question archive Consider the two data files (users
Subject:Computer SciencePrice:2.87 Bought7
Consider the two data files (users.csv, transactions.csv):
Users file has the following fields:
a) UserID
b) EmailID
c) NativeLanguage
d) Location
Transactions file has the following fields:
a) Transaction_ID
b) Product_ID
c) UserID
d) Price
e) Product_Description
-> By making use of Spark find out:
a) Count of unique locations where each product is sold.
b) Find out products bought by each user.
c) Total spending done by each user on each product.
Transactions
1 1004 19 129 whatchamacallit
2 1001 10 99 thingamajig
3 1004 17 129 whatchamacallit
4 1001 9 99 thingamajig
5 1003 3 89 gadget
6 1002 19 149 gizmo
7 1002 30 149 gizmo
8 1002 26 149 gizmo
9 1001 22 99 thingamajig
10 1003 6 89 gadget
11 1004 1 129 whatchamacallit
12 1004 2 129 whatchamacallit
13 1005 5 199 doohickey
14 1004 7 129 whatchamacallit
15 1002 16 149 gizmo
Users
1 u..1@company.com ES MX
2 u..4@domain.com EN US
3 u..5@company.com FR FR
4 u..9@site.org HI IN
5 u..2@service.io EN CA
6 u..7@website.net FR FR
7 u..1@company.com FR FR
8 u..5@company.com FR FR
9 u..7@school.edu ES MX
10 u..1@website.net EN CA
11 u..6@website.net FR FR
12 u..9@domain.com FR FR
13 u..1@company.com ES MX
14 u..5@domain.com HI IN
15 u..8@site.org ES MX
16 u..3@school.edu EN US
17 u..7@school.edu ES MX
18 u..9@website.net HI IN
19 u..4@school.edu EN US
20 u..7@domain.com HI IN
21 u..8@site.org EN US
22 u..1@domain.com ES MX
23 u..4@service.io EN US
24 u..9@website.net ES MX
25 u..1@site.org EN US
26 u..5@service.io HI IN
27 u..9@service.io EN CA
28 u..1@company.com EN CA
29 u..6@site.org ES MX
30 u..9@website.net EN US
Purchased 7 times