Interview Questions   Tutorials   Discussions   Programs   Videos   

Pig - Another Use case in PIG

asked marvit November 23, 2014 07:05 AM  

Use case :For all of your registered users, you want to count how many came to your site this month. You want this count both by geography (zip code) and by demographic group( age and gender)


1 Answers

answered By Experts-976   0  

alt text

Load web server logs

logs = load ‘server logs’ using HcatLoader ();    
thismonth = filter logs by date >= “20121101” and data < “20121201”;

Load users

users = load ‘user info’ using HcatLoader ();

Remove any user that did not visit this month

grpd = cogroup thismonth by userid, users by userid;
filtrd = filter grpd by not IsEmpty(logs);
visited = foreach filtrd generate flatten(userid);

Count by ZipCode

grpdbyzip = group visited by zip;
cntzip = foreach grpdbyzip generate group, COUNT(visited);
store cntzip into “by zip” using HcatStorer(“date=201211”);

Count by demographics

grpdbydemo = group visited by (age, gender);
cntzip = foreach grpdbydemo generate flatten group, COUNT(visited);
store cntzip into “by demo” using HcatStorer(“date=201211”);
   add comment

Your answer

Join with account you already have



 Write A Tutorials
Online-Classroom Classes

  1 person following this question

  1 person following this tag

  Question tags

pig × 1

Asked 1 year and 11 months ago ago
Number of Views -199
Number of Answers -1
Last updated
1 year and 11 months ago ago

  Similar questions

Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!