2017-05-23 14 views
0

私は以下のテキストファイルを入力し、ロジックに基づいて別のファイルに出力を生成する必要があります。PIGの条件文

customerid|Dateofsubscription|Customercode|CustomerType|CustomerText 
 
1001|2017-05-23|455|CODE|SPRINT56 
 
1001|2017-05-23|455|DESC|Unlimited Plan 
 
1001|2017-05-23|455|DATE|2017-05-05 
 
1002|2017-05-24|455|CODE|SPRINT56 
 
1002|2017-05-24|455|DESC|Unlimited Plan 
 
1002|2017-05-24|455|DATE|2017-05-06

ロジック:

If Customercode = 455 
 
if(CustomerType = "CODE") 
 
    Val= CustomerText 
 
if(CustomerType = "DESC") 
 
    Description = CustomerText 
 
if(CustomerType = "DATE") 
 
    Date = CustomerText

出力:

は、ここに私の入力ファイルです

あなたはこれで私を助けてくださいでした。

答えて

0
rawData = LOAD data; 
filteredData = FILTER rawData BY (Customercode == 455); 

--Extract and set Val/Description/Date based on CustomerText and 'null' otherwise 
ExtractedData = FOREACH filteredData GENERATE 
      customerId, 
      (CustomerType == "CODE" ? CustomerText : null) AS Val, 
      (CustomerType == "DESC" ? CustomerText : null) AS Description, 
      (CustomerType == "DATE" ? CustomerText : null) AS Date; 

groupedData = GROUP ExtractedData BY customerId; 

--While taking MAX, all 'nulls' will be ignored 
finalData = FOREACH groupedData GENERATE 
      group as CustomerId, 
      MAX($1.Val) AS Val, 
      MAX($1.Description) AS Description, 
      MAX($1.Date) AS Date; 

DUMP finalData; 

私は、コア・ロジックを指定しています。読み込み、書式設定、保存は簡単です。

+0

ありがとう – satya

0

customercode = 455の入力をフィルタリングし、必要な2列を生成してからcustomeridでグループ化し、次にBagToString を使用します。

B = FILTER A BY Customercode == 455; 
C = FOREACH B GENERATE $0 as CustomerId,$4 as CustomerText; 
D = GROUP C BY CustomerId; 
E = FOREACH D GENERATE group AS CustomerId, BagToString(C.CustomerText, '|'); -- Note:This will generate 1001,SPRINT56|Unlimited Plan|2017-05-05 so,you will have to concat the first field with '|' and then concat the resulting field with the second field which is already delimited by '|'. 
F = FOREACH E GENERATE CONCAT(CONCAT($0,'|'),$1); 
DUMP F; 
+0

ありがとうございました – satya