2017-06-27 4 views
1

私は次のようないくつかのデータを持っている:グループによって参照行から行賢明パーセント差data.table

Seller    Name      Price 
ⒽomeⓄnline   Harper Hand Truck and Dolly 51.7 
HomeOnline   Harper Hand Truck and Dolly 62.54 
Amazon.com   Harper Hand Truck and Dolly 41.83 
XpW     Honeywell Safe Chest  41.37 
XoXoGroupLLC  Honeywell Safe Chest  51.78 
Toys Online   Honeywell Safe Chest  43.01 
Tempus & Co.  Honeywell Safe Chest  52.7 
stores123   Honeywell Safe Chest  51.21 
ⒽomeⓄnline   Honeywell Safe Chest  43.88 
HomeOnline   Honeywell Safe Chest  43.87 
Great Brands Outlet Honeywell Safe Chest  64.95 
Connect Buy   Honeywell Safe Chest  30.1 
Amazon.com   Honeywell Safe Chest  24.6 

私はAmazon.comがあり、それぞれの行と行の間のパーセント差を計算します売り手はNameです。そのため、出力行がすべての方法を下に移入される意味「...など」と、次のようになります

Seller    Name      Price  Pct_Diff 
    ⒽomeⓄnline   Harper Hand Truck and Dolly 51.7  .23 
    HomeOnline   Harper Hand Truck and Dolly 62.54  .49 
    Amazon.com   Harper Hand Truck and Dolly 41.83 
    XpW     Honeywell Safe Chest  41.37  .68  
    XoXoGroupLLC  Honeywell Safe Chest  51.78  1.0 
    Toys Online   Honeywell Safe Chest  43.01  etc... 
    Tempus & Co.  Honeywell Safe Chest  52.7 
    stores123   Honeywell Safe Chest  51.21 
    ⒽomeⓄnline   Honeywell Safe Chest  43.88 
    HomeOnline   Honeywell Safe Chest  43.87 
    Great Brands Outlet Honeywell Safe Chest  64.95 
    Connect Buy   Honeywell Safe Chest  30.1 
    Amazon.com   Honeywell Safe Chest  24.6 

は、私は良いdata.tableソリューションは、このためにあると思います。しかし、 "Amazon.com"を売り手として持っていない各行と、 "Amazon.com"を売り手として持つ行とをどのように比較するのか分かりません。

+0

dplyrソリューションだあなたのデータ例えばdput' 'の出力を投稿してください。 – lmo

答えて

2

あなたは使用することができます与える

dt[, pct := (Price - Price[Seller=='Amazon.com'])/Price[Seller=='Amazon.com'], by = Name] 

    Seller      Name Price  pct 
1:   ⒽomeⓄnline Harper Hand Truck and Dolly 51.70 0.2359551 
2:   HomeOnline Harper Hand Truck and Dolly 62.54 0.4950992 
3:   Amazon.com Harper Hand Truck and Dolly 41.83 0.0000000 
4:     XpW  Honeywell Safe Chest 41.37 0.6817073 
5:  XoXoGroupLLC  Honeywell Safe Chest 51.78 1.1048780 
6:   Toys Online  Honeywell Safe Chest 43.01 0.7483740 
7:  Tempus & Co.  Honeywell Safe Chest 52.70 1.1422764 
8:   stores123  Honeywell Safe Chest 51.21 1.0817073 
9:   ⒽomeⓄnline  Honeywell Safe Chest 43.88 0.7837398 
10:   HomeOnline  Honeywell Safe Chest 43.87 0.7833333 
11: Great Brands Outlet  Honeywell Safe Chest 64.95 1.6402439 
12:   Connect Buy  Honeywell Safe Chest 30.10 0.2235772 
13:   Amazon.com  Honeywell Safe Chest 24.60 0.0000000 

同じロジックがdplyrで実装:

dt %>% 
    group_by(Name) %>% 
    mutate(pct = (Price - Price[Seller=='Amazon.com'])/Price[Seller=='Amazon.com']) 

使用するデータ:

dt <- structure(list(Seller = c("ⒽomeⓄnline", "HomeOnline", "Amazon.com", "XpW", "XoXoGroupLLC", "Toys Online", "Tempus & Co.", "stores123", "ⒽomeⓄnline", "HomeOnline", "Great Brands Outlet", "Connect Buy", "Amazon.com"), 
        Name = c("Harper Hand Truck and Dolly", "Harper Hand Truck and Dolly", "Harper Hand Truck and Dolly", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest"), 
        Price = c(51.7, 62.54, 41.83, 41.37, 51.78, 43.01, 52.7, 51.21, 43.88, 43.87, 64.95, 30.1, 24.6)), 
       .Names = c("Seller", "Name", "Price"), class = c("data.table", "data.frame"), row.names = c(NA, -13L)) 
1

ここ

libary(dplyr) 

df <- data.frame(
    Seller = c("ⒽomeⓄnline", "HomeOnline", "Amazon.com", "XpW", "XoXoGroupLLC", "Toys Online", "Tempus & Co.", "stores123", "ⒽomeⓄnline", "HomeOnline", "Great Brands Outlet", "Connect Buy", "Amazon.com"), 
    Name = c("Harper Hand Truck and Dolly","Harper Hand Truck and Dolly","Harper Hand Truck and Dolly","Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest"), 
    Price = c(51.7, 62.54, 41.83, 41.37, 51.78, 43.01, 52.7, 51.21, 43.88, 43.87, 64.95, 30.1, 24.6) 
) 

df %>% 
    # Join each row with the "Amazon.com" price for this item 
    left_join(df %>% filter(Seller == "Amazon.com"), by = "Name", suffix = c("", ".amazon")) %>% 
    # Remove unused "Seller" column 
    select(-Seller.amazon) %>% 
    # Calculate percentage for each row, except for 
    # "Amazon.com" rows, for which the percent difference is NA 
    mutate(Pct_Diff = ifelse(Seller == "Amazon.com", NA, round((Price - Price.amazon)/Price.amazon, 2))) 

#      Seller      Name Price Price.amazon Pct_Diff 
# 1 <U+24BD>ome<U+24C4>nline Harper Hand Truck and Dolly 51.70  41.83  0.24 
# 2    HomeOnline Harper Hand Truck and Dolly 62.54  41.83  0.50 
# 3    Amazon.com Harper Hand Truck and Dolly 41.83  41.83  NA 
# 4      XpW  Honeywell Safe Chest 41.37  24.60  0.68 
# 5    XoXoGroupLLC  Honeywell Safe Chest 51.78  24.60  1.10 
# 6    Toys Online  Honeywell Safe Chest 43.01  24.60  0.75 
# 7    Tempus & Co.  Honeywell Safe Chest 52.70  24.60  1.14 
# 8     stores123  Honeywell Safe Chest 51.21  24.60  1.08 
# 9 <U+24BD>ome<U+24C4>nline  Honeywell Safe Chest 43.88  24.60  0.78 
# 10    HomeOnline  Honeywell Safe Chest 43.87  24.60  0.78 
# 11  Great Brands Outlet  Honeywell Safe Chest 64.95  24.60  1.64 
# 12    Connect Buy  Honeywell Safe Chest 30.10  24.60  0.22 
# 13    Amazon.com  Honeywell Safe Chest 24.60  24.60  NA 
関連する問題