で結果をフォーマットここに私のテキストファイルは複数の列でグループとパンダ
No.,Time,Source,Destination,Protocol,Length,Info,SrcPort,DstPort,src_dst_pair
1401,0.397114,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
8999,3.229111,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
18504,5.877098,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
23755,8.695843,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
28027,11.24121,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
33304,14.117213,145.95.225.186,210.218.218.164,UDP,100,Source port: hsrp Destination port: hsrp,hsrp,1985,"('145.95.225.186', '210.218.218.164')"
700443,222.305789,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
700495,222.351933,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
700496,222.352372,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
708982,225.913385,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
709797,226.130847,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
710340,226.372421,145.95.41.251,145.95.81.118,UDP,50,Source port: 36477 Destination port: snmp,36477,161,"('145.95.41.251', '145.95.81.118')"
である私は、その後、グループに送信元と宛先に基づいてデータを欲しい:
は、内長列を蓄積グループ
グループ内の最大値と最小値の差を見つける
私は結果を得ましたが、予想される出力に示したようにフォーマットする必要があります。私はまた、これを行うより良い方法があるかどうかを知りたい。以下は
は私の試みimport pandas as pd
data = pd.read_csv('simple_udp.csv')
# getting the accumulated sum for the group
length = data.groupby(['Source','Destination']).Length.sum()
# getting the difference in time between the max and min in the group
time = data.groupby(['Source','Destination']).Time.max() - data.groupby(['Source','Destination']).Time.min()
# This is were I have problem. How can i format the result so that
# I can get the expected output(shown below)
print length, time
期待出力され
Source Destination Length Time
145.95.225.186 210.218.218.164 600 13.720099
145.95.41.251 145.95.81.118 300 4.066632