[英]Selecting distinct pandas data frame based on combination of multiple columns value
基於多列值的組合選擇不同的熊貓數據框。
我有一個像這樣的數據:
Time locIP remIp locPort remPort numReads numWrites
0 20180529235221 127.0.0.1 127.0.0.1 22 565 36736 36751
1 20180529235221 127.0.0.1 127.0.0.1 22 566 36736 74690
2 20180529235221 127.0.0.1 127.0.0.1 12 567 36736 36749
3 20180529235221 10.8.21.41 10.8.21.34 22 565 36744 36738
4 20180529235221 10.8.21.41 10.8.21.34 22 566 36744 36738
5 20180529235225 127.0.0.1 127.0.0.1 22 565 36788 36751
6 20180529235225 127.0.0.1 127.0.0.1 22 566 36788 74700
7 20180529235225 127.0.0.1 127.0.0.1 12 567 36788 36800
我想為(locIP,remIP,LocPort remPort)和numReads的每種組合繪制時間序列圖。
為此,我正在尋找其他較小的數據框,例如:
Time locIP remIp locPort remPort numReads numWrites
0 20180529235221 127.0.0.1 127.0.0.1 22 565 36736 36751
5 20180529235225 127.0.0.1 127.0.0.1 22 565 36736 36751
另一個:
Time locIP remIp locPort remPort numReads numWrites
20180529235221 127.0.0.1 127.0.0.1 22 566 36736 74690
20180529235225 127.0.0.1 127.0.0.1 22 566 36788 74700
我在多個列上嘗試條件:
df1 =df[(df["locIP"] =='127.0.0.1') & (df["remIp"] == '127.0.0.1') & (df['locPort']== '22') & (df['remPort']=='565')]
但是在這里,我必須提取條件變量中的所有組合。 尋找更好的方法。
這可能對您有用。
import itertools
#Create a dictionary to populate with a collection of unique values.
d = {}
#Grab header list
head = list(df)
#Create a collection of unique values
for x in head:
d[x] = list(set(df[x]))
#Create all possible combinations
c = list(itertools.product(d['locIP'],d['locPort'],d['remIp'],d['remPort']))
#Create list to populate with selected dataframes
NonEmpdf =[]
for x in c:
selectTxt = 'locIP == {} & locPort == {} & remIp == {} & remPort == {}'.format("'"+x[0]+"'",x[1],"'"+x[2]+"'",x[3])
print selectTxt
dfSel = df.query(selectTxt)
if dfSel.empty:
print 'Empty'
else:
NonEmpdf.append(dfSel)
#Then this is a collection of all non-empty dataframes you can iterate through and plot.
NonEmpdf
另外,.any()可能對您有用。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.