簡體   English   中英

基於多列值的組合選擇不同的熊貓數據框

[英]Selecting distinct pandas data frame based on combination of multiple columns value

基於多列值的組合選擇不同的熊貓數據框。

我有一個像這樣的數據:

    Time             locIP          remIp locPort remPort   numReads numWrites
0   20180529235221  127.0.0.1   127.0.0.1   22  565 36736   36751
1   20180529235221  127.0.0.1   127.0.0.1   22  566 36736   74690
2   20180529235221  127.0.0.1   127.0.0.1   12  567 36736   36749
3   20180529235221  10.8.21.41  10.8.21.34  22  565 36744   36738
4   20180529235221  10.8.21.41  10.8.21.34  22  566 36744   36738
5   20180529235225  127.0.0.1   127.0.0.1   22  565 36788   36751
6   20180529235225  127.0.0.1   127.0.0.1   22  566 36788   74700
7   20180529235225  127.0.0.1   127.0.0.1   12  567 36788   36800

我想為(locIP,remIP,LocPort remPort)和numReads的每種組合繪制時間序列圖。

為此,我正在尋找其他較小的數據框,例如:

    Time            locIP       remIp   locPort remPort numReads    numWrites
0   20180529235221  127.0.0.1   127.0.0.1   22  565 36736   36751
5   20180529235225  127.0.0.1   127.0.0.1   22  565 36736   36751

另一個:

Time             locIP        remIp  locPort    remPort  numReads   numWrites
20180529235221  127.0.0.1   127.0.0.1   22  566 36736   74690
20180529235225  127.0.0.1   127.0.0.1   22  566 36788   74700

我在多個列上嘗試條件:

df1 =df[(df["locIP"] =='127.0.0.1') & (df["remIp"] == '127.0.0.1') & (df['locPort']== '22') & (df['remPort']=='565')]

但是在這里,我必須提取條件變量中的所有組合。 尋找更好的方法。

這可能對您有用。

import itertools
#Create a dictionary to populate with a collection of unique values.
d = {}
#Grab header list 
head = list(df)
#Create a collection of unique values 
for x in head:
     d[x] = list(set(df[x]))
#Create all possible combinations
c = list(itertools.product(d['locIP'],d['locPort'],d['remIp'],d['remPort']))
#Create list to populate with selected dataframes
NonEmpdf =[]
for x in c:
     selectTxt = 'locIP == {} & locPort == {} & remIp == {} & remPort == {}'.format("'"+x[0]+"'",x[1],"'"+x[2]+"'",x[3])
     print selectTxt
     dfSel = df.query(selectTxt)
     if dfSel.empty:
         print 'Empty'
     else:
         NonEmpdf.append(dfSel)
#Then this is a collection of all non-empty dataframes you can iterate through and plot.
NonEmpdf

另外,.any()可能對您有用。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM