简体   繁体   English

熊猫:添加包含列表的多索引系列/数据框

[英]Pandas: adding multiindex Series/Dataframes containing lists

How do I add / merge two multiindex Series/DataFrames which contain lists as elements (a port-sequence or timestamp-sequence in my case). 如何添加/合并两个包含列表作为元素的多索引Series / DataFrame(在我的情况下为端口序列或时间戳序列)。 Especially, how to deal with indices, which appear only in one Series/DataFrame? 特别是,如何处理仅出现在一个Series / DataFrame中的索引? Unfortunately, the .add() -method allows only floats for the fill_value argument, not empty lists. 不幸的是, .add()方法只允许对fill_value参数使用浮点数,而不允许使用空列表。

My Data: 我的资料:

print series1
print series2

IP               sessionID
195.12*.21*.11*  49                    [5900]
                 50         [5900, 5900, 5900, 5900, ...

IP               sessionID
85.15*.24*.12*   63                    [3389]
91.20*.4*.14*    68           [445, 445, 139]
113.9*.4*.16*    75                 [23, 210]
195.12*.21*.11*  49                    [5905]

Expected result: 预期结果:

IP               sessionID
195.12*.21*.11*  49              [5900, 5905]
                 50         [5900, 5900, 5900, 5900, ...
85.15*.24*.12*   63                    [3389]
91.20*.4*.14*    68           [445, 445, 139]
113.9*.4*.16*    75                 [23, 210]

Oddly enough, series1.add(series1) or series2.add(series2) does work and appends the lists as expected, however series1.add(series2) produces runtime errors. 奇怪的是, series1.add(series1)series2.add(series2)确实可以正常工作并按预期附加列表,但是series1.add(series2)会产生运行时错误。 series1.combine_first(series2) works, however it does not merge the lists - it simply takes one. series1.combine_first(series2)可以工作,但是它不会合并列表-它只需要一个即可。 Any ideas? 有任何想法吗?

Yes, I know that lists as elements are bad style, but that's the way my data is right now. 是的,我知道列表作为元素是不好的样式,但这就是我的数据现在的样子。 Sorry for that. 抱歉 To keep it short I just have posted the series example, let me know if you also need the DataFrame example. 为了简短起见,我只是发布了系列示例,如果您也需要DataFrame示例,请告诉我。

In case there is any other poor ghost out there which needs this info... It seems like a dirty work-around, but it works: 万一有其他可怜的幽灵需要此信息...似乎是一个肮脏的解决方法,但它可以工作:

# add() works for mutual indices, so find intersection and call it
# fortunately, it appends list2 to list1!
intersection = series1.index.intersection(series2.index)
inter1 = series1[series1.index.isin(intersection)]
inter2 = series2[series2.index.isin(intersection)]
interAppend = inter1.add(inter2)

# combine_first() unions indices and keeps the values of the caller,
# so it will keep the appended lists on mutual indices,
# while it adds new indices and corresponding values
exclusiveAdd = interAppend.combine_first(series1).combine_first(series2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM