[英]From multiple values per rows of a pandas dataframe: get two columns with every realation of the values (to analyse the network with Networkx)
I have a dataframe with names of persons in it.我有一个包含人名的数据框。 The persons work thogether on the same item.
人们在同一个项目上一起工作。
item names
a moriz, jon, cate
b jon, lenard
c cate, martin, leo, jil
item person 1 person 2
a moriz jon
a moriz cate
a jon cate
b jon lenard
c cate martin
c cate leo
c cate jil
c jil martin
c jil leo
c martin leo
You could do something like this ( df
your dataframe):你可以做这样的事情(
df
你的数据框):
from itertools import combinations
df.names = df.names.str.split(", ").map(lambda l: [*combinations(l, 2)])
df = df.explode("names")
df[["person 1", "person 2"]] = df.names.str.join(",").str.split(",", expand=True)
df = df.drop(columns="names")
Result for the sample:样品结果:
item person 1 person 2
0 a moriz jon
0 a moriz cate
0 a jon cate
1 b jon lenard
2 c cate martin
2 c cate leo
2 c cate jil
2 c martin leo
2 c martin jil
2 c leo jil
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.