简体繁体中英

From multiple values per rows of a pandas dataframe: get two columns with every realation of the values (to analyse the network with Networkx)

原文 2021-11-08 16:20:45 1 1 python/ pandas/ dataframe/ networkx

I have a dataframe with names of persons in it. The persons work thogether on the same item.

item   names
a      moriz, jon, cate 
b      jon, lenard 
c      cate, martin, leo, jil

I like to prepare the names for a network-visualisation. I need to split the name-cells up in in two rows: in a way, that every relation is shown. like this:

item    person 1    person 2
a       moriz       jon
a       moriz       cate
a       jon         cate
b       jon         lenard
c       cate        martin
c       cate        leo
c       cate        jil
c       jil         martin
c       jil         leo
c       martin      leo

I know how to split the name-cell in multiple name-cells for each item. But I don't know how to list them in pairs with every relation per item.

1 answers

You could do something like this ( df your dataframe):

from itertools import combinations

df.names = df.names.str.split(", ").map(lambda l: [*combinations(l, 2)])
df = df.explode("names")
df[["person 1", "person 2"]] = df.names.str.join(",").str.split(",", expand=True)
df = df.drop(columns="names")

Result for the sample:

  item person 1 person 2
0    a    moriz      jon
0    a    moriz     cate
0    a      jon     cate
1    b      jon   lenard
2    c     cate   martin
2    c     cate      leo
2    c     cate      jil
2    c   martin      leo
2    c   martin      jil
2    c      leo      jil

Pandas: Analyse frequency of values in multiple DataFrame columns

Get the columns name of the two largest values from pandas dataframe rows

From a Pandas Dataframe, build networkx chart or flow chart between different rows with common values in certain columns

Pandas DataFrame: get rows with same pair of values in two specific columns

Selecting rows from a Dataframe based on values from multiple columns in pandas

Remove rows from pandas dataframe based on multiple columns with similar values

Selecting rows from a Dataframe based on values in multiple columns in pandas

Select rows from a DataFrame based on values in a MULTIPLE columns in pandas

Python Pandas Dataframe get count of rows after filtering using values from multiple columns

Values from rows to columns of a pandas dataframe

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Pandas: Analyse frequency of values in multiple DataFrame columns Get the columns name of the two largest values from pandas dataframe rows From a Pandas Dataframe, build networkx chart or flow chart between different rows with common values in certain columns Pandas DataFrame: get rows with same pair of values in two specific columns Selecting rows from a Dataframe based on values from multiple columns in pandas Remove rows from pandas dataframe based on multiple columns with similar values Selecting rows from a Dataframe based on values in multiple columns in pandas Select rows from a DataFrame based on values in a MULTIPLE columns in pandas Python Pandas Dataframe get count of rows after filtering using values from multiple columns Values from rows to columns of a pandas dataframe

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM