[英]Iterate over all columns in pandas dataframe to split on delimiter
I have a dataframe that looks like this: 我有一个如下所示的数据框:
name val
0 cat ['Furry: yes', 'Fast: yes', 'Slimy: no', 'Living: yes']
1 dog ['Furry: yes', 'Fast: yes', 'Slimy: no', 'Living: yes']
2 snail ['Furry: no', 'Fast: no', 'Slimy: yes', 'Living: yes']
3 paper ['Furry: no', 'Fast: no', 'Slimy: no', 'Living: no']
For each item in list in the val column, I want to split the item on the ':' delimiter. 对于val列中列表中的每个项目,我想在“:”分隔符上拆分项目。 Then I want to make item[0] be the column name, and item[1] be the value for that specific column. 然后我想让item [0]成为列名,item [1]是该特定列的值。 Like so: 像这样:
name Furry Fast Slimy Living
0 cat yes yes no yes
1 dog yes yes no yes
2 snail no no yes yes
3 paper no no no no
I've tried using apply(pd.Series) to the val column, but that still leaves me with many columns that I'd have to either manually do splits on, or figure out how to iteratively go through all the columns and do splits. 我已经尝试将apply(pd.Series)用于val列,但是这仍然留给我许多列,我必须手动进行拆分,或者弄清楚如何迭代遍历所有列并进行拆分。 I prefer to split from ground zero and create the column names. 我更喜欢从零开始分割并创建列名。 Any idea how I can achieve this? 知道我怎么能做到这一点?
pd.DataFrame
accepts a list of dictionaries directly. pd.DataFrame
直接接受字典列表。 Therefore, you can construct a dataframe via a list comprehension and then join. 因此,您可以通过列表推导构建数据框,然后加入。
L = [dict(i.split(': ') for i in x) for x in df['val']]
df = df[['name']].join(pd.DataFrame(L))
print(df)
name Fast Furry Living Slimy
0 cat yes yes yes no
1 dog yes yes yes no
2 snail no no yes yes
3 paper no no no no
apply
with split
to create dictionary: apply
与split
创建词典:
df.val = df.val.apply(lambda x: dict([i.split(': ') for i in x]))
apply
with pd.Series
to create columns: apply
与pd.Series
来创建列:
df.join(df.val.apply(pd.Series)).drop('val', 1)
name Furry Fast Slimy Living
0 cat yes yes no yes
1 dog yes yes no yes
2 snail no no yes yes
3 paper no no no no
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.