[英]How to replace a string that is a part of a dataframe with a list in pandas?
I am a beginner at coding, and since this is a very simple question, I know there must be answers out there.我是编码的初学者,因为这是一个非常简单的问题,我知道肯定有答案。 However, I've searched for about a half hour, typing countless queries in google, and all has flown over my head.
然而,我已经搜索了大约半个小时,在谷歌中输入了无数的查询,一切都在我脑海中浮现。
Lets say I have a dataframe with columns "Name", "Hobbies" and 2 people, so 2 rows.假设我有一个 dataframe 列“姓名”、“爱好”和 2 个人,所以 2 行。 Currently, I have the hobbies as strings in the form "hobby1, hobby2".
目前,我的爱好是“hobby1,hobby2”形式的字符串。 I would like to change this into ["hobby1", "hobby2"]
我想把它改成 ["hobby1", "hobby2"]
hobbies_as_string = df.iloc[0, 2]
hobbies_as_list = hobbies_as_string.split(',')
df.iloc[0, -2] = hobbies_as_list
However, this falls to an error, ValueError: Must have equal len keys and value when setting with an iterable.但是,这会导致错误,ValueError: Must have equal len keys and value when setting with an iterable。 I don't understand why if I get hobbies_as_string as a copy, I'm able to assign the hobbies column as a list no problem.
我不明白为什么如果我得到 hobbies_as_string 作为副本,我可以将 hobbies 列分配为列表没有问题。 I'm also able to assign df.iloc[0,-2] as a string, such as "Hey", and that works fine.
我还可以将 df.iloc[0,-2] 分配为字符串,例如“Hey”,效果很好。 I'm guess it has to do the with ValueError.
我猜它与ValueError有关。 Why won't pandas let me assign it as a list??
为什么 pandas 不让我将其分配为列表?
Thank you very much for your help and explanation.非常感谢您的帮助和解释。
Are you looking to apply
a split
row-wise to each value into a list?您是否希望
apply
split
行应用于列表中的每个值?
import pandas as pd
df = pd.DataFrame({'Name' : ['John', 'Kate'],
'Hobbies' : ["Hobby1, Hobby2", "Hobby2, Hobby3"]})
df['Hobbies'] = df['Hobbies'].apply(lambda x: x.split(','))
df
OR if you are not a big lambda exer, then you can do str.split()
on the entire column, which is easier:或者如果你不是一个大的 lambda exer,那么你可以在整个列上做
str.split()
,这更容易:
import pandas as pd
df = pd.DataFrame({'Name' : ['John', 'Kate'],
'Hobbies' : ["Hobby1, Hobby2", "Hobby2, Hobby3"]})
df['Hobbies'] = df['Hobbies'].str.split(",")
df
Output: Output:
Name Hobbies
0 John [Hobby1, Hobby2]
1 Kate [Hobby2, Hobby3]
Another way of doing it另一种方法
df=pd.DataFrame({'hobbiesStrings':['"hobby1, hobby2"']})
df
replace ,whitespace with ","
and put hobbiesStrings
values in a list将,whitespace替换为
","
并将hobbiesStrings
值放入列表中
x=df.hobbiesStrings.str.replace('((?<=)(\,\s+)+)','","').values.tolist()
x
Here I use regex expressions Basically I am replacing comma \,
followed by whitespace \s
with ","
在这里我使用正则表达式基本上我用
","
替换comma \,
然后是空格\s
rewrite column s using df.assign使用df.assign重写列
df=df.assign(hobbies_stringsnes=[x])
Chained together连在一起
df=df.assign(hobbies_stringsnes=[df.hobbiesStrings.str.replace('((\,\s))','","').values.tolist()])
df
Output Output
Use the "at" method to replace a value with a list使用“at”方法将值替换为列表
import pandas as pd
# create a dataframe
df = pd.DataFrame(data={'Name': ['Stinky', 'Lou'],
'Hobbies': ['Shooting Sports', 'Poker']})
# replace Lous hobby of poker with a list of degen hobbies with the at method
df.at[1, 'Hobbies'] = ['Poker', 'Ponies', 'Dice']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.