简体   繁体   English

使用 Pandas Dataframe,如何拆分特定列中的字符串,然后用拆分的第一个索引替换该字符串?

[英]Using a Pandas Dataframe, how can I split the strings in a specific column and then replace that string with the first index of the split?

I am trying to clean the location data of a data set and some of the locations have multiple cities seperated by commas.我正在尝试清理数据集的位置数据,并且某些位置有多个以逗号分隔的城市。 I want to split the strings that have commas on the comma and then replace each string with the first index of the split.我想拆分逗号上有逗号的字符串,然后用拆分的第一个索引替换每个字符串。 (ie; Mumbai, Delhi, Calcutta and then make it just Mumbai) This is the code I wrote to try and do it. (即;孟买、德里、加尔各答,然后让它成为孟买)这是我写的代码,试图做到这一点。 Can show me tell me what I am doing wrong?可以告诉我我做错了什么吗?

df_train = pd.read_csv("Final_Train_Dataset.csv", index_col= None)

for cell in df_train["location"]:
  new = df_train["location"].str.split(",")
df_train["new_location"] = new[0]
df_train["new_location"].head()

Any help is much appreciated.任何帮助深表感谢。 I dont think this is too hard to figure out, but I am new to pandas and we are using it for a project in a class.我不认为这很难弄清楚,但我是大熊猫的新手,我们正在将它用于课堂项目。

这将解决您的问题.split(expand=True)

df_train["new_location"] = df_train["location"].str.split(expand=True)[0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM