使用 Pandas Dataframe，如何拆分特定列中的字符串，然后用拆分的第一个索引替换该字符串？

Question

I am trying to clean the location data of a data set and some of the locations have multiple cities seperated by commas.我正在尝试清理数据集的位置数据，并且某些位置有多个以逗号分隔的城市。 I want to split the strings that have commas on the comma and then replace each string with the first index of the split.我想拆分逗号上有逗号的字符串，然后用拆分的第一个索引替换每个字符串。 (ie; Mumbai, Delhi, Calcutta and then make it just Mumbai) This is the code I wrote to try and do it. （即；孟买、德里、加尔各答，然后让它成为孟买）这是我写的代码，试图做到这一点。 Can show me tell me what I am doing wrong?可以告诉我我做错了什么吗？

df_train = pd.read_csv("Final_Train_Dataset.csv", index_col= None)

for cell in df_train["location"]:
  new = df_train["location"].str.split(",")
df_train["new_location"] = new[0]
df_train["new_location"].head()

Any help is much appreciated.任何帮助深表感谢。 I dont think this is too hard to figure out, but I am new to pandas and we are using it for a project in a class.我不认为这很难弄清楚，但我是大熊猫的新手，我们正在将它用于课堂项目。

Answer 1

这将解决您的问题.split(expand=True)

df_train["new_location"] = df_train["location"].str.split(expand=True)[0]

使用 Pandas Dataframe，如何拆分特定列中的字符串，然后用拆分的第一个索引替换该字符串？

问题描述

1 个解决方案

解决方案1
0 2021-11-13 19:42:17

使用 Pandas Dataframe，如何拆分特定列中的字符串，然后用拆分的第一个索引替换该字符串？

问题描述

1 个解决方案

解决方案1 0 2021-11-13 19:42:17

解决方案1
0 2021-11-13 19:42:17