简体   繁体   中英

Using a Pandas Dataframe, how can I split the strings in a specific column and then replace that string with the first index of the split?

I am trying to clean the location data of a data set and some of the locations have multiple cities seperated by commas. I want to split the strings that have commas on the comma and then replace each string with the first index of the split. (ie; Mumbai, Delhi, Calcutta and then make it just Mumbai) This is the code I wrote to try and do it. Can show me tell me what I am doing wrong?

df_train = pd.read_csv("Final_Train_Dataset.csv", index_col= None)

for cell in df_train["location"]:
  new = df_train["location"].str.split(",")
df_train["new_location"] = new[0]
df_train["new_location"].head()

Any help is much appreciated. I dont think this is too hard to figure out, but I am new to pandas and we are using it for a project in a class.

这将解决您的问题.split(expand=True)

df_train["new_location"] = df_train["location"].str.split(expand=True)[0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM