I am trying to clean the location data of a data set and some of the locations have multiple cities seperated by commas. I want to split the strings that have commas on the comma and then replace each string with the first index of the split. (ie; Mumbai, Delhi, Calcutta and then make it just Mumbai) This is the code I wrote to try and do it. Can show me tell me what I am doing wrong?
df_train = pd.read_csv("Final_Train_Dataset.csv", index_col= None)
for cell in df_train["location"]:
new = df_train["location"].str.split(",")
df_train["new_location"] = new[0]
df_train["new_location"].head()
Any help is much appreciated. I dont think this is too hard to figure out, but I am new to pandas and we are using it for a project in a class.
这将解决您的问题.split(expand=True)
df_train["new_location"] = df_train["location"].str.split(expand=True)[0]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.