简体   繁体   English

在python中,如何在带有dtype对象的列上使用split('_')?

[英]In python, how to use split('_') on a column with dtype object?

I am working on a data frame whose one column is of type object:我正在处理一列是对象类型的数据框:

example: name 36512 non-null object示例: name 36512 non-null object

I have tried the following with no success我尝试了以下但没有成功

> name_str = autos['name'].to_string()
> print(type(name_str))   # this makes the name_str as type string
> autos['name'] = name_str  # putting it back in the data frame brought it back to type object

Also, tried the following:另外,尝试了以下方法:

> import json
> autos['name'] = json.dumps(name_str)

My goal is to split the first two words using split('_') , but unable to do so unless the type is string我的目标是使用split('_')拆分前两个单词,但除非类型为字符串,否则无法这样做

example: BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik例如: BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik

output: ['BMW', '740i'] in a new column输出: ['BMW', '740i']在新列中

import pandas as pd

df = pd.DataFrame({'name':['BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik']})

df['new'] = df['name'].str.split('_').str[:2]

print(df)

Output:输出:

                                         name          new
0  BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik  [BMW, 740i]

Note that a column of strings is considered an "object" dtype in pandas so you should have the right dtype already请注意,一列字符串被视为熊猫中的“对象”dtype,因此您应该已经拥有正确的 dtype

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM