I am exploring the titanic data set and want to create a column with similar names. For example, any name that contains "Charles" will show as "ch",as I want to do some group by using those later on. I created a function using the following code:
def cont(Name):
for a in Name:
if a.str.contains('Charles'):
return('Ch')
and then applied using this:
titanic['namest']=titanic['Name'].apply(cont,axis=1)
Error: 'str' object has no attribute 'str'
您可以使用向量化的str.contains
返回布尔掩码,并将满足条件的所有行设置为所需的值,而不是使用循环或apply
titanic.loc[titanic['Name'].str.contains('Charles'), 'namest'] = 'Ch'
apply
will call the cont
function and pass it a value from the Name
column, a value by value. That means that the Name
variable inside the cont
function is already a string.
Also note that every function that is being used by apply
must return something, so in case the name doesn't contain 'Charles' the name itself is returned.
Also 2, Series
apply
method doesn't have an axis
keyword argument.
def cont(Name):
if 'Charles' in Name:
return 'Ch'
return Name
You don't even need to define it:
titanic['namest'] = titanic['Name'].apply(lambda x: 'Ch' if 'Charles' in x else x)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.