str.contains to create new column in pandas dataframe

Question

I am exploring the titanic data set and want to create a column with similar names. For example, any name that contains "Charles" will show as "ch",as I want to do some group by using those later on. I created a function using the following code:

def cont(Name):
    for a in Name:
        if a.str.contains('Charles'):
            return('Ch')

and then applied using this:

titanic['namest']=titanic['Name'].apply(cont,axis=1)

Error: 'str' object has no attribute 'str'

notebook_link

Answer 1

您可以使用向量化的str.contains返回布尔掩码，并将满足条件的所有行设置为所需的值，而不是使用循环或apply

titanic.loc[titanic['Name'].str.contains('Charles'), 'namest'] = 'Ch'

Answer 2

apply will call the cont function and pass it a value from the Name column, a value by value. That means that the Name variable inside the cont function is already a string.

Also note that every function that is being used by apply must return something, so in case the name doesn't contain 'Charles' the name itself is returned.

Also 2, Series apply method doesn't have an axis keyword argument.

def cont(Name):
    if 'Charles' in Name:
        return 'Ch'
    return Name

You don't even need to define it:

titanic['namest'] = titanic['Name'].apply(lambda x: 'Ch' if 'Charles' in x else x)

str.contains to create new column in pandas dataframe

Question

2 answers

solution1
7 ACCPTED 2016-04-15 19:30:31

solution2
3 2016-04-15 17:42:36

str.contains to create new column in pandas dataframe

Question

2 answers

solution1 7 ACCPTED 2016-04-15 19:30:31

solution2 3 2016-04-15 17:42:36

solution1
7 ACCPTED 2016-04-15 19:30:31

solution2
3 2016-04-15 17:42:36