create new column Pandas df with str.contains gives: Length of values does not match length of index

Question

I've seen many almost similar questions, but I still didn't find the right answer.

My df has a column ['Name'], containing names of all kind of stores. I want to categorize these by giving, for example, a grocery store the label 'Supermarket' in a new column df['Type'].

I first did this:

df['Type'] = df['Naam'].str.contains('Albert')

This gives a True False series.

after that I did this:

df['Type'] = df['Type'].replace({True: 'Supermarkt'})

That works, but is not very smart..... after writing an other line of str.contains for an other shop, obviously every value in ['Type'] became a Bool again....

Then I did this:

df['Type'] = (df['Naam'].str.contains('Albert'), 'Supermarkt')

My Idea was that I would be able to reuse this code, with an other part of a string over and over.

But.....

df['Type'] = (df['Naam'].str.contains('Albert'), 'Supermarkt')

gives an error:

Length of values does not match length of index . I think I understand what it means, but can't figure out why the first str.contains() gives a full series and this one gives an error....

So my question is: is there a way to alter df['Type'] = (df['Naam'].str.contains('Albert'), 'Supermarkt') , in a way that 1: True becomes 'Supermarkt' and all the False values stay in place or are replaced by something else?

Thanks in advance. Greetings Jan

Answer 1

# create a selection
boolean_indexer = df['Naam'].str.contains('Albert')

# create your new column 
df.loc[boolean_indexer, 'Type'] = 'Supermarkt'

create new column Pandas df with str.contains gives: Length of values does not match length of index

Question

1 answers

solution1
1 ACCPTED 2020-06-23 14:15:45

create new column Pandas df with str.contains gives: Length of values does not match length of index

Question

1 answers

solution1 1 ACCPTED 2020-06-23 14:15:45

solution1
1 ACCPTED 2020-06-23 14:15:45