Looping over a string list in Python

Question

I have a list which consists of a different colours, all stored as string variables.

Preferredcolours = ['red','yellow','green', 'blue']

I have a panda array, which contains information about cars. One of the column DfCar['colour'] consists of the colours of these cars. I want to create a new variable in my data frame, column named PreferredMathcing which =1 if the DataFrame colour column matches with one of the list colours. How can I use a for loop to solve this?

I would ideally want this sort of a solution:

+=================+============================+
| DfCar['colour'] | DfCar['PreferredMathcing'] |
+=================+============================+
| white           |                          0 |
+-----------------+----------------------------+
| yellow          |                          1 |
+-----------------+----------------------------+
| black           |                          0 |
+-----------------+----------------------------+
| purple          |                          0 |
+-----------------+----------------------------+
| green           |                          1 |
+-----------------+----------------------------+

Answer 1

you can use .isin() , which returns a Series with True / False for each row based on if it is in a list of values. then use .astype(int) to get your 1 / 0 instead.

try this:

import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({'colour': ['white', 'yellow', 'black', 'purple', 'green']})
Preferredcolours = ['red','yellow','green', 'blue']

df["PreferredMathcing"] = df['colour'].isin(Preferredcolours).astype(int)

print(df)

output:

   colour  PreferredMathcing
0   white                  0
1  yellow                  1
2   black                  0
3  purple                  0
4   green                  1

NOTE:

choosing a solution with a pure library function will likely out-perform a solution using apply with custom python logic.

bench-marking those against each other on my machine suggests .isin() is almost x8 faster:

with '.isin()': 1.0591506958007812
with '.apply()': 8.234664678573608
ratio: 7.774780974248154

Answer 2

following will give you output

def check_colour(x, Preferredcolours) :
    return 1 if x['colour'] in Preferredcolours else 0

dfCar['PreferredMathcing'] = df.apply(check_colour,args=(Preferredcolours,), axis=1)

Answer 3

You can use np.where like below:

import pandas as pd
import numpy as np

DfCar = pd.DataFrame.from_dict({'colour': ['white', 'yellow', 'black', 'purple', 'green']})
Preferredcolours = ['red','yellow','green', 'blue']

DfCar['PreferredMathcing'] = np.where(DfCar['colour'].isin(Preferredcolours), 1, 0)

Answer 4

Assuming DfCar is your Dataframe.

Preferredcolours = ['red','yellow','green', 'blue']    
DfCar['PreferredMatching'] = DfCar['colour'].apply(lambda x: x in Preferredcolours)

This will apply the lambda function over every element in your "colour" column. Simply check if it is in "preferredcolours" and return True or False.

Looping over a string list in Python

Question

4 answers

solution1
1 ACCPTED 2019-06-24 12:47:48

solution2
1 2019-06-24 12:48:31

solution3
0 2019-06-24 12:46:19

solution4
0 2019-06-24 12:49:27

Looping over a string list in Python

Question

4 answers

solution1 1 ACCPTED 2019-06-24 12:47:48

solution2 1 2019-06-24 12:48:31

solution3 0 2019-06-24 12:46:19

solution4 0 2019-06-24 12:49:27

solution1
1 ACCPTED 2019-06-24 12:47:48

solution2
1 2019-06-24 12:48:31

solution3
0 2019-06-24 12:46:19

solution4
0 2019-06-24 12:49:27