简体繁体中英

Check for a column in pandas dataframe for all elements if they are in a set of values

原文 2022-05-01 10:46:35 7 3 python/ pandas/ dataframe

We have a pandas DataFrame df and a set of values set_vals .

For a particular column (let's say 'name' ), I would now like to compute a new column which is True whenever the value of df['name'] is in set_vals and False otherwise.

One way to do this is to write:

df['name'].apply(lambda x: x in set_vals)

but when both df and set_vals become large this method is very slow. Is there a more efficient way of creating this new column?

3 answers

The real problem is the complexity of df['name'].apply(lambda x: x in set_vals) is O(M*N) where M is the length of df and N is the length of set_vals if set_vals is a list (or another type for which the search complexity is linear).

The complexity can be improved to O(M) if set_vals is hashed (turned into dict type) and the search complexity will be O(1).

I found the solution for you, it is called MapReduce.

You can read about it HERE

In general, it is a programming model for processing big data in parallel on multiple nodes.

There is a video that explains and shows an example for MapReduce: MapReduce Video

It is a complex problem with a simple solution, you can try to run multiple threads with this for loop:

let's say [0:i], [i+1:j], [j+1,k] etc.

Here is a very good explanation of how to do multiple threads

Also, if you are interested in more details about performance and efficiency check this out.

Writing check that all pandas DataFrame column values meet a certain values?

Check if a pandas Dataframe string column contains all the elements given in an array

How do I check if all values in a column of a pandas dataframe are equal?

How to take set union of all the values in a column of pandas Dataframe?

Group all elements of Pandas Dataframe by Datetime column

How to check a type of column values in pandas DataFrame

Pandas - check to see if all elements in a list are in a column

How to edit all values of a column in a pandas dataframe?

Pandas - check if set values in column are a subset of set values in another column

check if all the values of list is in dataframe column

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Writing check that all pandas DataFrame column values meet a certain values? Check if a pandas Dataframe string column contains all the elements given in an array How do I check if all values in a column of a pandas dataframe are equal? How to take set union of all the values in a column of pandas Dataframe? Group all elements of Pandas Dataframe by Datetime column How to check a type of column values in pandas DataFrame Pandas - check to see if all elements in a list are in a column How to edit all values of a column in a pandas dataframe? Pandas - check if set values in column are a subset of set values in another column check if all the values of list is in dataframe column

Related Tags

Check for a column in pandas dataframe for all elements if they are in a set of values

Question

3 answers

solution1
1 ACCPTED 2022-05-01 10:55:00

solution2
0 2022-05-02 11:46:51

solution3
-1 2022-05-01 10:57:02

Check for a column in pandas dataframe for all elements if they are in a set of values

Question

3 answers

solution1 1 ACCPTED 2022-05-01 10:55:00

solution2 0 2022-05-02 11:46:51

solution3 -1 2022-05-01 10:57:02

solution1
1 ACCPTED 2022-05-01 10:55:00

solution2
0 2022-05-02 11:46:51

solution3
-1 2022-05-01 10:57:02