Pandas: Split a Dataframe into separate Dataframes based on certain Column's string values

Question

Haven't found any answers that I could apply to my problem so here it goes:

I have an initial dataframe of images that I would like to split into two, based on the description of that image, which is a string in the "Description" column.

My problem issue is that not all descriptions are equally written. Here's an example of what I mean:

Some images are accelerated and others aren't. That's the criteria I want to use to split the dataset.

However even accelerated and non-accelerated image descriptions vary among them.

My strategy would be to rename every string that has "ACC" in it - this would cover all accelerated images - to "ACCELERATED IMAGE".

Then I could do:

df_Accl = df[df.Description == "ACCELERATED IMAGE"]
df_NonAccl = df[df.Description != "ACCELERATED IMAGE"]

How can I achieve this? This was just a strategy that I came up with, if there's any other more efficient way of doing this feel free to speak it.

Answer 1

You can use str.contains for boolean mask - then filter by boolean indexing .

For invert mask is use ~ , filter rows not contains ACC :

mask = df.Description.str.contains("ACC")
df_Accl = df[mask]
df_NonAccl = df[~mask]

Answer 2

您可以使用contains来查找包含子字符串ACC的行：

df['Description'].str.contains('ACC')

Pandas: Split a Dataframe into separate Dataframes based on certain Column's string values

Question

2 answers

solution1
4 ACCPTED 2018-11-18 17:56:48

solution2
0 2018-11-18 17:56:56

Pandas: Split a Dataframe into separate Dataframes based on certain Column's string values

Question

2 answers

solution1 4 ACCPTED 2018-11-18 17:56:48

solution2 0 2018-11-18 17:56:56

solution1
4 ACCPTED 2018-11-18 17:56:48

solution2
0 2018-11-18 17:56:56