简体   繁体   中英

String contains function in python within panda dataframe?

I'm very new to python so there may be a simple solution here. I'm trying to clean a data set about rent prices/square footage within a panda data frame. My data column for bedrooms includes information about bedrooms AND square feet. Most of the entries are formatted like "/ 1br - 950ft²" but some are "/ 1br" and some are "/950ft²". I'm trying to create a clean column with just bedrooms, but because of formatting I can't just split the string after a certain character.

I've decided I need to create a function to test for if the string contains "br", but I'm getting an error.

Here's my code:

def cleaned_bedrooms(x):
    if df[df['bedrooms'].str.contains('br')]:
        df['bedrooms'] = df['bedrooms'].str.split('-').str[0]
    else:
        return None
df['bedrooms'].map(cleaned_bedrooms)

I seem to have set up a boolean function though (I assume triggered by the if statement), because the error I'm getting is "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()." for the line containing the .map(cleaned_bedrooms)

If this is your dataframe,

    bedrooms
0   / 1br - 950ft²
1   / 1br
2   /950ft²

You can use str.extract to extract bedrooms

df['bedrooms'] = df['bedrooms'].str.extract('(\d+?br)', expand = False)

You get

    bedrooms
0   1br
1   1br
2   NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM