简体   繁体   中英

can anyone explain to me why this apply() method isn't working?

This doesn't work:

def rator(row):
    if row['country'] == 'Canada':
        row['stars'] = 3
    elif row['points'] >= 95:
        row['stars'] = 3
    elif row['points'] >= 85:
        row['stars'] = 2
    else:
        row['stars'] = 1
    return row

with_stars = reviews.apply(rator, axis='columns')

But this works:

def rator(row):
    if row['country'] == 'Canada':
        return 3
    elif row['points'] >= 95:
        return 3
    elif row['points'] >= 85:
        return 2
    else:
        return 1

with_stars = reviews.apply(rator, axis='columns')

I'm practicing on Kaggle, and reading through their tutorial as well as the documentation. I am a bit confused by the concept.

I understand that the apply() method acts on an entire row of a DataFrame, while map() acts on each element in a column. And that it's supposed to return a DataFrame, while map() returns a Series.

Just not sure how the mechanics work here, since it's not letting me return rows inside the function...

some of the data:

    country description designation points  price   province    region_1    region_2    taster_name taster_twitter_handle   title   variety winery
0   Italy   Aromas include tropical fruit, broom, brimston...   Vulkà Bianco    -1.447138   NaN Sicily & Sardinia   Etna    NaN Kerin O’Keefe   @kerinokeefe    Nicosia 2013 Vulkà Bianco (Etna)    White Blend Nicosia
1   Portugal    This is ripe and fruity, a wine that is smooth...   Avidagos    -1.447138   15.0    Douro   NaN NaN Roger Voss  @vossroger  Quinta dos Avidagos 2011 Avidagos Red (Douro)   Portuguese Red  Quinta dos Avidagos

Index(['country', 'description', 'designation', 'points', 'price', 'province',
       'region_1', 'region_2', 'taster_name', 'taster_twitter_handle', 'title',
       'variety', 'winery'],
      dtype='object')

https://www.kaggle.com/residentmario/summary-functions-and-maps

The official docs state:

Functions that mutate the passed object can produce unexpected behavior or errors and are not supported

It seems that you are doing something explicitely forbidden. Try to append the stars data afterwards calling apply

You shouldn't use apply with a function that modifies the input. You could change your code to this:

def rator(row):
    new_row = row.copy()
    if row['country'] == 'Canada':
        new_row['stars'] = 3
    elif row['points'] >= 95:
        new_row['stars'] = 3
    elif row['points'] >= 85:
        new_row['stars'] = 2
    else:
        new_row['stars'] = 1
    return new_row

with_stars = reviews.apply(rator, axis='columns')

However, it's simpler to just return the column you care about rather than returning an entire dataframe just to change one column. If you write rator to return just one column, but you want to have an entire dataframe, you can do with_stars = reviews.copy() and then with_stars['stars'] = reviews.apply(rator, axis='columns') . Also, if an if branch ends with a return, you can do just if after it rather than elif . You can also simplify your code with cut .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM