简体   繁体   中英

Lambda function notation in Pandas

I received a wonderful lambda function from a user a while ago.

actresses_modified['Winner_Count'] = actresses_modified.apply(lambda x: actresses_modified.Name.value_counts()[x.Name], axis=1)

The data frame to which it is applied looks like this:

    Year    Award           Winner  Name
2   1928    Best Actress    0.0     Louise Dresser
3   1928    Best Actress    1.0     Janet Gaynor
4   1928    Best Actress    0.0     Gloria Swanson
40  1929    Best Actress    0.0     Ruth Chatterton
41  1929    Best Actress    0.0     Betty Compson

The problem is I have forgotten how it works (I had to step away from this "for fun" project) and, more specifically, exactly what is going on with [x.Name] .

The line actresses_modified.Name.value_counts() by itself gives me the count of all actress names in the data frame. What does [x.Name] mean in english, how does it manage to tally up all of the 1s next to each person's name in the data frame's Winner column, and return a correct tally of the total number of wins? Of equal importance, does this type of syntax have a name? My google searches turned up nada.

Any thoughts would be appreciated?

Here, I'm not sure I made myself clear in the comment. So the apply method "Applies function along input axis of DataFrame." So let's say, for simplicity's sake, that we have a collection of Actress objects called actresses_modified and it looks like this:

   actresses_modified = [<Actress>, <Actress>, <Actress>, <Actress>]

Let's assume that this is how the Actress is defined:

class Actress:
    Name = "Some String"

So then we have our lambda function which gets applied to each actress in the collection as x . value_counts() returns "object containing counts of unique values."

So when we call value_counts() for each actress we're getting that Actress's counts value by key. Let's pretend that value_counts() returns a dict with actress names and their "count" and it looks like this:

counts = {
    'Jane Doe': 1,
    'Betty Ross': 3,
}

And we have our Actress objects with actress 1's Name is "Jane Doe", so when we call value_counts()[x.Name] we're doing counts["Jane Doe"] which would return 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM