简体   繁体   中英

Python - referencing dataframe column using str() instead of quotations

My code is a function that calculates the conjunctive probability of country flags in a sample containing a specific colour. It is from a DataQuest exercise. It only asks for the probability of three flags containing red but I wanted to challenge myself and write a function for n and colour.

Flags is a dataframe. There are colour columns. 1 if the colour is there, 0 if it is not.

import numpy as np

def conjunctive_probability(n, colour):
    total_count = flags.shape[0]
    colour_picked = flags[flags[str(colour)] == 1].shape[0]
    p = 0
    probabilities = []

    for p in range(n):
        probability = colour_picked / total_count
        probabilities.append(probability)
        colour_picked -= 1
        total_count -= 1
        p += 1
    return np.prod(np.array(probabilities))

three_red = conjunctive_probability(3, red)

I get an error on line 5 (colour_picked). If I type in a colour there, such as:

colour_picked = flags[flags['red'] == 1].shape[0]

it works.

But I don't understand why str() doesn't work. It gives me:

KeyError: '153'

which is the number of flags that have the colour red.

The str() isn't the problem. The key error likely indicates that flags[YOUR_KEY] does not exist. In the specific error, flags[153] does not exist, meaning 153 is not a key in your flags dict. I do not see where flags is initialized in your post.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM