So I'm new to python and I'm working with a dataframe using pandas (can't use packages besides pandas) and I have taken user input (make,model,type,rating) for car info for 6 different cars:
make model type rating
0 ford mustang coupe A
1 chevy camaro coupe B
2 ford fiesta sedan C
3 ford focus sedan A
4 ford taurus sedan B
5 toyota camry sedan B
I wanted conditional probabilities for this data, and I did this using a value_counts dataframe,
print df.groupby('rating')['type'].value_counts()
print df.groupby('rating')['type'].count()
conditional = (df.groupby('rating')['type'].value_counts() / df.groupby('rating')['type'].count()).reset_index(name="Cond")
print conditional
Which resulted in the conditional probabilities I was looking for:
rating type cond
0 A coupe 0.500000
1 A sedan 0.500000
2 B sedan 0.666667
3 B coupe 0.333333
4 C sedan 1.000000
Now I need to to print individual probabilities. How would I go about selecting individual probabilities here based on conditions in the 'make' and 'model' columns?
For example on the conditional probability dataframe, the conditional probability P(type=sedan|rating=B) = 0.666667. I want to select and print this individual probability, however I don't want to print based on index (like index 2 on the "cond" column), but by selecting for the value in "cond" for when rating = B and type = sedan
IIUC by using crosstab
with normalize
pd.crosstab(df.rating,df.type,normalize='index').stack().reset_index()
Out[36]:
rating type 0
0 A coupe 0.500000
1 A sedan 0.500000
2 B coupe 0.333333
3 B sedan 0.666667
4 C coupe 0.000000
5 C sedan 1.000000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.