below I have a function where I would like the parameter "label" be transformed as dataframe column/label object. In the function, the parameter "label" is the string "alcohol". Once accessed in the function via the parameter label which is a string "alcohol", I need it be used as the name of the dataframe column/label. The dataframe column is named "alcohol" as well. A call such as df.label.median() should be equivalent to df.alcohol.median() where alcohol is an actual column in the dataframe.
import pandas as pd
df = pd.read_csv('winequality-red.csv', sep=';')
def mean_quality_rating(df, label):
median_label = df.label.median() #should evaluate as df.alcohol.median()
for i, the_label in enumerate(df.label):
if the_label >= median_label:
df.loc[i, label] = 'high'
else:
df.loc[i, label] = 'low'
return df.groupby(label).quality.mean()
mean_quality_rating(df, 'alcohol')
Try:
def mean_quality_rating(df, label):
median_label = df[label]median() #should evaluate as df.alcohol.median()
for i, the_label in enumerate(df[label]):
if the_label >= median_label:
df.loc[i, label] = 'high'
else:
df.loc[i, label] = 'low'
return df.groupby(label).quality.mean()
mean_quality_rating(df, 'alcohol')
You can't not use the dot notation with variables.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.