Pandas 数据框条件均值

Question

I'm trying to find the average number of cigarettes smoked per day among women who smoked during pregnancy for a given dataset.我试图找到给定数据集的怀孕期间吸烟的女性每天吸烟的平均数量。 Currently, I'm trying目前，我正在尝试

mean = data.groupby(['male', 'cigs']).mean()
print(mean)

That gives me the mean average family income for each amount of cigarettes smoked per day (ie 0 per day, 2 per day, 8 per day, ect).这给了我每天吸每支烟的平均家庭收入（即每天 0 支、每天 2 支、每天 8 支等）。 How do I get it so it's the average family income for those who smoked >= 1?我如何得到它，所以它是吸烟 >= 1 的人的平均家庭收入？

Also, this is my first post on stack so forgive me if there isn't enough detail.另外，这是我在堆栈上的第一篇文章，所以如果没有足够的细节，请原谅我。

Answer 1

I assume " cigs " refers to number of Cigarettes smoked per day.我假设“ cigs ”是指每天抽的香烟数量。 You can first filter the data based on cigs >=1 and then apply what you were doing.您可以首先根据 cigs >=1 过滤数据，然后应用您正在执行的操作。

data_on_people_who_smoke = data[data.cigs >= 1]
mean = data_on_people_who_smoke.groupby(['male', 'cigs']).mean()
print(mean)

Answer 2

mean = data[data['cigs']>1]['income'].mean()
print (mean)

This gives you the mean of the income of all respondents that smoke at least 1 cig.这为您提供了抽至少 1 支烟的所有受访者收入的平均值。 don't groupby gender or cigs.不要按性别或香烟分组。 Filter first, and get the mean.先过滤，取平均值。

Pandas 数据框条件均值

问题描述

2 个解决方案

解决方案1
0 2020-11-11 00:05:33

解决方案2
0 2020-11-11 00:31:39

Pandas 数据框条件均值

问题描述

2 个解决方案

解决方案1 0 2020-11-11 00:05:33

解决方案2 0 2020-11-11 00:31:39

解决方案1
0 2020-11-11 00:05:33

解决方案2
0 2020-11-11 00:31:39