将分类值汇总到第二个变量Python中

Question

I have data that comes from a data source with pre-coded categorical variables. 我的数据来自带有预编码分类变量的数据源。 Unfortunately, these are not the variables that I need for my analysis and need to roll them up into a second column: 不幸的是，这些不是我进行分析所需的变量，不需要将它们汇总到第二列中：

age_group  lifestage
18-24      young adult
25-34      adult
35-44      adult
45-54      adult
.          .
.          .
.          .

I currently am using a loop through lists in order to do this: 我目前正在使用遍历列表来执行此操作：

ya_list = ['18-24']
adult_list = ['25-34', '35-44', '45-54']

for age in age_group:
    if age in ya_list:
        lifestage = 'young adult' 
    elif age in adult_list:
        lifestage = 'adult'

This works ok for this example with only a few groups to recode into, but when I have groups with 10 or more groups to recode, it becomes a lot more unwieldy. 在这个示例中，只有少数几个组可以重新编码，这可以正常工作，但是当我有10个或更多组要重新编码的组时，它变得更加笨拙。 I can't help but think there has to be a better way to do this, but I haven't been able to find one. 我忍不住想必须要有一种更好的方法来做到这一点，但我一直找不到。

Answer 1

You want a dictionary: 您想要一本字典：

stages = {'18-24': 'young adult',
          '25-34': 'adult', ...}

for age in age_group:
    lifestage = stages[age]

This is the canonical replacement for a lot of elif s in Python. 这是Python中许多elif的规范替代。

Answer 2

You could use split() and a list comprehension to get the actual numbers to work with: 您可以使用split()和列表推导来获取要使用的实际数字：

for age in age_group:
    lower,higher = [int(i) for i in age.split("-")]
    if higher <= 24:
        lifestage = "young adult"
    elif lower <= 54:
        lifestage = "adult"
    # etc...

Not sure if what you are scaling up is the number of age ranges, or the number of stages, but hopefully that will help you get started. 不确定您要扩大的年龄范围或阶段数，但希望可以帮助您入门。

将分类值汇总到第二个变量Python中

问题描述

2 个解决方案

解决方案1
0 已采纳 2014-06-19 22:40:48

解决方案2
0 2014-06-19 22:42:05

将分类值汇总到第二个变量Python中

问题描述

2 个解决方案

解决方案1 0 已采纳 2014-06-19 22:40:48

解决方案2 0 2014-06-19 22:42:05

解决方案1
0 已采纳 2014-06-19 22:40:48

解决方案2
0 2014-06-19 22:42:05