简体   繁体   English

如何比较pandas中的两列来制作第三列?

[英]how to compare two columns in pandas to make a third column ?

i have two columns age and sex in a pandas dataframe 我在熊猫数据框中有两列年龄和性别

sex = ['m', 'f' , 'm', 'f', 'f', 'f', 'f']
age = [16 ,  15 , 14 , 9  , 8   , 2   , 56 ]

now i want to extract a third column : like this if age <=9 then output ' child' and if age >9 then output the respective gender 现在我想提取第三列:如果年龄<= 9则输出'child',如果年龄> 9,则输出相应的性别

sex = ['m', 'f'  , 'm','f'    ,'f'    ,'f'    , 'f']
age = [16 ,  15  , 14 , 9     , 8     , 2     , 56 ]
yes = ['m', 'f'  ,'m' ,'child','child','child','f' ]

please help ps . 请帮助ps。 i am still working on it if i get anything i will immediately update 我仍在努力,如果我得到任何东西,我会立即更新

Use numpy.where : 使用numpy.where

df['col3'] = np.where(df['age'] <= 9, 'child', df['sex'])

The resulting output: 结果输出:

   age sex   col3
0   16   m      m
1   15   f      f
2   14   m      m
3    9   f  child
4    8   f  child
5    2   f  child
6   56   f      f

Timings 计时

Using the following setup to get a larger sample DataFrame: 使用以下设置获取更大的示例DataFrame:

np.random.seed([3,1415])
n = 10**5
df = pd.DataFrame({'sex': np.random.choice(['m', 'f'], size=n), 'age': np.random.randint(0, 100, size=n)})

I get the following timings: 我得到以下时间:

%timeit np.where(df['age'] <= 9, 'child', df['sex'])
1000 loops, best of 3: 1.26 ms per loop

%timeit df['sex'].where(df['age'] > 9, 'child')
100 loops, best of 3: 3.25 ms per loop

%timeit df.apply(lambda x: 'child' if x['age'] <= 9 else x['sex'], axis=1)
100 loops, best of 3: 3.92 ms per loop

You could use pandas.DataFrame.where . 你可以使用pandas.DataFrame.where For example 例如

child.where(age<=9, sex)
df = pd.DataFrame({'sex':['m', 'f' , 'm', 'f', 'f', 'f', 'f'],
    'age':[16, 15, 14, 9, 8, 2, 56]})
df['yes'] = df.apply(lambda x: 'child' if x['age'] <= 9 else x['sex'], axis=1)

Result: 结果:

   age sex    yes
0   16   m      m
1   15   f      f
2   14   m      m
3    9   f  child
4    8   f  child
5    2   f  child
6   56   f      f

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 比较 pandas DataFrame 中的两个日期列以验证第三列 - Compare two date columns in pandas DataFrame to validate third column 如何比较两列并从第三列返回值 Pandas dataframe - How to compare two columns and return value from a third column in Pandas dataframe 比较两列以在 python 中创建第三列 - Compare two columns to create third column in python 比较两个熊猫数据框列的元素,并基于第三列创建一个新列 - Compare elements of two pandas data frame columns and create a new column based on a third column Pandas 比较两个数据框中的两列,如果有匹配项,则从第三列获取值转换为周数 - Pandas compare two columns from two dataframes if there is a match get value from third column convert to week number Pandas 比较不同数据帧的两列,如果匹配,则复制第三列的值 - Pandas compare two columns of different dataframes and copy value of a third column if there is a match 如何比较两列并将较小的输入到pandas的新列中? - How to compare two columns and input the smaller one in a new column in pandas? 如何比较 Pandas 中的两列? - How to compare two columns in Pandas? 比较2列中的值,并在熊猫的第三列中输出结果 - Compare values in 2 columns and output the result in a third column in pandas 如何遍历熊猫数据框并基于第三列比较某些列? - How to iterate over a pandas dataframe and compare certain columns based on a third column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM