简体   繁体   English

通过计算熊猫另一列中的不同值来创建新列

[英]Create new column by counting distinct values in another column in pandas

Hello I have a dataframe such as:您好,我有一个数据框,例如:

COL1_1 COL1_3            COL2
Chr1_0 Canis_lupus       A
Chr1_0 Canis_lupus       A
Chr1_0 Canis_lupus       B
Chr1_0 Canis_lupus       B
Chr1_0 Canis_lupus       B
Chr1_0 Felis_cattus      B
Chr1_0 Felis_cattus      B
Chr2_0 Felis_cattus      A
Chr2_0 Felis_cattus      B
Chr2_1 Felis_cattus      C
Chr2_1 Felis_cattus      D
Chr2_1 Felis_cattus      E

and the idea is within each COL1_1 and COL1_3 count the number of distinct COL2 .并且这个想法是在每个COL1_1COL1_3计算不同COL2的数量。

ex : for Chr1_0 and Canis_lupus there are 2 distinct COL2 (A and B), so I put 2 into the new COL3 .例如:对于Chr1_0Canis_lupus ,有 2 个不同的COL2 (A 和 B),所以我将 2 个放入新的COL3

if there is only one value, I put a 0.如果只有一个值,我放一个 0。

here I should then get在这里我应该得到

COL1_1 COL1_3            COL2  COL3
Chr1_0 Canis_lupus       A     2
Chr1_0 Canis_lupus       A     2
Chr1_0 Canis_lupus       B     2
Chr1_0 Canis_lupus       B     2
Chr1_0 Canis_lupus       B     2
Chr1_0 Felis_cattus      B     0
Chr1_0 Felis_cattus      B     0
Chr2_0 Felis_cattus      A     2
Chr2_0 Felis_cattus      B     2
Chr2_1 Felis_cattus      C     3
Chr2_1 Felis_cattus      D     3
Chr2_1 Felis_cattus      E     3

maybe an idea would be to groupby (COL1_1 and COL1_3`) and count number of distinct COL2 values.也许一个想法是分组(COL1_1 and COL1_3`)并计算不同 COL2 值的数量。

Use GroupBy.transform with DataFrameGroupBy.nunique and Series.mask for replace 1 to 0 :使用GroupBy.transformDataFrameGroupBy.nuniqueSeries.mask替换10

df['COL3'] = (df.groupby(['COL1_1', 'COL1_3']).COL2.transform('nunique')
                .mask(lambda x: x == 1, 0))

Or use replace :或使用replace

df['COL3'] = df.groupby(['COL1_1', 'COL1_3']).COL2.transform('nunique').replace({1:0})

print (df)
    COL1_1        COL1_3 COL2  COL3
0   Chr1_0   Canis_lupus    A     2
1   Chr1_0   Canis_lupus    A     2
2   Chr1_0   Canis_lupus    B     2
3   Chr1_0   Canis_lupus    B     2
4   Chr1_0   Canis_lupus    B     2
5   Chr1_0  Felis_cattus    B     0
6   Chr1_0  Felis_cattus    B     0
7   Chr2_0  Felis_cattus    A     2
8   Chr2_0  Felis_cattus    B     2
9   Chr2_1  Felis_cattus    C     3
10  Chr2_1  Felis_cattus    D     3
11  Chr2_1  Felis_cattus    E     3

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用依赖于另一列的布尔值创建一个新的Pandas df列 - Create a new Pandas df column with boolean values that depend on another column 根据另一列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in another column 根据熊猫中另一列中相似值的分组来创建新列 - Create a new column based on Grouping of similar values in another column in pandas Pandas groupby:将不同的值合并到另一列中 - Pandas groupby: combine distinct values into another column 计算熊猫中的列值 - counting column values in pandas 基于 Pandas 中的另一列创建新列 - Create new column on basis of another column in Pandas 如何从另一列的所有值创建新的列名并按 pandas dataframe 中的另一列创建新列名? - how to create new column names from another column all values and agg by another column in pandas dataframe? 根据 pandas 中另一列的值计算列中 integer 值的数量 - Counting the number of integer values in a column depending on the value of another column in pandas 如何通过 pandas 中的相同列 ID 从 dataframe 中的两个不同分类列值创建新列? - How do you create new column from two distinct categorical column values in a dataframe by same column ID in pandas? 如何基于另一列的值在pandas dataframe列中创建新值 - How to create new values in a pandas dataframe column based on values from another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM