简体   繁体   English

根据熊猫中其他列的值添加具有唯一标识符的列

[英]Add column with unique identifiers based on values from other columns in pandas

I have the foll. 我有傻瓜。 dataframe: 数据框:

Cnt Year    JD  Min_Temp
S   2000    1   277.139
S   2000    2   274.725
S   2001    1   270.945
S   2001    2   271.505
N   2000    1   257.709
N   2000    2   254.533
N   2000    3   258.472
N   2001    1   255.763
N   2001    2   265.714
N   2001    3   267.943

I would like to add a new column where each separate row for a given 'Cnt' is given a unique identifier (1,2,3...). 我想添加一个新列,其中给定'Cnt'的每个单独行都具有唯一的标识符(1,2,3 ...)。 So, the result should look like this: 因此,结果应如下所示:

Cnt Year    JD  Min_Temp    unq
S   2000    1   277.139     1
S   2000    2   274.725     2
S   2001    1   270.945     3
S   2001    2   271.505     4
N   2000    1   257.709     1
N   2000    2   254.533     2
N   2000    3   258.472     3
N   2001    1   255.763     4
N   2001    2   265.714     5
N   2001    3   267.943     6

Here, each row corresponding to the same value in the column 'Cnt' as a unique identifier. 这里,每一行对应于列“ Cnt”中的相同值作为唯一标识符。

Currently, all I can do is add a new column with increasing values: df['unq'] = numpy.arange(1,len(df)) 当前,我所能做的就是添加一个具有递增值的新列:df ['unq'] = numpy.arange(1,len(df))

You could use groupby with cumcount 您可以将cumcountgroupby一起cumcount

>>> df["unq"] = df.groupby("Cnt").cumcount() + 1
>>> df
  Cnt  Year  JD  Min_Temp  unq
0   S  2000   1   277.139    1
1   S  2000   2   274.725    2
2   S  2001   1   270.945    3
3   S  2001   2   271.505    4
4   N  2000   1   257.709    1
5   N  2000   2   254.533    2
6   N  2000   3   258.472    3
7   N  2001   1   255.763    4
8   N  2001   2   265.714    5
9   N  2001   3   267.943    6

Note that because the groups are based on the Cnt column values and not on contiguity, if you have a second group of S below the group of N, the first unq value in that group will be 5. 请注意,由于这些组是基于CNT列值,而不是在邻接,如果你有选自N以下S的第二组,所述第一unq 组中值将是5。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas:根据其他列中的唯一标识符创建具有最小值的新列 - Python Pandas: create new column with min values based on unique identifiers in other columns 将新列添加到 dataframe 这是基于重复日期时间索引的前一个月的另一列的值,其他列作为标识符 - Add new column to dataframe that is another column's values from the month before based repeating datetime index with other columns as identifiers 根据 pandas 中其他列的交集加入列中的唯一值 - Join unique values in a column based on intersection of other columns in pandas pandas,根据其他两列的值创建一个新的唯一标识符列 - pandas, create a new unique identifier column based on values from two other columns Pandas:根据另一列中的索引列表添加一列来自其他列的值列表 - Pandas: Add a column of list of values from other columns based on an index list in another column 根据其他具有条件的列的值在 Pandas 中添加列 - Adding column in pandas based on values from other columns with conditions 基于python pandas中其他列的值创建新列 - Creating a new column based on values from other columns in python pandas 根据Pandas中其他两个列的相等性从列中提取值 - Extract values from a column based on the equality of two other columns in Pandas 熊猫-根据其他2列的值创建一列 - Pandas - Create a column based on values from 2 other columns Python Pandas 基于其他列值的新列 - Python Pandas New Column based on values from other columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM