简体   繁体   English

如何根据给定列为行中出现的唯一值分配 pandas dataframe 中的数字

[英]how to assign the number in pandas dataframe for the unique value appearing in the row based on given column

Data Frame looks like数据框看起来像

Unique Id     Date    
   H1         2/03/2022
   H1         2/03/2022
   H1         2/03/2022
   H1         3/03/2022
   H1         4/03/2022
   H2         9/03/2022
   H2         9/03/2022
   H2         10/03/2022

Expected Data Frame预期数据帧

    Unique Id     Date       Count
   H1         2/03/2022       1
   H1         2/03/2022       1
   H1         2/03/2022       1
   H1         3/03/2022       2
   H1         4/03/2022       3
   H2         9/03/2022       1
   H2         9/03/2022       1
   H2         10/03/2022      2

Repetitive dates should be assigned with number 1, else other should be assigned some other number重复的日期应分配编号 1,否则应分配其他编号

tried multiple approaches, please assist尝试了多种方法,请协助

There are a bunch of ways to do this, the primary issue is going to be that you need to treat the date as a date object so that October doesn't get moved ahead of September in your second group.有很多方法可以做到这一点,主要问题是您需要将日期视为日期对象,以便在您的第二组中,十月不会提前于九月。

import pandas as pd
df = pd.DataFrame({'Unique_Id': ['H1', 'H1', 'H1', 'H1', 'H1', 'H2', 'H2', 'H2'],
 'Date': ['2/03/2022',
  '2/03/2022',
  '2/03/2022',
  '3/03/2022',
  '4/03/2022',
  '9/03/2022',
  '9/03/2022',
  '10/03/2022']})

Dense Rank密集等级

df.groupby('Unique_Id')['Date'].apply(lambda x: pd.to_datetime(x).rank(method='dense'))

Cat Codes猫代码

df.groupby('Unique_Id')['Date'].apply(lambda x: pd.to_datetime(x).astype('category').cat.codes+1)

Factorize分解

df.groupby('Unique_Id')['Date'].transform(lambda x: x.factorize()[0] + 1)

here is one way to do it making use of groupby and transform这是使用 groupby 和 transform 的一种方法

" Repetitive dates should be assigned with number 1 , else other should be assigned some other number " is what the question stated, so I choose 2 where the values are unique 应为重复日期分配数字 1 ,否则应为其他日期分配其他数字”是问题所述,所以我选择 2 值是唯一的

df['count'] = df.groupby('Date').transform(lambda x: 1 if (x.size > 1) else 2  )
df

    Unique_Id   Date    count
0   H1       2/03/2022    1
1   H1       2/03/2022    1 
2   H1       2/03/2022    1
3   H1       3/03/2022    2
4   H1       4/03/2022    2
5   H2       9/03/2022    1
6   H2       9/03/2022    1
7   H2       10/03/2022   2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas Dataframe:对于给定的行,尝试基于在另一列中查找值来分配特定列中的值 - Pandas Dataframe: for a given row, trying to assign value in a certain column based on a lookup of a value in another column 将 pandas DataFrame 中的一行替换为基于唯一列值的 dict 项 - Replace a row in a pandas DataFrame with a dict item based on a unique column value 给定唯一的列值,Pandas 数据框如何删除以行长小于数字为条件的行? - Pandas dataframe how to remove rows conditioned on the length of rows being smaller than a number, given a unique column value? Pandas:如何根据唯一列值分配随机数 - Pandas: How to assign random number based on unique column values 如何为 pandas dataframe 中的重复列值序列分配唯一 ID? - How to assign a unique id for a sequence of repeated column value in pandas dataframe? 如何根据 Pandas 中的条件为 dataframe 子集的列分配值? - How to assign a value to a column for a subset of dataframe based on a condition in Pandas? 如何根据不同的条件为 pandas dataframe 中的特定列赋值? - How to assign value to particular column in pandas dataframe based on different conditions? 如何为每个唯一列值 pandas dataframe 添加行系列? - How to add row series per unique column value pandas dataframe? 熊猫矢量化根据日期分配列值,给定另一个具有值和开始日期的数据框 - Pandas vectorization to assign column value based on date, given another dataframe with value and start date 如何在数据框中获取给定值的行号和列号 - How to fetch row and column number given a value in dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM