Count values, keep duplicates with Pandas

Question

I have this dataset of ID's, in column A of GUID's (250,000 values). I need to count the amount of times each GUID in that column appears and then include that as another column within the data set. The problem is using.value_counts() with pandas gives me a list but removes the duplicates. As I want to align the new count dataset with the old one, the lists don't align.

import os
import pandas as pd

path = (r"D:\\Users\\cdoyle\Desktop\\Final2_.xlsx")
df = pd.read_excel(path)
df = df[['Data BoundingBoxGUID', 'Data Line', 'Data Remove Item:', 'Data Status:', 'Model']]
df2 = df['Data BoundingBoxGUID'].value_counts()


df_output = pd.concat([df,df2], axis=1)

Answer 1

We usually do transform

df['new'] = df.groupby('Data BoundingBoxGUID')['Data BoundingBoxGUID'].transform('count')

Count values, keep duplicates with Pandas

Question

1 answers

solution1
4 ACCPTED 2019-11-21 01:54:02

Count values, keep duplicates with Pandas

Question

1 answers

solution1 4 ACCPTED 2019-11-21 01:54:02

solution1
4 ACCPTED 2019-11-21 01:54:02