I have this dataset of ID's, in column A of GUID's (250,000 values). I need to count the amount of times each GUID in that column appears and then include that as another column within the data set. The problem is using.value_counts() with pandas gives me a list but removes the duplicates. As I want to align the new count dataset with the old one, the lists don't align.
import os
import pandas as pd
path = (r"D:\\Users\\cdoyle\Desktop\\Final2_.xlsx")
df = pd.read_excel(path)
df = df[['Data BoundingBoxGUID', 'Data Line', 'Data Remove Item:', 'Data Status:', 'Model']]
df2 = df['Data BoundingBoxGUID'].value_counts()
df_output = pd.concat([df,df2], axis=1)
We usually do transform
df['new'] = df.groupby('Data BoundingBoxGUID')['Data BoundingBoxGUID'].transform('count')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.