I have two dataframes:
The first one has n row of names.
The second one has n row of names.
for each name in the first dataframe:
The code looks something like this:
df5 = pd.read_excel(item1, usecols="B",skiprows=6)
df10 = pd.read_excel('SMR4xx_Change_situation.xlsm', sheet_name='LoPN',usecols='D', skiprows=4)
how do i count the number of times a name appears in the second database and output it besides the name in the first database?
Ex: The first name in the database is John. John appears in the second dataframe 4 times => output John 4
either print it in the console or write in a separate excel file the first database and on the second column the number of appearances.
Anything could help.
Well, you can create a datarame for the records you are seeking. You can first get list of unique names in the first dataframe like
uniqueNames = df5['B'].unique() # Assuming column B contains the names
dfCount = pd.DataFrame(columns=['name', 'count'])
Now you can iterate through each of the unique names in the first dataframe and compare against the second dataframe like this:
for eachName in uniqueNames:
dfCount = dfCount.append({'name':eachName,
'count':(df10['D'] == eachName).sum()},
ignore_index=True) # Assuming you need to compare with column D
Or If you want the counts to be present in the first database, something like this should work
import numpy as np
df10['counts'] = np.nan
df10['counts'] = np.select([dfCount['name']==df5['B']], [dfCount['count']], np.nan)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.