简体   繁体   中英

How to loop through pandas dataframe using a regular expression, counter, or string method and return a dictionary?

I have a pandas dataframe with common baby names listed on different rows. I need to loop through each name in each row and retrieve the number of times each letter of the alphabet appears as the last character of the baby name. I then need to return a dictionary of key and values where the keys are the 26 alphabet letters and the values are the frequency that the alphabet letters appear as the last character in all the baby names in the dataframe. 熊猫数据样本

Do I use a for loop with a regular expression? Do I use a counter? Do I use a string method after turning the column into a series?

With respect to a for loop and a regular expression, so far I have tried:

import re

for index, row in male_names.iterrows():
    male_last_letter_freq = row['name'](r'/(\w)\b/')
    male_letter_freq.update(male_last_letter_freq)

male_last_letter_freq

Clearly, I don't know the syntax for including the regular expression within the loop.

I have also tried to turn the 'name' column from the dataframe into a series and call some pandas .str methods:

male_name_series = male_names['name']
male_name_series.str.extract(r'/(\w)\b/')

Both ways return errors. I am really at a loss on how to do such a specific thing. Any help would be greatly appreciated.

If I understand your question correctly you don't need regular expressions, but just use:

dict(pd.value_counts(df["name"].str[-1]))

Explanation: df["name"].str[-1] extracts the last character, pd.value_counts counts unique value, finally dict converts the object to a dictionary

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM