I have a pandas dataframe with common baby names listed on different rows. I need to loop through each name in each row and retrieve the number of times each letter of the alphabet appears as the last character of the baby name. I then need to return a dictionary of key and values where the keys are the 26 alphabet letters and the values are the frequency that the alphabet letters appear as the last character in all the baby names in the dataframe.
Do I use a for loop with a regular expression? Do I use a counter? Do I use a string method after turning the column into a series?
With respect to a for loop and a regular expression, so far I have tried:
import re
for index, row in male_names.iterrows():
male_last_letter_freq = row['name'](r'/(\w)\b/')
male_letter_freq.update(male_last_letter_freq)
male_last_letter_freq
Clearly, I don't know the syntax for including the regular expression within the loop.
I have also tried to turn the 'name' column from the dataframe into a series and call some pandas .str methods:
male_name_series = male_names['name']
male_name_series.str.extract(r'/(\w)\b/')
Both ways return errors. I am really at a loss on how to do such a specific thing. Any help would be greatly appreciated.
If I understand your question correctly you don't need regular expressions, but just use:
dict(pd.value_counts(df["name"].str[-1]))
Explanation: df["name"].str[-1]
extracts the last character, pd.value_counts
counts unique value, finally dict
converts the object to a dictionary
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.