[英]Create Dataframe with rows made for each string appearing in a column of another Dataframe
如果您具有如下數據框:
genre mean_average_budget
horror thriller x
romance comedy y
action thriller z
documentary a
comedy documentary b
怎樣才能使其中的行是流派列中每個字符串的單獨出現? 例如:
genre mean_average_budget
horror h
thriller i
action k
documentary l
comedy m
嘗試這個
new_df = df.set_index('mean_average_budget').genre.str.split().\
apply(pd.Series).stack().reset_index(1,drop = True).\
reset_index(name = 'genre')
mean_average_budget genre
0 x horror
1 x thriller
2 y romance
3 y comedy
4 z action
5 z thriller
6 a documentary
7 b comedy
8 b documentary
要查找均值,請嘗試此操作以獲取數值數據
new_df.groupby('genre')['mean_average_budget'].mean()
如果要匯總字符串
new_df.groupby('genre')['mean_average_budget'].apply('+'.join)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.