簡體   English   中英

創建數據框,並為每個字符串的行顯示在另一個數據框的列中

[英]Create Dataframe with rows made for each string appearing in a column of another Dataframe

如果您具有如下數據框:

genre                 mean_average_budget
horror thriller       x
romance comedy        y 
action thriller       z
documentary           a
comedy documentary    b

怎樣才能使其中的行是流派列中每個字符串的單獨出現? 例如:

genre                 mean_average_budget
horror                h
thriller              i 
action                k
documentary           l
comedy                m

嘗試這個

new_df = df.set_index('mean_average_budget').genre.str.split().\
    apply(pd.Series).stack().reset_index(1,drop = True).\
    reset_index(name = 'genre')

    mean_average_budget genre
0   x                   horror
1   x                   thriller
2   y                   romance
3   y                   comedy
4   z                   action
5   z                   thriller
6   a                   documentary
7   b                   comedy
8   b                   documentary

要查找均值,請嘗試此操作以獲取數值數據

new_df.groupby('genre')['mean_average_budget'].mean()

如果要匯總字符串

new_df.groupby('genre')['mean_average_budget'].apply('+'.join)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM