简体   繁体   中英

Pandas pivot heatmap filter most frequent values

Basically, my final result should be a heatmap of the X most preferred destinations by the X most common origin countries (like R question How to create heatmap only for 50 highest value here). Let's say x=2 to align with the small toy dataframe below:

import pandas as pd

df = pd.DataFrame({'destination_1': ['Germany', 'France', 'UK', 'India', 'China'],
                   'destination_2': ['China', 'Vietnam', 'Namibia', 'India', 'UK'],
                   'destination_3' : ['France', 'Italy', 'Namibia', 'China', 'UK'],
                   'origin' : ['Germany', 'US', 'UK', 'China', 'UK']})

The destination count should be based on the mention across all three destination variables. To account for this, I melt and pivot the data.

 df1 = df.melt(id_vars= ['origin'],
    value_vars= ['destination_1', 'destination_2', 'destination_3'], var_name='columns')
df_heatmap = df1.pivot_table(index='origin',columns='value',aggfunc='count')

df_heatmap is basically already a heatmap, no problem visualizing it. The only problem for me is I don't get where/how I can put a filter to keep only the x most common origins and destinations.

Would surely be better to filter the pivot table to get the true "totals", but here's a way that at least gets the x:x pivot table dimension. Basically I use lists of top value counts in both dimensions to filter the dataframe before pivoting it.

df1 = df.melt(id_vars= ['origin'],
    value_vars= ['destination_1', 'destination_2', 'destination_3'],
    var_name='columns')

most = df1['origin'].value_counts()[:2].index.tolist()
most2 = df1['value'].value_counts()[:2].index.tolist()
filt = (df1['origin'].isin(most) & df1['value'].isin(most2))
df2 = df1[filt]

df_heatmap = df2.pivot_table(index='origin',columns='value',aggfunc='count', margins = True, margins_name='Total')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM