简体   繁体   English

从数据集中提取后如何可视化保存的数据

[英]How to visualize saved data after extracting it from the dataset

这是我提取的 原始数据集样本

Basically I want to extract the highest value count of the genre for each year and then plot it in a bar chart to answer the question - Which genre is most popular from year to year?基本上我想提取每年流派的最高值计数,然后将其绘制在条形图中以回答问题 - 每年哪种流派最受欢迎?

First idea is create 3 columns DataFrame by # Series.reset_index , remove duplicates by DataFrame.drop_duplicates and reshape by DataFrame.pivot :第一个想法是通过 # Series.reset_index创建3 columns DataFrame通过Series.reset_index删除重复DataFrame.drop_duplicates并通过DataFrame.pivot重塑:

df = (temp_1.reset_index(name='count')
            .drop_duplicates('release_year')
            .pivot('release_year','genres','count'))

Or remove duplicates in MultiIndex by Index.get_level_values with Index.duplicated and boolean indexing , reshape by Series.unstack and last create 3 columns DataFrame :或删除重复的MultiIndexIndex.get_level_valuesIndex.duplicatedboolean indexing通过,重塑Series.unstack和最后创建3 columns DataFrame

df = (temp_1[~temp_1.index.get_level_values('release_year').duplicated()]
            .unstack()
            .reset_index(name='count'))

Last plot by DataFrame.plot.bar :最后由DataFrame.plot.bar

df.plot.bar()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM