简体   繁体   中英

Python: Scatter plot using group_by function in Pandas

I have a dataframe which has a column named genres. Each genres has multiple values as movie name. The format is given below:

   Movie_val  Genre
      2       Fantasy
      11      Adventure
      12      Comedy
      2       Fantasy
      2       Adventure
     11       Adventure
     13       Thriller
     12       Fantasy
     10       Thriller
     11       Drama
     1        Fantasy

I need to group_by each of the genres based on movie_val and plot each group in a scatter plot like a cluster (Eg: Action genre movies in one cluster or color, Adventure in another, etc.,). I checked the matplot lib library and it expects two values X and Y for a cluster graph. My group_by command will have lot of movie values (eg,. Adventure genres have many values and I am not sure how to plot the values as a group).

Also each of these group_by values should be represented in different color. I tried the below code for bar plot. But I am looking for scatter one, as below format doesnt allow for scatter.

     result = df.groupby(['genres'])['Movie_val'].quantile(0.5)
     result.sort_values().plot(kind='barh')

I am trying this in python using pandas library. Any help would be greatly appreciated.

The seaborn library can probably give you what you're after. Of course you still need to pick which columns of your data frame will provide the coordinates for the scatter plot.

import seaborn as sns
g = sns.FacetGrid(df, hue="Genre", size=5)
g.map(plt.scatter, "column name for x dimension", "column name for y dimension", s=50, alpha=.7)
g.add_legend();

See also the examples with more complex faceting here: https://stanford.edu/~mwaskom/software/seaborn/tutorial/axis_grids.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM