简体   繁体   中英

Plot specific values in a pandas df

I'm not sure if there's a more efficient way to perform this. I've got a df with one Column containing information of interest. For the df below I'm interested in Column B . I want to create separate plots for each of W,X,Y,Z values. I'm also hoping to start each separate plot when these values change.

import pandas as pd
import matplotlib.pyplot as plt

d = ({
    'A' : [1,2,3,4,5,6,7,8,1,3],            
    'B' : ['W','W','X','X','Y','Y','Z','Z','W','W'], 
     })

df = pd.DataFrame(data=d)

So this df would display 4 different plots. There would 2 lines for the W value.

I'm currently exporting the above df into there own separate series . If I plot the values relating to W The output would be:

   W1  W2  X1  Y1  Z1
0   1   1   3   5   8
1   2   3   4   6   9

fig, ax = plt.subplots()

plt.plot(df['W1'])
plt.plot(df['W2'])

But this would mean I'm creating numerous separate series and plots . This wouldn't be very inefficient if my df contained 1000's of rows that continuously changed between values.

在此处输入图片说明 Is there an easier way? I think I'll still have to export each value to it's own series when the values change.

But I'm hoping there's an easier way to plot each series over the top of each each other without doing this.

I think need:

g = df['B'].ne(df['B'].shift()).cumsum()
df['C'] =  g.groupby(df['B']).transform(lambda x: pd.factorize(x)[0]).add(1).astype(str)
df['D'] = df.groupby(['B','C']).cumcount()
df = df.set_index(['D','C','B'])['A'].unstack([2,1])
df.columns = df.columns.map(''.join)
print (df)
   W1  X1  Y1  Z1  W2
D                    
0   1   3   5   7   1
1   2   4   6   8   3

df.groupby(df.columns.str[0], axis=1).plot()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM