I have a dataframe df
with a 2-level Multiindex. I want a scatter plot with level 0 on the x-axis and level 1 on the y axis and scattered dots for all combinations which satisfy a condition, say have a nonzero value in a specific column 'col'
.
import matplotlib.pyplot as plt
from itertools import product
import numpy as np
lengths = [3, 2]
df_index = pd.MultiIndex.from_product([list(product([-1,1], repeat=li)) for li in lengths], names=['level1', 'level2'])
df_cols = ['cols']
df = pd.DataFrame([[0.] * len(df_cols)] * len(df_index), index=df_index, columns=df_cols)
df['cols'] = np.random.randint(0, 2, size = len(df))
df
yields a dataframe of the following form
cols
level1 level2
(-1, -1, -1) (-1, -1) 0
(-1, 1) 0
(1, -1) 0
(1, 1) 0
(-1, -1, 1) (-1, -1) 1
(-1, 1) 0
(1, -1) 1
(1, 1) 1
(-1, 1, -1) (-1, -1) 0
(-1, 1) 0
(1, -1) 0
(1, 1) 0
(-1, 1, 1) (-1, -1) 0
(-1, 1) 0
(1, -1) 1
(1, 1) 0
(1, -1, -1) (-1, -1) 0
(-1, 1) 0
(1, -1) 1
(1, 1) 1
(1, -1, 1) (-1, -1) 0
(-1, 1) 1
(1, -1) 1
(1, 1) 0
...
Now, I want a scatter plot with the level1 index on the x-axis and the level2 index on the y-axis such that for every (x,y) with cols(x,y).= 0 there is a dot.
Let's first create an example dataframe with 2-level Multiindex:
import pandas as pd
import numpy as np
iterables = [[1, 2, 3, 4], [0,1, 2, 3, 4,5]]
my_multiindex=pd.MultiIndex.from_product(iterables, names=['first', 'second'])
series1 = pd.Series(np.random.randn(24), index=my_multiindex)
series2 = pd.Series(np.random.randn(24), index=my_multiindex)
df=pd.DataFrame({'col1':series1,'col2':series2})
Now, let's get the index values that satisfy a given condition:
index_values=df[df.col1<0].index.values
We then separate x
and y
coordinates:
xs=[a[0] for a in index_values]
ys=[a[1] for a in index_values]
We then plot:
from matplotlib import pyplot as plt
plt.scatter(xs,ys)
If you want the size of the scatter dots to reflect the actual values, you can use:
column_values=abs(df[df.col1<0].col1.values)
plt.scatter(xs,ys,s=column_values*10)
Edit to reflect the edited question :
You would just need to convert your xs
and ys
to strings. I am also using a large figure so that axis tick labels don't overlap:
plt.figure(figsize=(10,10))
plt.scatter([str(a) for a in xs],[str(a) for a in ys])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.