Python: Conditionally plotting data from many columns from a Dataframe in a loop

Question

I have about 200 pairs of columns in a dataframe that I would like to plot in a single plot. Each pair of columns can be thought of as related "x" and "y" variables. Some of the "y variables" are 0 at certain points in the data. I don't want to plot those. I would rather they show up as a discontinuity in the plot. I am not able to figure out an efficient way to excluse those variables. There is also a "date" variable that I don't need in the plot but I am keeping it in the sample data just to mirror the reality.

Here is a sample data set and what I have done with it. I created my sample dataset in a hurry, the original data has unique "y values" for a given "x value" for every pair of column data.

import pandas as pd
from numpy.random import randint

data1y = [n**3 -n**2+n for n in range(12)]
data1x = [randint(0, 100) for n in range(12)]
data1x.sort()
data2y = [n**3 for n in range(12)]
data2x = [randint(0, 100) for n in range(12)]
data2x.sort()
data3y = [n**3 - n**2 for n in range(12)]
data3x = [randint(0, 100) for n in range(12)]
data3x.sort()
data1y = [0 if x%7==0 else x for x in data1y]
data2y = [0 if x%7==0 else x for x in data2y]
data3y = [0 if x%7==0 else x for x in data3y]

date = ['Jan','Feb','Mar','Apr','May', 'Jun','Jul','Aug','Sep','Oct','Nov','Dec']
df = pd.DataFrame({'Date':date,'Var1':data1y, 'Var1x':data1x, 'Vartwo':data2y, 'Vartwox':data2x,'datatree':data3y, 'datatreex':data3x})

print(df)

ax = plt.gca()
fig = plt.figure()
for k in ['Var1','Vartwo','datatree']:
    df.plot(x=k+'x', y=k, kind = 'line',ax=ax)enter code here

The output I get this this:

I would like to see discontinuity where the 'y variables' are zero.

I have tried:

import numpy as np
df2 = df.copy()
df2[df2.Var1 < 0.5] = np.nan

But this makes an entire row NaN when I only want it to be a particular variable.

I'm trying this but it isnt working.

ax = plt.gca()
fig = plt.figure()
for k in ['Var1','Vartwo','datatree']:
    filter = df.k.values > 0
    x = df.k+'x'
    y = df.k
    plot(x[filter], y[filter], kind = 'line',ax=ax)

This works for a single variable but I don't know how to loop it across 200 variables and this also doesn't show the discontinuities.

import matplotlib.pyplot as plt
ax = plt.gca()
fig = plt.figure()
for k in ['Var1','Vartwo','datatree']:
    filter = df.Var1.values > 0
    x = df.Var1x[filter]
    y = df.Var1[filter]
    plt.plot(x, y)

Answer 1

You're looking for .replace() :

df2 = df.copy()
cols_to_replace = ['Var1','Var1x','Vartwo']
df2[cols_to_replace] = df2[cols_to_replace].replace({0:np.nan})

fig, ax = plt.subplots()
for k in ['Var1','Vartwo','datatree']:
    df2.plot(x=k+'x', y=k, kind = 'line',ax=ax)

Result:

Python: Conditionally plotting data from many columns from a Dataframe in a loop

Question

1 answers

solution1
1 ACCPTED 2020-05-18 11:08:40

Python: Conditionally plotting data from many columns from a Dataframe in a loop

Question

1 answers

solution1 1 ACCPTED 2020-05-18 11:08:40

solution1
1 ACCPTED 2020-05-18 11:08:40