使用过滤后的 y 值创建子图

Question

I have a plot that looks like this:我有一个 plot，看起来像这样：

And I want to create subplots with linear regressions on the y axis values from 0-5, 5-10, and 10-15 days.我想在 0-5、5-10 和 10-15 天的 y 轴值上创建具有线性回归的子图。 My code currently looks like this:我的代码目前看起来像这样：

#Read in proper dataframe
storms = pd.read_csv('storms.csv', sep = ',', header = 0, ) 

#Define variables for linear regression - frequency
x_col ='season'
y_col = 'days'
x = storms[x_col]
y = storms[y_col]
x_array = np.array(x).reshape(-1,1)
y_array = np.array(y).reshape(-1,1)

linreg = LinearRegression().fit(x_array,y_array)

#Perform linear regression for frequency 
lin_reg(x_array,y_array)

#Plot
sns.set_theme(context='notebook', style='darkgrid')
sns.light_palette("#79C")

plt.scatter(x_array,y_array, alpha = 0.25,)
plt.plot(x_array,linreg.predict(x_array),  label='y=-0.0154x+38.5978')
plt.xlabel('Season')
plt.ylabel('Duration of Storms (in days)')
plt.title('Duration of Storms Over Time')
plt.legend()
plt.show

I tried defining functions to filter the y axis values which I then applied with the plt.subplot function which looked like this:我尝试定义函数来过滤 y 轴值，然后将其应用于 plt.subplot function，如下所示：

def Filter1(f):
    if f <= 5:
        return False
    else:
        return True

y_filtered = Filter1(y_array)

#Plotting subplots
plt.subplot(3,1,1)
plt.plot(x_array,y_filtered)

But no success yet.但还没有成功。 Any suggestions?有什么建议么？

Answer 1

When you are taking your x- and y-values from the dataframe columns, you can convert them to numpy arrays by calling当您从 dataframe 列中获取 x 和 y 值时，您可以通过调用将它们转换为 numpy arrays

x = storms[x_col].values
y = storms[y_col].values

Then, you don't have to define any function to filter the values, you can just do it with numpy features.然后，您不必定义任何 function 来过滤值，您只需使用 numpy 功能即可。 You can obtain an array containing true and false values depending on your condition.您可以获得一个包含true值和false值的数组，具体取决于您的条件。 For example, you can say x > 5 and it will give you an array with the same size as x , where for each original index in x there will be a false if the value at that index is below (or equal to) 5 and true if it is above 5.Then you can use that boolean array as a mask for other arrays of the same size, returning only the elements at the index where the mask is true .例如，您可以说x > 5 ，它将为您提供一个与x大小相同的数组，其中对于 x 中的每个原始false ，如果该索引处的值低于（或等于）5 并且如果大于 5，则为true 。然后您可以使用该 boolean 数组作为其他相同大小的 arrays 的掩码，仅返回掩码为true的索引处的元素。 That's how you can filter with numpy:这就是您可以使用 numpy 进行过滤的方式：

mask = y[(y>=5) & (y<10)]
y_filtered = y[mask]
x_filtered = x[mask]

Then perform the regression with x_filtered and y_filtered .然后使用x_filtered和y_filtered执行回归。

使用过滤后的 y 值创建子图

问题描述

1 个解决方案

解决方案1
0 2023-01-12 15:32:38

使用过滤后的 y 值创建子图

问题描述

1 个解决方案

解决方案1 0 2023-01-12 15:32:38

解决方案1
0 2023-01-12 15:32:38