简体   繁体   English

根据Matplotlib中的条件标记具体点

[英]Mark specific points based on conditions in Matplotlib

I plotted min points for df['Data'] .我为df['Data']绘制了最小点。

Timestamp = pd.date_range('2020-02-06 08:23:04', periods=1000, freq='s')
df = pd.DataFrame({'Timestamp': Timestamp,
                   'Data': 30+15*np.cos(np.linspace(0,10,Timestamp.size))})

df['timediff'] = (df['Timestamp'].shift(-1) - df['Timestamp']).dt.total_seconds()   
df['datadiff'] = df['Data'].shift(-1) - df['Data']
df['gradient'] = df['datadiff'] / df['timediff']

min_pt = np.min(df['Data'])       
# filter_pt = df.loc(df['gradient'] >= -0.1) # & df.loc[i, 'gradient'] <=0.1

mask = np.array(df['Data']) == min_pt 
color = np.where(mask, 'blue', 'yellow')

fig,ax = plt.subplots(figsize=(20,10))
# plt.plot_date(df['Timestamp'], df['Data'], '-' )
ax.scatter(df['Timestamp'], df['Data'], color=color, s=10)
plt.ticklabel_format
plt.show()

The plot looks like this: plot 看起来像这样: 在此处输入图像描述

I want to extend the condition using df['gradient'] column:我想使用 df['gradient'] 列扩展条件:

  1. What if instead of marking only 'minimum' points, I want to mark the points where gradient lies between 0.1 and -0.1 inclusive?如果不是只标记“最小”点,我想标记gradient介于 0.1 和 -0.1 之间的点怎么办?
  2. Additional condition: Take only the first datapoint in such range(ie.0.1 and -0.1 inclusive).附加条件:仅取该范围内的第一个数据点(即.0.1 和 -0.1 包括在内)。
  3. How to loop through whole dataset, rather than just taking the first data point that satisfies these conditions(what my current plot did)?如何遍历整个数据集,而不仅仅是获取满足这些条件的第一个数据点(我当前的 plot 做了什么)?

Tried to add:尝试添加:


df1 = df[df.gradient <= 0.1 & df.gradient >= -0.1]
plt.plot(df1.Timestamp,df1.Data, label="filter")

before mask based on this answer which returned error:在基于返回错误的此答案mask之前:

TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

I think what I did wasn't very efficient.我认为我所做的不是很有效率。 How to do it more efficiently?如何更有效地做到这一点?


Update:更新:

With code带代码

Timestamp = pd.date_range('2020-02-06 08:23:04', periods=1000, freq='s')
df = pd.DataFrame({'Timestamp': Timestamp,
                   'Data': 30+15*np.cos(np.linspace(0,10,Timestamp.size))})

df['timediff'] = (df['Timestamp'].shift(-1) - df['Timestamp']).dt.total_seconds()    
df['datadiff'] = df['Data'].shift(-1) - df['Data']
df['gradient'] = df['datadiff'] / df['timediff']

fig,ax = plt.subplots(figsize=(20,10))
df1 = df[(df.gradient <= 0.1) & (df.gradient >= -0.1)]
plt.plot(df1.Timestamp,df1.Data, label="filter")
plt.show()

it returned它回来了在此处输入图像描述

After changing the range to将范围更改为

df1 = df[(df.gradient <= 0.01) & (df.gradient >= -0.01)]

it returned它回来了在此处输入图像描述

Why?为什么?

Add the parenthesis on each condition that way you can do logical and row by row在每个条件上添加括号,这样您就可以逐行执行逻辑操作

df1 = df[(df.gradient <= 0.1) & (df.gradient >= -0.1)]

And consider using some scatter, otherwise, the latest points where the absolute value of gradient is greater than 0.1 will be connected.并考虑使用一些散点,否则将连接梯度绝对值大于0.1的最新点。

plt.scatter(df1.Timestamp,df1.Data, label="filter")

This would be the final image:这将是最终图像:

在此处输入图像描述

EDIT编辑

If you need only the first point where gradient is in the range, create groups and then use groupby如果您只需要梯度在范围内的第一个点,请创建组,然后使用 groupby

df['groups'] = ((df.gradient > 0.1) | (df.gradient < -0.1)).cumsum()

df2 = df[(df.gradient <= 0.1) & (df.gradient >= -0.1)]
    .groupby('groups').agg({'Timestamp':'first', 'Data':'first'})

#        Timestamp              Data
# groups        
# 0      2020-02-06 08:23:04    45.000000
# 168    2020-02-06 08:27:05    18.814188
# 336    2020-02-06 08:32:19    41.201294
# 504    2020-02-06 08:37:33    18.783251

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM