簡體   English   中英

如何將異常值作為單獨的彩色標記添加到行 plot

[英]How to add outliers as separate colored markers to a line plot

val             time
5.6     2021-11-18 03:00:00
2.034   2021-11-18 05:00:00
1.171   2021-11-18 07:00:00
3.023   2021-11-18 09:00:00
4.202   2021-11-18 16:00:00
1.202   2021-11-18 17:00:00
5.202   2021-11-18 18:00:00
7.202   2021-11-18 19:00:00
2.202   2021-11-18 20:00:00
12.202  2021-11-18 21:00:00
1.202   2021-11-18 21:00:00

上面是我的 dataframe,我想要 plot 它(x=time,y=value),並將值 plot 設為紅色,其中(val>5)。

plt.plot(ab['time'], ab['value'], '-gD', markevery=marks, label='line with select markers')

其中標記[7.202,12.202]是我手動創建的列表。 但這不起作用。 error -: markevery is iterable but not a valid numpy fancy index

如果條件為真 python 3我在這里找到了一個,但如果點很多,這很耗時

  • The easiest solution is to use Boolean indexing to create a separate dataframe for values greater then 5, and then plot them as a scatter plot with pandas.DataFrame.plot
  • x 軸自動格式化為%M-%d %H 當有更多數據時格式會改變,還有其他答案討論如何格式化 pandas 日期時間軸。
import pandas as pd
import matplotlib.pyplot as plt

# sample data
data = {'val': [5.6, 2.034, 1.171, 3.023, 4.202, 1.202, 5.202, 7.202, 2.202, 12.202, 1.202], 'time': ['2021-11-18 03:00:00', '2021-11-18 05:00:00', '2021-11-18 07:00:00', '2021-11-18 09:00:00', '2021-11-18 16:00:00', '2021-11-18 17:00:00', '2021-11-18 18:00:00', '2021-11-18 19:00:00', '2021-11-18 20:00:00', '2021-11-18 21:00:00', '2021-11-18 21:00:00']}
df = pd.DataFrame(data)

# convert the time column to a datetime dtype
df.time = pd.to_datetime(df.time)

# get the values greater than 5
masked = df[df.val.gt(5)]

# plot the line plot
ax = df.plot(x='time', marker='o', figsize=(15, 5), zorder=0)

# plot those greater than 5
masked.plot(kind='scatter', x='time', y='val', color='red', ax=ax, s=30, label='outliers')

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM