简体   繁体   中英

How to iterate over rows and index in a DataFrame in Pandas to filter bolean values

I am working in a project to find anomalies though some stock market tickers, fishing abnormal volumes... I'm struggling to filters the True values(those pass in the 'filter'). The main objective is create a data frame with the tickers that passed on the ' stats filter'.

import numpy as np
import pandas  as pd
from pandas_datareader import data as web

Get data frame

tickers = ['F', 'GE', 'GM','TSLA']
data = pd.DataFrame()
for t in tickers:
data[t] = web.DataReader(t, data_source='yahoo', start='2020-1-1')['Volume']

Stats filters

data_std = data.std()
data_mean = data.mean()
anomaly_cut_off = data_std * 3
upper_limit = data_mean + anomaly_cut_off

Data frame with boolean values (True or False)

outlier = data > upper_limit

Anomalies should be a data frame with the DATE(index) and the ticker ('F', 'GE', 'GM','TSLA') just if is True... The code below worked if i change the pd to np.array(data), but just with one tickers.

anomalies = []

for outlier in data:
  if outlier > upper_limit:
  anomalies.append(outlier)
return anomalies

If you want to return the rows where at least one your tickers is True , this works:

outlier[outlier.any(axis=1)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM