简体   繁体   中英

Pandas: Iterate over rows and find frequency of occurances

I have a dataframe with 2 columns and 3000 rows.

First column is representing time in time-steps. For example first row is 0, second is 1, ..., last one is 2999.

Second column is representing pressure. The pressure changes as we iterate over the rows, but shows a repetitive behaviour. So every few steps we see that it goes to its minimum value (which is 375), then goes up again, then again at 375 etc.

What I want to do in Python, is to iterate over the rows and see: 1) at which time-steps we see pressure is at its minimum

2)Find the frequency between the minimum values.

import numpy as np
import pandas as pd
import numpy.random as rnd
import scipy.linalg as lin
from matplotlib.pylab import *
import re
from pylab import *
import datetime

df = pd.read_csv('test.csv')
row = next(df.iterrows())[0]
dataset = np.loadtxt(df, delimiter=";")

df.columns = ["Timestamp", "Pressure"]
print(df[[0, 1]])

You don't need to iterate row-wise, you can compare the entire column against the min value to mask it, you can then use the mask to find the timestep diff :

Data setup:

In [44]:
df = pd.DataFrame({'timestep':np.arange(20), 'value':np.random.randint(375, 400, 20)})
df

Out[44]:
    timestep  value
0          0    395
1          1    377
2          2    392
3          3    396
4          4    377
5          5    379
6          6    384
7          7    396
8          8    380
9          9    392
10        10    395
11        11    393
12        12    390
13        13    393
14        14    397
15        15    396
16        16    393
17        17    379
18        18    396
19        19    390

mask the df by comparing the column against the min value:

In [45]:    
df[df['value']==df['value'].min()]

Out[45]:
   timestep  value
1         1    377
4         4    377

We can use the mask with loc to find the corresponding 'timestep' value and use diff to find the interval differences:

In [48]:    
df.loc[df['value']==df['value'].min(),'timestep'].diff()

Out[48]:
1    NaN
4    3.0
Name: timestep, dtype: float64

You can divide the above by 1/60 to find frequency wrt to 1 minute or whatever frequency unit you desire

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM