简体   繁体   中英

Python: Delete specific timestamp index row (independently of date)

I have a DataFrame with this specific timestamps index:

2011-01-07 09:30:00
2011-01-07 09:35:00
2011-01-07 09:40:00
...
2011-01-08 09:30:00
2011-01-08 09:35:00
2011-01-08 09:40:00
...
2011-01-09 09:30:00
2011-01-09 09:35:00
2011-01-09 09:40:00

Without going through some kind of loop, is there a fast way to delete every row with the time 09:30:00 independently of the date?

Construct a test frame

In [28]: df = DataFrame(np.random.randn(400,1),index=date_range('20130101',periods=400,freq='15T'))

In [29]: df = df.take(df.index.indexer_between_time('9:00','10:00'))

In [30]: df
Out[30]: 
                            0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:30:00 -0.654370
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:30:00  1.481053
2013-01-02 09:45:00  0.327030
2013-01-02 10:00:00  1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00  0.563914
2013-01-03 09:30:00 -0.117773
2013-01-03 09:45:00  0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00  0.930746
2013-01-04 09:30:00  1.088279
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625

[20 rows x 1 columns]

The indexer_between_time returns the indexes that we want to remove, so just remove them from the original index (this is what an index - does).

In [31]: df.reindex(df.index-df.index.take(df.index.indexer_between_time('9:30:00','9:30:00')))
Out[31]: 
                            0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:45:00  0.327030
2013-01-02 10:00:00  1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00  0.563914
2013-01-03 09:45:00  0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00  0.930746
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625

[16 rows x 1 columns]

You need to do something like -

>>> x = pd.DataFrame([[1,2,3,4],[3,3,3,3],[8,7,3,2],[9,9,9,4],[2,2,2,4]])
>>> x
   0  1  2  3
0  1  2  3  4
1  3  3  3  3
2  8  7  3  2
3  9  9  9  4
4  2  2  2  4

[5 rows x 4 columns]
>>> x[x[3] == 4]
   0  1  2  3
0  1  2  3  4
3  9  9  9  4
4  2  2  2  4

[3 rows x 4 columns]

In your case condition would be on timestamp column. x[x[3] == 4] means that get only those rows for which column '3' has a value of 4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM