I have a DataFrame with this specific timestamps index:
2011-01-07 09:30:00
2011-01-07 09:35:00
2011-01-07 09:40:00
...
2011-01-08 09:30:00
2011-01-08 09:35:00
2011-01-08 09:40:00
...
2011-01-09 09:30:00
2011-01-09 09:35:00
2011-01-09 09:40:00
Without going through some kind of loop, is there a fast way to delete every row with the time 09:30:00
independently of the date?
Construct a test frame
In [28]: df = DataFrame(np.random.randn(400,1),index=date_range('20130101',periods=400,freq='15T'))
In [29]: df = df.take(df.index.indexer_between_time('9:00','10:00'))
In [30]: df
Out[30]:
0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:30:00 -0.654370
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:30:00 1.481053
2013-01-02 09:45:00 0.327030
2013-01-02 10:00:00 1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00 0.563914
2013-01-03 09:30:00 -0.117773
2013-01-03 09:45:00 0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00 0.930746
2013-01-04 09:30:00 1.088279
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625
[20 rows x 1 columns]
The indexer_between_time
returns the indexes that we want to remove, so just remove them from the original index (this is what an index -
does).
In [31]: df.reindex(df.index-df.index.take(df.index.indexer_between_time('9:30:00','9:30:00')))
Out[31]:
0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:45:00 0.327030
2013-01-02 10:00:00 1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00 0.563914
2013-01-03 09:45:00 0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00 0.930746
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625
[16 rows x 1 columns]
You need to do something like -
>>> x = pd.DataFrame([[1,2,3,4],[3,3,3,3],[8,7,3,2],[9,9,9,4],[2,2,2,4]])
>>> x
0 1 2 3
0 1 2 3 4
1 3 3 3 3
2 8 7 3 2
3 9 9 9 4
4 2 2 2 4
[5 rows x 4 columns]
>>> x[x[3] == 4]
0 1 2 3
0 1 2 3 4
3 9 9 9 4
4 2 2 2 4
[3 rows x 4 columns]
In your case condition would be on timestamp column. x[x[3] == 4]
means that get only those rows for which column '3' has a value of 4.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.