[英]Select rows of specific months in pandas dataframe
我有一个日期列从 2015 年到 2021 年的熊猫数据框。
print(data)
date time wind_speed wind_direction
0 2015-01-01 00:00 00:00 28.0 25.0
1 2015-01-01 01:00 01:00 23.0 24.0
2 2015-01-01 02:00 02:00 25.0 24.0
3 2015-01-01 03:00 03:00 21.0 24.0
4 2015-01-01 04:00 04:00 23.0 24.0
... ... ... ... ...
61363 2021-12-31 19:00 19:00 NaN NaN
61364 2021-12-31 20:00 20:00 NaN NaN
61365 2021-12-31 21:00 21:00 NaN NaN
61366 2021-12-31 22:00 22:00 NaN NaN
61367 2021-12-31 23:00 23:00 NaN NaN
如何选择date
列中月份 == 5、6、7、8、9 的行? (五月 -> 九月)
这是我尝试过的:
data['date'] = pd.to_datetime(data['date'])
data = data[data['date'].dt.month == 5, 6, 7, 8, 9]
print(data)
C:\Users\Chance\anaconda3\lib\site-packages\pandas\core\frame.py:3607: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._set_item(key, value)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-02646c054a5d> in <module>
1 data['date'] = pd.to_datetime(data['date'])
----> 2 data = data[data['date'].dt.month == 5, 6, 7, 8, 9]
3 data
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
3453 if self.columns.nlevels > 1:
3454 return self._getitem_multilevel(key)
-> 3455 indexer = self.columns.get_loc(key)
3456 if is_integer(indexer):
3457 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3359 casted_key = self._maybe_cast_indexer(key)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
3363 raise KeyError(key) from err
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
TypeError: '(0 False
1 False
2 False
3 False
4 False
...
61363 False
61364 False
61365 False
61366 False
61367 False
Name: date, Length: 61368, dtype: bool, 6, 7, 8, 9)' is an invalid key
尝试:
#if column `date` isn't already converted:
#df["date"] = pd.to_datetime(df["date"])
print(df[(df.date.dt.month > 4) & (df.date.dt.month < 10)])
印刷:
date time wind_speed wind_direction
3 2015-05-01 03:00:00 03:00 21.0 24.0
4 2015-06-01 04:00:00 04:00 23.0 24.0
61363 2021-07-01 19:00:00 19:00 NaN NaN
61364 2021-08-01 20:00:00 20:00 NaN NaN
61365 2021-09-01 21:00:00 21:00 NaN NaN
df
使用:
date time wind_speed wind_direction
0 2015-01-01 00:00 00:00 28.0 25.0
1 2015-01-01 01:00 01:00 23.0 24.0
2 2015-02-01 02:00 02:00 25.0 24.0
3 2015-05-01 03:00 03:00 21.0 24.0
4 2015-06-01 04:00 04:00 23.0 24.0
61363 2021-07-01 19:00 19:00 NaN NaN
61364 2021-08-01 20:00 20:00 NaN NaN
61365 2021-09-01 21:00 21:00 NaN NaN
61366 2021-10-01 22:00 22:00 NaN NaN
61367 2021-11-01 23:00 23:00 NaN NaN
尝试:
df["date"] = pd.to_datetime(df["date"], format="%Y-%m-%d %H:%M")
>>> df[df["date"].dt.month.isin([5,6,7,8,9])]
date time wind_speed wind_direction
2 2015-05-01 02:00:00 02:00 25 24.0
3 2015-07-01 03:00:00 03:00 21 24.0
4 2015-09-01 04:00:00 04:00 23 24.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.