[英]Python - Round CSV column to nearest 30min
My CSV data is the following:我的 CSV 数据如下:
Columns:列:
Wished result:希望的结果:
A new column named "CRASH_DATETIME" with a datetime
Python object based with the corresponding date.一个新列名为“CRASH_DATETIME”有
datetime
基于与相应的日期Python对象。 Year doesn't matter, main goal is to track crashes by month, day and hour:minutes, which should be rounded to the nearest 30min.年份无关紧要,主要目标是按月、日和小时:分钟跟踪崩溃,应四舍五入到最接近的 30 分钟。
Tried the following but failed:尝试了以下但失败了:
from datetime import datetime, timedelta
def ceil_dt(month, day, hourWithMinutes, delta):
hour,minutes = hourWithMinutes.split(':')
int(month)
int(day)
int(hour)
int(minutes)
dt = datetime.datetime(month=month, day=day, hour=hour, minute=minutes)
return dt + (datetime.min - dt) % delta
and和
dataInitial['TIME'] = dataInitial.apply(lambda row: ceil_dt(row['CRASH_MONTH'], row['CRASH_DAY'], row['TIMESTR'], '30'))
But failed ( using Jupyter Notebook ):但失败了(使用 Jupyter Notebook ):
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:14010)()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-40-a9ef29fd7eb7> in <module>()
----> 1 dataInitial['TIME'] = dataInitial.apply(lambda row: ceil_dt(row['CRASH_MONTH'], row['CRASH_DAY'], row['TIMESTR'], '30'))
~/anaconda2/envs/tfdeeplearning/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
4260 f, axis,
4261 reduce=reduce,
-> 4262 ignore_failures=ignore_failures)
4263 else:
4264 return self._apply_broadcast(f, axis)
~/anaconda2/envs/tfdeeplearning/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
4356 try:
4357 for i, v in enumerate(series_gen):
-> 4358 results[i] = func(v)
4359 keys.append(v.name)
4360 except Exception as e:
<ipython-input-40-a9ef29fd7eb7> in <lambda>(row)
----> 1 dataInitial['TIME'] = dataInitial.apply(lambda row: ceil_dt(row['CRASH_MONTH'], row['CRASH_DAY'], row['TIMESTR'], '30'))
~/anaconda2/envs/tfdeeplearning/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
599 key = com._apply_if_callable(key, self)
600 try:
--> 601 result = self.index.get_value(self, key)
602
603 if not is_scalar(result):
~/anaconda2/envs/tfdeeplearning/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
2475 try:
2476 return self._engine.get_value(s, k,
-> 2477 tz=getattr(series.dtype, 'tz', None))
2478 except KeyError as e1:
2479 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4404)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4087)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5210)()
KeyError: ('CRASH_MONTH', 'occurred at index CRASH_DATE')
Any ideas?有任何想法吗?
Your function has some minor problems regarding the conversions (not stored in the variable), the lack of the year and the timedelta.您的函数在转换(未存储在变量中)、缺少年份和 timedelta 方面存在一些小问题。 This version of the function works properly:
此版本的功能正常工作:
from datetime import datetime, timedelta
def ceil_dt(month, day, hourWithMinutes, delta):
hour,minutes = hourWithMinutes.split(':')
month = int(month)
day = int(day)
hour = int(hour)
minutes = int(minutes)
dt = datetime(year = 2019, month=month, day=day, hour=int(hour), minute=int(minutes))
return dt + (datetime.min - dt) % timedelta(minutes=int(delta))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.