[英]TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U32') dtype('<U32') dtype('<U32')
I am trying to normalize energy use data using weather data with Pandas. 我正在尝试使用Pandas的天气数据来规范能源使用数据。 I need my code to read a csv with weather data, calculate some numbers using that data, and sum up those numbers based on the month of the year. 我需要我的代码来读取包含天气数据的csv,使用该数据计算一些数字,并根据一年中的月份总结这些数字。 Here is my code so far: 到目前为止,这是我的代码:
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
data = pd.read_csv("C:\\Users\\mparlo\\Documents\\Python\\NEWYORK - NEWYORK.csv", header=None)
data.columns = ["Month", "Day", "Year", "Temperature"]
ndays = len(data)
data["hdd"] = ""
data["cdd"] = ""
t_bp = 65
for i in range(0,ndays):
if data.at[i,"Temperature"] > t_bp:
data.at[i,"hdd"] = 0
data.at[i,"cdd"] = data.at[i,"Temperature"]-t_bp
elif data.at[i,"Temperature"] < t_bp:
data.at[i,"hdd"] = t_bp - data.at[i,"Temperature"]
data.at[i,"cdd"] = 0
data
hddjan = data.loc[data["Month"] == 1, "hdd"].sum()
cddjan = data.loc[data["Month"] == 1, "cdd"].sum()
hddfeb = data.loc[data["Month"] == 2, "hdd"].sum()
cddfeb = data.loc[data["Month"] == 2, "cdd"].sum()
hddmar = data.loc[data["Month"] == 3, "hdd"].sum()
cddmar = data.loc[data["Month"] == 3, "cdd"].sum()
hddapr = data.loc[data["Month"] == 4, "hdd"].sum()
The data is formatted such that Months are numbered 1-12. 数据被格式化,以使月份编号为1-12。
The code works up until the last line here, where I try to sum anything past Month 3/March. 该代码一直工作到这里的最后一行,在这里我尝试总结3月3日之后的所有内容。 I get this error: 我收到此错误:
> ---------------------------------------------------------------------------
>TypeError Traceback (most recent call last)
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in
>f(values, axis, skipna, **kwds)
> 118 else:
>--> 119 result = alt(values, axis=axis, skipna=skipna, >**kwds)
> 120 except Exception:
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in >nansum(values, axis, skipna)
> 292 dtype_sum = np.float64
>--> 293 the_sum = values.sum(axis, dtype=dtype_sum)
> 294 the_sum = _maybe_null_out(the_sum, axis, mask)
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_methods.py >in _sum(a, axis, dtype, out, keepdims)
> 31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
>---> 32 return umr_sum(a, axis, dtype, out, keepdims)
> 33
>
>TypeError: ufunc 'add' did not contain a loop with signature matching types >dtype('<U32') dtype('<U32') dtype('<U32')
>
>During handling of the above exception, another exception occurred:
>
>TypeError Traceback (most recent call last)
><ipython-input-5-beeced82f47d> in <module>()
> 8 cddmar = data.loc[data["Month"] == 3, "cdd"].sum()
> 9
>---> 10 hddapr = data.loc[data["Month"] == 4, "hdd"].sum()
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py >in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
> 6340 skipna=skipna)
> 6341 return self._reduce(f, name, axis=axis, skipna=skipna,
>-> 6342 numeric_only=numeric_only)
> 6343
> 6344 return set_function_name(stat_func, name, cls)
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\series.py in >_reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
> 2379 'numeric_only.'.format(name))
> 2380 with np.errstate(all='ignore'):
>-> 2381 return op(delegate, skipna=skipna, **kwds)
> 2382
> 2383 return delegate._reduce(op=op, name=name, axis=axis, >skipna=skipna,
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in >_f(*args, **kwargs)
> 60 try:
> 61 with np.errstate(invalid='ignore'):
>---> 62 return f(*args, **kwargs)
> 63 except ValueError as e:
> 64 # we want to transform an object array
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in >f(values, axis, skipna, **kwds)
> 120 except Exception:
> 121 try:
>--> 122 result = alt(values, axis=axis, skipna=skipna, >**kwds)
> 123 except ValueError as e:
> 124 # we want to transform an object array
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in >nansum(values, axis, skipna)
> 291 elif is_timedelta64_dtype(dtype):
> 292 dtype_sum = np.float64
>--> 293 the_sum = values.sum(axis, dtype=dtype_sum)
> 294 the_sum = _maybe_null_out(the_sum, axis, mask)
> 295
>
>~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_methods.py >in _sum(a, axis, dtype, out, keepdims)
> 30
> 31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
>---> 32 return umr_sum(a, axis, dtype, out, keepdims)
> 33
> 34 def _prod(a, axis=None, dtype=None, out=None, keepdims=False):
>
>TypeError: ufunc 'add' did not contain a loop with signature matching types >dtype('<U32') dtype('<U32') dtype('<U32')
If anyone has any idea why the sum is working for only the first three months, any help is appreciated. 如果有人知道为什么这笔金额仅在头三个月内有效,我们将不胜感激。
[EDIT: here is a link to the data: https://docs.google.com/spreadsheets/d/1nUD1wS_ZEWCyjLFdeL14_HaYq6NBjP1VtHN9gvgPD14/edit?usp=sharing ] [编辑:这是数据的链接: https : //docs.google.com/spreadsheets/d/1nUD1wS_ZEWCyjLFdeL14_HaYq6NBjP1VtHN9gvgPD14/edit?usp=sharing ]
While your issue is unclear without data, consider adjusting your code to pandas data manipulation methods such as conditional .loc
, groupby
, or pivot_table
instead of looping through all rows and assigning values by index and manually creating month sums. 尽管没有数据您的问题仍然不清楚,但请考虑将代码调整为适用于pandas数据处理方法(例如,条件.loc
, groupby
或pivot_table
而不是遍历所有行并按索引分配值并手动创建月份和。 In fact with an adjusted approach, you will be able to see the extent of your data, if it cuts off after March or not. 实际上,如果调整后的方法,无论3月以后是否中断,您都可以看到数据的范围。
...
import calendar
data = pd.read_csv(r'C:\Users\mparlo\Documents\Python\NEWYORK - NEWYORK.csv', header=None)\
.set_axis(["Month", "Day", "Year", "Temperature"], axis=1, inplace=False)
t_bp = 65
data.loc[data['Temperature'] > t_bp, 'hdd'] = 0
data.loc[data['Temperature'] <= t_bp, 'hdd'] = t_bp - data['Temperature']
data.loc[data['Temperature'] > t_bp, 'cdd'] = data['Temperature'] - t_bp
data.loc[data['Temperature'] <= t_bp, 'cdd'] = 0
data['Location'] = 'New York'
data['MonthAbbrev'] = data['Month'].apply(lambda x: calendar.month_abbr[x])
# LONG FORMAT
agg_data = data.group_by(['MonthAbbrev']).agg({'hhd':'sum', 'cdd':'sum'})
# WIDE FORMAT
agg_data = data.pivot_table(index=['Location'], values=['hdd', 'cdd'],
columns='MonthAbbrev', aggfunc='sum')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.