如何将 Excel 文件中的列与 pandas 平方?

[英]How do I square a column from an Excel file with pandas?

I've read an Excel file into python using:我已经使用以下方法将 Excel 文件读入 python:

import pandas as pd
import numpy as np

water_consumption = pd.read_csv('Self_Data.csv')

and I'm trying to square the columns using:我正在尝试使用以下方法对列进行平方:

exponent = 2
water_consumption['x2'] = np.power(water_consumption['Consumption_(HCF)'], exponent)
water_consumption['y2'] = np.power(water_consumption['Water&Sewer_Charges'], exponent)

I keep getting the error:我不断收到错误:

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

I'm fairly new to python.我对python相当陌生。 Is there any way to easily fix this?有什么方法可以轻松解决这个问题吗?

You can use a lambda function like this: But this works if the data type of the column is not object type.您可以使用这样的lambda function :但是如果列的数据类型不是对象类型,则此方法有效。 for this check type(water_consumption['x2']) you try it:对于这种检查type(water_consumption['x2']) ,您可以尝试一下:

water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)

>>> import pandas as pd
>>> water_consumption={"x2":[1,2,3,4],"y2":[5,6,7,8]}
>>> water_consumption=pd.DataFrame(water_consumption)
>>> water_consumption
   x2  y2
0   1   5
1   2   6
2   3   7
3   4   8
>>> water_consumption['x2']=water_consumption['x2'].apply(lambda x:x**2)
>>> water_consumption
   x2  y2
0   1   5
1   4   6
2   9   7
3  16   8

Never use apply- lambda for straightforward mathematical operations it is orders of magnitude slower than using direct operations.永远不要将 apply- lambda用于直接的数学运算,它比使用直接运算慢几个数量级。 The problem that click004 is having is that columns are in str format. click004 的问题是列是str格式。

They should be converted first to a numeric, typically with .convert_dtypes() : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html它们应该首先转换为数字,通常使用.convert_dtypes()https ://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html

Pandas is quite good at understanding the type of column, the fact that it has been detected as str means that probably some of the values are not directly numbers, but may have units or something else. Pandas 非常擅长理解列的类型,它被检测为str的事实意味着可能有些值不是直接数字,而是可能有单位或其他东西。

