简体   繁体   English

读取 csv 文件时出错:将列从字符串转换为浮点数

[英]Error while reading csv file: converting a column from string to float

I am trying to read a csv file that contains a column, SpType, in which there are String values.我正在尝试读取一个包含 SpType 列的 csv 文件,其中有字符串值。 My variable is being converted into an object, but I need it to be float type.我的变量正在转换为对象,但我需要它是浮点类型。 Here's the snippet:这是片段:

data = pd.read_csv("/content/Star3642_balanced.csv")

X_orig = data[["Vmag", "Plx", "e_Plx", "B-V", "SpType", "Amag"]].to_numpy()

Here's what's giving me the error:这是给我错误的原因:

X = torch.tensor(X_orig, dtype=torch.float32)

The error reads "can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool."错误显示"can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool."

I tried doing this after reading the csv file, but it didn't help:我在阅读 csv 文件后尝试这样做,但没有帮助:

data["SpType"] = data.SpType.astype(float)

Can someone please tell me what can be done about this?有人可以告诉我可以做些什么吗?

Strings should be encoded into numeric values.字符串应编码为数值。 The easiest way would be using Pandas one-hot encoding (that will create lots of extra columns in this case, but a neural network should process those without much effort):最简单的方法是使用 Pandas one-hot 编码(在这种情况下会创建很多额外的列,但是神经网络应该不费吹灰之力地处理这些列):

ohe = pd.get_dummies(data["SpType"], drop_first=True)
data[ohe.columns] = ohe
data = data.drop(["SpType"], axis=1)

Alternatively, you may use sklearn encoders or category_encoders library - more complex encoding might require to process the test set separately to avoid the target leakage.或者,您可以使用 sklearn 编码器或 category_encoders 库 - 更复杂的编码可能需要单独处理测试集以避免目标泄漏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将字符串转换为CSV文件中的float时出错 - An error while converting string into float in a CSV file 将大csv的列从字符串转换为float时出现pandas内存错误 - pandas memory error while converting column of large csv from string to float Pandas 在从 csv 文件读取列时出错 - Pandas is making an error while reading a column from a csv file 读取文本文件并将字符串转换为浮点数 - Reading a text file and converting string to float 尝试从注释文件读取时将字符串转换为浮点数时出现值错误 - Getting value error for converting string to float while trying to read from annotation file 读取CSV文件并将数据转换为python列表。 ValueError:无法将字符串转换为float: - Reading a CSV file and converting data into python lists. ValueError: could not convert string to float: 从文件中读取并在 Python 中转换为浮点数 - Reading from a file and converting to a float in Python 将float从两列文本文件读取到Python中的数组时出错 - error with reading float from two column text file into an array in Python 从csv文件读取时如何更改列的数据类型 - how to change datatype of a column while reading from a csv file 将 csv 文件中列的字符串值转换为 int 或 float 值以在 Python 中创建 Kmeans 聚类算法 - Converting string values of a column in a csv file to int or float values to create Kmeans cluster algorithm in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM