简体   繁体   English

csv 文件有问题用 csv.reader 打开它

[英]problem with csv file to open it with csv.reader

I have the following code:我有以下代码:

import pandas as pd
import numpy as np
import csv
filename = (r"C:\Users\Z\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Anaconda3 (64- bit)\diabetes.csv")
raw_data = open(filename, 'rb')
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)
data = (np.array(x)).astype('float')
print(data.shape)

But it errors:但它错误:

----> 7 x = list(reader)
Error: iterator should return strings, not bytes (did you open the file in text mode?)

When I change 'rb' to 'rt' :当我将'rb'更改为'rt'时:

raw_data = open(filename, 'rt')

It says:它说:

----> 8 data = (np.array(x)).astype('float')
ValueError: could not convert string to float: 'Pregnancies'

And when I delete .astype('float') , the result is (769, 9) but the expected result is (768, 9) .当我删除.astype('float')时,结果是(769, 9)但预期结果是(768, 9)

It counts the header as data.它将 header 计为数据。 Can you tell me why?你能告诉我为什么吗?

The error is in your mode of opening the file.错误在于您打开文件的方式。 You opened the file in 'rb' mode.您以'rb'模式打开了文件。 'rb' is used only for binary files and CSV file type is not binary. 'rb'仅用于二进制文件,CSV 文件类型不是二进制文件。

Use:采用:

raw_data = open(filename, 'r')

OR要么

raw_data = open(filename)

This is because the file is by default opened in read mode by open() function.这是因为该文件默认情况下由open() function 以读取模式打开。

And about header being taken as data, use而关于 header 被作为数据,使用

reader = csv.reader(file, delimiter = ','[,other parameters])
data = []
for i in reader:
  if i[0] = "You first cell's data":
    pass
  else:
    data.append(i)

Now you can use the list to create Numpy Arrays现在你可以使用列表创建 Numpy Arrays

Before you do following:在您执行以下操作之前:

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)

try尝试

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
x = list(reader)

which should skip the header of csvfiles.这应该跳过 csv 文件的 header。

It is described @ https://docs.python.org/3/library/csv.html#csv.csvreader.__next_ _它被描述为@ https://docs.python.org/3/library/csv.html#csv.csvreader.__next_ _

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM