[英]Pandas. pd.read_csv KeyError:not in index error
I'm working with CSV file with over million entries. 我正在处理具有超过一百万个条目的CSV文件。 I'm trying to read data for each candidate as a separate column.
我正在尝试将每个候选人的数据作为单独的列读取。 I was able to parse data for the first candidate but when I get to next candidate I'm getting error ['cand_nm'] not in index.
我能够解析第一个候选者的数据,但是当我找到下一个候选者时,我得到的错误['cand_nm']不在索引中。
Here is the link to file: Link to data file 这是文件的链接: 数据文件的链接
%matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
reader_bachmann = pd.read_csv('P00000001-ALL.csv', squeeze=True, low_memory=False, nrows=411 )
print(reader_bachmann)
cand_bachmann = reader_bachmann[["cand_nm"]]
print(cand_bachmann)
reader_romney = pd.read_csv('P00000001-ALL.csv', skiprows=range(0,411) , na_filter =True, low_memory=False)
print(reader_romney)
cand_romney = reader_romney[["cand_nm"]]
print(cand_romney)
KeyError: "['cand_nm'] not in index"
When you use skip_rows
like that you lose the header. 当您像这样使用
skip_rows
,会丢失标头。 So you header for reader_romney
is now row number 412. If this is the way you want to read the file you will need to store the header line to a list of strings and then pass that list as the names=
kwarg. 因此,您
reader_romney
标头现在是行号412。如果这是您要读取文件的方式,则需要将标头行存储到字符串列表中,然后将该列表作为names=
kwarg传递。 For example 例如
r_bachman = pd.read_csv('P00000001-ALL.csv', squeeze=True, low_memory=False,
nrows=411 )
cols = r_bachman.columns
r_romney = pd.read_csv('P00000001-ALL.csv', skiprows=range(0,411),
na_filter =True, low_memory=False, names=cols)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.