[英]How to delete non-needed rows while using read_csv
I have a csv file reads like this in.txt:我有一个 csv 文件,在.txt 中读取如下:
FullName: Ryan, Jack
DOB:12345
IDNUM: 1234455
Name,Age,City
jack,34,Sydeny,
Riti,31,Delhi,
Aadi,16,New York,
Suse,32,Lucknow,
Mark,33,Las vegas,
Suri,35,Patna,
Is there anyway to remove the first three rows of the csv file (FullName, DOB, and IDNUM) using read_csv()?无论如何使用 read_csv() 删除 csv 文件(全名、DOB 和 IDNUM)的前三行? I used header=3 and skiprows=3 and the resulting table is not what I am looking for and appears to be shifted off by one to the left.
我使用了 header=3 和 skiprows=3 ,结果表不是我想要的,并且似乎向左移动了一个。 Any help would be appreciated.
任何帮助,将不胜感激。
As suggested by @xyzjayne, skiprows
is the way to go.正如@xyzjayne 所建议的,
skiprows
是通往 go 的方式。
import pandas as pd
import io
myfile = """FullName: Ryan, Jack
DOB:12345
IDNUM: 1234455
Name,Age,City
jack,34,Sydeny,
Riti,31,Delhi,
Aadi,16,New York,
Suse,32,Lucknow,
Mark,33,Las vegas,
Suri,35,Patna,"""
fh = io.StringIO(myfile)
Let's read it using pandas' read_csv()
method:让我们使用 pandas 的
read_csv()
方法来阅读它:
df = pd.read_csv(fh,
skiprows=3,
usecols=range(3)
)
NOTE: since your lines end with a comma, which would generate an extra column with NaN
values, you also need usecols
to return only a subset of the columns.注意:由于您的行以逗号结尾,这将生成一个带有
NaN
值的额外列,因此您还需要usecols
来仅返回列的子集。
>>> print(df)
Name Age City
0 jack 34 Sydeny
1 Riti 31 Delhi
2 Aadi 16 New York
3 Suse 32 Lucknow
4 Mark 33 Las vegas
5 Suri 35 Patna
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.