如何在使用 read_csv 时删除不需要的行

Question

I have a csv file reads like this in.txt:我有一个 csv 文件，在.txt 中读取如下：

FullName: Ryan, Jack
DOB:12345
IDNUM: 1234455
Name,Age,City
jack,34,Sydeny,
Riti,31,Delhi,
Aadi,16,New York,
Suse,32,Lucknow,
Mark,33,Las vegas,
Suri,35,Patna,

Is there anyway to remove the first three rows of the csv file (FullName, DOB, and IDNUM) using read_csv()?无论如何使用 read_csv() 删除 csv 文件（全名、DOB 和 IDNUM）的前三行？ I used header=3 and skiprows=3 and the resulting table is not what I am looking for and appears to be shifted off by one to the left.我使用了 header=3 和 skiprows=3 ，结果表不是我想要的，并且似乎向左移动了一个。 Any help would be appreciated.任何帮助，将不胜感激。

Answer 1

As suggested by @xyzjayne, skiprows is the way to go.正如@xyzjayne 所建议的， skiprows是通往 go 的方式。

import pandas as pd
import io

myfile = """FullName: Ryan, Jack
DOB:12345
IDNUM: 1234455
Name,Age,City
jack,34,Sydeny,
Riti,31,Delhi,
Aadi,16,New York,
Suse,32,Lucknow,
Mark,33,Las vegas,
Suri,35,Patna,"""

fh = io.StringIO(myfile)

Let's read it using pandas' read_csv() method:让我们使用 pandas 的read_csv()方法来阅读它：

df = pd.read_csv(fh,
                 skiprows=3,
                 usecols=range(3)
                )

NOTE: since your lines end with a comma, which would generate an extra column with NaN values, you also need usecols to return only a subset of the columns.注意：由于您的行以逗号结尾，这将生成一个带有NaN值的额外列，因此您还需要usecols来仅返回列的子集。

>>> print(df)

   Name  Age       City
0  jack   34     Sydeny
1  Riti   31      Delhi
2  Aadi   16   New York
3  Suse   32    Lucknow
4  Mark   33  Las vegas
5  Suri   35      Patna

如何在使用 read_csv 时删除不需要的行

问题描述

1 个解决方案

解决方案1
0 2021-06-11 20:20:24

如何在使用 read_csv 时删除不需要的行

问题描述

1 个解决方案

解决方案1 0 2021-06-11 20:20:24

解决方案1
0 2021-06-11 20:20:24