[英]Parsing a pipe-delimited file in Python
I'm trying to parse a pipe-delimited file and pass the values into a list, so that later I can print selective values from the list.我正在尝试解析一个以竖线分隔的文件并将值传递到一个列表中,以便稍后我可以从列表中打印选择性值。
The file looks like:该文件如下所示:
name|age|address|phone|||||||||||..etc
It has more than 100 columns.它有 100 多个列。
Use the csv library . 使用csv库 。
First, register your dialect: 首先,注册您的方言:
import csv
csv.register_dialect('piper', delimiter='|', quoting=csv.QUOTE_NONE)
Then, use your dialect on the file: 然后,在文件上使用您的方言:
with open(myfile, "rb") as csvfile:
for row in csv.DictReader(csvfile, dialect='piper'):
print row['name']
If you're parsing a very simple file that won't contain any |
如果您要解析一个非常简单的文件,其中不包含任何
|
characters in the actual field values, you can use split
: 实际字段值中的字符,可以使用
split
:
fileHandle = open('file', 'r')
for line in fileHandle:
fields = line.split('|')
print(fields[0]) # prints the first fields value
print(fields[1]) # prints the second fields value
fileHandle.close()
EDIT: A more robust way to parse tabular data would be to use the csv
library as mentioned below . 编辑:解析表格数据的更可靠的方法是使用
csv
库, 如下所述 。
import pandas as pd
pd.read_csv(filename,sep="|")
This will store the file in dataframe. 这会将文件存储在数据框中。 For each column you can apply conditions to select the required values to print.
对于每一列,您可以应用条件以选择要打印的所需值。 It takes a very short time to execute.
执行时间很短。 I tried with 111047 rows.
我尝试了111047行。
In 2022, with Python 3.8 or above, you can simply do:在 2022 年,使用 Python 3.8 或更高版本,您可以简单地执行以下操作:
import csv
with open(file_path, "r") as csvfile:
reader = csv.reader(csvfile, delimiter='|')
for row in reader:
print(row[0], row[1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.