简体   繁体   English

python将文本文件读入数组

[英]python read text file into array

I am trying to use Python to auto-parse a set of text files and turn them into XML files. 我正在尝试使用Python自动解析一组文本文件并将其转换为XML文件。

There are alot of people asking how to loop through a text file and read them into an array. 有很多人问如何遍历文本文件并将它们读入数组。 The trouble here is that this wont quite work for me. 这里的麻烦是,这对我来说是行不通的。

I need to loop through the first three lines individually then drop the rest of the text file (body) into one array entry. 我需要分别遍历前三行,然后将其余文本文件(正文)放入一个数组条目中。

The text file is formatted as follows. 文本文件的格式如下。

Headline 标题

Subhead 副标题

by A Person 由一个人

text file body content. 文本文件正文内容。 Multiple paragraphs 多段

How would I go about setting up an array to do this in Python? 我将如何在Python中设置一个数组来执行此操作?

Something like this: 像这样:

with open("data1.txt") as f:
    head,sub,auth = [f.readline().strip() for i in range(3)]
    data=f.read()
    print head,sub,auth,data

If you've spaces between the the lines, then you should try: 如果行与行之间有空格,则应尝试:

filter() will remove he empty lines: filter()将删除空行:

 with open("data1.txt") as f:
    head,sub,auth =filter(None,(f.readline().strip() for i in range(6)))
    data=f.read()
    print head,sub,auth,,data

If I understood your question correctly, you wish to put all the text except for the first 3 lines into an array (list). 如果我正确理解了您的问题,则希望将除前三行之外的所有文本放入数组(列表)中。 Here's how to do that: 这样做的方法如下:

with open("/path/to/your/file.txt") as f:
    all_lines = f.readlines()
content_lines = all_lines[3:]
content_text = '\n'.join(content_lines)
content_list.append(content_text)

Explanation: You first open the file, and then put all of its lines into a list. 说明:您首先打开文件,然后将其所有行放入列表中。 Then, you take all the lines after the first three, and put those into a list. 然后,将前三行之后的所有行都放入列表中。 Then, you join this new list with newlines to make it content again. 然后,您用换行符将此新列表加入,以使其再次满足要求。 Then, you append this new content to a list that you've created beforehand called content_list 然后,将此新内容附加到预先创建的名为content_list的列表中


If you want to put the first three lines into your list as well, then do the following before appending to content_list : 如果您也想将前三行也放入列表中,请在附加到content_list之前执行以下操作:

for line in all_lines[:3]:
    content_list.append(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM