简体   繁体   English

将.txt文件分为Python的多个部分

[英]Dividing a .txt file in multiple parts in Python

I'm a begginer in Python, and I have a question about file reading : I need to process info in a file to write it in another one. 我是Python的入门者,我对文件读取有疑问:我需要处理文件中的信息,然后再将其写入另一个文件中。 I know how to do that, but it's reaaally ressource-consuming for my computer, as the file is really big, but I know how it's formatted ! 我知道该怎么做,但是对于我的计算机来说,它确实很耗资源,因为文件很大,但是我知道它的格式! The file follows that format : 该文件采用以下格式:

4 13
9 3 4 7
3 3 3 3
3 5 2 1

I won't explain what it is for, as it would take ages and would not be very useful, but the file is essentialy made of four lines like these, again and again. 我将不解释它的用途,因为它会花费很多时间并且不会很有用,但是文件必不可少地由这样的四行组成。 For now, I use this to read the file and convert it in a very long chain : 现在,我用它来读取文件并将其转换成一个很长的链:

inputfile = open("input.txt", "r")
output = open("output.txt", "w")
Chain = inputfile.read()
Chain = Chain.split("\n")
Chained = ' '.join(Chain)
Chain = Chained.split(" ")
Chain = list(map(int, Chain))

Afterwards, I just treat it with "task IDs", but I feel like it's really not efficient. 之后,我只是用“任务ID”来对待它,但是我觉得它确实没有效率。 So do you know how I could divide the chain into multiple ones knowing how they are formatted? 那么,您知道我如何将链分成多个链,知道它们的格式吗? Thanks for reading ! 谢谢阅读 !

How about: 怎么样:

res = []
with open('file', 'r') as f:
  for line in f:
    for num in line.split(' '):
      res.append(int(num))

Instead of reading the whole file into memory, you go line by line. 您无需逐行将整个文件读入内存中。 Does this help? 这有帮助吗?

If you need to go 4 lines at a time, just add an internal loop. 如果您需要一次走4行,只需添加一个内部循环。

Regarding output, I'm assuming you want to do some computation on the input, so I wouldn't necessarily do this in the same loop. 关于输出,我假设您想对输入进行一些计算,因此不必在同一循环中执行此操作。 Either process the input once reading is done, or instead of using a list, use a queue and have another thread read from the queue while this thread is writing to it. 读取完成后,要么处理输入,要么不使用列表,而使用队列,并在该线程写入队列时从队列中读取另一个线程。

Perhaps the utility of a list comprehension will help a bit as well (I doubt this will make an impact): 也许列表理解的功能也会有所帮助(我怀疑这会产生影响):

res = []
with open('file', 'r') as f:
  for line in f:
    res.append( int(num) for num in line.split() )

hmm there's some method to write to a file without reading it i believe 嗯,有一些方法可以写入文件而不读取它,我相信

Add text to end of line without loading file 将文本添加到行尾而不加载文件

https://docs.python.org/2.7/library/functions.html#print https://docs.python.org/2.7/library/functions.html#print

from __future__ import print_function
# if you are using python2.7
i = open("input","r")
f = open("output.txt","w")
a = "awesome"
for line in i:
    #iterate lines in file input
    line.strip()
    #this will remove the \n in the end of the string
    print(line,end=" ",file=f) 
    #this will write to file output with space at the end of it

this might help, i'm a newbie too, but with better google fu XD 这可能会有所帮助,我也是新手,但是使用更好的Google Fu XD

Maybe do it line by line. 也许一行一行地做。 This way it consumes less memory. 这样,它将消耗更少的内存。

inputfile = open("input.txt", "r")
output = open("output.txt", "a")

while True:
    line = inputfile.readline()
    numbers = words.split(" ")
    integers = list(map(int, numbers))

    if not line: 
       break

There is probably a newline character \\n in the words. 单词中可能有换行符\\n You should also replace that with an empty string. 您还应该将其替换为空字符串。

If you don't wanna to consume memory (you can run of it if file is very large), you need to read lien by line. 如果您不想消耗内存(如果文件很大,可以运行它),则需要逐行读取留置权。

with open('input.txt', 'w') as inputfile, open('"output.txt', 'w') as output:
    for line in inputfile:
        chain = line.split(" ")
        #do some calculations or what ever you need
        #and write those numbers to new file
        numbers = list(map(int, chain))
        for number in numbers
            output.write("%d " % number)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM