从txt文件中选取零件，然后使用python复制到另一个文件

Question

I'm in trouble here. 我在这里遇到麻烦了。 I need to read a file. 我需要读取一个文件。 Txt file that contains a sequence of records, check the records that I want to copy them to a new file. 包含一系列记录的Txt文件，检查我要将它们复制到新文件的记录。 The file content is like this (this is just an example, the original file has more than 30 000 lines): 文件内容是这样的（这只是一个示例，原始文件有3万多行）：

AAAAA|12|120 #begin file
00000|46|150 #begin register
03000|TO|460 
99999|35|436 #end register
00000|46|316 #begin register
03000|SP|467
99999|33|130 #end register
00000|46|778 #begin register
03000|TO|478
99999|33|457 #end register
ZZZZZ|15|111 #end file

The records that begin with 03000 and have the characters 'TO' must be written to a new file. 以03000开头且字符为“ TO”的记录必须写入新文件。 Based on the example, the file should look like this: 根据示例，文件应如下所示：

AAAAA|12|120 #begin file
00000|46|150 #begin register
03000|TO|460 
99999|35|436 #end register
00000|46|778 #begin register
03000|TO|478
99999|33|457 #end register
ZZZZZ|15|111 #end file

Code: 码：

file = open("file.txt",'r')
newFile = open("newFile.txt","w")    
content = file.read()
file.close()
# here I need to check if the record exists 03000 characters 'TO', if it exists, copy the recordset 00000-99999 for the new file.

I did multiple searches and found nothing to help me. 我进行了多次搜索，没有发现任何帮助。 Thank you! 谢谢！

Answer 1

with open("file.txt",'r') as inFile, open("newFile.txt","w") as outFile:
    outFile.writelines(line for line in inFile 
                       if line.startswith("03000") and "TO" in line)

If you need the previous and the next line, then you have to iterate inFile in triads. 如果您需要上一行和下一行，那么您必须在三元组中迭代inFile 。 First define: 首先定义：

def gen_triad(lines, prev=None):
    after = current = next(lines)
    for after in lines:
        yield prev, current, after
        prev, current = current, after

And then do like before: 然后像以前一样做：

outFile.writelines(''.join(triad) for triad in gen_triad(inFile) 
                   if triad[1].startswith("03000") and "TO" in triad[1])

Answer 2

import re

pat = ('^00000\|\d+\|\d+.*\n'
       '^03000\|TO\|\d+.*\n'
       '^99999\|\d+\|\d+.*\n'
       '|'
       '^AAAAA\|\d+\|\d+.*\n'
       '|'
       '^ZZZZZ\|\d+\|\d+.*')
rag = re.compile(pat,re.MULTILINE)

with open('fifi.txt','r') as f,\
     open('newfifi.txt','w') as g:
    g.write(''.join(rag.findall(f.read())))

For files with additional lines between lines beginning with 00000, 03000 and 99999, I didn't find simpler code than this one: 对于以00000、03000和99999开头的行之间有其他行的文件，我发现没有比这更简单的代码了：

import re

pat = ('(^00000\|\d+\|\d+.*\n'
       '(?:.*\n)+?'
       '^99999\|\d+\|\d+.*\n)'
       '|'
       '(^AAAAA\|\d+\|\d+.*\n'
       '|'
       '^ZZZZZ\|\d+\|\d+.*)')
rag = re.compile(pat,re.MULTILINE)

pit = ('^00000\|.+?^03000\|TO\|\d+.+?^99999\|')
rig = re.compile(pit,re.DOTALL|re.MULTILINE)

def yi(text):
    for g1,g2 in rag.findall(text):
        if g2:
            yield g2
        elif rig.match(g1):
            yield g1

with open('fifi.txt','r') as f,\
     open('newfifi.txt','w') as g:
    g.write(''.join(yi(f.read())))

Answer 3

file = open("file.txt",'r')
newFile = open("newFile.txt","w")    
content = file.readlines()
file.close()
newFile.writelines(filter(lambda x:x.startswith("03000") and "TO" in x,content))

Answer 4

This seems to work. 这似乎有效。 The other answers seem to only be writing out records that contain '03000|TO|' 其他答案似乎只是写出包含“ 03000 | TO |”的记录 but you have to write out the record before and after that as well. 但您也必须在此前后写出记录。

    import sys
# ---------------------------------------------------------------
# ---------------------------------------------------------------
# import file
file_name = sys.argv[1]
file_path = 'C:\\DATA_SAVE\\pick_parts\\' + file_name
file = open(file_path,"r")
# ---------------------------------------------------------------
# create output files
output_file_path = 'C:\\DATA_SAVE\\pick_parts\\' + file_name + '.out'
output_file = open(output_file_path,"w")
# create output files

# ---------------------------------------------------------------
# process file

temp = ''
temp_out = ''
good_write = False
bad_write = False
for line in file:
    if line[:5] == 'AAAAA':
        temp_out += line 
    elif line[:5] == 'ZZZZZ':
        temp_out += line
    elif good_write:
        temp += line
        temp_out += temp
        temp = ''
        good_write = False
    elif bad_write:
        bad_write = False
        temp = ''
    elif line[:5] == '03000':
        if line[6:8] != 'TO':
            temp = ''
            bad_write = True
        else:
            good_write = True
            temp += line
            temp_out += temp 
            temp = ''
    else:
        temp += line

output_file.write(temp_out)
output_file.close()
file.close()

Output: 输出：

AAAAA|12|120 #begin file
00000|46|150 #begin register
03000|TO|460 
99999|35|436 #end register
00000|46|778 #begin register
03000|TO|478
99999|33|457 #end register
ZZZZZ|15|111 #end file

Answer 5

Does it have to be python? 一定是python吗？ These shell commands would do the same thing in a pinch. 这些shell命令在紧要关头会做同样的事情。

head -1 inputfile.txt > outputfile.txt
grep -C 1 "03000|TO" inputfile.txt >> outputfile.txt
tail -1 inputfile.txt >> outputfile.txt

Answer 6

# Whenever I have to parse text files I prefer to use regular expressions
# You can also customize the matching criteria if you want to
import re
what_is_being_searched = re.compile("^03000.*TO")

# don't use "file" as a variable name since it is (was?) a builtin 
# function 
with open("file.txt", "r") as source_file, open("newFile.txt", "w") as destination_file:
    for this_line in source_file:
        if what_is_being_searched.match(this_line):
            destination_file.write(this_line)

and for those who prefer a more compact representation: 对于那些更喜欢紧凑的表示形式的人：

import re

with open("file.txt", "r") as source_file, open("newFile.txt", "w") as destination_file:
    destination_file.writelines(this_line for this_line in source_file 
                                if re.match("^03000.*TO", this_line))

Answer 7

code: 码：

fileName = '1'

fil = open(fileName,'r')

import string

##step 1: parse the file.

parsedFile = []

for i in fil:

    ##tuple1 = (1,2,3)    

    firstPipe = i.find('|')

    secondPipe = i.find('|',firstPipe+1)

    tuple1 = (i[:firstPipe],\
                i[firstPipe+1:secondPipe],\
                 i[secondPipe+1:i.find('\n')])

    parsedFile.append(tuple1)


fil.close()

##search criterias:

searchFirst = '03000'  
searchString = 'TO'  ##can be changed if and when required

##step 2: used the parsed contents to write the new file

filout = open('newFile','w')

stringToWrite = parsedFile[0][0] + '|' + parsedFile[0][1] + '|' + parsedFile[0][2] + '\n'

filout.write(stringToWrite)  ##to write the first entry

for i in range(1,len(parsedFile)):

    if parsedFile[i][1] == searchString and parsedFile[i][0] == searchFirst:

        for j in range(-1,2,1):

            stringToWrite = parsedFile[i+j][0] + '|' + parsedFile[i+j][1] + '|' + parsedFile[i+j][2] + '\n'

            filout.write(stringToWrite)


stringToWrite = parsedFile[-1][0] + '|' + parsedFile[-1][1] + '|' + parsedFile[-1][2] + '\n'

filout.write(stringToWrite)  ##to write the first entry

filout.close()

I know that this solution may be a bit long. 我知道这个解决方案可能会有点长。 But it is quite easy to understand. 但这很容易理解。 And it seems an intuitive way to do it. 这似乎是一种直观的方法。 And I have already checked this with the Data that you have provided and it works perfectly. 而且我已经使用您提供的数据进行了检查，它可以完美运行。

Please tell me if you need some more explanation on the code. 如果您需要有关代码的更多说明，请告诉我。 I will definitely add the same. 我一定会添加相同的内容。

Answer 8

I tip (Beasley and Joran elyase) very interesting, but it only allows to get the contents of the line 03000. I would like to get the contents of the lines 00000 to line 99999. I even managed to do here, but I am not satisfied, I wanted to make a more cleaner. 我给（Beasley和Joran elyase）小费很有趣，但它只允许获取03000行的内容。我想将00000行的内容获取到99999行。我什至设法在这里做，但我不是满意，我想做一个更清洁的。 See how I did: 看看我是怎么做的：

    file = open(url,'r')
    newFile = open("newFile.txt",'w')
    lines = file.readlines()        
    file.close()
    i = 0
    lineTemp = []
    for line in lines:                     
        lineTemp.append(line)                       
        if line[0:5] == '03000':
            state = line[21:23]                                
        if line[0:5] == '99999':
            if state == 'TO':
                newFile.writelines(lineTemp)                    
            else:
                linhaTemp = []                                                                            
        i = i+1                      
    newFile.close()

Suggestions... Thanks to all! 建议...谢谢大家！

从txt文件中选取零件，然后使用python复制到另一个文件

问题描述

8 个解决方案

解决方案1
6 2013-05-15 21:00:40

解决方案2
0 2013-05-17 03:00:29

解决方案3
-1 2013-05-15 20:50:33

解决方案4
-1 2013-05-16 14:25:01

解决方案5
-1 2013-05-18 19:37:43

解决方案6
-2 2013-05-15 22:56:54

解决方案7
-2 2013-05-16 15:28:59

解决方案8
-2 2013-05-16 19:08:44

从txt文件中选取零件，然后使用python复制到另一个文件

问题描述

8 个解决方案

解决方案1 6 2013-05-15 21:00:40

解决方案2 0 2013-05-17 03:00:29

解决方案3 -1 2013-05-15 20:50:33

解决方案4 -1 2013-05-16 14:25:01

解决方案5 -1 2013-05-18 19:37:43

解决方案6 -2 2013-05-15 22:56:54

解决方案7 -2 2013-05-16 15:28:59

解决方案8 -2 2013-05-16 19:08:44

解决方案1
6 2013-05-15 21:00:40

解决方案2
0 2013-05-17 03:00:29

解决方案3
-1 2013-05-15 20:50:33

解决方案4
-1 2013-05-16 14:25:01

解决方案5
-1 2013-05-18 19:37:43

解决方案6
-2 2013-05-15 22:56:54

解决方案7
-2 2013-05-16 15:28:59

解决方案8
-2 2013-05-16 19:08:44