简体   繁体   English

分成两个列表一个txt文件python

[英]split in two list a txt file python

i need the speedest way to read a txt file and return 2 list.我需要最快的方式来读取 txt 文件并返回 2 个列表。

The file start with some empty rows then there is an*m table that a need to allocate in a list and then some empty rows and a list of words and every rows may contain 1 or more word separated with spaces该文件以一些空行开始,然后有一个 *m 表需要在列表中分配,然后是一些空行和一个单词列表,每行可能包含 1 个或多个用空格分隔的单词

this is what i have done but is to slow in my opinion,probably i'm looking for something with iteration这就是我所做的,但在我看来是缓慢的,可能我正在寻找迭代的东西

        with open(file) as f:
            sub = ''
            matrix = []
            word = []
            c = 0
            a = False
            for line in f:
                if line == '\n':
                    if c == 1:
                        a = True
                    continue
                if a:
                    l = line.split()
                    for x in l:
                        word.append(x)
                else:
                    sub = line.strip('\n')
                    matrix.append(sub)
                    c = 1
        f.close()
        matrix = [x.upper() for x in matrix]
        word = [x.upper() for x in word]
        return matrix, word

an example file should be: '\\n' is a blank row一个示例文件应该是:'\\n' 是一个空行

example.txt:
\n
\n
\n
jebvoqbfvqoif
feqbfoeqbfoie
qfenfoeiqnfoi
ejfnqoeifboqe
nefoineoifneo
nfeqiofhneoif
enfqoinfeoifn
fewknfoiewnfn
\n
\n
\n
efwhhewof eiwofoiefw fwnenfif
wefioh
wfeno
ewfioef
oefkofeofo

the output should be:输出应该是:

list 1:
[
"jebvoqbfvqoif",
"feqbfoeqbfoie",
"qfenfoeiqnfoi",
"ejfnqoeifboqe",
"nefoineoifneo",
"nfeqiofhneoif",
"enfqoinfeoifn",
"fewknfoiewnfn"]
list 2:
[
"efwhhewof",
"eiwofoiefw", 
"fwnenfif",
"wefioh",
"wfeno",
"ewfioef",
"oefkofeofo"]

One possible solution is using itertools.groupby and itertools.chain (The file sample.txt contains the text you've stated in question)一种可能的解决方案是使用itertools.groupbyitertools.chain (文件sample.txt包含您在问题中陈述的文本)

For example:例如:

from itertools import groupby, chain

with open('sample.txt', 'r') as f_in:
    matrix, words = [[*chain(*map(str.split, g))] for v, g in groupby(map(str.strip, f_in), lambda k: k != '') if v]

from pprint import pprint
pprint(matrix)
pprint(words)

Prints:印刷:

['jebvoqbfvqoif',
 'feqbfoeqbfoie',
 'qfenfoeiqnfoi',
 'ejfnqoeifboqe',
 'nefoineoifneo',
 'nfeqiofhneoif',
 'enfqoinfeoifn',
 'fewknfoiewnfn']
['efwhhewof',
 'eiwofoiefw',
 'fwnenfif',
 'wefioh',
 'wfeno',
 'ewfioef',
 'oefkofeofo']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM