简体   繁体   中英

How can I create two different lists from the words in a file?

I have a file with the following structure

word1a word2a
word1b word2b
word1c word2c
word1d word2d
        \n -> Empty Line
word11a word21a
word11b word21b
word11c word21c
        \n -> Empty Line
word12a word22a
word12b word22b
word12c word22c
word12d word22d
         \n -> Empty Line
         \n -> Empty Line

I need to create two separate lists which looks like

wordList_1 = [[word1a,word1b,word1c,word1d],[word11a,word11b,word11c],[word12a,word12b,word12c,word12d]]
wordList_2 = [[word2a,word2b,word2c,word2d],[word21a,word21b,word21c],[word22a,word22b,word22c,word22d]]

How do I do this efficiently?

I have come up with a solution as below, but I know that I haven't done a good job in achieving my goal. So please take a look at the code below and let me know how I can change it to achieve the desired results.

def fun(fName):
   Create two empty lists, words1 = [] and words2 = []
   with open(fname) as f:
       all_the_lines_in_file = f.read()
   lines = all_the_lines_in_file.split("\n\n") //Split The line based on new line
   for line in lines:
        l = line.split("\n")
        Create two empty lists, w1 = [] and w2 = []
        for words in l:
          if(len(words)>1):
            w = words.split()
            w1.append(w[0])
            w2.append(w[1])
        words1.append(w1)
        words2.append(w2)

Maybe a little more efficient than your attempt:

words1, words2 = [], []
with open(fname) as f:
    w1, w2 = [], []
    for line in f:
        if line.strip(): # line is not empty
            words = line.split()
            w1.append(words[0])
            w2.append(words[1])
        else:
            words1.append(w1)
            words2.append(w2)
            w1, w2 = [], []
    # At end of the file
    if w1: # Checking lists are not empty, only need to check w1 or w2
        words1.append(w1) 
        words2.append(w2)

print(words1)
print(words2)

Output:

[['word1a', 'word1b', 'word1c', 'word1d'], ['word11a', 'word11b', 'word11c'], ['word12a', 'word12b', 'word12c', 'word12d']]
[['word2a', 'word2b', 'word2c', 'word2d'], ['word21a', 'word21b', 'word21c'], ['word22a', 'word22b', 'word22c', 'word22d']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM