简体   繁体   中英

Python: How do I include multiple text files in my code?

I'm using Python3 on windows. How do I include multiple text files so that I can run my code using more than one file?

article_one = re.findall('\w+', open('D.txt',).read().lower())
wordbank = {}

for word in article_one:
    word = word.lower().strip(string.punctuation)
    if word not in wordbank:
        wordbank[word] = 1
    else:
        wordbank[word] += 1

sortedwords = sorted(wordbank.items(), key=operator.itemgetter(1))

for word in sortedwords:
    print (word[1], word[0])​

I imagine you could just concatenate your files together before doing the regex, or just loop through the files. You can also use the collections.Counter dictionary to get the word frequency in the word list.

from collections import Counter

words = []
for filename in ['A.txt', 'D.txt']:
    with open(filename, 'r') as f:
        words.extend(re.findall('\w+', f.read().lower()))

wordbank = Counter(words)

for word, cnt in wordbank.most_common():
    print word, cnt

Something like that:

In this example, you can create filelist as you want - may be using glob, or any other ways. if you need assistance - please tell your criteria on creating it.

filelist = ['D.txt','E.txt']
wordbank = {}
for file in filelist:
    article_one = re.findall('\w+', open(file,).read().lower())

    for word in article_one:
        word = word.lower().strip(string.punctuation)
        if word not in wordbank:
            wordbank[word] = 1
        else:
            wordbank[word] += 1

sortedwords = sorted(wordbank.items(), key=operator.itemgetter(1))

for word in sortedwords:
    print (word[1], word[0])​

You could use the "glob" library to get an array of all the files that match an expression (ie *.txt). Once you have that array, you can then iterate over it, opening each file one by one and executing the steps you're trying to do.

https://docs.python.org/3/library/glob.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM