简体   繁体   English

Python-需要将具有大写和小写单词的列表更改为全部小写

[英]Python - need to change a list with uppercase and lowercase words into all lowercase

My project is pretty basic... I need to take a text file of the gettysburg address and count the number of words and the number of unique words. 我的项目非常基础...我需要获取葛底斯堡地址的文本文件,并计算单词数和唯一单词数。 I've gotten pretty much all the way to the end but its double counting words that are the same just with a capital first letter -- ie But and but. 我几乎一直到最后,但其重复计算字词与首字母大写相同,即But and but。 I'm not sure how to fix this :( Here is what I have so far: 我不确定如何解决这个问题:(这是我到目前为止的内容:

def main():
    getty = open('Gettysburgaddress.txt','r')
    lines = getty.readlines()
    getty.close()

    index = 0
    while index < len(lines):
       lines[index] = lines[index].rstrip('\n')
        index += 1

    words = [word for line in lines for word in line.split()]

    size = len(words)

    print('\nThere are', size,'words in the Gettysburg Address.\n')

    unique = list(set(words))

    size_unique = len(unique)

    print('There are', size_unique,'unique words in the Gettysburg Address.\n')

    unique.sort()

    print('Sorted order of unique words:', unique)

    close = input('')

main()

Lowercase the words while collecting them: 收集单词时将其小写:

words = [word.lower() for line in lines for word in line.split()]

or when creating the set of unique words: 或在创建一组唯一词时:

unique = list(set(word.lower() for word in words))

You could simplify your file-loading code a little more: 您可以进一步简化文件加载代码:

with open('Gettysburgaddress.txt','r') as getty:
    words = [word.lower() for line in getty for word in line.split()]

This loads the file into a list of lower-cased words in one step, where with statement also takes care of closing the file again. 这一步将文件加载到小写单词列表中,其中with语句还负责再次关闭文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM