简体   繁体   English

计算文件中单词的出现次数

[英]Counting occurrences of a word in a file

I'm new to Python and I have no idea what to do.我是 Python 的新手,我不知道该怎么做。 So this is the question:所以这是个问题:

Write a function that takes a filename and a word (or, if no word is given, assumes the word is “hello”), and returns an integer representing how many times that word appears in the file, except instances in the first line of the file don't count.写一个 function 接受一个文件名和一个单词(或者,如果没有给出单词,假设单词是“hello”),并返回一个 integer 表示该单词在文件中出现的次数,除了第一行中的实例该文件不算数。

Call the function with two args, a filename that must be in path and the word to count inside the file.使用两个参数调用 function,一个是必须在路径中的文件名,另一个是要在文件中计数的单词。 If you don't pass the word the function assumes the default word "hello".如果您不传递单词,则 function 会采用默认单词“hello”。

def wordcount(filename, word="hello"):
    # Open the file name
    with open(filename) as file:
        # Skip first line
        next(file)  
        # read the file as a string except the first line and count the occurrences 
        return file.read().count(word)

Count method returns the number of occurrences of the substring in the given string. Count 方法返回 substring 在给定字符串中出现的次数。 Also u can save the first line x = next(file) if you what to use it later.如果您稍后要使用它,您也可以保存第一行x = next(file)

Call the function and print the result with print(wordcount("sample.txt", "repeat")) to count how many times the word 'repeat' appears in the file.调用 function 并使用print(wordcount("sample.txt", "repeat"))打印结果以计算“repeat”一词在文件中出现的次数。

The sample.txt contains: sample.txt 包含:

Hello ! My name is João, i will repeat this !
Hello ! My name is João, i will repeat this !
Hello ! My name is João, i will repeat this !
Hello ! My name is João, i will repeat this !
Hello ! My name is João, i will repeat this !

The result must be 4:)结果必须是 4:)

from pathlib import Path

def count(filename: str, word = "hello"):
    file = Path(filename)
        
    text = file.read_text()
    lines_excluding_first = text.split("\n")[1:]
    
    counts = sum([sum(list(map(lambda y: 1 if word == y else 0, x.split(" ")))) for x in lines_excluding_first])
    
    
    return counts

Example: say you have a txt file like:示例:假设您有一个 txt 文件,例如:

sample.txt
----------

this is the first line so this dose not count!
hello!! this is a sample text.
text which will be used as sample.
if nothing no word specified hellow is returned!
it will return the count.
print(count("sample.txt")) 
## output: 2

EDIT:编辑:

I have made a small correction in the code now "hellow!!"我现在对代码做了一个小修正"hellow!!" and "hellow" are two separate words."hellow"是两个不同的词。 Words are separated by blank spaces and than checked for equality.单词由空格分隔,然后检查是否相等。 Therefore "hellow" and "hellow."因此"hellow"和“你好”。 are also different!也不同!

As per the request here is how it will look on repl.it :根据这里的请求,它在repl.it上的外观如下:

make a sample.txt first:先做一个sample.txt 样本.txt main.py looks like: main.py看起来像: 主程序 you can see the output as 2 for default "hello"您可以将 output 视为默认“hello”的 2
[output is case sensitive ie Hello and hello are not the same]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM