how to calculate how many line has a specific word

Question

I'm not sure if the if statement is wrong? I tried to split each line and iterate through each index and find 'the raven' and return the count.

def count_word(file_url, word):
    r = requests.get(file_url, stream=True)
    count = 0

    for line in r.iter_lines():
        words = line.split()
        if line[1:] == 'the raven':
            count += 1
    return count

Answer 1

When you do

`words = line.split()`

you're assigning to the variable words a list of strings - the non-whitespace strings in the line. But you're not doing anything with words after that. Instead, you do:

if line[1:] == 'the raven':

which checks if the whole line, minus its first character, is exactly 'the raven'.

(Edited for handing unicode/bytes): If you want to add up the total number of times 'the raven' appears in your whole file, you can skip the split and the if and get the count of occurrences directly from each line. Because requests gives you bytes objects (in python 3) or unicode objects (in python 2) you'll need to decode the lines with the appropriate encoding first:

for line in r.iter_lines():
    count += line.decode('utf-8').count('the raven')

If instead you want to return the total number of lines in which 'the raven' appears at all, you can do:

for line in r.iter_lines():
    if 'the raven' in line.decode('utf-8'):
        count += 1

You may need to choose a different encoding, depending on your data source.

Answer 2

The following slight edits to your code will allow you to count any word as defined by the parameter word in the file defined by file_url .

def count_word(file_url, word):
    r = requests.get(file_url, stream=True)
    count = 0

    for line in r.iter_lines():
        count += line.count(word)

    return count

how to calculate how many line has a specific word

Question

2 answers

solution1
2 2018-01-29 22:53:43

solution2
1 2018-01-29 22:55:44

how to calculate how many line has a specific word

Question

2 answers

solution1 2 2018-01-29 22:53:43

solution2 1 2018-01-29 22:55:44

solution1
2 2018-01-29 22:53:43

solution2
1 2018-01-29 22:55:44