I'm not sure if the if statement is wrong? I tried to split each line and iterate through each index and find 'the raven' and return the count.
def count_word(file_url, word):
r = requests.get(file_url, stream=True)
count = 0
for line in r.iter_lines():
words = line.split()
if line[1:] == 'the raven':
count += 1
return count
When you do
`words = line.split()`
you're assigning to the variable words
a list of strings - the non-whitespace strings in the line. But you're not doing anything with words
after that. Instead, you do:
if line[1:] == 'the raven':
which checks if the whole line, minus its first character, is exactly 'the raven'.
(Edited for handing unicode/bytes): If you want to add up the total number of times 'the raven' appears in your whole file, you can skip the split
and the if
and get the count of occurrences directly from each line. Because requests gives you bytes
objects (in python 3) or unicode
objects (in python 2) you'll need to decode the lines with the appropriate encoding first:
for line in r.iter_lines():
count += line.decode('utf-8').count('the raven')
If instead you want to return the total number of lines in which 'the raven' appears at all, you can do:
for line in r.iter_lines():
if 'the raven' in line.decode('utf-8'):
count += 1
You may need to choose a different encoding, depending on your data source.
The following slight edits to your code will allow you to count any word as defined by the parameter word
in the file defined by file_url
.
def count_word(file_url, word):
r = requests.get(file_url, stream=True)
count = 0
for line in r.iter_lines():
count += line.count(word)
return count
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.