如何在多個網頁或網址中查找特定單詞並對其進行計數，使用 Python

Question

下面是我的代碼。 請檢查並糾正我。

import requests

from bs4 import BeautifulSoup

url = ["https://www.tensorflow.org/","https://www.tomordonez.com/"]

the_word = input()

r = requests.get(url, allow_redirects=False)

soup = BeautifulSoup(r.content, 'lxml')

words = soup.find(text=lambda text: text and the_word in text)

print(words)

count = len(words)

print('\nUrl: {}\ncontains {} of word: {}'.format(url, count, the_word))

如何更改我的代碼以解析多個 URL 並計算特定單詞出現的次數？

Answer 1

import requests
from bs4 import BeautifulSoup

url_list = ["https://www.tensorflow.org/","https://www.tomordonez.com/"]

#the_word = input()
the_word = 'Python'

total_words = []
for url in url_list:
    r = requests.get(url, allow_redirects=False)
    soup = BeautifulSoup(r.content.lower(), 'lxml')
    words = soup.find_all(text=lambda text: text and the_word.lower() in text)
    count = len(words)
    words_list = [ ele.strip() for ele in words ]
    for word in words:
        total_words.append(word.strip())

    print('\nUrl: {}\ncontains {} of word: {}'.format(url, count, the_word))
    print(words_list)


#print(total_words)
total_count = len(total_words)

輸出：

Url: https://www.tensorflow.org/
contains 0 of word: Python
[]

Url: https://www.tomordonez.com/
contains 8 of word: Python
['web scraping with python', 'this is a tutorial on web scraping with python. learn to scrape websites with python and beautifulsoup.', 'python unit testing tutorial', 'this is a tutorial about unit testing in python.', 'pip install ssl module in python is not available', 'troubleshooting ssl module in python is not available', 'python context manager', 'a short tutorial about python context manager: "with" statement.']

Answer 2

您可以使用re模塊來查找特定文本。

import requests
import re
from bs4 import BeautifulSoup

urls = ["https://www.tensorflow.org/","https://www.tomordonez.com/"]

the_word ='Tableau'

for url in urls:
 print(url)
 r = requests.get(url, allow_redirects=False)
 soup = BeautifulSoup(r.text, 'html.parser')
 words = soup.find_all(text=re.compile(the_word))
 print(len(words))

如何在多個網頁或網址中查找特定單詞並對其進行計數，使用 Python

問題描述

2 個解決方案

解決方案1
1 2019-03-15 09:31:56

解決方案2
0 2019-03-15 10:11:17

如何在多個網頁或網址中查找特定單詞並對其進行計數，使用 Python

問題描述

2 個解決方案

解決方案1 1 2019-03-15 09:31:56

解決方案2 0 2019-03-15 10:11:17

解決方案1
1 2019-03-15 09:31:56

解決方案2
0 2019-03-15 10:11:17