抓取时提取列表值

Question

I'm looking through https://www.nps.gov/index.htm and trying to create a dictionary where the state names from the drop-down menu are the keys and the values are the links to the appropriate page containing that state's information.我正在浏览https://www.nps.gov/index.htm并尝试创建一个字典，其中下拉菜单中的州名称是键，值是指向包含该州的相应页面的链接信息。

However, with my current code, I am getting something like this:但是，使用我当前的代码，我得到了这样的信息：

<li><a href="/state/wy/index.htm">Wyoming</a></li>

With my current skill level I don't know how to extract the state name, because it doesn't have any identifier or class or anything right?以我目前的技能水平，我不知道如何提取州名，因为它没有任何标识符或类或任何东西？

So how would I go about achieving this?那么我将如何实现这一目标？ Here is my current code:这是我当前的代码：

state_dict = {}

url = 'https://www.nps.gov/index.htm'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
drop_down_search = soup.find('ul', class_="dropdown-menu SearchBar-keywordSearch")
state_search = drop_down_search.find_all('li', recursive=True)

for state in state_search:
    print(state)

Answer 1

You can use .text property, just like this:您可以使用.text属性，就像这样：

import requests
from bs4 import BeautifulSoup

state_dict = {}

url = 'https://www.nps.gov/index.htm'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
drop_down_search = soup.find('ul', class_="dropdown-menu SearchBar-keywordSearch")
state_search = drop_down_search.find_all('li', recursive=True)

for state in state_search:
    print(state.text)

it will print only the text:它只会打印文本：

Alabama
Alaska
American Samoa
Arizona
Arkansas
...

Answer 2

...

for state in state_search:
    for link in state.find_all('a'):
        print("%30s ===> %s" % (link.text, link.get('href')))

抓取时提取列表值

问题描述

2 个解决方案

解决方案1
4 已采纳 2020-03-11 01:30:19

解决方案2
0 2020-03-11 01:40:15

抓取时提取列表值

问题描述

2 个解决方案

解决方案1 4 已采纳 2020-03-11 01:30:19

解决方案2 0 2020-03-11 01:40:15

解决方案1
4 已采纳 2020-03-11 01:30:19

解决方案2
0 2020-03-11 01:40:15