如何使用BeautifulSoup和Python 2.7中的請求從TD標簽之間提取文本

Question

我正在嘗試使用BeautifulSoup從TD標記和Python 2.7中的請求之間提取文本。 到目前為止，使用此代碼我什么都沒得到:(

import requests
from bs4 import BeautifulSoup

# Set up the Spider

def card_search(max_pages):
    page = 1
    mtgset = 'portal'
    card = 'lava-axe'

    while page <= max_pages:
        url = 'http://www.mtgotraders.com/store/search-results.html?q=lava+axe&x=0&y=0'
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)

        for text in soup.findAll('td',{'class': 'price mod'}):
            pagetext = text.get('td')

            print(pagetext)
            page += 1

card_search(1)

我正在嘗試自動排序和評估我的MTG卡集合，因此代碼示例中使用的網站的結果非常重要。 我知道該網站可以解析，因為我可以返回鏈接。 可悲的是，我只是無法讓純文本發生。

這是用於拉動鏈接的代碼，但不針對表。 只是頁面本身。

import requests
from bs4 import BeautifulSoup

# Set up the Spider

def card_search(max_pages):
    page = 1
    mtgset = 'portal'
    card = 'lava-axe'

    while page <= max_pages:
        url = 'http://www.mtgotraders.com/store/search-results.html?q=lava+axe&x=0&y=0'
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)

        for text in soup.findAll('a'):
            pagetext = text.get('href')

            print(pagetext)
            page += 1

card_search(1)

親切的問候，酸傑克

Answer 1

如果希望在抓取時具有更大的靈活性，則需要諸如phantomJs之類的東西。 在這里看看皮克勒的答案。

如何使用BeautifulSoup和Python 2.7中的請求從TD標簽之間提取文本

問題描述

1 個解決方案

解決方案1
0 2015-04-20 04:04:06

如何使用BeautifulSoup和Python 2.7中的請求從TD標簽之間提取文本

問題描述

1 個解決方案

解決方案1 0 2015-04-20 04:04:06

解決方案1
0 2015-04-20 04:04:06