此函數在涉及urllib2和BeautifulSoup的Python中做什么？

Question

因此，我較早前提出了一個有關從html頁面獲取高分的問題，另一個用戶向我提供了以下代碼來提供幫助。 我是python和beautifulsoup的新手，所以我正在嘗試逐步研究其他一些代碼。 我了解其中的大部分內容，但我不明白這段代碼是什么以及它的功能是什么：

    def parse_string(el):
       text = ''.join(el.findAll(text=True))
       return text.strip()

這是完整的代碼：

from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import sys

URL = "http://hiscore.runescape.com/hiscorepersonal.ws?user1=" + sys.argv[1]

# Grab page html, create BeatifulSoup object
html = urlopen(URL).read()
soup = BeautifulSoup(html)

# Grab the <table id="mini_player"> element
scores = soup.find('table', {'id':'mini_player'})

# Get a list of all the <tr>s in the table, skip the header row
rows = scores.findAll('tr')[1:]

# Helper function to return concatenation of all character data in an element
def parse_string(el):
   text = ''.join(el.findAll(text=True))
   return text.strip()

for row in rows:

   # Get all the text from the <td>s
   data = map(parse_string, row.findAll('td'))

   # Skip the first td, which is an image
   data = data[1:]

   # Do something with the data...
   print data

Answer 1

el.findAll(text=True)返回元素及其子元素中包含的所有文本。 文字是指所有內容都不在標簽內； 因此在<b>hello</b> “ hello”將是文本，而<b>和</b>不是。

因此，該函數將在給定元素下找到的所有文本連接在一起，並從正面和背面剝離空格。

這是findAll文檔的鏈接： http : findAll

此函數在涉及urllib2和BeautifulSoup的Python中做什么？

問題描述

1 個解決方案

解決方案1
3 已采納 2009-06-14 02:13:37

此函數在涉及urllib2和BeautifulSoup的Python中做什么？

問題描述

1 個解決方案

解決方案1 3 已采納 2009-06-14 02:13:37

解決方案1
3 已采納 2009-06-14 02:13:37