简体   繁体   中英

AttributeError: 'list' object has no attribute 'replace' during change array

I'm don't have much experience with Python and I'm having a problem with making a web crawler. I'm trying to change a item in an array it gives this error

AttributeError: 'list' object has no attribute 'replace'

import urllib
import mechanize
import requests
import re
from bs4 import BeautifulSoup

def getGoogle(link):
        br = mechanize.Browser()
        br.set_handle_robots(False)
        br.addheaders=[('User-agent','chrome')]

        term = link.replace(" ", "+")
        query = "http://www.google.com/search?q="+term
        htmltext = br.open(query).read()
        soup2 = BeautifulSoup(htmltext, "html.parser")
        search = soup2.findAll('div', attrs={'id':'search'})
        searchtext = str(search[:])
        soup3 = BeautifulSoup(searchtext, "html.parser")
        list_items = soup3.findAll('li', attrs={'class', 'g'})
        for li in list_items:
                for h3 in li.findAll('h3'):
                        for a in h3.findAll('a'):
                                regex = "q(?!.*q).*?amp"
                                pattern = re.compile(regex)
                                source_url = re.findall(pattern, str(a))
                                results_array = [source_url]

                                for result in results_array:
                                        result.append(str(source_url[:].replace("q=","").replace("&amp", "")))

                                        for url in results_array:
                                                r = requests.get(url)

                                                soup = BeautifulSoup(r.content, "html.parser")

                                                g_data = soup.find_all(attrs={"name":"viewport"})

                                                for item in g_data:
                                                        return r.url, 'bevat een viewport'
                                                        break
                                                else:
                                                        return r.url, 'bevat GEEN viewport'

你可以这样做:

result.append(''.join(source_url).replace("q=","").replace("&amp", ""))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM