简体   繁体   English

为什么 listfunction 不会遍历 url?

[英]Why won't listfunction iterate through urls?

I am trying to write a program that opens a url, finds a name in a certain line, and saves it.我正在尝试编写一个程序来打开一个 url,在某一行中找到一个名称并保存它。 Then it should find the url in the same line as the name, open it, and find the name + url in the same line # as the previous page.然后它应该找到与名称在同一行的url,打开它,并在与上一页的同一行#中找到名称+ url。 It should do this 4 times.它应该这样做4次。

I can't get it to iterate through the new url parameter.我无法让它遍历新的 url 参数。 It keeps returning the same name and url.它不断返回相同的名称和网址。 What is going wrong here?这里出了什么问题? Thanks.谢谢。

from bs4 import BeautifulSoup
from urllib.request import urlopen
import re
import ssl
linklist = list()
namelist = list()
linelist = list()
count = 0
listposition = int(input("Please enter list position: "))
goodnamelist = list(["Fikret"])
nexturl = "http://py4e-data.dr-chuck.net/known_by_Fikret.html"
def listfunction(url):
    ctx = ssl.create_default_context()
    #Allows reading of HTTPS pages
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE
    html = urlopen(url, context=ctx).read()
    soup = BeautifulSoup(html, "html.parser")
    linelist = soup('a')
    for line in linelist:
        #Creates list of lines in webpage:
        linklist.append(re.findall("(http://.+)\"", str(line)))
        #Creates list of names in line:
        namelist.append(re.findall(">(.+)</a>", str(line)))
    #Creates list of names in the designated user-input position:
    goodnamelist.append(namelist[listposition][0])
    nexturl = linklist[listposition][0]
    return nexturl
while (count < 4):
    nexturl = listfunction(nexturl)
    print(listfunction(nexturl))
    count += 1
    print(nexturl)
    continue
print(linelist)
print(linklist)
print(namelist)
print(nexturl)
print(goodnamelist)
print(listfunction(nexturl))

You do not actually set nexturl in listfunction() .您实际上nexturllistfunction()设置nexturl Therefore the method just returns the same initial global variable every time.因此,该方法每次只返回相同的初始全局变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM