简体   繁体   English

没有递归的Python递归错误?

[英]Python recursion error with no recursion?

I'm new to python, and I'm building a web crawler for funsies/educational purposes. 我是python的新手,我正在出于娱乐/教育目的构建网络爬虫。 I don't use any recursive functions, but I still get the 'RuntimeError: maximum recursion depth exceeded' error. 我不使用任何递归函数,但仍然收到“ RuntimeError:超出最大递归深度”错误。 I'm really confused, and kinda feel like I'm missing something obvious or just misunderstanding something. 我真的很困惑,有点感觉是我缺少明显的东西或只是误解了一些东西。 Am I somehow recursing, or could it be related to my large loops? 我是否以某种方式递归,或者它可能与我的大循环有关? The idea is to crawl the web until you've crawled 10k pages. 这个想法是在您爬网1万页之前,先爬网。

UPDATES: 更新:

Latest Code is here: http://pastebin.com/4v5GT7ft 最新的代码在这里: http : //pastebin.com/4v5GT7ft

Stack Trace is Here: http://pastebin.com/9GzAxZM9 堆栈跟踪在这里: http : //pastebin.com/9GzAxZM9

Looks like my issue is trying to call str() on a URL that is not encoded properly. 看来我的问题是尝试在未正确编码的URL上调用str()。 I've tried decoding the URLs and then ecoding them to unicode, but I was never able to do it successfully. 我曾尝试解码URL,然后将它们编码为unicode,但我从未成功完成。 Any advice would be greatly appreciated! 任何建议将不胜感激!

The code you gave us doesn't actually run (it's missing all the import statements, and it has indentation errors, and so on), and it requires a JET database we don't have and a third-party module to read it, and it's hardcoded to use pre-existing directories in your home directory. 您提供给我们的代码实际上并没有运行(它丢失了所有import语句,并且存在缩进错误等),并且它需要我们没有的JET数据库和第三方模块才能读取它,并且使用主目录中的现有目录进行了硬编码。

I've attempted to fix all of that at http://pastebin.com/rCJriEu5 (which requires lxml and bs4 —if you were using a different parsing library or BS3, I can try it that way). 我已经尝试在http://pastebin.com/rCJriEu5上修复所有问题(它需要lxmlbs4如果您使用的是其他解析库或BS3,我可以尝试这种方式)。

And when I run it, it seems to work. 当我运行它时,它似乎起作用了。 It's 31.73% complete, with no errors yet. 已完成31.73%,没有错误。 Even if I do a sys.setrecusionlimit(50) at the start of the file, it still seems to work (3.67% complete so far). 即使我在文件的开头执行了sys.setrecusionlimit(50) ,它仍然可以正常工作(到目前为止已完成3.67%)。

So, whatever is wrong in your code is apparently in code you haven't showed us. 因此,代码中的任何错误显然都在您未向我们展示的代码中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM