简体   繁体   English

Python Scrapy分段错误

[英]Python Scrapy Segmentation Fault

I have a django project where I use nltk. 我有一个使用nltk的django项目。 I get segmentation error while trying to 尝试进行细分时出现细分错误

import nltk

I found the exact line where it happens and it's collocations.py:38. 我找到了它发生的确切位置,它是collocations.py:38。 I also found out that I get this error only in django unit-tests and django manage.py shell(actually shell_plus, but I don't thing it changes something), but not in custom management commands (which maybe means, that I won't get this error in production) 我还发现,仅在django单元测试和django manage.py shell(实际上是shell_plus,但我不认为会更改某些东西)中才出现此错误,但在自定义管理命令中却没有(这可能意味着我赢了)在生产中不会出现此错误)

I use django django 1.4.5, nltk 2.0.4, python 2.7.3. 我使用django django 1.4.5,nltk 2.0.4,python 2.7.3。

Thanks! 谢谢!

UPDATE: Update of python to 2.7.4 hasn't effected 更新: python更新到2.7.4尚未生效

UPDATE: Update of numpy to 1.7.1 and scipy to 0.12.0 hasn't effected 更新: numpy更新到1.7.1和scipy更新到0.12.0尚未实现

UPDATE: I found the statement, which causes the segfault (I suppose, as much as I moved it from module scope to function scope, which caused segfault not to happen) 更新:我找到了导致段错误的语句(我想,就像我将其从模块范围移到函数范围一样,这导致段错误不会发生)

from scrapy.crawler import CrawlerProcess

It's the class from crawling scrapy framework, which I used in custom module for executing scrapy spider as a python script. 这是爬网scrapy框架中的类,我在自定义模块中使用了该类,以将scrapy spider作为python脚本执行。 It seems like segfault is not NLTK fault, but scrapy. segfault似乎不是NLTK的错误,而是令人毛骨悚然的。 Probably somehow this class rewrote some data of nltk, that's why it's segfaulted. 此类可能以某种方式重写了nltk的某些数据,这就是它被分段的原因。

I had the same issue with CrawlerProcess. 我在CrawlerProcess中遇到了同样的问题。 Scrapy had failed to install libxml2. Scrapy无法安装libxml2。 Try entering this into the command prompt: 尝试在命令提示符下输入以下内容:

easy_install lxml
pip install scrapy --upgrade

My program executed correctly after this. 之后,我的程序正确执行了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM