簡體   English   中英

如何修復“未找到模塊”和“ int對象沒有屬性”樣條線”?

[英]How to fix 'No module found' and 'int object has no attribute 'splitlines'?

我試圖在我的網站上運行Spider,並在桌面上運行scrapyrt監聽服務器。 它告訴我在運行Spider時找不到模塊“ webscrape”,並且還告訴我“ Int對​​象沒有splitlines屬性”。

https://github.com/scrapy/scrapyd/issues/311提供了scrapyd的解決方案。 https://github.com/scrapinghub/scrapyrt/pull/84似乎仍然是一個問題。

所以,我真的很茫然。

錯誤代碼:

2019-08-12 16:37:47-0700 [scrapyrt] Unhandled Error
        Traceback (most recent call last):
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 2196, in allContentReceived
            req.requestReceived(command, path, version)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 920, in requestReceived
            self.process()
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 199, in process
            self.render(resrc)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 259, in render
            body = resrc.render(self)
        --- <exception caught here> ---
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 26, in render
            result = resource.Resource.render(self, request)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\resource.py", line 250, in render
            return m(request)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 127, in render_GET
            return self.prepare_crawl(api_params, scrapy_request_args, **kwargs)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 217, in prepare_crawl
            start_requests=start_requests, *args, **kwargs)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 226, in run_crawl
            dfd = manager.crawl(*args, **kwargs)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\core.py", line 157, in crawl
            self.get_project_settings(), self)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\core.py", line 178, in get_project_settings
            return get_project_settings(custom_settings=custom_settings)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\conf\spider_settings.py", line 27, in get_project_settings
            crawler_settings.setmodule(module, priority='project')
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapy\settings\__init__.py", line 288, in setmodule
            module = import_module(module)
          File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\importlib\__init__.py", line 127, in import_module
            return _bootstrap._gcd_import(name[level:], package, level)
          File "<frozen importlib._bootstrap>", line 1006, in _gcd_import

          File "<frozen importlib._bootstrap>", line 983, in _find_and_load

          File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked

          File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed

          File "<frozen importlib._bootstrap>", line 1006, in _gcd_import

          File "<frozen importlib._bootstrap>", line 983, in _find_and_load

          File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked

        builtins.ModuleNotFoundError: No module named 'webscrape'

Unhandled Error
Traceback (most recent call last):
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 2196, in allContentReceived
    req.requestReceived(command, path, version)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 920, in requestReceived
    self.process()
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 199, in process
    self.render(resrc)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 259, in render
    body = resrc.render(self)
--- <exception caught here> ---
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 26, in render
    result = resource.Resource.render(self, request)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\resource.py", line 250, in render
    return m(request)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 127, in render_GET
    return self.prepare_crawl(api_params, scrapy_request_args, **kwargs)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 217, in prepare_crawl
    start_requests=start_requests, *args, **kwargs)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 226, in run_crawl
    dfd = manager.crawl(*args, **kwargs)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\core.py", line 157, in crawl
    self.get_project_settings(), self)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\core.py", line 178, in get_project_settings
    return get_project_settings(custom_settings=custom_settings)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\conf\spider_settings.py", line 27, in get_project_settings
    crawler_settings.setmodule(module, priority='project')
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapy\settings\__init__.py", line 288, in setmodule
    module = import_module(module)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import

  File "<frozen importlib._bootstrap>", line 983, in _find_and_load

  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed

  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import

  File "<frozen importlib._bootstrap>", line 983, in _find_and_load

  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked

builtins.ModuleNotFoundError: No module named 'webscrape'

2019-08-12 16:37:47-0700 [-] Unhandled Error
        Traceback (most recent call last):
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\protocols\basic.py", line 572, in dataReceived
            why = self.lineReceived(line)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 2105, in lineReceived
            self.allContentReceived()
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 2196, in allContentReceived
            req.requestReceived(command, path, version)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 920, in requestReceived
            self.process()
        --- <exception caught here> ---
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 199, in process
            self.render(resrc)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 259, in render
            body = resrc.render(self)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 31, in render
            return self.render_object(result, request)
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 95, in render_object
            request.setHeader('Content-Length', len(r))
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 1271, in setHeader
            self.responseHeaders.setRawHeaders(name, [value])
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 220, in setRawHeaders
            for v in self._encodeValues(values)]
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 220, in <listcomp>
            for v in self._encodeValues(values)]
          File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 40, in _sanitizeLinearWhitespace
            return b' '.join(headerComponent.splitlines())
        builtins.AttributeError: 'int' object has no attribute 'splitlines'


Traceback (most recent call last):
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 199, in process
    self.render(resrc)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\server.py", line 259, in render
    body = resrc.render(self)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 31, in render
    return self.render_object(result, request)
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\scrapyrt\resources.py", line 95, in render_object
    request.setHeader('Content-Length', len(r))
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http.py", line 1271, in setHeader
    self.responseHeaders.setRawHeaders(name, [value])
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 220, in setRawHeaders
    for v in self._encodeValues(values)]
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 220, in <listcomp>
    for v in self._encodeValues(values)]
  File "c:\users\user\microblog\job-visualizer\venv\lib\site-packages\twisted\web\http_headers.py", line 40, in _sanitizeLinearWhitespace
    return b' '.join(headerComponent.splitlines())
AttributeError: 'int' object has no attribute 'splitlines'

項目布局:

-Job-Visualizer

 -app

  -webscrape(scrapyrt ran from here in venv)

   -spiders

運行Spider時,Spider代碼應按預期返回結果。

編輯:蜘蛛代碼:

import scrapy
from scrapy_splash import SplashRequest

class IndeedSpider(scrapy.Spider):
    name = 'indeedspider'
    allowed_domains = ['https://www.indeed.com']

    def __init__(self):
        super().__init__()
        print('Spider being ran...')
        self.start_url = 'https://www.indeed.com/jobs?q=financial+aid+advisor&l=Highland%2C+CA'
        self.links = []


    def modify_realtime_request(self, request):
        return SplashRequest(url, self.parse, args=splash_args, endpoint='render.html')

    def start_requests(self):
        print(self.start_url)
        urls = [
            self.start_url
        ]
        splash_args = {
            'html': 1,
            'png': 1,
            'width': 800,
            'render_all': 1,
        }
        for url in urls:
            yield SplashRequest(url, self.parse, endpoint='render.json', args=splash_args)

    def parse(self, response):
        html = response.body
        title = response.css('title').extract()
        titles = response.xpath("//div[@class= 'title']/a/text()").getall()
        locations = response.xpath("//div[@class= 'sjcl']/span/text()").getall()
        companies = response.css("div.sjcl.span.company a::text").getall()
        summarys = response.xpath("//div[@class= 'summary']/text()").getall()

路線部分代碼:

 params = {
                'spider_name': 'indeed_scraper',
                'start_requests': True
            }
            response = requests.get('http://localhost:9080/crawl.json', params)
            data = json.loads(response.text)
            print(data)

解決方案:創建scrapy項目時,請確保scrapy.cfg位於SCRAPY項目文件夾之外。

不正確:

-app
 - webscrape
  - scrapy.cfg
  - __init__.py
  - items.py
  - middleware.py
   - spiders
    - spider.py

正確:

-app
- scrapy.cfg
 - webscrape
  - __init__.py
  - items.py
  - middleware.py
   - spiders
    - spider.py

正確結果:

{"status": "ok", "items": [], "spider_name": "indeedspider"}

您是否已導入webscrape模塊? 另外,您使用的對象類型錯誤,因此沒有分割線屬性。 如果打印對象類型,它是否顯示為int? Splitlines方法僅適用於字符串,因此您需要確保調用它的對象是字符串,而不是int數據類型。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM