ModuleNotFound 错误。找不到蜘蛛

问题描述 投票:0回答:1

我根据项目教程组装的网络爬虫遇到 ModuleNotFound 错误。我的爬虫运行良好,但是一旦我添加了代码来提取数据并将其存储在 MongoDB 数据库中,我在启动“ModuleNotFoundError”爬虫时遇到了问题,看起来像是无法找到我的蜘蛛的模块(名称-“stackspider”) )。由于蜘蛛之前运行良好,不确定间隙在哪里。

不想为第一个问题添加过多的代码。如何解决此 ModuleNotFound 错误?

如果需要更多详细信息或乐意提供代码。感谢您的帮助。

[这是我用来创建 scraper 项目的教程的链接] (https://realpython.com/web-scraping-with-scrapy-and-mongodb/#scrapy-project)

2020-04-14 08:23:28 [twisted] CRITICAL: 
Traceback (most recent call last):
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/crawler.py", line 89, in crawl
self.engine = self._create_engine()
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/crawler.py", line 103, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/core/engine.py", line 70, in __init__
self.scraper = Scraper(crawler)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/core/scraper.py", line 71, in __init__
self.itemproc = itemproc_cls.from_crawler(crawler)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
    File "/Users/blouie/.conda/envs/GoScrape/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
    File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
    File "<frozen importlib._bootstrap>", line 983, in _find_and_load
    File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
    File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
    File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
    File "<frozen importlib._bootstrap>", line 983, in _find_and_load
    File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'stackspider'

我的蜘蛛代码(stackspider.py)

import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
from scrapy import Spider
from scrapy.selector import Selector
from bigscrape.items import BigscrapeItem


class StackspiderSpider(Spider):
    name = 'stackspider'
    allowed_domains = ['stackoverflow.com']
    start_urls = ["http://stackoverflow.com/questions?pagesize=50&sort=newest"]

    rules = (
        Rule(LinkExtractor(allow=r'Items/'), callback='parse_item', follow=True),
)

    def parse(self, response):
        #item = BigscrapeItem()
        questions = Selector(response).xpath('//div[@class="summary"]/h3')
        for question in questions:
            item = BigscrapeItem()
            item['title'] = question.xpath(
            'a[@class="question-hyperlink"]/text()').get()
            item['url'] = question.xpath(
            'a[@class="question-hyperlink"]/@href').get()

            yield item
python mongodb module scrapy
1个回答
0
投票

如果您正在使用我认为您必须添加一个保存此配置的 scrapy.cfg 文件: [设置] 默认 = .设置 这是爬行策略过程的入口点。

© www.soinside.com 2019 - 2024. All rights reserved.