我创建了自定义scrapy扩展名,并添加到scrapysetting.py
此扩展程序必须通过pika从Rabbitmq中提取网址,然后发出“抓抓蜘蛛”的请求蜘蛛打开时-这意味着蜘蛛开始爬行,我通过pika获取了网址,但如何从此网址发出对刮y蜘蛛的请求?我的扩展名:
import pika
from scrapy import signals
class AddRequestExample:
def __init__(self, stats):
self.stats = stats
@classmethod
def from_crawler(cls, crawler):
s = cls(crawler)
crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
return s
def spider_opened(cls, spider):
connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost'))
channel = connection.channel()
url = channel.basic_get(queue='hello')[2]
# Here i have to make request with url above