无法弄清楚为什么我的Scrapy脚本不起作用

Question

import scrapy

class TestSpider(scrapy.Spider):
    name = 'test'
    start_urls = ['https://go.twitch.tv/directory']
def parse(self, response):
    for title in response.css('body'):
        yield {'title': title.css('h3.tw-box-art-card__title::text').extract()}

    for next_page in response.css('a::attr(href)'):
        yield response.follow(next_page, self.parse)

它只是爬行和刮擦https://go.twitch.tv/directory但没有推出任何标题。

我是Python的新手，所以问题可能非常明显，但我无法弄明白。

Answer 1

正如@Shahin所提到的，页面是动态生成的，你不能解析它，没有像selenium或splash这样的东西。阅读this。

另外还有另一种方法：您可以对请求生成的内容进行一些搜索，从而为您提供所需的数据。

例如，当页面加载或当你到达底部时，有一些数据请求https://gql.twitch.tv/gql，请看下图：

这是请求将返回你json与目录游戏描述：所以，我认为你只需要找出请求数据如何构建和请求不是twitch.tv/directory，但gql.twitch.tv/gql和解析响应，以json格式。

如何用身体提出请求here（有身体参数）

无法弄清楚为什么我的Scrapy脚本不起作用

问题描述投票：-2回答：1

1个回答

最新问题

无法弄清楚为什么我的Scrapy脚本不起作用

问题描述 投票：-2回答：1

1个回答

最新问题

问题描述投票：-2回答：1