使用python在scrapy中无法使用多个类名检索数据

Question

我需要从html获取数据，但每当我试图获得“常规价格”数据时，response.css，response.xpath和组合都不起作用它总是说“无”

我需要获得17.99美元的enter code here的价值文本

这是我的代码

HTML

<div class="price parbase"><div class="primary-row product-item-price product-item-price-discount"> <span class="price-value">$12.99</span><small class="js-price-value-original price-value-original">$17.99</small> </div> </div>

Scrapy python

def parse_subpage(self, response):
    item = {
    'title': response.css('h1.primary.product-item-headline::text').extract_first(),
    'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(), 
    'regular-price': response.css('.js-price-value-original').xpath("@small").extract_first(),
    'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
    'description': response.css('p.pdp-description-text::text').extract_first()

        }   
    yield item

产量应该是正常价格：17.99美元

请帮忙谢谢！

Answer 1

你的链接给了我404，但你的html片段只需要response.css('small.js-price-value-original::text').get()，没有属性small。

UPD：嗯，好像这个数据是由JS呈现的。检查页面的HTML代码，你会看到巨大的json，通过whitePrice关键字搜索。您可以使用response.xpath('//script[contains(text(), "whitePrice")]/text()').re_first("'whitePrice'\s?:\s?'([^']+)'")检索此类数据

Answer 2

如果这个sniped是你拥有的唯一html，你可以这样做：

def parse_subpage(self, response):
    item = {
    'title': response.css('h1.primary.product-item-headline::text').extract_first(),
    'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(),
    'regular-price': response.xpath('//div/small[contains(@class, "js-price-value-original") and contains(@class, "price-value-original")]/text()').extract_first(),
    'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
    'description': response.css('p.pdp-description-text::text').extract_first()

        }   
    yield item

顺便说一下，你提供的网站显示了file not found

Answer 3

谢谢@vezunchik。如果要使用CSS选择器。您可以使用以下代码

response.css('script:contains("whitePrice")').re_first("'whitePrice'\s?:\s?'([^']+)'")

使用python在scrapy中无法使用多个类名检索数据

问题描述投票：0回答：3

3个回答

最新问题

使用python在scrapy中无法使用多个类名检索数据

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3