用草皮写一个蜘蛛,但是为什么'yield item'在for循环中不起作用?

问题描述 投票:0回答:1

我有一只蜘蛛用草皮书写,但是在for循环中未执行成品项目,请参见下面的代码。

def parse_paragraph(self, div_list, category_name, group_name):
    for div in div_list:
        duilian_text_list = div.xpath('./text()').extract()
        duilian_text_list = strip_list(duilian_text_list)
        if len(duilian_text_list) == 0:
            continue
        elif len(duilian_text_list) == 1:
            duilian_text = duilian_text_list[0]
            self.parse_duilian(duilian_text, category_name, group_name)
        elif len(duilian_text_list) == 2 and not is_single_line(duilian_text_list[0]):
            duilian_text = ''.join(duilian_text_list)
            self.parse_duilian(duilian_text, category_name, group_name)
        else:
            for duilian_text in duilian_text_list:
                duilian_item = DuilianItem()
                duilian_item['id'] = str(uuid.uuid4()).replace('-', '')
                duilian_item['category_id'] = getCategoryName(category_name)
                duilian_item['group_name'] = group_name
                duilian = parse_duilian(duilian_text)
                if duilian != '|':
                    duilian_item['name'] = duilian
                    duilian_item['desc'] = ''
                    duilian_item['author'] = ''
                    duilian_item['shuti'] = ''
                    duilian_item['word_count'] = len(duilian_item['name']) // 2
                    duilian_item['image_url'] = ''
                    print('-------I am here--------')
                    yield duilian_item

[当我调用此函数时,在输出窗口中没有任何内容,看来yiled duilian_item行不起作用,甚至阻止了其他代码的执行(它上面的打印行)。

[当我注释掉最后一行yiled duilian_item时,一切正常,并且在输出窗口中出现了-------I am here--------,这是怎么了?

以一种简单的方式进行处理,以下代码不打印任何内容,但是如果我注释掉yiled 1,它将打印该列表,因此python中的yield无法在for循环中工作?

def strange_yield():
    list = [1, 2, 3]
    for i in list:
        print(i)
        yield 1

strange_yield()
python scrapy yield
1个回答
0
投票

当您在python函数中使用yield时,该函数将成为Generator函数。按照strange_yield函数处理它的正确方法是:

my_yield = strange_yield()

my_yield现在是生成器函数strange_yield的一个实例。生成器函数可以使用next()函数进行迭代,也可以拉至下一个值:

print(next(my_yield))

for yield_value in my_yield:
  print(yield_value)
© www.soinside.com 2019 - 2024. All rights reserved.