AttributeError：'NoneType'对象没有属性'get_text'python 3x

Question

我一直在努力使用这段代码：

def MainPageSpider(max_pages):
    page = 1
    while page <= max_pages:
        url = 'url' + str(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = bs(plain_text, 'html.parser')
        for link in soup.findAll(attrs={'class':'col4'}):
            href = 'url' + link.a['href']
            title = link.span.text

            PostPageItems(href)
        page += 1


def PostPageItems(post_url):
    source_code = requests.get(post_url)
    plain_text = source_code.text
    soup = bs(plain_text, 'html.parser')
    for items in soup.findAll(attrs={'class':'container'}):
        title2 = items.find('h1', {'class':'title'}).get_text()

        print(title2)




MainPageSpider(1)

每当我尝试从'h1'获取文本时，我都会收到此错误：

Traceback (most recent call last):
  File "Xfeed.py", line 33, in <module>
    MainPageSpider(1)
  File "Xfeed.py", line 17, in MainPageSpider
    PostPageItems(href)
  File "Xfeed.py", line 27, in PostPageItems
    test = title2.get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'

但是当我在没有'get_text（）'的情况下运行它时，我会得到'h1'HTML：

<h1 class="title">Title 1</h1>
None
None
None
None
<h1 class="title">Title 2</h1>
None
None
None
None
<h1 class="title">Title 3</h1>
None
None
None
None

我真的不明白为什么这个错误与title = link.span.text我没有任何问题得到文本。我只想要文字。

Answer 1

不是每个container都有h1，所以只检查None是否返回，然后打印，如果没有。

for items in soup.findAll(attrs={'class':'container'}):
        title2 = items.find('h1', {'class':'title'})
        if title2:
            print(title2.text)

Answer 2

从没有get_text()的输出看起来像title2通常是None，由于None没有get_text()属性，因此你发布的错误会失败。您可以将其拆分为2个语句并添加如下检查：

title2_item = items.find('h1', {'class':'title'})

if title2_item: # Check for None
    title2 = title2_item.get_text()
    print(title2)

Answer 3

使用仅选择合格元素的css选择器重写

for item in soup.select('.container h1.title'):
        title2 = item.text

AttributeError：'NoneType'对象没有属性'get_text'python 3x

问题描述投票：-3回答：3

3个回答

最新问题

AttributeError：'NoneType'对象没有属性'get_text'python 3x

问题描述 投票：-3回答：3

3个回答

最新问题

问题描述投票：-3回答：3