我正在尝试使用Python的维基百科API查看维基百科页面中的目录。这是我的代码。
>>> import wikipedia
>>> ny = wikipedia.page("New York")
>>> ny.sections
但结果我得到一个空列表
[]
。当我进入该页面并检查时,我可以看到目录中有内容。除了这一点之外,文档中所说的所有其他内容似乎都有效。我是来自 java 背景的 python 新手。
当前版本的 Wikipedia API python 库存在一个错误。您可以通过 github 上的 lucasdnd 安装一个分支来修复此问题:
pip install git+https://github.com/lucasdnd/Wikipedia.git
(如果您已经安装了,可以
--upgrade
)现在:
>>> import wikipedia
>>> ny = wikipedia.page("New York")
>>> ny.sections
[u'History', u'16th century', u'17th century', u'18th century, the American Revolution, and statehood', u'19th century', u'Immigration', u'September 11, 2001 attacks', u'Hurricane Sandy, 2012', u'Geography', u'Climate', u'Statescape', u'Regions', u'Adjacent geographic entities', u'State parks', u'National parks', u'Administrative divisions', u'Demographics', u'Population', u'Most populous counties', u'Major cities', u'Metropolitan areas', u'Racial and ancestral makeup', u'Languages', u'Religion', u'LGBT', u'Economy', u'Wall Street', u'Silicon Alley', u'Microelectronic hardware and photographic processing', u'Media and entertainment', u'Tourism', u'Exports', u'Education', u'Transportation', u'Government and politics', u'Government', u'Capital punishment', u'Federal representation', u'Politics', u'Sports', u'See also', u'References', u'Further reading', u'External links']
它有望很快被
在主库中修复。
Wikipedia-API。
import wikipediaapi
wiki = wikipediaapi.Wikipedia('en')
mutcd = wiki.page('Comparison of MUTCD-Influenced Traffic Signs')
print("\n".join([s.title for s in mutcd.sections]))
输出:
Places
Media and entertainment
Sports
Ships
Other uses
See also
>>> wikipedia.summary('Creativity')
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.page('Creativity')
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.suggest('Creativity')
'creatity'
>>> wikipedia.search('Creativity')
['Creativity',
'Creativity (religion)',
'Creativity and mental health',
...
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.page('creativity')
PageError: Page id "creatity" does not match any pages. Try another id!
小写等没有帮助,但添加“(宗教)”限定符可以,除非您不是在寻找宗教页面。深入研究源代码和维基百科 API,我发现是维基百科的
suggest
API 返回了无效的页面标题建议。 如果您确定页面标题(“纽约”)存在,您可以关闭
auto_suggest
:
>>> wikipedia.page('Creativity', auto_suggest=False)
<WikipediaPage 'Creativity'>
>>> wikipedia.page('New York', auto_suggest=False)
DisambiguationError: "New York" may refer to:
New York City
New York (state)
...
>>> wikipedia.page('New York City', auto_suggest=False)
<WikipediaPage 'New York City'>
在过去的 6 个月里,已经有多个实施修复的拉取请求,但尚未审核任何一个:https://github.com/goldsmith/Wikipedia/pull/305
nlpia2_wikipedia
软件包修复了这些错误以及其他错误。
>>> !pip install nlpia2_wikipedia
>>> import wikipedia
>>> wikipedia.summary('Creativity')
{'summary': "Creativity is the ability to form novel and ..."}
>>> wikipedia.search("New York")
['New York',
'New York City',
'New York (state)',
...