将刮削表转换为DataFrame时的NoneType对象

问题描述 投票:1回答:1

我正在尝试废弃以下链接中显示在表格中的股票代码清单:http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A我使用漂亮的汤取消了表格,但当我将其转换为Pandas数据框架时,我收到一个错误:TypeError:'NoneType'对象是不可赎回

我尝试了以下代码:

url = 'http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A'
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find("table",{"class":"market tab1"})
df = pd.read_html(table)

但它不知道如何解决它?为什么我会收到错误?

完整错误日志:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
    796         try:
--> 797             tables = p.parse_tables()
    798         except Exception as caught:

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in parse_tables(self)
    212     def parse_tables(self):
--> 213         tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
    214         return (self._build_table(table) for table in tables)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _build_doc(self)
    618                 # try to parse the input in the simplest way
--> 619                 r = parse(self.io, parser=parser)
    620             try:

~/anaconda3/lib/python3.7/site-packages/lxml/html/__init__.py in parse(filename_or_url, parser, base_url, **kw)
    939         parser = html_parser
--> 940     return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
    941 

src/lxml/etree.pyx in lxml.etree.parse()

src/lxml/parser.pxi in lxml.etree._parseDocument()

TypeError: 'NoneType' object is not callable

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-23-c3e05c494f63> in <module>
      5 table = soup.find("table",{"class":"market tab1"})
      6 #print(table)
----> 7 df = pd.read_html(table)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding, decimal, converters, na_values, keep_default_na, displayed_only)
    985                   decimal=decimal, converters=converters, na_values=na_values,
    986                   keep_default_na=keep_default_na,
--> 987                   displayed_only=displayed_only)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
    799             # if `io` is an io-like object, check if it's seekable
    800             # and try to rewind it before trying the next parser
--> 801             if hasattr(io, 'seekable') and io.seekable():
    802                 io.seek(0)
    803             elif hasattr(io, 'seekable') and not io.seekable():

TypeError: 'NoneType' object is not callable

表乞讨:

<table cellpadding="0" cellspacing="1" class="market tab1" width="610">
<colgroup><col/><col/><col class="c"/></colgroup>
<tr><td class="tabh" colspan="3"><b>Companies listed on the NYSE</b></td></tr>
<tr><th>Equity</th><th>Symbol</th><th>Info</th></tr>
<tr class="ts0"><td align="left"><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/stock-price">A K Steel</a></td><td><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/stock-price">AKS</a></td><td><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/chart"><img src="/s/stock-chart.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/news"><img src="/s/stock-news.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/financials"><img src="/s/fundamentals.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/trades"><img src="/s/stock-trades.gif"/></a></td></tr>
python beautifulsoup
1个回答
1
投票

您正在将<class 'bs4.element.Tag'>元素传递给pandas read_html。你需要将它转换为string

from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A'
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find("table",{"class":"market tab1"})
df = pd.read_html(str(table))
print(df)

输出:

[                                    0       1     2
0        Companies listed on the NYSE     NaN   NaN
1                              Equity  Symbol  Info
2                           A K Steel     AKS   NaN
3                               A M R     AMR   NaN
4                      A M R Cp 7.875     AAR   NaN
5                               A V X     AVX   NaN
6                               A a R     AIR   NaN
7               A.h. Belo Corporation     AHC   NaN
8                         Aaron Rents   RNT.A   NaN
9                         Aaron Rents     RNT   NaN
10                        Aarons Cl A   AAN.A   NaN
11                        Aarons Inc.     AAN   NaN
12               Ab Svensk Cdss Arbmn     CBJ   NaN
13                   Ab Svensk Ekport     AXF   NaN
14               Ab Svensk Ekportkrdt     SQT   NaN
15               Ab Svensk Ekportkred     DVK   NaN
16               Ab Svensk Ekportkred     IWK   NaN
17               Ab Svensk Ekportkred     RCW   NaN
18               Ab Svensk Ekportkred     EOA   NaN
19                 Ab Svensk Msci Arn     MIS   NaN
20                  Ab Svensk Russell     REU   NaN
21                  Ab Svensk Sp Arns     SAD   NaN
22                  Ab Svensk Sp Arns     MHG   NaN
23                                Abb     ABB   NaN
24                        Abbott Labs     ABT   NaN
25                Abercrombie & Fitch     ANF   NaN
26                            Abitibi     ABY   NaN
27                                Abm     ABM   NaN
28                             Acadia     AKR   NaN
29                  Acc Bear Amex Egy     IMW   NaN
..                                ...     ...   ...
194                           Ashland     ASH   NaN
195                   Aspen Insurance     AHL   NaN
196  Assisted Living Concepts (nevada     ALC   NaN
197                Associated Estates     AEC   NaN
198                          Assurant     AIZ   NaN
199                  Assured Guaranty     AGO   NaN
200                           Astoria      AF   NaN
201                       Astrazeneca     AZN   NaN
202                 Atlanta Gas Light     ATG   NaN
203                    Atlas Pipeline     APL   NaN
204        Atlas Pipeline Holdings Lp     AHD   NaN
205                             Atmos     ATO   NaN
206                               Att       T   NaN
207                               Att     ATT   NaN
208                   Atwood Oceanics     ATW   NaN
209                      Au Optronics     AUO   NaN
210                           Autoliv     ALV   NaN
211                        Autonation      AN   NaN
212                          Autozone     AZO   NaN
213              Av Svensk Ekportkred     NEH   NaN
214                         Avalonbay     AVB   NaN
215              Aventine Renew Enrgy     AVR   NaN
216                    Avery Dennison     AVY   NaN
217                  Avis Budget Grp.     CAR   NaN
218                            Avista     AVA   NaN
219                             Avnet     AVT   NaN
220                     Avon Products     AVP   NaN
221                               Axa     AXA   NaN
222                              Axis     AXS   NaN
223                               Azz     AZZ   NaN

[224 rows x 3 columns]]
© www.soinside.com 2019 - 2024. All rights reserved.