AttributeError:“list”对象没有属性“to_sql”:如何使用我的抓取工具修复此错误?

问题描述 投票:0回答:2

我正在尝试将咒语从 dnd 页面抓取到 pd 数据帧中,并将所述数据帧转换为 SQLite 数据库。

这是我的代码:

from bs4 import BeautifulSoup
from urllib.request import urlopen
import re
import csv
from random import randint
import pandas as pd

url1 = "http://dnd5e.wikidot.com/spells"
url_list = [url1,]

for url in url_list:


  ## Make Requests
  data = requests.get(url).text
  soup = BeautifulSoup(data, 'html.parser')

  # Scrape all tables into a pandas df (up to this part, it works)
  df_pandas = pd.read_html(url, attrs = {'class': 'wiki-content-table'},  flavor='bs4', thousands ='.')

# converting the df into a sqlite database
import sqlite3

conn = sqlite3.connect('dndata.db')
c = conn.cursor()

df_pandas.to_sql('spells', conn, if_exists='replace')

最后一行代码收到此错误消息:

AttributeError                            Traceback (most recent call last)
c:\Users\chpap\Documents\Dokumente\Python lernen\Python book club\Cinema_Scraper\CS_Main.ipynb Cell 12 line 1
----> 1 df_pandas.to_sql('spells', conn, if_exists='replace')

AttributeError: 'list' object has no attribute 'to_sql'

你能帮我指出并改正我的错误吗?

我添加了行

c=conn.cursor(
),因为我被告知这会让Python识别“to_sql”,但错误消息保持不变。

我还添加了 sql 魔法“%load_ext sql”以促进从 Python 到 SQLITE 的平滑过渡,但它也没有改变错误消息

python pandas sqlite beautifulsoup
2个回答
0
投票

read_html
返回 DataFrame 对象的列表。因此,您需要首先选择或迭代所有并调用
to_sql
方法。 另外关于
read_html
的第一个参数,它可以是url或html本身。

conn = sqlite3.connect('dndata.db')
c = conn.cursor()

for url in url_list:
    df_pandas_list = pd.read_html(url, attrs={'class': 'wiki-content-table'}, flavor='bs4', thousands='.')

    df_pandas_list[0].to_sql('spells', conn, if_exists='replace')

0
投票

您可能希望

pd.concat
将网站上找到的所有表格连接到一个大数据框:

import csv
import re
import sqlite3
from random import randint
from urllib.request import urlopen

import pandas as pd
from bs4 import BeautifulSoup

url1 = "http://dnd5e.wikidot.com/spells"

url_list = [
    url1,
]

all_dfs = []
for url in url_list:
    ## Make Requests
    data = requests.get(url).text
    soup = BeautifulSoup(data, "html.parser")

    df_pandas = pd.concat(
        pd.read_html(
            url, attrs={"class": "wiki-content-table"}, flavor="bs4", thousands="."
        )
    )

    all_dfs.append(df_pandas)


conn = sqlite3.connect("dndata.db")
c = conn.cursor()

# iterate over all dataframes in list `all_dfs` and save them to database
for i, df in enumerate(all_dfs, 1):
    df.to_sql(f"spells_{i}", conn, if_exists="replace")
© www.soinside.com 2019 - 2024. All rights reserved.