从数据库中提取 400 万行，每组 100 万行

Question

我在雪花数据库中有一个表，其中包含 4 到 8 百万行。
我想分块提取 400 万行以减少数据库负载。
当循环在每个循环中运行时，行应以块的形式追加到数组
```
r
```
（如最后一行代码所示），最后所有数据应保存到CSV文件中。

我是Python新手。我看过其他答案，但它们没有按预期工作。

谁能帮助我。

chunksize= 1000000
query = "select from emp_table;"
cursor = conn.cursor()
cursor.execute(query)
r=[]

Answer 1

正如您所提到的，由于大量数据应该分块加载以减少负载，但是在 r 中加载和附加可能会导致非常占用内存的任务。

为了避免分块加载，请写入它，然后加载下一个块并遵循相同的操作，这样我们就不会过载内存。

这是示例代码

import snowflake.connector
import csv

def extract_and_write_data(conn, query, chunksize=1000000, 
    filename='extracted_data.csv'):
    cursor = conn.cursor()
    cursor.execute(query)

    with open(filename, 'w', newline='') as f:
        writer = csv.writer(f)
        writer.writerow(cursor.description)  # Write header row

        while True:
            chunk = cursor.fetchmany(chunksize)
            if not chunk:
                break
            writer.writerows(chunk)

# Connect to Snowflake
conn = snowflake.connector.connect(
    user='your_user',
    password='your_password',
    account='your_account',
    warehouse='your_warehouse',
    database='your_database',
    schema='your_schema'
)

# SQL query to extract all rows from the table
query = "SELECT * FROM emp_table"

# Extract and write data to CSV
extract_and_write_data(conn, query)

# Close the connection
conn.close()

从数据库中提取 400 万行，每组 100 万行

问题描述投票：0回答：1

1个回答

最新问题

从数据库中提取 400 万行，每组 100 万行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1