避免批量插入期间 MS Access 中的文件共享锁定

问题描述 投票:0回答:1

文件共享锁计数超出 // python 导入访问表。

如果有人可以提供帮助,您将成为救星。我有一个 python 脚本,它尝试自动执行手动过程,其中用户将包含 1,500,000 行的 .txt 导入访问表中。在编写 python 时,我开始使用批量插入,它基本上采用一个数据帧,然后将其分割成行大小如 50,000 的 .csv,然后从各个 .csv 插入到访问表中。

问题是它是公司 PC,我无法增加 MaxLocksPerFile 默认值 9500。我正在处理 5,000 行 .csv 文件,并每 10 批提交一次。因此,插入 5,000 行,直到达到 50,000 行,然后才提交。在抛出“文件共享锁定计数超出”错误之前,它执行了大约 350,000 次。

我尝试了每一种可以想到的批量大小和提交间隔的组合。我多次尝试使用executemany来执行一条sql语句,我尝试过使用execute加载1.5M行然后提交它们。一切都失败了。

有人在 Access 中做过类似的事情吗?另外,在你说使用更强大的数据库之前,如果可以的话我会的。我的主管仍然使用 Access,所以目前我仍坚持使用它。如果可以的话我会使用 Sql 服务器。

此处代码:导入数据的函数

def bulk_insert_to_access(file_path, table_name, temp_file_path=None, chunk_size=5000, commit_interval=100): 
    """
    Inserts data from a dataframe to Access using an optimized bulk insert.
    This method works by saving data to a temporary CSV file and then using an SQL SELECT INTO command.
    """
    # Step 1: Read the .txt file and prepare the dataframe
    df = pd.read_csv(file_path, delimiter="\t", encoding="utf-8")

    # Adjust your data as needed (like column cleaning, etc.)
    column_specs = get_column_specs(table_name)
    df_adjusted = adjust_data_to_specs(df, column_specs)

    
    if not temp_file_path:
        temp_file_path = c for os.path.dirname(file_path)

    # Break DF into chunks
    chunks = [df_adjusted.iloc[i:i + chunk_size] for i in range(0, len(df_adjusted), chunk_size)]

    # List to keep track of temporary files for cleanup later
    chunk_file_paths = []

    # Step 2: Perform the bulk insert via SQL SELECT INTO method
    conn = pyodbc.connect(connect_str)
    cursor = conn.cursor()

    try:
        for idx, chunk in enumerate(chunks):
            # Save the chunk to a temporary CSV file
            chunk_file_path = os.path.join(temp_file_path, f"temp_data_chunk_{idx}.csv")
            chunk.to_csv(chunk_file_path, index=False, header=False)  # Save to temporary file
            chunk_file_paths.append(chunk_file_path)  # Track file for later cleanup

            # Perform the bulk insert for each chunk
            sql = f"""
            INSERT INTO [{table_name}] 
            SELECT * FROM [Text;FMT=TabDelimited;HDR=NO;DATABASE={os.path.dirname(chunk_file_path)}].[{os.path.basename(chunk_file_path)}]
            """
            try:
                # Execute SQL statement to insert data from chunked file
                start_time = time.time()
                cursor.execute(sql)
                
                # Commit after every `commit_interval` chunks
                if (idx + 1) % commit_interval == 0:
                    conn.commit()
                    time.sleep(1)  # Add a small delay after commit to release locks
                
                elapsed_time = time.time() - start_time
                print(f"Bulk insert for chunk {idx} completed in {elapsed_time:.2f} seconds")

            except pyodbc.Error as e:
                print(f"Error during bulk insert for chunk {idx}: {e}")
                conn.rollback()  # Rollback only the current chunk

        # Commit after the last chunk if not already committed
        conn.commit()

    except pyodbc.Error as e:
        print(f"Error during the bulk insert process: {e}")
        conn.rollback()

    finally:
        # Cleanup temporary files
        for file_path in chunk_file_paths:
            if os.path.exists(file_path):
                try:
                    os.remove(file_path)
                except OSError as e:
                    print(f"Error deleting temporary file {file_path}: {e}")

        
        cursor.close()
        conn.close()
python sql ms-access sql-insert
1个回答
0
投票

解决了在 python 中调用 vba python: def run_macro(macro_name): try: access_app = win32com.client.Dispatch('Access.Application') access_app.Visible = False access_app.OpenCurrentDatabase(access_db_path) access_app.DoCmd.RunMacro(macro_name) access_app.Quit() run_macro(“宏”) VBA:子ImportTextFileWithSpec() filePath = “file.txt” importSpec = “SpecName” tableName = “Test” DoCmd.TransferText _ TransferType:=acImportDelim, _ SpecificName:=importSpec, _ TableName:=tableName, _ FileName:=filePath, _ HasFieldNames: =真正的结局子

© www.soinside.com 2019 - 2024. All rights reserved.