将数据加载到 postgresql 数据库时出错

问题描述 投票:0回答:1

我有一条错误消息:

**python3 load_data.py
回溯(最近一次调用最后一次): 文件“/Users/sarahlenet/Desktop/MLOPS/Repo_MLOPS_data_atelier/src/scripts/data/load_data.py”,第 113 行,位于 insert_tsunamis_data('../../../data/tsunamis.csv') 文件“/Users/sarahlenet/Desktop/MLOPS/Repo_MLOPS_data_atelier/src/scripts/data/load_data.py”,第 19 行,在 insert_tsunamis_data 中 conn = psycopg2.connect(**DB_PARAMS) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“/opt/anaconda3/lib/python3.11/site-packages/psycopg2/init.py”,第 122 行,在连接中 conn = _connect(dsn,connection_factory=connection_factory,kwasync) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^ psycopg2.OperationalError:连接到服务器“172.25.0.2”,端口 5432 失败:操作超时 服务器是否在该主机上运行并接受 TCP/IP 连接?

我有一个 load_data.py 文件,就是这个:

import pandas as pd
import psycopg2
from psycopg2 import sql

# Database connection parameters
DB_PARAMS = {
    'dbname': 'tsunamis',
    'user': 'my_user',
    'password': 'my_password',
    'host': '172.25.0.2',  # or your PostgreSQL server address
    'port': 5432
}

def insert_tsunamis_data(csv_file):
    # Read the CSV file into a DataFrame
    df = pd.read_csv(csv_file)

    # Create a connection to the database
    conn = psycopg2.connect(**DB_PARAMS)
    cursor = conn.cursor()

    # Prepare the SQL insert statement for the tsunamis table
    insert_query = sql.SQL("""
        INSERT INTO tsunamis (
            type, x, y, fid, id, tsevent_id, year, month, day, date_str,
            arr_day, arr_hour, arr_min, trav_hours, trav_mins, period,
            first_moti, latitude, longitude, location_n, area,
            country, region_cod, runup_ht, runup_ht_r, runup_hori,
            type_measu, type_mea_1, damage_mil, damage_amo, damage_des,
            deaths, deaths_amo, deaths_des, injuries, injuries_a,
            injuries_d, houses_des, houses_amo, houses_d_1, houses_dam,
            houses_d_2, houses_d_3, comments, doubtful, dist_from,
            event_regi, event_re_1, pred_trav, pred_trav1, url,
            tsevent_ur, hour, minute, second, region, cause_code,
            cause, event_vali, event_va_1, eq_mag_unk, eq_mag_mb,
            eq_mag_ms, eq_mag_mw, eq_mag_ml, eq_mag_mfa, eq_magnitu,
            eq_magni_1, eq_depth, max_event_, ts_mt_abe, ts_mt_ii,
            ts_intensi, missing, missing_am, missing_de, warning_st,
            num_runup, id_runup, has_ref, id_ref, slides_url,
            num_slides, map_slide_, map_eq_id, damage_m_1, damage_a_1,
            damage_tot, houses_d_4, houses_a_1, houses_tot,
            deaths_tot, deaths_a_1, deaths_t_1, injuries_t, injuries_1,
            injuries_2, missing_to, missing__1, missing__2, map_vol_id,
            houses_d_5, houses_d_6, houses_d_7, num_deposi
        ) VALUES (%s)
    """)

    # Insert each row from the DataFrame
    for index, row in df.iterrows():
        cursor.execute(insert_query, tuple(row))

    # Commit the transaction
    conn.commit()
    cursor.close()
    conn.close()

def insert_gdp_data(csv_file):
    # Read the CSV file into a DataFrame
    df = pd.read_csv(csv_file)

    # Create a connection to the database
    conn = psycopg2.connect(**DB_PARAMS)
    cursor = conn.cursor()

    # Prepare the SQL insert statement for the gdp table
    insert_query = sql.SQL("""
        INSERT INTO gdp (
            type, country_name, country_code, indicator_name, indicator_code,
            "1960", "1961", "1962", "1963", "1964", "1965", "1966", "1967", "1968", "1969",
            "1970", "1971", "1972", "1973", "1974", "1975", "1976", "1977", "1978", "1979",
            "1980", "1981", "1982", "1983", "1984", "1985", "1986", "1987", "1988", "1989",
            "1990", "1991", "1992", "1993", "1994", "1995", "1996", "1997", "1998", "1999",
            "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009",
            "2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018", "2019",
            "2020", "2021", "2022", "2023"
        ) VALUES (%s)
    """)

    # Insert each row from the DataFrame
    for index, row in df.iterrows():
        cursor.execute(insert_query, tuple(row))

    # Commit the transaction
    conn.commit()
    cursor.close()
    conn.close()

def insert_population_data(csv_file):
    # Read the CSV file into a DataFrame
    df = pd.read_csv(csv_file)

    # Create a connection to the database
    conn = psycopg2.connect(**DB_PARAMS)
    cursor = conn.cursor()

    # Prepare the SQL insert statement for the population table
    insert_query = sql.SQL("""
        INSERT INTO population (
            type, country, density, density_mi, population, area
        ) VALUES (%s)
    """)

    # Insert each row from the DataFrame
    for index, row in df.iterrows():
        cursor.execute(insert_query, tuple(row))

    # Commit the transaction
    conn.commit()
    cursor.close()
    conn.close()

# Example usage:
insert_tsunamis_data('../../../data/tsunamis.csv')
insert_gdp_data('../../../data/gdp.csv')
insert_population_data('../../../data/population.csv')


还有一个 docker-compose 就是这个:


# ''' TODO: change the credentials and names'''  
version: '3.8'

services:
  postgres:
    image: postgres:13  # Version de PostgreSQL
    container_name: postgres_container
    environment:
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: admin
      POSTGRES_DB: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ../src/scripts/data/init.sql:/docker-entrypoint-initdb.d/init.sql  # Montez le script

    ports:
      - "5432:5432"
      
  pgadmin:
    image: dpage/pgadmin4:latest  # Image officielle de pgAdmin 4
    container_name: pgadmin_container
    environment:
      PGADMIN_DEFAULT_EMAIL: [email protected]  # Email de connexion à pgAdmin
      PGADMIN_DEFAULT_PASSWORD: admin  # Mot de passe pour pgAdmin
    ports:
      - "8080:80"  # Expose l'interface pgAdmin sur le port 8080
    depends_on:
      - postgres
    volumes:
      - pgadmin_data:/var/lib/pgadmin

volumes:
  postgres_data:
  pgadmin_data:

我尝试更正 pg_hba.conf 并将所有访问添加到 0.0.0.0/0。

postgresql
1个回答
0
投票

假设您在本地运行容器,请更新 python 脚本以使用正确的连接详细信息:

DB_PARAMS = {
  'dbname': 'tsunamis',
  'user': 'admin',
  'password': 'admin',
  'host': 'localhost',
  'port': 5432
}

并更新 Dockerfile 以使用正确的数据库:

environment:
  POSTGRES_USER: admin
  POSTGRES_PASSWORD: admin
  POSTGRES_DB: tsunamis
© www.soinside.com 2019 - 2024. All rights reserved.