我设置了三个联网的容器,因为我想将 Hadoop 和 Hive 与 PostgreSQL 一起使用。您可以通过 https://github.com/jcool12/hadoop-docker/tree/main/hivepost 访问设置的 Docker,以便下载
folders/
文件来运行它。 Hadoop 容器启动正常,PostgreSQL 启动正常,但 Hive 容器在启动时出现以下错误:
Waiting for PostgreSQL to start...
2024-05-24 00:09:28 psql: error: could not connect to server: Connection refused
2024-05-24 00:09:28 Is the server running on host "postgres" (172.22.0.2) and accepting
2024-05-24 00:09:28 TCP/IP connections on port 5432?
2024-05-24 00:09:28 Postgres is unavailable - sleeping
2024-05-24 00:09:29 psql: error: could not connect to server: Connection refused
2024-05-24 00:09:29 Is the server running on host "postgres" (172.22.0.2) and accepting
2024-05-24 00:09:29 TCP/IP connections on port 5432?
2024-05-24 00:09:29 Postgres is unavailable - sleeping
2024-05-24 00:09:30 psql: error: could not connect to server: Connection refused
2024-05-24 00:09:30 Is the server running on host "postgres" (172.22.0.2) and accepting
2024-05-24 00:09:30 TCP/IP connections on port 5432?
2024-05-24 00:09:30 Postgres is unavailable - sleeping
2024-05-24 00:09:31 psql: error: could not connect to server: Connection refused
2024-05-24 00:09:31 Is the server running on host "postgres" (172.22.0.2) and accepting
2024-05-24 00:09:31 TCP/IP connections on port 5432?
2024-05-24 00:09:31 Postgres is unavailable - sleeping
2024-05-24 00:09:32 Postgres is up - checking for required tables
2024-05-24 00:09:32 Required tables not found. Proceeding with initialization...
2024-05-24 00:09:32 WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:42
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:28 Waiting for PostgreSQL to start at postgres...
2024-05-24 00:09:39 Initializing the schema to: 4.0.0
2024-05-24 00:09:39 Metastore connection URL: jdbc:postgresql://postgres:5432/hive
2024-05-24 00:09:39 Metastore connection Driver : org.postgresql.Driver
2024-05-24 00:09:39 Metastore connection User: hiveuser
2024-05-24 00:09:39 Starting metastore schema initialization to 4.0.0
2024-05-24 00:09:39 Initialization script hive-schema-4.0.0.postgres.sql
2024-05-24 00:09:50 Initialization script completed
2024-05-24 00:09:51 Initializing Hive schema...
2024-05-24 00:09:56 Initializing the schema to: 4.0.0
2024-05-24 00:09:56 Metastore connection URL: jdbc:postgresql://postgres:5432/hive
2024-05-24 00:09:56 Metastore connection Driver : org.postgresql.Driver
2024-05-24 00:09:56 Metastore connection User: hiveuser
2024-05-24 00:09:57 Starting metastore schema initialization to 4.0.0
2024-05-24 00:09:57 Initialization script hive-schema-4.0.0.postgres.sql
2024-05-24 00:10:01 2024-05-23 23:10:01: Starting Hive Metastore Server
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:43
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:44
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:45
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:46
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:47
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:48
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:49
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50
2024-05-24 00:09:50 Postgres is up - executing command
2024-05-24 00:09:51 Password for user hiveuser:
2024-05-24 00:09:51 psql: error: fe_sendauth: no password supplied
2024-05-24 00:09:51 WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00
2024-05-24 00:10:00 Error: ERROR: relation "BUCKETING_COLS" already exists (state=42P07,code=0)
2024-05-24 00:10:00 Schema initialization FAILED! Metastore state would be inconsistent!
2024-05-24 00:10:00 Underlying cause: java.io.IOException : Schema script failed, errorcode 2
2024-05-24 00:10:00 Use --verbose for detailed stacktrace.
2024-05-24 00:10:00 *** schemaTool failed ***
2024-05-24 00:10:01 WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
2024-05-24 00:10:01 WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
2024-05-24 00:10:13 2024-05-23 23:10:13: Starting HiveServer2
2024-05-24 00:10:13 WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
2024-05-24 00:10:24 Hive Session ID = 854876ef-81d3-409f-af2d-223ec90b215f
2024-05-24 00:11:34 Hive Session ID = b46b34ab-c27e-4898-afaa-71d79fc03f44
2024-05-24 00:12:34 Hive Session ID = 85678875-8418-45fc-aeb6-c75bce3b926e
您能否帮我解决这些问题,以便 Hive 与 postgres 一起正常运行?
我建议对
entrypoint.sh
文件进行一些更新。
#!/bin/bash
rm /opt/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar
if [ ! -f $HADOOP_CONF_DIR/log4j.properties ]; then
cp $HADOOP_HOME/conf/log4j.properties $HADOOP_CONF_DIR/log4j.properties
fi
if [ ! -f $HIVE_HOME/conf/log4j.properties ]; then
cp $HIVE_HOME/conf/log4j-hive.properties $HIVE_HOME/conf/log4j.properties
fi
export PGPASSWORD=$POSTGRES_PASSWORD
TABLE_EXISTS=$(psql -h postgres -U $POSTGRES_USER -d $POSTGRES_DB -tc "SELECT 1 FROM pg_tables WHERE schemaname='public' AND tablename='BUCKETING_COLS';")
if [ -z "$TABLE_EXISTS" ]; then
echo "Initializing Hive schema"
$HIVE_HOME/bin/schematool -initSchema -dbType postgres 2>&1 | grep -v '^$'
else
echo "Hive schema is already initialized, skipping schema initialization"
fi
pid=$(pgrep -f 'hive.*service.*hiveserver2')
if [ -n "$pid" ]; then
echo "Stopping running HiveServer2 process..."
kill -9 $pid
fi
$HIVE_HOME/bin/hive --service metastore &
$HIVE_HOME/bin/hive --service hiveserver2
BUCKETING_COLS
表是否存在,并用它来确定是否初始化Hive schema。grep
命令末尾添加 schematool
,以抑制日志中大量空行。wait-for-postgres.sh
的执行并将其移至 docker-compose.yml
文件中(见下文)。这并不是真正必要的,但我发现如果在 Docker Compose 配置中明确这种依赖关系,事情就会变得更加清晰。下面的
docker-compose.yml
已被简化,以重点关注关键更改(为postgres
服务添加健康检查)。请参阅🚨评论。
version: '3.8'
services:
postgres:
image: postgres:13
container_name: postgres
environment:
POSTGRES_HOST: postgres
POSTGRES_DB: hive
POSTGRES_USER: hiveuser
POSTGRES_PASSWORD: hivepassword
# 🚨 Check that PostgreSQL is ready for connections before marking as "healthy".
healthcheck:
test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"]
interval: 5s
timeout: 5s
retries: 5
hive:
build: ./hive # Assumes you have a Dockerfile in the ./hive directory
container_name: hive
environment:
POSTGRES_JDBC_VERSION: 42.2.24
POSTGRES_HOST: postgres # Ensure PostgreSQL host is set
POSTGRES_DB: hive
POSTGRES_USER: hiveuser
POSTGRES_PASSWORD: hivepassword
depends_on:
# 🚨 Only start this container when PostgreSQL is "healthy".
postgres:
condition: service_healthy
hadoop:
condition: service_started