我在不同数据中心设置 Cassandra 节点时遇到问题。
每个数据中心都有公网IP;假设有 111.111.111.x 和 222.222.222.x。 Cassandra 节点在 docker 容器中启动 - 每个 DC 中有 3 个节点。问题是 DC 无法看到对方。
这里是 IP 为 111.111.111.x 的主机上第一个集群的 docker-compose 文件
version: '3.3'
services:
cass1:
image: cassandra
container_name: cass1
hostname: cass1
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9042:9042"
- "7000:7000"
environment: &environment
HEAP_NEWSIZE: 128M
MAX_HEAP_SIZE: 2048M
CASSANDRA_SEEDS: "cass1, cass2, 222.222.222.222”
CASSANDRA_CLUSTER_NAME: ks
CASSANDRA_DC: DC1
CASSANDRA_RACK: West
CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
CASSANDRA_NUM_TOKENS: 256
cass2:
image: cassandra
container_name: cass2
hostname: cass2
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9043:9042"
- "7001:7000"
environment: *environment
depends_on:
- cass1
cass3:
image: cassandra
container_name: cass3
hostname: cass3
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9044:9042"
- "7002:7000"
environment: *environment
depends_on:
- cass2
networks:
cassandra:
第二个集群位于 IP 为 222.222.222.x 的主机上
version: '3.3'
services:
cass1:
image: cassandra
container_name: cass1
hostname: cass1
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9042:9042"
- "7000:7000"
environment: &environment
HEAP_NEWSIZE: 128M
MAX_HEAP_SIZE: 2048M
CASSANDRA_SEEDS: "cass1, cass2, 111.111.111.111”
CASSANDRA_CLUSTER_NAME: ks
CASSANDRA_DC: DC2
CASSANDRA_RACK: East
CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
CASSANDRA_NUM_TOKENS: 256
cass2:
image: cassandra
container_name: cass2
hostname: cass2
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9043:9042"
- "7001:7000"
environment: *environment
depends_on:
- cass1
cass3:
image: cassandra
container_name: cass3
hostname: cass3
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9044:9042"
- "7002:7000"
environment: *environment
depends_on:
- cass2
networks:
cassandra:
启动后,每个集群只能看到自己,看不到其他集群。例如 111.111.111.111 的 nodetool 状态输出
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.18.0.4 108.81 KiB 256 60.1% 3ceecd38-0c30-4189-9a52-cb23465ca0ca West
UN 172.18.0.3 113.77 KiB 256 70.4% f1f6f63a-7cc5-4883-8b90-5159c6b6613c West
UN 172.18.0.2 108.74 KiB 256 69.5% 94395942-8efa-447b-9961-2d80375bce35 West
netcat 显示另一台主机可用
nc -v 222.222.222.222 7000
Connection to 222.222.222.222 7000 port [tcp/*]
succeeded!
cassandra 日志中没有错误。 也许发生了一些配置错误?我尝试在同一主机上运行 2 个 DC,并且 DC 之间的互通工作正常。
TY 寻求答案。
对于要形成集群的节点,必须满足以下三件事:
cluster_name
。当节点上线时,它将尝试与种子节点进行八卦以获取有关集群的信息。如果与其他节点没有网络连接,它将无法加入集群。
如果一个节点与另一个具有不同集群名称的节点进行闲聊,这两个节点将互相拒绝,因为它们不属于同一集群。
就您而言,我怀疑节点无法跨区域相互八卦。通常在公共云上,节点无法通过私有 IP 与另一个区域进行通信,因此需要对节点进行以下配置:
listen_address: private_IP
broadcast_address: public_IP
通过此配置,同一区域内的节点将使用其私有 IP 进行连接,但将使用其公共 IP 与另一个区域中的节点进行连接。
我还从您的 Docker 撰写中注意到,您已在主机上的不同端口上公开了八卦端口 (
7000
),这意味着节点将无法与端口上其他区域的节点进行八卦 7000
。干杯!