我试图运行带有存储的Janusgraph作为Cassandra,作为同一集群中的另一个服务运行,并使用Elasticsearch进行索引,再次作为同一集群中的另一个服务运行。
虽然两个服务中都打开了所需的端口,但是janusgraph pods的日志表示在连接到Cassandra时它面临连接超时。
23343 [main] WARN org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] configured at [conf/gremlin-server/janusgraph.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:82)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:70)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.lambda$new$0(DefaultGraphManager.java:57)
at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.<init>(DefaultGraphManager.java:55)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:110)
at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:89)
at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:354)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:78)
... 13 more
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477)
at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:409)
at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1376)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:113)
... 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
... 24 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:619)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(AstyanaxStoreManager.java:314)
... 29 more
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=cassandra(SERVICE_IP):9160, latency=10001(10001), attempts=1]Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:352)
at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:294)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:614)
我正在为cassandra运行janusgrah v2图像和gcr.io/google-samples/cassandra:v13图像。
我也试过从busybox pod连接到cassandra端口9160。但似乎没有用。但有趣的是:ping
似乎适用于服务名称(cassandra)。但只有当它在端口9160或9042上到达telnet
时才会出现连接拒绝错误。
这是cassandra STS:
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
name: cql
- port: 9160
name: thrift
selector:
app: cassandra
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
#schedulerName: stork #Check benefits of using STORK as scheduler.
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
- containerPort: 9160
name: thrift
- containerPort: 9142
name: transportssl
resources:
limits:
cpu: "1Gi"
memory: 2Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: CASSANDRA_SEEDS
value: cassandra-0.cassandra.default.svc.cluster.local
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 512M
- name: CASSANDRA_CLUSTER_NAME
value: "Cassandra"
- name: CASSANDRA_DC
value: "DC1"
- name: CASSANDRA_RACK
value: "Rack1"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
- name: CASSANDRA_RPC_ADDRESS
value: 0.0.0.0
- name: CASSANDRA_NUM_TOKENS
value: "32"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: nfs-pvc-cassandra
mountPath: /srv/nfs/kubedata/janus
restartPolicy: Always
volumes:
- name: nfs-pvc-cassandra
persistentVolumeClaim:
claimName: nfs-pvc-cassandra
有什么方法可以进一步调试?
如果您的janusgraph正在您的主机上运行,您可能需要对您的kubernetes服务端口进行端口转发以在本地访问它。也许你已经做到了
我可以确认您的StatefulSet yaml工作正常,无头服务创建指向pod端点的dns名称。我创建了简单的nginx pod来telnet服务以进行检查。这是输出:
检查cassandra是否存在端点和服务
$ kubectl get svc,ep cassandra
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cassandra ClusterIP None <none> 9042/TCP,9160/TCP 2h
NAME ENDPOINTS AGE
endpoints/cassandra 10.56.0.10:9160,10.56.3.3:9160,10.56.4.2:9160 + 3 more... 2h
执行到相同命名空间中的邻居窗格并远程登录到服务和窗格
$ kubectl exec -it nginx-79dbd67896-9dwp8 bash
root@nginx-79dbd67896-9dwp8:/# telnet cassandra 9042
Trying 10.56.3.3...
Connected to cassandra.default.svc.cluster.local.
Escape character is '^]'.
telnet> quit
Connection closed.
root@nginx-79dbd67896-9dwp8:/# telnet 10.56.0.10 9042
Trying 10.56.0.10...
Connected to 10.56.0.10.
Escape character is '^]'.
从输出服务看来,pod正在侦听端口9042,而不是9160,因为端口9160用于Cassandra的Thrift API服务器,默认情况下是禁用的。有关此问题的更多信息,请查看https://github.com/docker-library/cassandra/issues/127。您必须检查如何启用Thrift API端口。
您可以通过执行此pod中的一个并运行以下命令来检查cassandra pod上的侦听端口:
root@cassandra-0:/# apt update && apt install telnet net-tools
<output omitted>
root@cassandra-0:/# netstat -tulpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 0.0.0.0:9042 0.0.0.0:* LISTEN 1000 10163244 -
tcp 0 0 127.0.0.1:43669 0.0.0.0:* LISTEN 1000 10162974 -
tcp 0 0 10.56.0.10:7000 0.0.0.0:* LISTEN 1000 10163145 -
tcp 0 0 127.0.0.1:7199 0.0.0.0:* LISTEN 1000 10162973 -
希望能帮助到你!