我使用Infinispan 5.3设置了2个节点的群集。我正在测试故障转移方案。当我杀死一个节点时,出现以下异常(使用同步缓存)。群集无法获取。因此,我需要重新启动应用程序,这在生产环境中实际上是不可能的]
2020-05-06 18:50:28,082 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] ISPN000136: Execution error
java.lang.IllegalStateException: Transaction TransactionImple < ac, BasicAction: -3f57f478:dd0a:5eb2b455:2d461 status: ActionStatus.ABORT_ONLY > is not in a valid state to be invoking cache operations on.
at org.infinispan.interceptors.TxInterceptor.enlist(TxInterceptor.java:275)
at org.infinispan.interceptors.TxInterceptor.enlistIfNeeded(TxInterceptor.java:239)
at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:233)
at org.infinispan.interceptors.TxInterceptor.visitGetKeyValueCommand(TxInterceptor.java:229)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:134)
at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:96)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:216)
at org.infinispan.statetransfer.StateTransferInterceptor.handleDefault(StateTransferInterceptor.java:200)
at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:96)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
at org.infinispan.interceptors.CacheMgmtInterceptor.visitGetKeyValueCommand(CacheMgmtInterceptor.java:113)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:134)
at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:96)
at org.infinispan.interceptors.IsMarshallableInterceptor.visitGetKeyValueCommand(IsMarshallableInterceptor.java:97)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128)
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92)
at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:96)
at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343)
at org.infinispan.CacheImpl.containsKey(CacheImpl.java:372)
at org.infinispan.DecoratedCache.containsKey(DecoratedCache.java:410)
at com.abcr.ServiceContext.existsInSyncCache(ServiceContext.java:1740)
at com.abcr.ServiceContext.getObjectForUpdateInSyncCache(ServiceContext.java:1778)
at com.abcr.core.cache.ClusterServiceNodeListCacheManager.getObjectForUpdate(ClusterServiceNodeListCacheManager.java:90)
at com.suntecgroup.tbms.tpe.core.server.ServerManager.callBackOnMembersModified(ServerManager.java:3385)
at com.abcr.core.ServiceContainerCommandDespatcher.run(ServiceContainerCommandDespatcher.java:64)
2020-05-06 18:50:28,086 ERROR [com.abcr.core.ServiceContainer] Invocation of callback APIs on leaving coordinator role failed for service 'ABC'.
com.suntecgroup.tbms.container.services.ContainerPlatformServicesException: Failed to retrieve object[SERVER/SERVICE_NODES/28000] for update.
这是我的infinspan和jgroups配置
<infinispan
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:5.3 http://www.infinispan.org/schemas/infinispan-config-5.3.xsd"
xmlns="urn:infinispan:config:5.3">
<global>
<!-- Note that if these are left blank, defaults are used. See the user guide for what these defaults are -->
<asyncListenerExecutor factory="org.infinispan.executors.DefaultExecutorFactory">
<properties>
<property name="maxThreads" value="5" />
<property name="threadNamePrefix" value="AsyncListenerThread" />
</properties>
</asyncListenerExecutor>
<asyncTransportExecutor factory="org.infinispan.executors.DefaultExecutorFactory">
<properties>
<property name="maxThreads" value="25" />
<property name="threadNamePrefix" value="AsyncSerializationThread" />
</properties>
</asyncTransportExecutor>
<evictionScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
<properties>
<property name="threadNamePrefix" value="EvictionThread" />
</properties>
</evictionScheduledExecutor>
<replicationQueueScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
<properties>
<property name="threadNamePrefix" value="ReplicationQueueThread" />
</properties>
</replicationQueueScheduledExecutor>
<globalJmxStatistics enabled="false" jmxDomain="infinispan_1" />
<!--
If the transport is omitted, there is no way to create distributed or clustered caches.
There is no added cost to defining a transport but not creating a cache that uses one, since the transport
is created and initialized lazily.
-->
<transport clusterName="PC_SITE_1" distributedSyncTimeout="50000" transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
<properties>
<property name="configurationFile" value="./tmp/_clusterconfig/pc_jgroups_main_sync.xml" />
</properties>
</transport>
<!-- Note that the JGroups transport uses sensible defaults if no configuration property is defined. -->
<!-- See the JGroupsTransport javadocs for more flags -->
<!-- Again, sensible defaults are used here if this is omitted. -->
<serialization marshallerClass="org.infinispan.marshall.VersionAwareMarshaller" version="1.0" />
<!--
Used to register JVM shutdown hooks.
hookBehavior: DEFAULT, REGISTER, DONT_REGISTER
-->
<shutdown hookBehavior="DEFAULT" />
</global>
<!-- *************************** -->
<!-- Default "template" settings -->
<!-- *************************** -->
<!-- this is used as a "template" configuration for all caches in the system. -->
<default>
<!--
isolation levels supported: READ_COMMITTED and REPEATABLE_READ
-->
<locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="60000" writeSkewCheck="false" concurrencyLevel="5000" useLockStriping="false" />
<!--
Used to register a transaction manager and participate in ongoing transactions.
-->
<!-- ECPCacheTxManagerLookup -->
<!--
Used to register JMX statistics in any available MBean server
-->
<jmxStatistics enabled="false" />
<!--
Used to enable invocation batching and allow the use of Cache.startBatch()/endBatch() methods.
-->
<clustering mode="replication">
<sync replTimeout="600000" />
<stateTransfer timeout="480000" fetchInMemoryState="true" />
</clustering>
<storeAsBinary enabled="true" />
</default>
<namedCache name="GLOBAL_SYNC_CACHE">
<transaction transactionMode="TRANSACTIONAL" transactionManagerLookupClass="com.suntecgroup.tbms.container.services.cluster.ContainerCacheTxManagerLookup" syncRollbackPhase="false" syncCommitPhase="true" useEagerLocking="true" lockingMode="PESSIMISTIC" />
</namedCache>
<namedCache name="GLOBAL_NONTX_SYNC_CACHE">
<transaction transactionMode="NON_TRANSACTIONAL" />
</namedCache>
</infinispan>
JGROUPS配置..
<?xml version="1.0" encoding="UTF-8"?>
<config xmlns="urn:org:jgroups" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd">
<TCP bind_port="7800" loopback="true" recv_buf_size="20M" send_buf_size="640K" max_bundle_size="64000" max_bundle_timeout="30" enable_bundling="false" use_send_queues="true" sock_conn_timeout="300" tcp_nodelay="true" thread_pool.enabled="true" thread_pool.min_threads="1" thread_pool.max_threads="25" thread_pool.keep_alive_time="5000" thread_pool.queue_enabled="false" thread_pool.queue_max_size="100" thread_pool.rejection_policy="run" oob_thread_pool.enabled="true" oob_thread_pool.min_threads="1" oob_thread_pool.max_threads="8" oob_thread_pool.keep_alive_time="5000" oob_thread_pool.queue_enabled="false" oob_thread_pool.queue_max_size="100" oob_thread_pool.rejection_policy="run" enable_diagnostics="false" />
<!--MPING mcast_addr="232.1.2.13"
mcast_port="7500"
num_initial_members="2"
timeout="2000" /-->
<TCPPING timeout="3000" initial_hosts="${jgroups.tcpping.initial_hosts:localhost[7800],localhost[7801]}" port_range="0" num_initial_members="3" />
<MERGE2 max_interval="100000" min_interval="20000" />
<FD_SOCK />
<FD timeout="60000" max_tries="5" />
<VERIFY_SUSPECT timeout="30000" />
<BARRIER />
<pbcast.NAKACK use_mcast_xmit="false" exponential_backoff="500" discard_delivered_msgs="true" />
<UNICAST timeout="300,600,1200" />
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000" />
<pbcast.GMS print_local_addr="false" join_timeout="3000" view_bundling="true" />
FRAG2 frag_size="60000"
pbcast.STATE_TRANSFER
</config>
当前事务已中止(可能是由于超时,但可能是由于交付失败所致)。您需要回滚当前事务并开始新的事务。
但是我要注意5.3是在2013年6月26日发布的-您使用的版本已接近7年。如果存在错误,甚至没有人会尝试将其检出。