Apache Ignite cache long running transaction

Sometimes, not always, when the topology of my cluster changes it happens that my application hangs for up to 1 minute or more. In the log I then see the below Ignite warning. I guess this is the reason why my application hangs at a cache operation.

What is causing the long transaction? I expect it is either network issues or GC?

I wasn't able to find out which cache operation in my code is causing this long transaction. Does the warning help me find out what operation it is?

22:00:30.456 [grid-timeout-worker-#63][101] WARN org.apache.ignite.internal.diagnostic-[warning] Found long running transaction [startTime=21:58:57.176, curTime=22:00:30.456, tx=GridNearTxLocal [mappings=IgniteTxMappingsImpl , nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null, hasRemoteLocks=false, trackTimeout=false, lb=null, thread=<failed to find active thread 1498>, mappings=IgniteTxMappingsImpl , super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=, dhtNodes=, explicitLock=false, super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl [activeCacheIds=, recovery=null, txMap=], super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=147994093, order=1536523104974, nodeOrder=74], writeVer=null, implicit=false, loc=true, threadId=1498, startTime=1536523137176, nodeId=e8153238-1d5a-4149-8db8-83a9fc820750, startVer=GridCacheVersion [topVer=147994093, order=1536523104974, nodeOrder=74], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=null, finalizing=NONE, invalidParts=null, state=ACTIVE, timedOut=false, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], duration=93280ms, onePhaseCommit=false], size=0]]]]

My caches are created like this:

<bean class="org.apache.ignite.configuration.CacheConfiguration"> <property name="name" value="MFDB_JobList" /> <property name="cacheMode" value="PARTITIONED" /> <property name="backups" value="0" /> <property name="atomicityMode" value="TRANSACTIONAL"/> <property name="writeSynchronizationMode" value="FULL_SYNC"/> <property name="indexedTypes"> <list> <value>java.util.UUID</value> <value>CacheJobQueueEntry</value> </list> </property> </bean>

The relevant ignite configuration looks like this:

<property name="networkTimeout" value="60000" /> <property name="networkSendRetryCount" value="10" /> <property name="failureDetectionTimeout" value="100000" /> <property name="clientFailureDetectionTimeout" value="100000" />

It sounds like exchange goes way too slow in your cluster. Could you please share logs and few consecutive thread dumps taken while application hangs?

– antkr
Sep 10 '18 at 10:10

@DonTequila How many nodes in the cluster do you have? How much data are you storing? Is persistence to disk enabled?

– Dmitriy
Sep 11 '18 at 18:45

@Dmitiy sorry for missing all this information. These errors already came with only 3 nodes. Persitance is emabled. I think I found out that I was querying a cache with lots of data using SqlQuery, this took quite long time on every cache update. Now I'm using SqlFieldsQuery with only a few columns and the whole cluster is must faster. Data thoughput was about average 500mb/s before, now only 40kb/s. I will keep an eye on it if the error still happens. Thanks!

– DonTequila
Sep 11 '18 at 21:14

1 Answer
1

go through below link:-

https://issues.apache.org/jira/browse/IGNITE-6980

And also try

<property name="atomicityMode" value="ATOMIC"/>

with Async Cache operation. It might help to minimize above issue.

Thanks for contributing an answer to Stack Overflow!

But avoid …

To learn more, see our tips on writing great answers.

Required, but never shown

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Dfyjkt