We're updating the issue view to help you get more done. 

Shutdown might be pending endlessly for replication to finish

Description

relocation of pu instance was stuck, looking at the log customer saw that original instance shutdown hanged.
looking at the code customer identified the problem:
AbstractReplicationSourceGroup.flushPendingReplication. The outer loop looks like this:
while (remainingTime >= 0) {

Inside the loop:
if (remainingTime > 0)
try {
...
remainingTime -= sleepTime;

If the remainingTime is subtracted to exactly 0, it will be never subtracted again and the outer loop condit

thread dump done around 15 minutes after the relocation:

"GS-LRMI Connection-pool-1-thread-4" Id=128 RUNNABLE
at com.gigaspaces.internal.cluster.node.impl.backlog.AbstractSingleFileGroupBacklog.size(AbstractSingleFileGroupBacklog.java:1176)
at com.gigaspaces.internal.cluster.node.impl.groups.AbstractReplicationSourceGroup.flushPendingReplication(AbstractReplicationSourceGroup.java:478)
at com.gigaspaces.internal.cluster.node.impl.ReplicationNode.flushPendingReplication(ReplicationNode.java:1034)
at com.gigaspaces.internal.cluster.node.impl.ReplicationNodeAdmin.flushPendingReplication(ReplicationNodeAdmin.java:61)
at com.gigaspaces.internal.server.space.SpaceEngine.waitForConsistentState(SpaceEngine.java:3784)
at com.gigaspaces.internal.server.space.SpaceImpl.beforeShutdown(SpaceImpl.java:1263)
at com.gigaspaces.internal.server.space.SpaceImpl.shutdown(SpaceImpl.java:1214)
at com.j_spaces.core.JSpaceContainerImpl.shutdownInternal(JSpaceContainerImpl.java:1128)

  • locked java.lang.Object@1011675000
    at com.j_spaces.core.JSpaceContainerImpl.shutdown(JSpaceContainerImpl.java:1107)
    at com.j_spaces.core.LRMIJSpaceContainer.shutdown(LRMIJSpaceContainer.java:86)
    at com.gigaspaces.internal.client.spaceproxy.SpaceProxyImpl.shutdown(SpaceProxyImpl.java:300)
    at org.openspaces.core.space.AbstractSpaceFactoryBean.close(AbstractSpaceFactoryBean.java:236)

  • locked org.openspaces.core.space.UrlSpaceFactoryBean@696698257
    at org.openspaces.core.space.AbstractSpaceFactoryBean.destroy(AbstractSpaceFactoryBean.java:212)

Workaround

None

Acceptance Test

NA

Status

Assignee

Reporter

Ester Atzmon

Labels

Priority

Medium

SalesForce Case ID

10682

Fix versions

None

Commitment Version/s

None

Due date

None

Product

None

Edition

None

Platform

All