Uploaded image for project: 'InsightEdge Platform'
  1. GS-13631

When a transaction aborts due to network failure, it may not release the participating entries immediately

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Medium
    • Resolution: Fixed
    • Affects versions: None
    • Fix versions: 14.0, 12.3.1-patch2
    • Labels:
      None
    • Platform:
      All
    • SalesForce Case ID:
      11523
    • Acceptance Test:
       
    • Sprint:
    • Product:
      XAP
    • Edition:
      Open Source

      Description

      After getting :
      StackTrace: net.jini.core.transaction.TransactionException: Transaction was disconnected due to communication fault: ServerTransaction [id=1729275, manager=TxnMgrProxy [proxyId=ddface01-624d-4f9e-a8ef-9c4ce2292fe3, isDirect=false]]
      at com.j_spaces.core.transaction.TransactionHandler.checkTransactionDisconnection(TransactionHandler.java:498)
      at com.gigaspaces.internal.server.space.SpaceImpl.readMultiple(SpaceImpl.java:2038)
      at com.gigaspaces.internal.server.space.operations.ReadTakeEntriesSpaceOperation.execute(ReadTakeEntriesSpaceOperation.java:41)
      at com.gigaspaces.internal.server.space.operations.ReadTakeEntriesSpaceOperation.execute(ReadTakeEntriesSpaceOperation.java:32)
      In server side,
      expecting that transaction will be rolledback and / or a hook for exception handeling will be triggered
      But in fact : No more proccesing continues till transaction is timeing out.
      This issue is related to gs-13284.


      Steps to manually reproduce :
      1. apply the attached patch
          patch ads a counter field to SpaceImple which is incremented every time a take under txn in performed,
          in readMultiple finally clause (line 2132) we throw RemoteException to simulate the short disconnection (should throw only
          when counter ==1 ).
          hard codded txn leas to be an hour to allow for debugging.
      2. fill the space with FIFO object and run a FIFO polling container.
      3. first take op will trigger the remote exception
      4. second one will trigger the com.j_spaces.core.transaction.TransactionHandler#checkTransactionDisconnection method to throw TransactionException and call abort internally.


      NOTE: the bug occurs only when first operation of the txn throws RemoteException , since the operation is performed in the server but as far as the client knows no entries were locked in the space so the client calls disjoin.
      that causes the org.openspaces.events.polling.SimplePollingEventListenerContainer#rollbackOnException method, which is called if TransactionException is thrown, to be useless (no participants to call abort on).
          
          

        Attachments

          Issue links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: