The Space instance identified that another instance has been elected as primary and changes its state to stopped.
2018-03-21 13:42:30,280 data-processor.72  INFO [com.gigaspaces.space.active-election.space.72] - terminating - another space has been elected as primary [72_1]
2018-03-21 13:42:30,363 data-processor.72  INFO [com.gigaspaces.space.space.72] - Space Stopped successfully
The Grid Service Manager suspects the instance failure, but since it is still not actively managing the instance it does not forcefully terminate it.
2018-03-21 13:42:34,440 GSM WARNING [org.openspaces.pu.container.servicegrid.PUFaultDetectionHandler] - Suspecting failure of service: [data-processor.72 ] pid host[ip-172-30-0-227.eu-west-1.compute.internal/172.30.0.227] - RTT[7.7 ms]. Retrying to reach service.; Caused by: com.j_spaces.core.SpaceUnhealthyException: Space is in unhealthy state: Space [space_container72:space] is in stopped state.
terminating - another space has been elected as primary [72_1]
We noticed that the time between the GSM was granted leadership and the time it actually was active was longer than the time we wait before forcefully terminating (3 min vs. 1 min wait).
2018-03-21 13:42:29,118 GSM INFO [com.gigaspaces.grid.gsm.leader] - Granted leadership
2018-03-21 13:45:51,311 GSM INFO [com.gigaspaces.grid.gsm.leader] - Actively managing: [data-processor, data-feeder]
We should be seeing the following message:
GSM INFO [com.gigaspaces.grid.gsm.services] - Forcefully destroy unhealthy service [data-processor.72 ]
Instances that the GSM got to actively managing before the timeout, were terminated.
regression tests (disconnect, manager suite)