We're updating the issue view to help you get more done. 

Space failover might take an extended period of time (above 60 sec)

Description

Setup:
Running with 3 PUs where one of those PUs is deployed with 14 partitions.
4 physical hosts in total. 2 LUS running on the hosts which where not killed.

Customer was able to reproduce the long failover with 3 space, 2 spaces have only one partition and one space has 14 partitions. With backups this make 32 stateful PUIs. Failover sometimes takes a period of over 60 sec.

Analysis:
EventExpireThread which holds a given lock for extended period of time (> 60sec) and prevents some other thread to signal the waiting connection threads

Workaround

None

Acceptance Test

reproduction test case

Status

Assignee

Meron Avigdor

Reporter

Meron Avigdor

Labels

Priority

Major

SalesForce Case ID

10562

Fix versions

Commitment Version/s

None

Due date

None

Product

None

Edition

None

Platform

All

Sprint

None