Professional Documents
Culture Documents
Incident Report Intermittent Connection Issue 211019 PDF
Incident Report Intermittent Connection Issue 211019 PDF
Author
Mark McAllister CTO
1. Introduction ................................................................................................................................................2
2. Incident / Symptoms ..................................................................................................................................2
3. Root Cause ................................................................................................................................................2
4. Conclusions / Outcome ..............................................................................................................................3
1. Introduction
This is an Incident Report designed to report on major or re-occurring incidents, the outcome of root cause
analysis and details any remedial actions taken or to be taken.
2. Incident / Symptoms
When accessing the Precision login pages on the Backoffice portal domain, the page would show an error
message saying error connection refused ERR_CONNECTION_REFUSED.
When this occurred, the user could either click reload (which would require a new login) or click the refresh
button followed by the continue button; and when the connection was restored the user would be where they
would have been without the connection drop.
3. Root Cause
The error had been intermittent; since it first being reported we have taken a whole raft of steps.
1. 05/07/2019 – 07/08/2019 On initial reporting we carried out the first remedial actions to clear cache
and check logs to identify any issues that might be causing the problem and for the next month we
conducted the previously reported actions
2. 07/08/2019 – 03/09/2019 (systems stability gradually improve) – The issues now began to be
mitigated (but not completely removed) by the work being conducted by the multi skilled team
assembled by Giant, which included engineers from Rackspace, Cisco, Wintel and Microsoft.
Pressure was still being applied to Rackspace to get the Datapipe 1 and Metro 1 routing resolved.
They continued to advise the we should continue all the component replacement we had begun.
3. 03/09/2019 – 08/09/2019 Rackspace carried out remedial works and then replaced Datapipe 1. When
this was announced, Giant stopped all further works and there have been NO reoccurrence of the
intermittent issues. The decision was made to continue with the stoppage of all activity to deliver a
period of stability to our clients.
4. 18/10/2019 – There continues to be NO instances of the intermittent issues we had been experiencing
along with our clients. We will now begin scheduling the system upgrade activities necessary for
4. Conclusions / Outcome
Along with the system amendments we had begun implementing and had completed, as soon as the remedial
works were carried out by our supplier that we had been requesting over a protracted period, we suspended
all of our further actions and this has delivered a period of stability.
We are now planning to continue implementing all of the necessary upgrades and transfers we had been
planning as part of our standard upgrade plans. We will be notifying clients as necessary each time an action
may have an implication to system operation or downtime.
If there are any further questions, please do not hesitate to contact directly me my mobile number is
07801 369022.
Mark McAllister
Chief Technology Office
Giant Precision