Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Report Title

Precision Portal Intermittent Connection Error Requiring Refresh – Final Update

Mark McAllister CTO

1. Introduction ................................................................................................................................................2
2. Incident / Symptoms ..................................................................................................................................2
3. Root Cause ................................................................................................................................................2
4. Conclusions / Outcome ..............................................................................................................................3

Confidential Page 1 21-Oct-19


1. Introduction
This is an Incident Report designed to report on major or re-occurring incidents, the outcome of root cause
analysis and details any remedial actions taken or to be taken.

Service Impact: Major

Initial Incident Notification 05/07/2019

Notification of Incident Resolution Concluded 08/09/2019

2. Incident / Symptoms
When accessing the Precision login pages on the Backoffice portal domain, the page would show an error
message saying error connection refused ERR_CONNECTION_REFUSED.

When this occurred, the user could either click reload (which would require a new login) or click the refresh
button followed by the continue button; and when the connection was restored the user would be where they
would have been without the connection drop.

3. Root Cause

The error had been intermittent; since it first being reported we have taken a whole raft of steps.

1. 05/07/2019 – 07/08/2019 On initial reporting we carried out the first remedial actions to clear cache
and check logs to identify any issues that might be causing the problem and for the next month we
conducted the previously reported actions

a. Re-route the network infrastructure to mitigate any risk there (Completed)

b. Move the sites from one webserver to another (Completed)
c. Replace the webservers with new hardware (Completed)
d. Completely reroute the network via another entry point within Rackspace (Completed)
e. Prepare a further contingency if c. or d. does not work
i. Prepare a webserver in another datacentre location of ours in the UK (Completed but
not used as the issue was resolved)
ii. Migrate the sites to that location (Not required)

2. 07/08/2019 – 03/09/2019 (systems stability gradually improve) – The issues now began to be
mitigated (but not completely removed) by the work being conducted by the multi skilled team
assembled by Giant, which included engineers from Rackspace, Cisco, Wintel and Microsoft.
Pressure was still being applied to Rackspace to get the Datapipe 1 and Metro 1 routing resolved.
They continued to advise the we should continue all the component replacement we had begun.

3. 03/09/2019 – 08/09/2019 Rackspace carried out remedial works and then replaced Datapipe 1. When
this was announced, Giant stopped all further works and there have been NO reoccurrence of the
intermittent issues. The decision was made to continue with the stoppage of all activity to deliver a
period of stability to our clients.

4. 18/10/2019 – There continues to be NO instances of the intermittent issues we had been experiencing
along with our clients. We will now begin scheduling the system upgrade activities necessary for

Confidential Page 2 21-Oct-19

proper maintenance of the infrastructure. These will be notified as and when they affect our clients as

4. Conclusions / Outcome

Along with the system amendments we had begun implementing and had completed, as soon as the remedial
works were carried out by our supplier that we had been requesting over a protracted period, we suspended
all of our further actions and this has delivered a period of stability.

We are now planning to continue implementing all of the necessary upgrades and transfers we had been
planning as part of our standard upgrade plans. We will be notifying clients as necessary each time an action
may have an implication to system operation or downtime.

If there are any further questions, please do not hesitate to contact directly me my mobile number is
07801 369022.

Mark McAllister
Chief Technology Office
Giant Precision

Confidential Page 3 21-Oct-19

You might also like