System outages at healthcare institutions caused by hardware failure, not lack of manpower: Janil Puthucheary
The two IT system outages on Aug 27 and Sep 5 significantly impacted the operations of public healthcare institutions, said Dr Puthucheary.
SINGAPORE: The two IT system outages at healthcare institutions last month and earlier this month were caused by failures of hardware devices in data centres, not a lack of manpower, said Senior Minister of State for Health Janil Puthucheary in Parliament on Monday (Sep 12).
Dr Puthucheary was responding to questions by MPs on the outages on Aug 27 and Sep 5.
Patients at some polyclinics had their appointments delayed or rescheduled in the Aug 27 incident, according to a report by the Straits Times.
Member of Parliament (MP) Tan Wu Meng (PAP-Jurong GRC) asked for a breakdown of the patients affected by the outage, its causes, and how the Integrated Health Information Systems (IHiS) compared against international healthcare institutions, while MP Wan Rizal Wan Zakariah (PAP-Jalan Besar) asked for safeguards in place to prevent similar disruptions in the future.
Detailing the incidents, Dr Puthucheary said that both incidents had caused a "significant impact on operations".
"On both 27th August and 5th September, our affected public healthcare institutions activated their downtime procedures and business continuity plans to keep operations running using alternative systems and in some cases manual documentation.
"These business continuity plans are exercised regularly, and staff were able to switch processes to sustain operations during the outage. But they had to work doubly hard to keep healthcare operations running," said the Senior Minister of State.
In the earlier incident, the public healthcare monitoring systems detected IT network connectivity failures from 7am. The faults were rectified and systems restored by 10.45am the same day. In total, 26 IT applications, including electronic medical records, appointment, pharmacy and laboratory systems, were affected.
A total of 17 public healthcare institutions, including community hospitals, specialist outpatient clinics, and all polyclinics were impacted.
For the second incident on Sep 5, another fault occurred in the IT infrastructure at 10am. Partial functionality was restored from 1pm that day, while full functionality was only restored by 6pm the next day. The disruption affected eight public healthcare institutions and two out of three polyclinic groups, according to Dr Puthucheary.
Due to the outages, patients had longer waiting times of up to an hour at affected institutions and others had appointments rescheduled. There were also delays in dispensing medication.
"Fortunately, there was no compromise to urgent care services across the institutions during the IT disruptions. Nobody was turned away from the emergency departments, or denied urgent care," he said.
CAUSES OF THE OUTAGES
Investigations into the incidents showed that the main causes were failures of hardware devices in data centres, said Dr Puthucheary.
In the days leading up to the Aug 27 incident, two nodes - hardware devices which operate in tandem - in firewall zones at data centres failed.
The system usually has other nodes to manage the load of data traffic to continue service operations if one fails. On Aug 27, as engineers tried to restore the two failed nodes, the operation failed. This caused the cluster of firewall nodes to malfunction, which subsequently caused the outage.
Dr Puthucheary said that the engineers worked to reset the systems to the prior state without the function of the two affected nodes and service was progressively restored.
"The failure of the nodes was caused by bugs in the firmware of the devices. They have since been identified by the manufacturer, CISCO, and the devices have been patched," he added.
The Sep 5 incident was caused by "the simultaneous failure of two further nodes, again from the same manufacturer, and of the same model", noted Dr Puthucheary.
The way the failure occurred was different from the previous incident and more time was needed to restore operations. The cause of this failure is still under investigation.
"There was a suggestion in one of questions from members that the failures may be due to the lack of manpower of IHiS. IHiS has a headcount of 3,500 personnel, they have a lot to do and will always welcome more manpower, but a lack of manpower is not the cause of these failures," said the Senior Minister of State.
He added that there were no indications of security compromise to the affected systems based on investigations.
In response to Dr Tan's question on whether systems were benchmarked against "best-in-class" systems elsewhere, Dr Puthucheary said: "There are service level agreements about the uptime availability as well as the sort of user interface usability for the products that IHiS manages, and these are indeed benchmarked against best in class around the world."
MP He Ting Ru (WP-Sengkang GRC) asked what assistance was given to affected frontline staff, and if they were provided with training to deal with such circumstances.
In reply, Dr Puthucheary said that support from the Ministry of Health and IHiS centred on "communication and providing clear information about what has happened, what are the expected steps taken to restore functionality, (and) how much time will be required".
The healthcare institutions also mobilised staff to deal with the outages and the extra processes involved, said Dr Puthucheary.
"Is training provided? Yes, business continuity plans ... disaster recovery plans are drilled regularly in the healthcare units and teams and are part of standard training for all the healthcare workers that are in our public healthcare sector. And the training is updated on a regular basis.
"This is not something that is standardised in every team and in every unit across the healthcare ecosystem because these are peculiar to the operations and flows within each clinical team."