Root Cause Analysis
(OBJ 4.8)
Root Cause Analysis (RCA)
- Systematic process to identify the initial source of an incident and prevent it from recurring (occurring again)
Steps in Root Cause Analysis
- Define and Scope the Incident
- Determine the initial cause and scope of the incident
- Understand how many systems/users have been affected and the operational impact
- Example:
- Let's say we have a malware infection inside your organization.
- Maybe it was caused because malware was introduced into our network because somebody plugged in a USB thumb drive into their workstation, or they clicked the link in a spear-phishing campaign, or they visited a malicious website.
- Determine Causal Relationships
- Identify the causal relationships that led to the incident
- Understand how the incident occurred, such as through malware infection via USB drive or other vectors
- Identify Effective Solutions
- Find solutions to prevent the incident from recurring
- Solutions may include adding antivirus, restricting data transfer from USB devices, or applying software patches
- Example:
- Adding antivirus software to all hosts
- Ensure latest version of antivirus software
- Updating the version of Windows or Installing Security patches
- Implement and Track Solutions
- Execute the solutions and ensure the incident is fully resolved
- Use change management processes to update systems and configurations
- Look across the network and see if there are any other machines that could have been affected
- Identify the indigent's cause and assess how many other network organization elements share similar features
- Example:
- Go ahead and ensure that we block mass storage devices from being read on the user system, and we want to ensure that their antivirus and antimalware solution is up to date.
- Tell my system administrators trough a change management process to install the registry change that will prevent USB devices from being read on that machine and make sure that it is part of our asset and change management process.
- Say "Hmm, how many Windows 10 machines do we have?"
- It may be a very large solution to try to upgrade
Benefits of Root Cause Analysis
- Identifies vulnerabilities and weaknesses in security practices
- Creates more robust protections against cyber threats
- Encourages a no-blame culture, focusing on solutions and improvements rather than assigning fault
- No-Blame Approach
- RCA should not assign blame to individuals or teams
- Encourages open and honest reporting to improve cybersecurity practices
- Recognizes that= human errors often result from systemic issues within organizations, such as training procedures or regulatory oversight
- No-Blame Approach
- Example:
- Two plane crash involving a Boeing 737 MAX back in October 2018 and March of 2019
- A an independent no-blame root cause analysis was conducted by the NTSB
- Wanted to determine what went wrong and why
- Involve multiple experts across various fields
- Public air
- weather conditions
- technical malfunctions
- maintenance records
- voice recordings
- It was determined that both crashed were caused by issues in the aircraft's new flight control software
- Defective sensor on the aircrafts
- The NTSB is known for recognizing that human errors are often the result of systemic issues within the aviation industry
- Recommendations are given to prevent this issues in the future