How should federal agencies prioritize vulnerabilities?

Your guess is as good as mine...

Nov 11, 2022

Just last week, I released an in-depth post regarding the U.S. government’s vulnerability management practices. To summarize it, I found that there are at least eight different vulnerability prioritization methods in federal use and recommended consolidating the various methods into a single quantitative standard.

Well, either no one at the Cybersecurity and Infrastructure Security Agency (CISA) saw the piece or they did…and completely ignored me.

The reason I say this is that yesterday CISA released their own blog post and white paper which adds even more confusion to the situation (Note: please see Chris Hughes’ lightning fast and in-depth article on the topic for an analysis without my editorial comments).

In addition to making it unclear which standard(s) that agencies under CISA’s purview should use and under what circumstances, the new Stakeholder-Specific Vulnerability Categorization (SSVC) is itself extremely vague and unhelpful in driving risk decision-making and tradeoff evaluations.

I’ll recap the key guidance from CISA (all of it still active) over the past few years that makes clear how fractured the situation is.

Key statements

Cyber Hygiene leverages the Common Vulnerability Scoring System (CVSS), which is a vulnerability scoring system designed to provide a universally open and standardized method for rating IT vulnerabilities
Critical vulnerabilities [detected on the agency’s Internet-accessible systems] must be remediated within 15 calendar days of initial detection…[h]igh vulnerabilities must be remediated within 30 calendar days of initial detection.

- CISA Binding Operational Directive (BOD) 19-02

[T]he Common Vulnerability Scoring System (CVSS) base score does not account for if the vulnerability is actually being used to attack systems. Our experts have observed that attackers do not rely only on “critical” vulnerabilities to achieve their goals; some of the most widespread and devastating attacks have included multiple vulnerabilities rated “high”, “medium”, or even “low”.
Known exploited vulnerabilities [KEV] should be the top priority for remediation…this Directive is intended to help agencies prioritize their remediation work; it does not release them from any of their compliance obligations, including the resolution of other vulnerabilities [i.e. “critical” and “high” items per the CVSS as required by BOD 19-02].
The [KEV] catalog will list exploited vulnerabilities that carry significant risk to the federal enterprise with the requirement to remediate within 6 months for vulnerabilities with a Common Vulnerabilities and Exposures (CVE) ID assigned prior to 2021 and within two weeks for all other vulnerabilities.

- Cybersecurity and Infrastructure Security Agency (CISA) Binding Operational Directive (BOD) 22-01

The Common Vulnerability Scoring System (CVSS) is widely misused for vulnerability prioritization and risk assessment, despite being designed to measure technical severity. Furthermore, the CVSS scoring algorithm is not justified, either formally or empirically. Misuse of CVSS as a risk score means you are not likely learning what you thought you were learning from it, while the formula design flaw means that the output is unreliable regardless. Therefore, CVSS is inadequate.

- Jonathan M. Spring, Eric Hatleback, Allen Householder, Art Manion, and Deana Shick, authors of the original Stakeholder-Specific Vulnerability Categorization (SSVC) underlying CISA’s version, in their article “Towards Improving CVSS”

[W]e must help organizations more effectively prioritize vulnerability management resources through use of Stakeholder Specific Vulnerability Categorization (SSVC)…CISA encourages every organization to use a vulnerability management framework that considers a vulnerability’s exploitation status, such as SSVC.

- Eric Goldstein, Executive Assistant Director for Cybersecurity, Cybersecurity and Infrastructure Security Agency

The four SSVC scoring decisions, described in this guide, outline how CISA messages out patching prioritization….[w]hen CISA becomes aware of a vulnerability, there are four possible decisions.

- CISA Stakeholder-Specific Vulnerability Categorization Guide

Analysis of CISA’s position

Because as of today there is nothing superseding the aforementioned BODs, CISA’s stance can currently be summarized as the following:

You must remediate KEVs published in 2020 or earlier within six months of identification - or two weeks for those published in 2021 or later - but also all known internet-facing CVSS “Criticals” (even though we acknowledge attackers do not rely on these) within 15 days of detection and all CVSS “Highs” within 30 days. Additionally, we suggest - or will “message[] out” - that you remediate vulnerabilities in accordance with the below flow chart, which is based on a system that originated as a critique of the CVSS:

In addition to the confusion described above, the proposed new system is quite flawed even when taken by itself.

Highlights from the CISA SSVC

Timelines and remediation actions

This is where I have the biggest problem, because although the SSVC advertises itself as being action-driven, these proposed outcomes are extremely vague and make no reference to the traditional and well-understood risk management actions.

Track

The vulnerability does not require action at this time. The organization would continue to track the vulnerability and reassess it if new information becomes available. CISA recommends remediating Track vulnerabilities within standard update timelines.

Track*

The vulnerability contains specific characteristics that may require closer monitoring for changes. CISA recommends remediating Track* vulnerabilities within standard update timelines.

Attend

The vulnerability requires attention from the organization's internal, supervisory-level individuals. Necessary actions may include requesting assistance or information about the vulnerability and may involve publishing a notification, either internally and/or externally, about the vulnerability. CISA recommends remediating Attend vulnerabilities sooner than standard update timelines.

Act

The vulnerability requires attention from the organization's internal, supervisory-level and leadership-level individuals. Necessary actions include requesting assistance or information about the vulnerability, as well as publishing a notification either internally and/or externally. Typically, internal groups would meet to determine the overall response and then execute agreed upon actions. CISA recommends remediating Act vulnerabilities as soon as possible.

Here are some questions:

What are “standard update timelines”?
What does “requesting assistance or information about the vulnerability” mean?
What is “as soon as possible”?
How should “internal groups…meet to determine the overall response and then execute agreed upon actions”?
Why are there both “Track” and “Track*” (which I will term “trackstar” going forward) categories? That seems bound to cause a lot of confusion! Especially since there is no difference in the recommended action for addressing them, these categories appear duplicative.

None of these questions are answered in any detail in the white paper.

Exploitability

This is the most useful part of the document, and it provides relatively clear guidelines for how to classify the exploitability of vulnerabilities. These criteria include:

(State of) Exploitation, which represents the current or likely future exploitation of a given vulnerability in the wild.
Technical Impact, which the CISA white paper erroneously describes as being similar to the CVSS’ base score, which FIRST describes as measuring severity, not risk. The CVSS base score includes both likelihood and impact factors but the SSVC’s “technical impact” is in fact more similar to the former standard’s privileges / user interaction required attributes.
Automation, which refers to whether the attacker can execute the entire kill chain without human intervention.

These categories are mutually exclusive and completely exhaustive for the most part, but would greatly benefit from attaching a numerical probability of exploitation to each situation. The Exploit Prediction Scoring System (EPSS) does this based on empirical data, and I would recommend using it unless you have something better (e.g. proprietary tool or data set).

Impact

Finally, the CISA SSVC introduces an extremely complex qualitative system for evaluating impact that categorizes “public well-being impact” as being “minimal,” “material,” and “irreversible.” Merging that with a second axis of “mission prevalence,” constituting “minimal,” “support,” and “essential” functions, it gets you back to an outcome of, you guessed it…“high,” “medium,” and “low.”

While the CISA white paper gives examples of what might constitute each of these categories, they are remarkably vague. For example, it defines “irreversible” impact as situations where one or both of the following are true:

Multiple fatalities are likely.
The cyber-physical system, of which the vulnerable componen[t] is a part, is likely lost or destroyed.

So if a single connected robot arm in smart factory is destroyed in a cyber attack, that is the same as multiple people likely dying? Those two events seem to be entirely different orders of magnitude. Additionally, what is the definition of “multiple fatalities?” Two? One thousand? The unfortunate reality is that it is only logical to treat these outcomes as being quite different.

Digging more deeply into this system, the comparisons become even more egregious. As Douglas Hubbard and Richard Seiersen note in their book How to Measure Anything in Cybersecurity Risk, one of the many flaws of qualitative two-dimensional matrices is that they often suggest ludicrous prescriptions. And CISA does not disappoint here.

Suppose, for example, that the Department of Labor (DoL) has a system in place with a vulnerability that supports a certain mission essential functions (MEF). MEFs are those “directly related to accomplishing the organization’s mission as set forth in its statutory or executive charter.” In this case, this system allows the the Secretary of Labor to ensure that “scrap paper balers and paper box compactors…be at least as protective of the safety of minors as the standard described” in another provision of the Fair Labor Standards Act of 1938.

I would say that the Department must perform this task “during a disruption to normal operations,” as injury to minors could potentially result if not. Without the vulnerable system in place, however, presume the Department will completely fail in its enforcement mission. It’s certainly conceivable that the DoL scrap paper baler/compactor program runs entirely on a single legacy system, so I am good with this assumption.

Based on the CISA definition, this is an “essential” level “mission prevalence decision value.”

Looking at the “public well-being decision values,” it seems fair to say that a total outage with such a system would introduce “occupational safety hazards.” This outcome would be rated as “material.”

So, as a result, this outage at the Department of Labor due to a vulnerability would be rated as “high” impact according to the SSVC.

Now, assume that there is a vulnerability in the Centers for Disease Control (CDC) laboratory in Atlanta, Georgia that handles one of the few known remaining smallpox samples in the world. If this vulnerability is exploited, the negative pressure system in the facility will fail. I would generously characterize this as a “support” mission prevalence scenario, although it could certainly be argued that the system “directly provides capabilities that constitute at least one MEF” and is thus is “essential.”

Now, there is no guarantee that the smallpox will escape, but it’s fair to say the risk of it doing so would constitute an “immediate public health threat” and thus be “irreversible.” And given that the world just suffered a pandemic that might very well have resulted from a similar such lab leak, this isn’t a crazy scenario to contemplate.

Thus, this second situation would also be rated as “high” according to the SSVC.

So, according to this system, the complete disruption of an occupational labor safety program is equivalent in outcome to the resurgence of one of the worst diseases ever known to man.

Any reasonable observer would see that these situations are in fact many orders of magnitude different.

Conclusion

In addition to such outlandish outcomes, there are other problems with the system.

As I have said before, speaking in generalities makes cost/benefit analyses impossible and does not help you answer important question. For example, to fix an “Act” vulnerability, should we:

Ground an aircraft about to take off for a combat mission?
Remove a medical devices embedded into a human patient?
Shut down the power grid for an entire city?

There is no way answer using the SSVC. The only way these decisions get made are in the heat of the moment, when people at the “sharp end” are stressed out, sleep deprived, and not functioning at their cognitive best.

To counter this inherently problematic decision making mode, quantitative analysis ahead of time is vital. I have proposed one such way to perform it, and there are many others available. Thus, I remain bewildered by CISA’s continued advocacy for ineffective qualitative tools.

I doubt CISA incorporates what I write into their decision-making, but if they do, then this new guidance is a pretty good troll, tbh. Instead of consolidating existing guidance into any sort of coherent quantitative system, they in fact layered on an even more complex qualitative one. One which may or may not be binding on government agencies!

Refer a friend

As we approach the year 2023, I am somewhat saddened but yet unsurprised to see such confusion coming out of federal cybersecurity leadership. Maybe I have lost my audience (on the government side at least), but I’ll make the offer yet again: I am here to help.

Deploy Securely