This series was born from an interview on the Application Security Podcast, season 5, episode 18. On this episode, Chris and Robert interviewed Steve Springett about the world of the secure supply chain.

In part one of the series, we covered software supply chain risk, the depths of the software composition analysis market, and the current state of commercial and open-source SCA. Read part one first to set the stage on SCA and software supply chain risk.

In part two, we branch into how to choose an enterprise SCA tool. Steve, as a contributor to Dependency Check and the project lead for Dependency Track, knows a thing or two about software composition analysis. Steve shared the list of things that he looks for in an SCA tool.

Steve provided a total of eleven things to look for in a Software Composition Analysis solution. We’ve never seen a list of this level provided anywhere. The source of information is Steve, a trusted insider that understands the technology and shares what he looks for in an SCA tool.

#1: Generate an accurate inventory of third-party code in an application.

Generating an accurate inventory is surprisingly tricky. There are a few different approaches or solutions used to address the inventory problem. There’s the binary way, where the tool scans the actual binary output searching for components. There is also the manifest way, where the software looks at a bill of materials (BOM) or packaged JSON, and the tool is made aware of what components exist for the application. Both of these technologies have their advantages and disadvantages, and a robust tool should have the ability to use either technique.

To assist with the inventory process, adopt a package URL specification. There is currently a core working group that’s defining what the specification is. It’s a universal way to describe packages regardless of their ecosystem. Instead of specifying Maven GAVS (GroupID:ArtifactID:Version), you have a package URL, and that becomes a universal way to point to a specific package.

#2: Build environment support.

Makes sure the solution has support for the environments that your build processes rely upon. These build environments include Maven, Gradle, Jenkins, Ant, Jira, Git, SonarQube, Azure DevOps, Bamboo, or GitLab CI, and many more. The key here is that you have a suite of tools in your build environment and need to ensure that your SCA solution matches up with what you already have.

#3: Incorporate your approach to package management.

Many different repositories exist across the software ecosystem, from Ruby Gems, Maven Central, NPM, NuGet, and PyPi, to name a few. Ensure that your SCA solution supports the repositories that you rely upon for building software.

With component onboarding, ensure that your solution assists you with specifying a golden repository of your blessed set of components, and enforces that list in the package management process.

#4: Analyze individual components for vulnerabilities.

The bread and butter of SCA is the ability to detect vulnerabilities in individual components. Vulnerability detection is the heart of why you buy this type of product. If the solution you are looking at isn’t excelling here, stop here, do not pass go, do not collect $200, and look elsewhere for an answer.

#5: Parse and identify various open-source licenses.

Open source license tracking and management is a component of managing software supply chain risk. License compliance is a different type of risk. You must track the acceptable licenses that are allowable in your organization. You could end up being sued by an open-source organization for using their code without giving back. The solution you choose should be able to track licenses as well, as license identification happens in the same places as vulnerability detection. The SCA tool already has access to the source of truth for detecting and reporting on licenses.

#6: Analyze and make decisions upon component-specific metadata.

Each software component has several qualities that help SCA tools make the best possible decisions about the risk level for the element.

The age of the component

As the old saying goes, open-source ages like a child’s dirty diaper. The older it gets, the smellier it gets. The age of the component is an essential piece of metadata to track.

Your solution/program should enforce an open-source policy where you determine the acceptable age of components, with a maximum age of three-five years old. Deny anything older than three-five years. Security researchers investigate both new and old elements.

The health of the project

The health of the project is determined based on how active the project is. Is the project accepting pull requests? Historically, what types of issues are the project notorious for producing. Common Weakness Enumeration is a nice dataset to use to evaluate the types of weaknesses a project has over time. If the same kinds of faults are continually coming up, perhaps they have an architectural problem with that project. An SCA solution needs to measure the health of the projects and factor health into the overall risk level.

Component capability and duplication detection

The solution must consider metadata about the capabilities of a given component to assist in detecting if this component functionally duplicates another element. Take an XML parser as an example. If you are adding XML Parser to an application that already has two, you are unnecessarily increasing the attack surface.

Choosing the best components from fewer suppliers is the best practice that your SCA solution must help you to enforce. As applications and component usage grow over time, a development team’s ability to maintain component hygiene throughout that application’s life cycle diminishes. The complexity of maintenance grows with the addition of more and more components. Your solution must assist with choosing fewer and the best quality components and evaluate the function of the element and to avoid duplication.

End of life or end of support status

As components age out, they experience an end of life or end of support milestone. After these milestones, the component will no longer receive any bug fixes. In the world of software components, this sometimes occurs when a part makes a significant version number change due to new functionality. The older 1.x version will have a limited life span, and your SCA solution needs to know when component versions are no longer receiving fixes.

#7: Ingest the BOM formats.

A Bill of Materials (BOM) defines and describes the contents of a deliverable and metadata about the manufacturing and packaging process. In software supply chains, this refers to the contents of all components bundled with the software, including, authors, publishers, names, versions, licenses, and copyrights. BOM’s are useful to represent component inventory in a standardized format that is interchangeable between the various systems.

BOM’s are the way of the future for software supply chain risk. They ensure that your solution works with the two most popular formats: CycloneDX and SPDX.

#8: Evaluate pedigree and providence.

Consider whether a solution can determine or readout on the pedigree or providence of a component. If a developer forks an existing component, you lose all traceability and tracking of the original.

Many organizations choose an open-source component, and it doesn’t quite meet their needs. They’ll fork it and make modifications to it and generate a new element. A solid SCA needs to be able to track the origin of the component and the differences between the custom version and the latest. Failure of your tool to understand pedigree and providence may blind you to any license or a security risk.

#9: Enforce a security policy with risk ratings.

Enforcing a security policy means defining what risk ratings are acceptable for your organization in the solution, and having those settings applied throughout your software lifecycle.

A robust solution needs the ability to reprioritize the risk ratings. There are always going to be errors and false positives. Different organizations have different risk tolerances or different interpretations of particular things. Reprioritizing the risk of findings allows you to adjust a critical result to a lower level because you have determined it is not in use.

#10: Data flow exploitability.

Dataflow exploitability is the ability to identify the usage of a given method in your application. Be very cautious with this feature and existing commercial implementations. Use it as a risk prioritization tool, not as a determination as to whether to fix something or not.

The challenge with dataflow exploitability is that it claims to alert that a component is confirmed exploitable. The problem is that the lack of an established data flow does not indicate safe component usage, especially when an application has multiple languages. Use this capability for risk prioritization only.

#11: Measuring reporting and intelligence effectiveness.

Reporting and integrations are essential. You want to ensure that your SCA solution communicates with the other tools you use, like Jira or GitHub, for issue tracking. To assist developers in fixing these issues, you have to track the problems where the developers are already working.

Another item to consider is the effectiveness of the intelligence program. Efficiency is a measurement of the size of the vendor’s team, tracking down new information about components. Determine if they have a team doing this or are just relying upon public sources for information.

Conclusion

Software composition analysis is a maturing space. We provide this list of capabilities to help you evaluate solutions and determine the best, and most feature-rich solution that meets the needs of your organization. If you can get the best of breed based on the life experience of an industry insider, why not? Happy vulnerable component hunting to you!