By Curtis Yanko
While we were at RSA I saw that CISA released a document from its working group on SBOM Types. I also noticed a bit of a kerfuffle on LinkedIn as folks were asking pointed questions and scratching their heads about just what this document does/means. Since I participated in this forum, and a couple of others, I thought I’d share my own observations on how we got here.
The first thing to keep in mind is that this paper is a ‘community’ effort that came out of the SBOM Tooling working group. These working groups are an effort to get input from the community and are very different than something like the Secure Software Self-Attestation Common Form which CISA published as a direct response to Executive Order 14028 and the Office of Management and Budget’s memo M-22-18 which required the development of a self-attestation form for software producers serving the federal government. As such, the SBOM Types document is a collection of community voices where no one was shut down or silenced, so keep that in mind.
The introduction to the SBOM Types paper provides us with this bit of context, Given the disparate ways SBOM data can be collected, tool outputs may vary and provide value in different use cases. In simpler terms, there are lots of different ways/times to create an SBOM and they are not all the same. This is more of an intermediate conversation happening amongst SBOM practitioners. With all of that in mind, let me attempt to go through each type and provide a bit more context as to the thinking that went into each one, in reverse SDLC order.
The Runtime type talks about Instrumented and/or Dynamic SBOMs. These tools are hooked into the system so they can see what is actually being loaded into memory at execution time and can help avoid the false positives of unused (never loaded) components. I don’t have a lot of personal experience with this SBOM generation technique so I am unsure of what trade-offs might be taking place. My guess is that, on its own, we might not be able to infer where the component came from or its authenticity. If you’re using runtime SBOM tools please leave a comment with your experience and the trade-offs you’ve seen.
Next up is the Deployed type. In principle, this is simple, we create an SBOM of everything on a system that sounds like a job for end-point scanners to produce such inventories. However, the document implies that we might be stitching together several SBOMs that were previously generated, perhaps at build time, to represent the system. This sounds more like the work I see at SwiftBOM where there is a product SBOM that acts as an aggregator of all of the SBOMs that make up the whole system. The de-coupled nature of this approach is appealing to me but I’ve never done this in practice. If anyone has, please share your experience in the comments below.
The Analyzed type is a little more straightforward and represents an SBOM generation technique more so than an SDLC phase like most of the others. Sometimes referred to as Binary SCA, this technique relies on scanning the file/application/system as delivered to tease out what components went into the item being scanned. Essentially a reverse engineering approach to SBOM generation which is useful for software consumers in the absence of an SBOM or to help verify and validate an SBOM that has been provided. The approach of scanning binaries will produce some false positives but the technique excels at identifying statically linked libraries like OpenSSL, ffmpeg, and others that even build-time SBOMs would likely miss.
Build time SBOMs are the most common and therefore a well-known type of SBOM out there today. I would assume that when most folks hear the term SBOM, this is what they are thinking of. This approach usually relies on the build system resolving all of the dependencies onto the build machine where they can be scanned. Because the files are present to be scanned this approach produces rich information about each file, like its hash and where it was sourced from which can help establish provenance and authenticity.
Next up is Source SBOMs and this is where I started scratching my own head trying to differentiate from the Build time SBOM. As written in the paper, this sounds like the exact same thing as Build except that it tries to exclude the build tool. The claim is that we’ll access to the source and required dependencies and goes on to suggest that this is where most SCA tools play today even though I am not aware of one that doesn’t invoke at least the dependency resolution phase of the build tool. For me personally, I’m not sure there is a distinction to be made here between Source and Build but I am open to feedback that would help me better understand the differences.
The whole reason I did this list in reverse SDLC order was to save the Design time SBOM last. I understand that this is theoretically and technically possible, I just struggle to come up with a practical use case for producing an SBOM at design time. I do not know of any tooling that would help create a machine-readable document from whatever other design artifacts might exist, so would this be done by hand? The most pertinent information we could get from this kind of SBOM would be around licensing compliance to ensure the team isn’t thinking of using some tech that would impact the intellectual property or distribution mechanisms of the final product. When I think about what is really needed at this point in the dev cycle would be more along the lines of a high-level tech stack that would help us shop for the best open-source components to fill that need from a business perspective as opposed to a technical one. For instance, if I need a graphing database, I’d like to see a list of the different database solutions available along with information like licensing, release frequency, MTTD, and MTTR as I feel those inputs help us make the best selection from that business perspective. I feel IT shops would gravitate toward a solution that is good about finding and fixing their own issues along with a business-friendly license. Sadly, in my experience, most tech stacks are chosen because they have been used before or, it’s hot new tech that looks good on a resume and I don’t see how an SBOM helps me do the comparisons I feel are needed at this point in time.
Hopefully, this article has helped folks better understand just what this CISA working group document is trying to convey, and why. Now it’s your turn, what did the paper miss? …or what did it get wrong?