Modern software development is a double-edged sword. On the one hand, we can easily download open source software libraries and applications that provide tons of technological capabilities. On the other hand, we might only use 10% of these capabilities and end up with much more software than we need, leading to increased maintenance and management.
These libraries, large as they are, help us to move quickly and build new functionality with just a few lines of code. Simply take someone else’s component and leverage it for your own benefit, often for free. Modern software development has become more about “assembling” other’s components than writing our own. The benefits are tremendous: improved performance, functionality, speed, economics, and interoperability. It’s a trend that’s here to stay, and rightly so. Building and deploying amazing software is cheaper and faster than ever.
In this evolutionary process, we’ve also become very good at finding, including, and merging the code we need. Package managers allow us to easily and automatically manage and install dependencies and open source components. Not too long ago, software packages were small and interoperability was low. In 2022, it’s quite the opposite. In fact, software packages are often so large that we now have a need to manage excess functionality to decrease threat opportunities, reduce footprint, and improve performance. The Linux Foundation estimates that the average software container consists of 80-90% open source components.
The practice of reducing a software footprint down to what’s necessary is called software optimization. Software is 100% optimized when it contains only those components necessary to run. Currently the average software container is only ~20% optimized, meaning 80% of the code is not used, and must be maintained.
Because of software supply chain exploits like Log4J, we’re seeing the very first indications that software optimization must become a critical piece of the software deployment in the coming years. At RapidFort, we’re betting our futures on it. And so are progressive organizations that are optimizing their containerized applications, including the Department of Defense.
In this article, we’re going to talk about software optimization of containerized applications, what’s changing and why, and how you can get started. We begin with Software Bills of Materials.
Installing open source software packages is a single command away. Whether you’re into RPM or Homebrew or something else, sophisticated software installations have been reduced to a few keystrokes. But do you know what you’re getting when you type `docker pull nginx:latest`?
The truth is, almost nobody does. We might enjoy all those installation messages flying by on the terminal, but what is all that stuff? Some installation logs are thousands of lines long. What’s the impact of all that work and new software now running in our infrastructure? We often have to cross our fingers and hope that whoever authored the package knew what they were doing.
For example, when you download a container image from the Docker Hub container library, you have limited insights about what's in the container or whether you need everything it contains. You might download an NGINX container so you can run health checks on your workload but this doesn’t need all the components to be included because the use case for doing a health check is relatively light.
The National Telecommunications and Information Administration (NTIA) defines a Software Bill of Materials (SBOM) as “a nested inventory for software, a list of ingredients that make up software components.” It’s very similar to a food label: “In this portion of food, there are these ingredients.” It’s not the nutrition facts label, but the ingredients list.
SBOMs are obtained using a software component analyzer, which parses the metadata of all the components and provides a complete list. The list includes information about each item: who wrote them, version and license information, and dependencies on other components to provide a view of the workload composition.
They’ve become increasingly popular, especially considering the White House executive order mandating their use in US government software implementations. Now everyone needs SBOMs. Though the regulations are not punitive yet, they are required for most security certifications, and industry is realizing the sensibility of SBOMs.
Outside of the compliance requirements for SBOMs, there are many reasons to have them. The bigger the software footprint in your infrastructure, the more code you’re running, and the greater your risk of malicious attack. At RapidFort, we refer to this as your Software Attack Surface. It's very hard, sometimes impossible, for engineers to manually noodle through the code to determine what is not used and can be safely removed. An SBOM gives them a major head start.
While we are big proponents of SBOMs as the first step, they are not the final solution to today’s cybersecurity problems. We think there’s an essential next step forward.
Software Attack Surface minimization is the practice of using only what you need. It’s like customizing a shoe specifically for your foot; “This is the use case, these are the requirements, only include the code that supports these runtime needs.” We mentioned earlier that a piece of software is 100% optimized when there’s nothing left to remove without impairing its functionality. SBOMs tell you what you have (pre-optimization), and RBOMs tell you what you need to have (post-optimization). Typically having less code to manage results in less risk, fewer problems, and less software ‘weight’ to carry and manage..
Knowing everything in a container is great, but as we’ve already made clear, the list is so long it's hard to understand its implications. It’s not the final destination . There’s still a lot of “extra” code in there, which, at RapidFort, we remove as part of our software optimization process. To begin that process, we develop something we call a Real Bill of Materials, or an RBOM. Our secret sauce is to ”instrument” your container and run it in a “fancy sandbox” to observe what it does and then reverse engineer the components needed to support that runtime behavior.
Whereas an SBOM tells you everything in your container, an RBOM tells you everything in your container that you are using. We develop RBOMs using a suite of dynamic composition analysis tools within our product, RapidFort. RapidFort is industry’s first Software Attack Surface Management system. It does a thorough analysis of what processes are actually running, what system calls are made, what network traffic patterns are exercised, and what libraries are actually being used. It’s a different way of looking at an SBOM, and it is much more informative and actionable.
Obtaining an RBOM manually is challenging work. Only now is the tech being developed in order to get sufficient profiling granularity to build viable RBOMs. Other companies have different approaches that are viable but less complete.
With an RBOM, there’s no second-guessing what you’re running. You know exactly what is active in your architecture and where the risks lie. You don’t need to patch, fix, or defend code you don’t use, especially if you remove it altogether.
Containers, SBOMs, and RBOMs aside, here’s the reality: the more software you have, the more risk you have. As a thought experiment, assume you have 20 components that are 95% secure. Because you have chained them together, your entire system is now 95%^20, or 36% secure! The more components you chain together the more risk you compound, which adds up.
Another reality: you’re only as safe as your weakest link. If you have 19 components that are 100% secure, but one at 50%, then the entire system is only 50% secure. Now expand this scenario to thousands, perhaps millions, of packages and components. These are basic mathematical forces in play that cause risk to accumulate. More components chained together form a mathematical basis for increased risk. Reducing the number of components reverses this trend.
Software optimization essentially reduces the compounding of risk phenomena. Every software component and line of code is a “liability” that can be optimized. Optimized code bases are the path to success in today’s open source software-dependent ecosystem. Without optimization, the entire system becomes unwieldy. There are simply too many components to patch and manage. With optimization there is less code, less risk, and less problems.
The future of cloud-native software involves reduction and optimization. In five years, we won’t be deploying bloated, insecure containers in our infrastructure as we do today. We will be deploying only what we need. RapidFort can get you started with software optimization today.
Depending on the programming language,we can reduce your container security risk by upwards of 80% and reduce the size of your containers by similar amounts. We’ll gladly show you in a product demo and you can see for yourself. Just reach out and let’s talk.