Project Goals

With the Mequal project we have 3 high-level goals we’d like to achieve:

Validation

We would like to identify if a manifest is up-to-standard not just in its schema representation, but beyond, where we can also identify problems where the standard schema checks of different manifest formats just wouldn’t be able to identify them. We’d also like to validate certain representations of a manifest depending on the context or what type of manifest it is.

Through this, we would like to validate the "correctness" of a manifest given a specific set of rules within a certain business context, going beyond the basic schema checks.

Evaluation

For situations where direct validation of a manifest may not be so easy, such as optional or alternative data representations in manifests, we’d like to at the very least return feedback on what problems this would cause in what situations, and offer guidance and solutions to address these.

Through this, we would like to ensure that we make gradual quality improvements in the information contained in our manifests.

Policy Creation and Management

We’d like to enable policy creation in a simplified manner, without requiring high-level expertise in the underlying policy language in order to leverage the expertise of others, and to keep track of these policies by managing them in a version-controlled fashion, where we can keep track of the higher-level details, the evolution, and the ongoing validity of these policies.

1. Validation

Validation can be thought of as extended SBOM schema checks, including but not limited to gating policy and automating the evaluation process with a more flexible approach compared to what an actual JSON or XML schema would allow. It’s intended to be portable for other tooling dealing with generation and storage of SBOM to use.

Extended Schema examples

  • Gating checks for the very basics like PURL string and external reference format. These should always be valid if present.

  • Ecosystem specialization

    • Handling alternative approaches of package and source management. (For example RPMs having a SRPM)

  • Embedded extended schema in SBOMs

    • Validating a document further down a pipeline as changes are possibly made

    • Build/generation tool identification and enforcement (versioned pipelines)

  • Extended checks by policy delegation, eg. valid link checking, internal and external url checks, requires non-policy hooks or custom builtins

  • Chain of custody

    • Being permissive or restrictive on what has interacted with the SBOM

Why validate through policies vs. schemas?

Doing validation from the schemas of the two main SBOM formats (CycloneDX and SPDX) is possible but it’s difficult to write and harder to document. A schema is pass or fail, and without maintaining a large catalogue of schema with subtle variations it’s not possible to view the same SBOM through a different lens.

Writing the basic checks in a policy language like Rego removes some of these issues. For example it’s easier to share a “valid” PURL-check function amongst multiple formats and schema versions. It allows for more descriptive metadata about a policy where we’re not restricted to a description field in a JSON schema.

Ecosystem specialization is also a useful advantage to a policy-based extended schema. For example, checking if an RPM is generated from a source RPM would be difficult to encode in a JSON schema and would again deviate from the base specification. With policy language, multiple rules can be run simultaneously.

An embedded extended schema could also be utilized like a schema URI that resolves to a specific set of policies that ran at the time of the document creation, and like a schema it can ensure that the onward enrichment process is also checked against the policies intended by the author.

In policy, although most checks are possible relying on just the SBOM document itself, it is also possible, unlike a schema, to carry out more exhaustive checks like verifying if a URL is reachable or if a component’s checksum matches. Although this is somewhat counterproductive for portability, it is something we could utilize.

Policy alongside signing could also be used to validate the integrity of an SBOM and that it meets a particular organizations’ standards. This means a chain of custody can be established, and this may be an useful mechanism where there are multiple SBOM sources. For example upstream community SBOM or SBOMs derived from an early-access release, which do not need to meet the same threshold.

Who is validation useful to?

As mentioned, these use cases are more suited for validation purposes. Firstly for the individuals and tools responsible for SBOM creation (e.g. SBOMer [1]) and the initial review, and then later for process tools which may want to discriminate ingested SBOMs based on the policy (e.g. trustify [2] Evaluation)

2. Evaluation

Evaluation differs from validation in that is related to qualitative aspects of the manifest. This is less likely to be gating checks and instead is more about user feedback. Providing the details of a rule, why it exists, what it’s expected to provide, and examples on how to improve an SBOM. This would guide individuals to provide a better quality SBOM which contains more useful information further downstream for SMEs to action upon.

Quality improvement examples

  • Information requirements for multiple stakeholders

    • Full component metadata (SHA etc.)

    • Legal

      • Licenses

      • Cryptographic libraries (export licensing)

    • Build environment

      • Supply chain questions

    • Project or Product info

      • Naming

      • Versions

      • Release names

  • SBOM Type (layering and phases)

    • Service or infrastructure BOMs should contain endpoints, ports, protocols, etc.

  • SBOM Onboarding/Lifecycle Representation

    • SBOM type and evolution

    • Process should start with a design SBOM and end with decommissioning SBOM)

  • Component Registry (trustify [2] client)

    • Dependencies or related projects with vulnerabilities

    • Dependency SBOM quality (recursive SBOM)

    • Other usages of this component (how popular it is, who supports it) - leading people to choosing the best supported version of a component, suggesting alternatives, etc.

Why is evaluation through policies useful for SBOM quality?

We described policy evaluation as qualitative and this is what these use cases aim to improve. There are already a number of guidelines for SBOM creation and tools to assess SBOM quality, but these often fit into the former validation category and as a result produce a simplistic report. Many provide a poor abstraction and do not direct the user to what the problem is, where it occurs or how to improve upon it.

Textual feedback to the user akin to a compiler warning is a better way to provide these key bits of information. This guides the user on a gradual improvement to reach verification of their target policy set.

Identifying information requirements from multiple Subject Matter Experts will allow gradual improvement in multiple aspects of an SBOM without the requirement for the user to have that expert knowledge. As an example, a product engineer might not know that mixing components of two contradicting licenses will be problematic for the organization, or may not properly consider supply chain attacks when quickly pulling a project together.

Another example of knowledge that could be imparted onto the user is knowledge about the SBOMs themselves and how they are used in their organization. This would include information about how to use the full SBOM lifecycle to help spread the information requirement gathering amongst multiple teams.

SBOM quality is also an opportunity to highlight information from other tools. For example, feedback about potentially vulnerable components from Software Composition Analysis. Or as a more complex example, if we have chosen a stricter set of policies that requires a hermetic build, do any of the components we rely upon also conform to these policies?

Who is evaluation useful to?

As mentioned, these use cases are around quality improvement and guidance. This is focused at end-users rather than services or tooling. The end users could be the Software Production teams, Product and QE teams, Product Security team, legal team, etc. Anyone that interacts with a project or product and has an interest in improving the quality of the SBOM for their own use case or others.

3. Policy Creation and Management

Allowing easy policy creation by subject matter experts, coupled with an effective way to manage them, will provide a consistent and accessible way to share knowledge with the wider organization.

These subject matter experts will not need to be knowledgable in existing policies and how they are represented in Mequal, or the details of the SBOM formats and schemas, as these are all large barriers to entry. Instead, the policy creation will focus on capturing scenarios (e.g. Supply chain attack), the questions they would ask to resolve the scenario (e.g. “Which projects are using this repository?”), and the information requirement (e.g. The URL of the component’s origin) and provide an interface to create these types of policies.

If we can get easy policy contributions from subject matter experts and carefully manage the policies with clear, concrete definitions, who contributed them, and for what purposes, this would enable us to keep track of what policies are in place, why they exist, and their evolution through time through version control principles.

Policies are ever evolving and as part of that process some policies will become irrelevant. Our goal is to create and manage policies where we can easily know if a policy is still relevant and why we as an organization should still rely on them.

The easy creation and management of policies and coupling them with certain important information (scenarios, questions, information requirements, why, for whom) helps us achieve a number of goals:

  • Consistency and gating of policy

  • Approachability

    • You don’t need to know SBOM specifications or know Rego to describe a scenario and the questions you would ask to resolve it.

    • You don’t need to know the workflow or the ins-and-outs of scenario to provide an information requirement.

  • Policy categorization

    • Adjustable scope or levels (we can ensure that all facets of a scenario are covered)

    • Maybe we’re not interested in supply chain attacks or legal policies when we’re dealing with a development SBOM

  • Policy reuse

    • Forking of other organizations policy and customizing it to your own specifics.

    • The same information requirement can answer multiple questions, this will help reduce the split-brain problem where the same policy is written in multiple ways by multiple authors.

  • Policy attribution

    • Who asked for this policy, why is it useful?

    • Policy review and refinement

Who is policy creation and management useful to?

Easy policy creation and management will be useful to subject matter experts and policy implementers. It will help keep track of policies for SBOMs and allow organizations to manage and share their policies, even with customers. It’s also useful for development teams associated with services that produce and consume SBOMs. These policies and their related information give insight into the practical use of various SBOM formats and specification versions.