diff mbox series

sstatesig: Ensure SPDX dependencies update when ABI safe recipes change

Message ID 20240109153651.1732423-1-JPEWhacker@gmail.com
State New
Headers show
Series sstatesig: Ensure SPDX dependencies update when ABI safe recipes change | expand

Commit Message

Joshua Watt Jan. 9, 2024, 3:36 p.m. UTC
Because of the way that SPDX documents are linked together, the SPDX
creation tasks must re-run, even for ABI safe recipes. This sstatesig
change is more comprehensive version of the same fix found in [1],
which also has more information about the specific reproducer.

In order to this to work correctly with the concept of allarch recipes,
a few changes need to be made to the SPDX creation tasks:
 1) All tasks now have stamp-extra-info on MACHINE_ARCH, which ensure
    they are machine specific (but doesn't affect the actual stamp
    contents)
 2) All machine specific variables are excluded from signature values
    (specifically, SSTATE_PKGARCH and MACHINE_ARCH). These aren't
    necessary since the tasks are now machine dependent anyway.
 3) do_create_spdx is no longer run before do_build. Practically
    speaking, this means users should explicitly run `bitbake -c
    create_spdx <recipe>` to generate the SPDX for a specific recipe. No
    changes are needed to generate the SPDX for a complete image as the
    tasks are automatically pulled in for that case, so this should not
    affect very many users

[1]: https://lists.openembedded.org/g/openembedded-core/message/192743

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 meta/classes/create-spdx-2.2.bbclass        | 30 +++++++++++++++------
 meta/lib/oe/sstatesig.py                    |  6 +++++
 meta/lib/oeqa/selftest/cases/sstatetests.py |  2 ++
 3 files changed, 30 insertions(+), 8 deletions(-)

Comments

Richard Purdie Jan. 9, 2024, 3:50 p.m. UTC | #1
On Tue, 2024-01-09 at 08:36 -0700, Joshua Watt wrote:
> Because of the way that SPDX documents are linked together, the SPDX
> creation tasks must re-run, even for ABI safe recipes. This sstatesig
> change is more comprehensive version of the same fix found in [1],
> which also has more information about the specific reproducer.
> 
> In order to this to work correctly with the concept of allarch recipes,
> a few changes need to be made to the SPDX creation tasks:
>  1) All tasks now have stamp-extra-info on MACHINE_ARCH, which ensure
>     they are machine specific (but doesn't affect the actual stamp
>     contents)
>  2) All machine specific variables are excluded from signature values
>     (specifically, SSTATE_PKGARCH and MACHINE_ARCH). These aren't
>     necessary since the tasks are now machine dependent anyway.
>  3) do_create_spdx is no longer run before do_build. Practically
>     speaking, this means users should explicitly run `bitbake -c
>     create_spdx <recipe>` to generate the SPDX for a specific recipe. No
>     changes are needed to generate the SPDX for a complete image as the
>     tasks are automatically pulled in for that case, so this should not
>     affect very many users
> 
> [1]: https://lists.openembedded.org/g/openembedded-core/message/192743
> 
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>  meta/classes/create-spdx-2.2.bbclass        | 30 +++++++++++++++------
>  meta/lib/oe/sstatesig.py                    |  6 +++++
>  meta/lib/oeqa/selftest/cases/sstatetests.py |  2 ++
>  3 files changed, 30 insertions(+), 8 deletions(-)

Sadly, I'm afraid I'm not happy about this.

This change forces a do_unpack in order for the SPDX task to run and
they have to rerun per machine. This means that any time you switch
machines, it will force everything to be unpacked again if for example
you'd cleaned up TMPDIR in the meantime.

The point behind sstate in the first place is to stop the system having
to do large chunks of work such as unpacking or compiling. This change
undermines the source extraction.

I do think we're going to have to find a better way to handle this
unfortunately, painful as that may be.

When I suggested the machine specific stamps, I was assuming the bulk
of the work was already done and could be reused but that isn't the
same from what I'm seeing in the patch as it needs the unpacked data
every time.

Cheers,

Richard
diff mbox series

Patch

diff --git a/meta/classes/create-spdx-2.2.bbclass b/meta/classes/create-spdx-2.2.bbclass
index 486efadba96..ac0ec60716c 100644
--- a/meta/classes/create-spdx-2.2.bbclass
+++ b/meta/classes/create-spdx-2.2.bbclass
@@ -4,7 +4,8 @@ 
 # SPDX-License-Identifier: GPL-2.0-only
 #
 
-DEPLOY_DIR_SPDX ??= "${DEPLOY_DIR}/spdx"
+DEPLOY_DIR_SPDX ??= "${DEPLOY_DIR}/spdx/${MACHINE_ARCH}"
+DEPLOY_DIR_SPDX[vardepsexclude] += "MACHINE_ARCH"
 
 # The product name that the CVE database uses.  Defaults to BPN, but may need to
 # be overriden per recipe (for example tiff.bb sets CVE_PRODUCT=libtiff).
@@ -100,6 +101,14 @@  python() {
         d.setVar("SPDX_LICENSE_DATA", data)
 }
 
+def unpack_deps(d):
+    # NOTE: Depending on do_unpack is a hack that is necessary to
+    # (transitively) pull in its dependencies so that source can be unpacked
+    # for reporting and archiving
+    if d.getVar("SPDX_INCLUDE_SOURCES") == "1":
+        return "do_unpack"
+    return ""
+
 def convert_license_to_spdx(lic, document, d, existing={}):
     from pathlib import Path
     import oe.spdx
@@ -502,11 +511,13 @@  python do_collect_spdx_deps() {
     with spdx_deps_file.open("w") as f:
         json.dump(deps, f)
 }
-# NOTE: depending on do_unpack is a hack that is necessary to get it's dependencies for archive the source
-addtask do_collect_spdx_deps after do_unpack
+do_collect_spdx_deps[vardepsexclude] = "SSTATE_PKGARCH"
+
+addtask do_collect_spdx_deps after ${@unpack_deps(d)}
 do_collect_spdx_deps[depends] += "${PATCHDEPENDENCY}"
 do_collect_spdx_deps[deptask] = "do_create_spdx"
 do_collect_spdx_deps[dirs] = "${SPDXDIR}"
+do_collect_spdx_deps[stamp-extra-info] = "${MACHINE_ARCH}"
 
 python do_create_spdx() {
     from datetime import datetime, timezone
@@ -693,9 +704,10 @@  python do_create_spdx() {
 
             oe.sbom.write_doc(d, package_doc, pkg_arch, "packages", indent=get_json_indent(d))
 }
-do_create_spdx[vardepsexclude] += "BB_NUMBER_THREADS"
-# NOTE: depending on do_unpack is a hack that is necessary to get it's dependencies for archive the source
-addtask do_create_spdx after do_package do_packagedata do_unpack do_collect_spdx_deps before do_populate_sdk do_build do_rm_work
+do_create_spdx[vardepsexclude] += "BB_NUMBER_THREADS SSTATE_PKGARCH"
+addtask do_create_spdx \
+    before do_populate_sdk do_rm_work \
+    after do_package do_packagedata do_collect_spdx_deps ${@unpack_deps(d)}
 
 SSTATETASKS += "do_create_spdx"
 do_create_spdx[sstate-inputdirs] = "${SPDXDEPLOY}"
@@ -709,6 +721,7 @@  addtask do_create_spdx_setscene
 do_create_spdx[dirs] = "${SPDXWORK}"
 do_create_spdx[cleandirs] = "${SPDXDEPLOY} ${SPDXWORK}"
 do_create_spdx[depends] += "${PATCHDEPENDENCY}"
+do_create_spdx[stamp-extra-info] = "${MACHINE_ARCH}"
 
 def collect_package_providers(d):
     from pathlib import Path
@@ -869,9 +882,9 @@  python do_create_runtime_spdx() {
             oe.sbom.write_doc(d, runtime_doc, pkg_arch, "runtime", spdx_deploy, indent=get_json_indent(d))
 }
 
-do_create_runtime_spdx[vardepsexclude] += "OVERRIDES SSTATE_ARCHS"
+do_create_runtime_spdx[vardepsexclude] += "OVERRIDES SSTATE_ARCHS SSTATE_PKGARCH"
 
-addtask do_create_runtime_spdx after do_create_spdx before do_build do_rm_work
+addtask do_create_runtime_spdx after do_create_spdx before do_rm_work
 SSTATETASKS += "do_create_runtime_spdx"
 do_create_runtime_spdx[sstate-inputdirs] = "${SPDXRUNTIMEDEPLOY}"
 do_create_runtime_spdx[sstate-outputdirs] = "${DEPLOY_DIR_SPDX}"
@@ -884,6 +897,7 @@  addtask do_create_runtime_spdx_setscene
 do_create_runtime_spdx[dirs] = "${SPDXRUNTIMEDEPLOY}"
 do_create_runtime_spdx[cleandirs] = "${SPDXRUNTIMEDEPLOY}"
 do_create_runtime_spdx[rdeptask] = "do_create_spdx"
+do_create_runtime_spdx[stamp-extra-info] = "${MACHINE_ARCH}"
 
 def spdx_get_src(d):
     """
diff --git a/meta/lib/oe/sstatesig.py b/meta/lib/oe/sstatesig.py
index 1b4380f21bc..cf39f68a102 100644
--- a/meta/lib/oe/sstatesig.py
+++ b/meta/lib/oe/sstatesig.py
@@ -44,6 +44,12 @@  def sstate_rundepfilter(siggen, fn, recipename, task, dep, depname, dataCaches):
             return False
         return True
 
+    # The link between do_collect_spdx_deps and do_create_spdx must be
+    # preserved in order to correctly link documents together, regardless of
+    # ABI safe status of a recipe or task
+    if task == "do_collect_spdx_deps" and deptaskname == "do_create_spdx":
+        return True
+
     # Exclude well defined recipe->dependency
     if "%s->%s" % (recipename, depname) in siggen.saferecipedeps:
         return False
diff --git a/meta/lib/oeqa/selftest/cases/sstatetests.py b/meta/lib/oeqa/selftest/cases/sstatetests.py
index 393eaf63393..119ce5a95f5 100644
--- a/meta/lib/oeqa/selftest/cases/sstatetests.py
+++ b/meta/lib/oeqa/selftest/cases/sstatetests.py
@@ -510,6 +510,8 @@  BB_SIGNATURE_HANDLER = "OEBasicHash"
                         continue
                     if "qemux86copy-" in root or "qemux86-" in root:
                         continue
+                    if re.match(r'.*spdx.*\.(qemux86copy|qemux86)$', name):
+                        continue
                     if "do_build" not in name and "do_populate_sdk" not in name:
                         f.append(os.path.join(root, name))
             return f