[RFC] classes/native: Propagate dependencies to outhash

Message ID 20220114171222.1788462-1-JPEWhacker@gmail.com
State Accepted, archived
Commit d6c7b9f4f0e61fa6546d3644e27abe3e96f597e2
Headers show
Series [RFC] classes/native: Propagate dependencies to outhash | expand

Commit Message

Joshua Watt Jan. 14, 2022, 5:12 p.m. UTC
Native task outputs are directly run on the target (host) system after
being built. Even if the output of a native recipe doesn't change, a
change in one of its dependencies may cause a change in the output it
generates (e.g. rpm output depends on the output of its dependent zstd
library).

This can cause poor interactions with hash equivalence, since this
recipes output-changing dependency is "hidden" and downstream task only
see that this recipe has the same outhash and therefore is equivalent.
This can result in different output in different cases.

To resolve this, unhide the output-changing dependency by adding it's
unihash to this tasks outhash calculation. Unfortunately, don't know
specifically know which dependencies are output-changing, so we have to
add all of them.

[YOCTO #14685]

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 meta/classes/native.bbclass | 31 +++++++++++++++++++++++++++++++
 meta/lib/oe/sstatesig.py    | 10 +++++++---
 2 files changed, 38 insertions(+), 3 deletions(-)

Comments

Jacob Kroon Jan. 14, 2022, 5:50 p.m. UTC | #1
On 1/14/22 18:12, Joshua Watt wrote:
> Native task outputs are directly run on the target (host) system after

"target" or "host" ? the latter i suppose

> being built. Even if the output of a native recipe doesn't change, a
> change in one of its dependencies may cause a change in the output it
> generates (e.g. rpm output depends on the output of its dependent zstd
> library).
> 
> This can cause poor interactions with hash equivalence, since this
> recipes output-changing dependency is "hidden" and downstream task only
> see that this recipe has the same outhash and therefore is equivalent.
> This can result in different output in different cases.
> 
> To resolve this, unhide the output-changing dependency by adding it's
> unihash to this tasks outhash calculation. Unfortunately, don't know
> specifically know which dependencies are output-changing, so we have to
> add all of them.
> 

"don't know specifically know which.."

> [YOCTO #14685]
> 
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>  meta/classes/native.bbclass | 31 +++++++++++++++++++++++++++++++
>  meta/lib/oe/sstatesig.py    | 10 +++++++---
>  2 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/meta/classes/native.bbclass b/meta/classes/native.bbclass
> index 76a599bc15..fc7422c5d7 100644
> --- a/meta/classes/native.bbclass
> +++ b/meta/classes/native.bbclass
> @@ -195,3 +195,34 @@ USE_NLS = "no"
>  
>  RECIPERDEPTASK = "do_populate_sysroot"
>  do_populate_sysroot[rdeptask] = "${RECIPERDEPTASK}"
> +
> +#
> +# Native task outputs are directly run on the target (host) system after being

see above

> +# built. Even if the output of this recipe doesn't change, a change in one of
> +# its dependencies may cause a change in the output it generates (e.g. rpm
> +# output depends on the output of its dependent zstd library).
> +#
> +# This can cause poor interactions with hash equivalence, since this recipes
> +# output-changing dependency is "hidden" and downstream task only see that this
> +# recipe has the same outhash and therefore is equivalent. This can result in
> +# different output in different cases.
> +#
> +# To resolve this, unhide the output-changing dependency by adding its unihash
> +# to this tasks outhash calculation. Unfortunately, don't know specifically
> +# know which dependencies are output-changing, so we have to add all of them.
> +#

see above

> +python native_add_do_populate_sysroot_deps () {
> +    current_task = "do_" + d.getVar("BB_CURRENTTASK")
> +    if current_task != "do_populate_sysroot":
> +        return
> +
> +    taskdepdata = d.getVar("BB_TASKDEPDATA", False)
> +    pn = d.getVar("PN")
> +    deps = {
> +        dep[0]:dep[6] for dep in taskdepdata.values() if
> +            dep[1] == current_task and dep[0] != pn
> +    }
> +
> +    d.setVar("HASHEQUIV_EXTRA_SIGDATA", "\n".join("%s: %s" % (k, deps[k]) for k in sorted(deps.keys())))
> +}
> +SSTATECREATEFUNCS += "native_add_do_populate_sysroot_deps"
> diff --git a/meta/lib/oe/sstatesig.py b/meta/lib/oe/sstatesig.py
> index 038404e377..abcd96231e 100644
> --- a/meta/lib/oe/sstatesig.py
> +++ b/meta/lib/oe/sstatesig.py
> @@ -491,7 +491,8 @@ def OEOuthashBasic(path, sigfile, task, d):
>      if task == "package":
>          include_timestamps = True
>          include_root = False
> -    extra_content = d.getVar('HASHEQUIV_HASH_VERSION')
> +    hash_version = d.getVar('HASHEQUIV_HASH_VERSION')
> +    extra_sigdata = d.getVar("HASHEQUIV_EXTRA_SIGDATA")
>  
>      filemaps = {}
>      for m in (d.getVar('SSTATE_HASHEQUIV_FILEMAP') or '').split():
> @@ -506,8 +507,11 @@ def OEOuthashBasic(path, sigfile, task, d):
>          basepath = os.path.normpath(path)
>  
>          update_hash("OEOuthashBasic\n")
> -        if extra_content:
> -            update_hash(extra_content + "\n")
> +        if hash_version:
> +            update_hash(hash_version + "\n")
> +
> +        if extra_sigdata:
> +            update_hash(extra_sigdata + "\n")
>  
>          # It is only currently useful to get equivalent hashes for things that
>          # can be restored from sstate. Since the sstate object is named using
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#160571): https://lists.openembedded.org/g/openembedded-core/message/160571
> Mute This Topic: https://lists.openembedded.org/mt/88425608/4454410
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [jacob.kroon@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 

Sounds to me like something we should do.

Jacob
Alexander Kanavin Jan. 14, 2022, 6:04 p.m. UTC | #2
While we're here, recipes inheriting qemu class should do this too, but for
the target dependencies.

Alex

On Fri, 14 Jan 2022 at 18:50, Jacob Kroon <jacob.kroon@gmail.com> wrote:

> On 1/14/22 18:12, Joshua Watt wrote:
> > Native task outputs are directly run on the target (host) system after
>
> "target" or "host" ? the latter i suppose
>
> > being built. Even if the output of a native recipe doesn't change, a
> > change in one of its dependencies may cause a change in the output it
> > generates (e.g. rpm output depends on the output of its dependent zstd
> > library).
> >
> > This can cause poor interactions with hash equivalence, since this
> > recipes output-changing dependency is "hidden" and downstream task only
> > see that this recipe has the same outhash and therefore is equivalent.
> > This can result in different output in different cases.
> >
> > To resolve this, unhide the output-changing dependency by adding it's
> > unihash to this tasks outhash calculation. Unfortunately, don't know
> > specifically know which dependencies are output-changing, so we have to
> > add all of them.
> >
>
> "don't know specifically know which.."
>
> > [YOCTO #14685]
> >
> > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > ---
> >  meta/classes/native.bbclass | 31 +++++++++++++++++++++++++++++++
> >  meta/lib/oe/sstatesig.py    | 10 +++++++---
> >  2 files changed, 38 insertions(+), 3 deletions(-)
> >
> > diff --git a/meta/classes/native.bbclass b/meta/classes/native.bbclass
> > index 76a599bc15..fc7422c5d7 100644
> > --- a/meta/classes/native.bbclass
> > +++ b/meta/classes/native.bbclass
> > @@ -195,3 +195,34 @@ USE_NLS = "no"
> >
> >  RECIPERDEPTASK = "do_populate_sysroot"
> >  do_populate_sysroot[rdeptask] = "${RECIPERDEPTASK}"
> > +
> > +#
> > +# Native task outputs are directly run on the target (host) system
> after being
>
> see above
>
> > +# built. Even if the output of this recipe doesn't change, a change in
> one of
> > +# its dependencies may cause a change in the output it generates (e.g.
> rpm
> > +# output depends on the output of its dependent zstd library).
> > +#
> > +# This can cause poor interactions with hash equivalence, since this
> recipes
> > +# output-changing dependency is "hidden" and downstream task only see
> that this
> > +# recipe has the same outhash and therefore is equivalent. This can
> result in
> > +# different output in different cases.
> > +#
> > +# To resolve this, unhide the output-changing dependency by adding its
> unihash
> > +# to this tasks outhash calculation. Unfortunately, don't know
> specifically
> > +# know which dependencies are output-changing, so we have to add all of
> them.
> > +#
>
> see above
>
> > +python native_add_do_populate_sysroot_deps () {
> > +    current_task = "do_" + d.getVar("BB_CURRENTTASK")
> > +    if current_task != "do_populate_sysroot":
> > +        return
> > +
> > +    taskdepdata = d.getVar("BB_TASKDEPDATA", False)
> > +    pn = d.getVar("PN")
> > +    deps = {
> > +        dep[0]:dep[6] for dep in taskdepdata.values() if
> > +            dep[1] == current_task and dep[0] != pn
> > +    }
> > +
> > +    d.setVar("HASHEQUIV_EXTRA_SIGDATA", "\n".join("%s: %s" % (k,
> deps[k]) for k in sorted(deps.keys())))
> > +}
> > +SSTATECREATEFUNCS += "native_add_do_populate_sysroot_deps"
> > diff --git a/meta/lib/oe/sstatesig.py b/meta/lib/oe/sstatesig.py
> > index 038404e377..abcd96231e 100644
> > --- a/meta/lib/oe/sstatesig.py
> > +++ b/meta/lib/oe/sstatesig.py
> > @@ -491,7 +491,8 @@ def OEOuthashBasic(path, sigfile, task, d):
> >      if task == "package":
> >          include_timestamps = True
> >          include_root = False
> > -    extra_content = d.getVar('HASHEQUIV_HASH_VERSION')
> > +    hash_version = d.getVar('HASHEQUIV_HASH_VERSION')
> > +    extra_sigdata = d.getVar("HASHEQUIV_EXTRA_SIGDATA")
> >
> >      filemaps = {}
> >      for m in (d.getVar('SSTATE_HASHEQUIV_FILEMAP') or '').split():
> > @@ -506,8 +507,11 @@ def OEOuthashBasic(path, sigfile, task, d):
> >          basepath = os.path.normpath(path)
> >
> >          update_hash("OEOuthashBasic\n")
> > -        if extra_content:
> > -            update_hash(extra_content + "\n")
> > +        if hash_version:
> > +            update_hash(hash_version + "\n")
> > +
> > +        if extra_sigdata:
> > +            update_hash(extra_sigdata + "\n")
> >
> >          # It is only currently useful to get equivalent hashes for
> things that
> >          # can be restored from sstate. Since the sstate object is named
> using
> >
> >
> >
> >
> >
>
> Sounds to me like something we should do.
>
> Jacob
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#160573):
> https://lists.openembedded.org/g/openembedded-core/message/160573
> Mute This Topic: https://lists.openembedded.org/mt/88425608/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Richard Purdie Jan. 17, 2022, 11:50 a.m. UTC | #3
On Fri, 2022-01-14 at 11:12 -0600, Joshua Watt wrote:
> Native task outputs are directly run on the target (host) system after
> being built. Even if the output of a native recipe doesn't change, a
> change in one of its dependencies may cause a change in the output it
> generates (e.g. rpm output depends on the output of its dependent zstd
> library).
> 
> This can cause poor interactions with hash equivalence, since this
> recipes output-changing dependency is "hidden" and downstream task only
> see that this recipe has the same outhash and therefore is equivalent.
> This can result in different output in different cases.
> 
> To resolve this, unhide the output-changing dependency by adding it's
> unihash to this tasks outhash calculation. Unfortunately, don't know
> specifically know which dependencies are output-changing, so we have to
> add all of them.
> 
> [YOCTO #14685]
> 
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>  meta/classes/native.bbclass | 31 +++++++++++++++++++++++++++++++
>  meta/lib/oe/sstatesig.py    | 10 +++++++---
>  2 files changed, 38 insertions(+), 3 deletions(-)

Thanks for this. I know it was an RFC but after testing over the weekend and
doing some checks of my own, I've merged it. I think it is the correct solution
and we don't have any other good option to fix the issue which is breaking
builds on the autobuilder.

Cheers,

Richard

Patch

diff --git a/meta/classes/native.bbclass b/meta/classes/native.bbclass
index 76a599bc15..fc7422c5d7 100644
--- a/meta/classes/native.bbclass
+++ b/meta/classes/native.bbclass
@@ -195,3 +195,34 @@  USE_NLS = "no"
 
 RECIPERDEPTASK = "do_populate_sysroot"
 do_populate_sysroot[rdeptask] = "${RECIPERDEPTASK}"
+
+#
+# Native task outputs are directly run on the target (host) system after being
+# built. Even if the output of this recipe doesn't change, a change in one of
+# its dependencies may cause a change in the output it generates (e.g. rpm
+# output depends on the output of its dependent zstd library).
+#
+# This can cause poor interactions with hash equivalence, since this recipes
+# output-changing dependency is "hidden" and downstream task only see that this
+# recipe has the same outhash and therefore is equivalent. This can result in
+# different output in different cases.
+#
+# To resolve this, unhide the output-changing dependency by adding its unihash
+# to this tasks outhash calculation. Unfortunately, don't know specifically
+# know which dependencies are output-changing, so we have to add all of them.
+#
+python native_add_do_populate_sysroot_deps () {
+    current_task = "do_" + d.getVar("BB_CURRENTTASK")
+    if current_task != "do_populate_sysroot":
+        return
+
+    taskdepdata = d.getVar("BB_TASKDEPDATA", False)
+    pn = d.getVar("PN")
+    deps = {
+        dep[0]:dep[6] for dep in taskdepdata.values() if
+            dep[1] == current_task and dep[0] != pn
+    }
+
+    d.setVar("HASHEQUIV_EXTRA_SIGDATA", "\n".join("%s: %s" % (k, deps[k]) for k in sorted(deps.keys())))
+}
+SSTATECREATEFUNCS += "native_add_do_populate_sysroot_deps"
diff --git a/meta/lib/oe/sstatesig.py b/meta/lib/oe/sstatesig.py
index 038404e377..abcd96231e 100644
--- a/meta/lib/oe/sstatesig.py
+++ b/meta/lib/oe/sstatesig.py
@@ -491,7 +491,8 @@  def OEOuthashBasic(path, sigfile, task, d):
     if task == "package":
         include_timestamps = True
         include_root = False
-    extra_content = d.getVar('HASHEQUIV_HASH_VERSION')
+    hash_version = d.getVar('HASHEQUIV_HASH_VERSION')
+    extra_sigdata = d.getVar("HASHEQUIV_EXTRA_SIGDATA")
 
     filemaps = {}
     for m in (d.getVar('SSTATE_HASHEQUIV_FILEMAP') or '').split():
@@ -506,8 +507,11 @@  def OEOuthashBasic(path, sigfile, task, d):
         basepath = os.path.normpath(path)
 
         update_hash("OEOuthashBasic\n")
-        if extra_content:
-            update_hash(extra_content + "\n")
+        if hash_version:
+            update_hash(hash_version + "\n")
+
+        if extra_sigdata:
+            update_hash(extra_sigdata + "\n")
 
         # It is only currently useful to get equivalent hashes for things that
         # can be restored from sstate. Since the sstate object is named using