diff mbox series

[1/1] siggen.py: Improve taskhash reproducibility

Message ID 20230822203350.264625-1-paulo@myneves.com
State Accepted, archived
Commit 5293a1b36eeb89f57577cb709ec7f293909039a1
Headers show
Series [1/1] siggen.py: Improve taskhash reproducibility | expand

Commit Message

Paulo Neves Aug. 22, 2023, 8:33 p.m. UTC
file checksums are part of the data checksummed
to generate the task hash. The list of file checksums
was not ordered.

In this commit we make sure the task hash checksum takes
a list of checksum data that is ordered by unique file name
thus guaranteeing reproducibility.

Signed-off-by: Paulo Neves <paulo@myneves.com>
---
 lib/bb/siggen.py | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Luca Ceresoli Aug. 24, 2023, 7:58 a.m. UTC | #1
Hello Paulo,

On Tue, 22 Aug 2023 20:33:58 +0000
"Paulo Neves" <paulo@myneves.com> wrote:

> file checksums are part of the data checksummed
> to generate the task hash. The list of file checksums
> was not ordered.
> 
> In this commit we make sure the task hash checksum takes
> a list of checksum data that is ordered by unique file name
> thus guaranteeing reproducibility.
> 
> Signed-off-by: Paulo Neves <paulo@myneves.com>

Lots of errors like the following on the autobuilders are possibly
caused by this patch:

Hash for task dependency pseudo-native:do_populate_sysroot changed from f4435f759ac00c51c07eaa840d9d61249f38f82d39a8dbffa4e8f599d874a3c6 to 07a200198639768d86d5c2c654fcbd0d4ea23d333e735848ac22c12474db1c2d
(and a lot more errors)

Logs:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5592/steps/14/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/5646/steps/14/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/127/builds/1954/steps/15/logs/stdio

Luca
Paulo Neves Aug. 24, 2023, 9:08 a.m. UTC | #2
On 24/08/2023 09:58, Luca Ceresoli wrote:
> Hello Paulo,
>
> On Tue, 22 Aug 2023 20:33:58 +0000
> "Paulo Neves" <paulo@myneves.com> wrote:
>
>> file checksums are part of the data checksummed
>> to generate the task hash. The list of file checksums
>> was not ordered.
>>
>> In this commit we make sure the task hash checksum takes
>> a list of checksum data that is ordered by unique file name
>> thus guaranteeing reproducibility.
>>
>> Signed-off-by: Paulo Neves <paulo@myneves.com>
> Lots of errors like the following on the autobuilders are possibly
> caused by this patch:
>
> Hash for task dependency pseudo-native:do_populate_sysroot changed from f4435f759ac00c51c07eaa840d9d61249f38f82d39a8dbffa4e8f599d874a3c6 to 07a200198639768d86d5c2c654fcbd0d4ea23d333e735848ac22c12474db1c2d
> (and a lot more errors)
>
> Logs:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5592/steps/14/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/5646/steps/14/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/127/builds/1954/steps/15/logs/stdio
>
> Luca
>
> --
> Luca Ceresoli, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

Thank you! I will have a look. Can you tell me how to run the test case 
which is showing this? I think this may be related to me having a 
different signature handler than the ones on the test.
Richard Purdie Aug. 24, 2023, 10:33 a.m. UTC | #3
On Thu, 2023-08-24 at 09:58 +0200, Luca Ceresoli via
lists.openembedded.org wrote:
> Hello Paulo,
> 
> On Tue, 22 Aug 2023 20:33:58 +0000
> "Paulo Neves" <paulo@myneves.com> wrote:
> 
> > file checksums are part of the data checksummed
> > to generate the task hash. The list of file checksums
> > was not ordered.
> > 
> > In this commit we make sure the task hash checksum takes
> > a list of checksum data that is ordered by unique file name
> > thus guaranteeing reproducibility.
> > 
> > Signed-off-by: Paulo Neves <paulo@myneves.com>
> 
> Lots of errors like the following on the autobuilders are possibly
> caused by this patch:
> 
> Hash for task dependency pseudo-native:do_populate_sysroot changed from f4435f759ac00c51c07eaa840d9d61249f38f82d39a8dbffa4e8f599d874a3c6 to 07a200198639768d86d5c2c654fcbd0d4ea23d333e735848ac22c12474db1c2d
> (and a lot more errors)
> 
> Logs:
> 
> https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5592/steps/14/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/5646/steps/14/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/127/builds/1954/steps/15/logs/stdio

FWIW I had this patch in master-next and didn't see those issues so it
is possibly a different change...

Cheers,

Richard
diff mbox series

Patch

diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py
index 879c136e1..b023b79ec 100644
--- a/lib/bb/siggen.py
+++ b/lib/bb/siggen.py
@@ -361,7 +361,7 @@  class SignatureGeneratorBasic(SignatureGenerator):
         for dep in sorted(self.runtaskdeps[tid]):
             data += self.get_unihash(dep[1])
 
-        for (f, cs) in self.file_checksum_values[tid]:
+        for (f, cs) in sorted(self.file_checksum_values[tid], key=clean_checksum_file_path):
             if cs:
                 if "/./" in f:
                     data += "./" + f.split("/./")[1]
@@ -426,7 +426,7 @@  class SignatureGeneratorBasic(SignatureGenerator):
         if runtime and tid in self.taskhash:
             data['runtaskdeps'] = [dep[0] for dep in sorted(self.runtaskdeps[tid])]
             data['file_checksum_values'] = []
-            for f,cs in self.file_checksum_values[tid]:
+            for f,cs in sorted(self.file_checksum_values[tid], key=clean_checksum_file_path):
                 if "/./" in f:
                     data['file_checksum_values'].append(("./" + f.split("/./")[1], cs))
                 else:
@@ -745,6 +745,12 @@  class SignatureGeneratorTestEquivHash(SignatureGeneratorUniHashMixIn, SignatureG
         self.server = data.getVar('BB_HASHSERVE')
         self.method = "sstate_output_hash"
 
+def clean_checksum_file_path(file_checksum_tuple):
+    f, cs = file_checksum_tuple
+    if "/./" in f:
+        return "./" + f.split("/./")[1]
+    return f
+
 def dump_this_task(outfile, d):
     import bb.parse
     mcfn = d.getVar("BB_FILENAME")