[v3] fetch2/git: stop generated tarballs from leaking info

Message ID 20220328173618.960965-1-o.mandel@menlosystems.com
State Accepted, archived
Commit 0178ab83e6312e97e528aa8c5e12105f5165d896
Headers show
Series [v3] fetch2/git: stop generated tarballs from leaking info | expand

Commit Message

Olaf Mandel March 28, 2022, 5:36 p.m. UTC
When using BB_GENERATE_MIRROR_TARBALLS="1" to generate mirror tarballs
of git repositories, they leaked local information: username, group and
time of the last fetch. Remove all these by setting fixed information:

 * uname = pokybuild (6000)
 * gname = users (100)
 * mtime = committer time of newest commit in repo

The username and group value were taken from the archives available on
the downloads.yoctoproject.org mirror. The modification time is chosen
so it still retains some relationship to the contents of the archive.

Signed-off-by: Olaf Mandel <o.mandel@menlosystems.com>
---
 lib/bb/fetch2/git.py  |  5 ++++-
 lib/bb/tests/fetch.py | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 1 deletion(-)

Comments

Olaf Mandel April 5, 2022, 11:38 a.m. UTC | #1
Hello,

Am 28.03.2022 um 19:36 schrieb Olaf Mandel:
> When using BB_GENERATE_MIRROR_TARBALLS="1" to generate mirror tarballs
> of git repositories, they leaked local information: username, group and
> time of the last fetch. Remove all these by setting fixed information:
> 
>   * uname = pokybuild (6000)
>   * gname = users (100)

it was pointed out to me by Marek that instead of using pokybuild:users, 
which seems to be an artifact of the YP autobuilder, I should use the 
"canonical" combination oe:oe. But that raises the question: which 
numerical IDs should I use then?

Best regards,
Olaf Mandel
Alexandre Belloni April 5, 2022, 1:19 p.m. UTC | #2
Hello,

On 05/04/2022 13:38:41+0200, Olaf Mandel wrote:
> Hello,
> 
> Am 28.03.2022 um 19:36 schrieb Olaf Mandel:
> > When using BB_GENERATE_MIRROR_TARBALLS="1" to generate mirror tarballs
> > of git repositories, they leaked local information: username, group and
> > time of the last fetch. Remove all these by setting fixed information:
> > 
> >   * uname = pokybuild (6000)
> >   * gname = users (100)
> 
> it was pointed out to me by Marek that instead of using pokybuild:users,
> which seems to be an artifact of the YP autobuilder, I should use the
> "canonical" combination oe:oe. But that raises the question: which numerical
> IDs should I use then?
> 

Note that patch has already been applied so you'd have to send a patch
on top of master.

> Best regards,
> Olaf Mandel




> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#13579): https://lists.openembedded.org/g/bitbake-devel/message/13579
> Mute This Topic: https://lists.openembedded.org/mt/90090609/3617179
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Richard Purdie April 5, 2022, 2:29 p.m. UTC | #3
On Tue, 2022-04-05 at 13:38 +0200, Olaf Mandel wrote:
> Hello,
> 
> Am 28.03.2022 um 19:36 schrieb Olaf Mandel:
> > When using BB_GENERATE_MIRROR_TARBALLS="1" to generate mirror tarballs
> > of git repositories, they leaked local information: username, group and
> > time of the last fetch. Remove all these by setting fixed information:
> > 
> >   * uname = pokybuild (6000)
> >   * gname = users (100)
> 
> it was pointed out to me by Marek that instead of using pokybuild:users, 
> which seems to be an artifact of the YP autobuilder, I should use the 
> "canonical" combination oe:oe. But that raises the question: which 
> numerical IDs should I use then?

I was happy to get a consistent value, I'm less worried about what that value is
but if we want something different I'm ok with that...

Cheers,

Richard

Patch

diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index f6f6b63a..727cebdc 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -448,7 +448,10 @@  class Git(FetchMethod):
 
             logger.info("Creating tarball of git repository")
             with create_atomic(ud.fullmirror) as tfile:
-                runfetchcmd("tar -czf %s ." % tfile, d, workdir=ud.clonedir)
+                mtime = runfetchcmd("git log --all -1 --format=%cD", d,
+                        quiet=True, workdir=ud.clonedir)
+                runfetchcmd("tar -czf %s --owner pokybuild:6000 --group users:100 --mtime \"%s\" ."
+                        % (tfile, mtime), d, workdir=ud.clonedir)
             runfetchcmd("touch %s.done" % ud.fullmirror, d)
 
     def clone_shallow_local(self, ud, dest, d):
diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index 301c4683..68934e79 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -11,6 +11,7 @@  import hashlib
 import tempfile
 import collections
 import os
+import tarfile
 from bb.fetch2 import URI
 from bb.fetch2 import FetchMethod
 import bb
@@ -584,6 +585,37 @@  class GitShallowTarballNamingTest(FetcherTest):
         self.assertIn(self.mirror_tarball, dir)
 
 
+class CleanTarballTest(FetcherTest):
+    def setUp(self):
+        super(CleanTarballTest, self).setUp()
+        self.recipe_url = "git://git.openembedded.org/bitbake"
+        self.recipe_tarball = "git2_git.openembedded.org.bitbake.tar.gz"
+
+        self.d.setVar('BB_GENERATE_MIRROR_TARBALLS', '1')
+        self.d.setVar('SRCREV', '82ea737a0b42a8b53e11c9cde141e9e9c0bd8c40')
+
+    @skipIfNoNetwork()
+    def test_that_the_tarball_contents_does_not_leak_info(self):
+        fetcher = bb.fetch.Fetch([self.recipe_url], self.d)
+
+        fetcher.download()
+
+        fetcher.unpack(self.unpackdir)
+        mtime = bb.process.run('git log --all -1 --format=%ct',
+                cwd=os.path.join(self.unpackdir, 'git'))
+        self.assertEqual(len(mtime), 2)
+        mtime = int(mtime[0])
+
+        archive = tarfile.open(os.path.join(self.dldir, self.recipe_tarball))
+        self.assertNotEqual(len(archive.members), 0)
+        for member in archive.members:
+            self.assertEqual(member.uname, 'pokybuild')
+            self.assertEqual(member.uid, 6000)
+            self.assertEqual(member.gname, 'users')
+            self.assertEqual(member.gid, 100)
+            self.assertEqual(member.mtime, mtime)
+
+
 class FetcherLocalTest(FetcherTest):
     def setUp(self):
         def touch(fn):