diff mbox series

[PATCHv2,5/5] lib/oe/patch: Use git notes to store the filenames for the patches

Message ID 20240219012832.3090768-5-pkj@axis.com
State Accepted, archived
Commit f5e6183b9557477bef74024a587de0bfcc2b7c0d
Headers show
Series [PATCHv2,1/5] lib/oe/patch: Make extractPatches() not extract ignored commits | expand

Commit Message

Peter Kjellerstedt Feb. 19, 2024, 1:28 a.m. UTC
The old way of keeping track of the filenames for the patches that
correspond to the commits was to add a special comment line to the end
of the commit message, e.g., "%% original patch: <filename>", using a
temporary git hook. This method had some drawbacks, e.g.:

* It caused problems if one wanted to push the commits upstream as the
  comment line had to be manually removed.
* The comment line would end up in patches if someone used git
  format-path rather than devtool finish to generate the patches.
* The comment line could interfere with global Git hooks used to
  validate the format of the Git commit message.
* When regenerating patches with `devtool finish --force-patch-refresh`,
  the process typically resulted in adding empty lines to the end of the
  commit messages in the updated patches.

A better way of keeping track of the patch filenames is to use Git
notes. This way the commit messages remain unaffected, but the
information is still shown when, e.g., doing `git log`. A special Git
notes space, refs/notes/devtool, is used to not intefere with the
default Git notes. It is configured to be shown in, e.g., `git log` and
to survive rewrites (i.e., `git commit --amend` and `git rebase`).

Since there is no longer any need for a temporary Git hook, the code
that manipulated the .git/hooks directory has also been removed. To
avoid potential problems due to global Git hooks, --no-verify was added
to the `git commit` command.

To not cause troubles for those who have done `devtool modify` for a
recipe with the old solution and then do `devtool finish` with the new
solution, the code will fall back to look for the old strings in the
commit message if no Git note can be found.

While not technically motivated like above, the way to keep track of
ignored commits is also changed to use Git notes to avoid having
different methods to store similar information.

Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
---

PATCHv2:
* Remove the --fixed-value option from calls to git config. It was
  added in Git version 2.30, but OE Core currently only requires Git
  version 1.8.3.1.
* Work around a bug (or feature?) in git rebase related to commits with
  notes that are completely absorbed by already existing commits during
  a rebase.
* Configure notes.rewriteMode = ignore. This is part of the solution to
  the above problem.

 meta/lib/oe/patch.py                    | 109 +++++++++++++++---------
 meta/lib/oeqa/selftest/cases/devtool.py |   9 +-
 scripts/lib/devtool/standard.py         |  15 ++--
 scripts/lib/devtool/upgrade.py          |  24 +++++-
 4 files changed, 103 insertions(+), 54 deletions(-)

Comments

Ross Burton Feb. 19, 2024, 12:56 p.m. UTC | #1
On 19 Feb 2024, at 01:28, Peter Kjellerstedt via lists.openembedded.org <peter.kjellerstedt=axis.com@lists.openembedded.org> wrote:
> +    @staticmethod
> +    def addNote(repo, ref, key, value=None):
> +        note = key + (": %s" % value if value else "")
> +        notes_ref = GitApplyTree.notes_ref
> +        runcmd(["git", "config", "notes.rewriteMode", "ignore"], repo)
> +        runcmd(["git", "config", "notes.displayRef", notes_ref, notes_ref], repo)
> +        runcmd(["git", "config", "notes.rewriteRef", notes_ref, notes_ref], repo)
> +        runcmd(["git", "notes", "--ref", notes_ref, "append", "-m", note, ref], repo)

It feels like the config calls could be done once when setting up the repository somehow?

> +                        # This loop, which is used to remove any line that
> +                        # starts with "%% original patch", is kept for backwards
> +                        # compatibility. If/when that compatibility is dropped,
> +                        # it can be replaced with code to just read the first
> +                        # line of the patch file to get the SHA-1, and the code
> +                        # below that writes the modified patch file can be
> +                        # replaced with a simple file move.

This is all that is needed for the new code to work correctly with an old worktree, right?

Ross
Peter Kjellerstedt Feb. 19, 2024, 3:03 p.m. UTC | #2
> -----Original Message-----
> From: Ross Burton <Ross.Burton@arm.com>
> Sent: den 19 februari 2024 13:56
> To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
> Cc: openembedded-core@lists.openembedded.org
> Subject: Re: [OE-core] [PATCHv2 5/5] lib/oe/patch: Use git notes to store
> the filenames for the patches
> 
> On 19 Feb 2024, at 01:28, Peter Kjellerstedt via lists.openembedded.org
> <peter.kjellerstedt=axis.com@lists.openembedded.org> wrote:
> > +    @staticmethod
> > +    def addNote(repo, ref, key, value=None):
> > +        note = key + (": %s" % value if value else "")
> > +        notes_ref = GitApplyTree.notes_ref
> > +        runcmd(["git", "config", "notes.rewriteMode", "ignore"], repo)
> > +        runcmd(["git", "config", "notes.displayRef", notes_ref, notes_ref], repo)
> > +        runcmd(["git", "config", "notes.rewriteRef", notes_ref, notes_ref], repo)
> > +        runcmd(["git", "notes", "--ref", notes_ref, "append", "-m", note, ref], repo)
> 
> It feels like the config calls could be done once when setting up the
> repository somehow?

While I do agree that would be better, the reason I did it like this is because 
there is nothing in this class that is related to setting up the repositories 
that are used while calling this function. Thus it felt wrong to rely on that 
the repository has been configured correctly beforehand for the function to 
work as intended.

> > +                        # This loop, which is used to remove any line that
> > +                        # starts with "%% original patch", is kept for backwards
> > +                        # compatibility. If/when that compatibility is dropped,
> > +                        # it can be replaced with code to just read the first
> > +                        # line of the patch file to get the SHA-1, and the code
> > +                        # below that writes the modified patch file can be
> > +                        # replaced with a simple file move.
> 
> This is all that is needed for the new code to work correctly with an old
> worktree, right?

Well, it relies on the fallback (i.e., the exception clause) in GitApplyTree.getNotes() 
that reads from the commit message in case a Git note cannot be found. Do you want me 
to extend the comment so that it includes that refence as well?

> 
> Ross

//Peter
Ross Burton Feb. 19, 2024, 3:52 p.m. UTC | #3
On 19 Feb 2024, at 15:03, Peter Kjellerstedt <peter.kjellerstedt@axis.com> wrote:
>> This is all that is needed for the new code to work correctly with an old
>> worktree, right?
> 
> Well, it relies on the fallback (i.e., the exception clause) in GitApplyTree.getNotes() 
> that reads from the commit message in case a Git note cannot be found. Do you want me 
> to extend the comment so that it includes that refence as well?

No need, I was just checking that the new code would work on an old worktree using comments.

Ross
diff mbox series

Patch

diff --git a/meta/lib/oe/patch.py b/meta/lib/oe/patch.py
index 3ded5f3601..60a0cc8291 100644
--- a/meta/lib/oe/patch.py
+++ b/meta/lib/oe/patch.py
@@ -294,8 +294,9 @@  class PatchTree(PatchSet):
         self.Pop(all=True)
 
 class GitApplyTree(PatchTree):
-    patch_line_prefix = '%% original patch'
-    ignore_commit_prefix = '%% ignore'
+    notes_ref = "refs/notes/devtool"
+    original_patch = 'original patch'
+    ignore_commit = 'ignore'
 
     def __init__(self, dir, d):
         PatchTree.__init__(self, dir, d)
@@ -452,7 +453,7 @@  class GitApplyTree(PatchTree):
         # Prepare git command
         cmd = ["git"]
         GitApplyTree.gitCommandUserOptions(cmd, commituser, commitemail)
-        cmd += ["commit", "-F", tmpfile]
+        cmd += ["commit", "-F", tmpfile, "--no-verify"]
         # git doesn't like plain email addresses as authors
         if author and '<' in author:
             cmd.append('--author="%s"' % author)
@@ -460,15 +461,53 @@  class GitApplyTree(PatchTree):
             cmd.append('--date="%s"' % date)
         return (tmpfile, cmd)
 
+    @staticmethod
+    def addNote(repo, ref, key, value=None):
+        note = key + (": %s" % value if value else "")
+        notes_ref = GitApplyTree.notes_ref
+        runcmd(["git", "config", "notes.rewriteMode", "ignore"], repo)
+        runcmd(["git", "config", "notes.displayRef", notes_ref, notes_ref], repo)
+        runcmd(["git", "config", "notes.rewriteRef", notes_ref, notes_ref], repo)
+        runcmd(["git", "notes", "--ref", notes_ref, "append", "-m", note, ref], repo)
+
+    @staticmethod
+    def removeNote(repo, ref, key):
+        notes = GitApplyTree.getNotes(repo, ref)
+        notes = {k: v for k, v in notes.items() if k != key and not k.startswith(key + ":")}
+        runcmd(["git", "notes", "--ref", GitApplyTree.notes_ref, "remove", "--ignore-missing", ref], repo)
+        for note, value in notes.items():
+            GitApplyTree.addNote(repo, ref, note, value)
+
+    @staticmethod
+    def getNotes(repo, ref):
+        import re
+
+        note = None
+        try:
+            note = runcmd(["git", "notes", "--ref", GitApplyTree.notes_ref, "show", ref], repo)
+            prefix = ""
+        except CmdError:
+            note = runcmd(['git', 'show', '-s', '--format=%B', ref], repo)
+            prefix = "%% "
+
+        note_re = re.compile(r'^%s(.*?)(?::\s*(.*))?$' % prefix)
+        notes = dict()
+        for line in note.splitlines():
+            m = note_re.match(line)
+            if m:
+                notes[m.group(1)] = m.group(2)
+
+        return notes
+
     @staticmethod
     def commitIgnored(subject, dir=None, files=None, d=None):
         if files:
             runcmd(['git', 'add'] + files, dir)
-        message = "%s\n\n%s" % (subject, GitApplyTree.ignore_commit_prefix)
         cmd = ["git"]
         GitApplyTree.gitCommandUserOptions(cmd, d=d)
-        cmd += ["commit", "-m", message, "--no-verify"]
+        cmd += ["commit", "-m", subject, "--no-verify"]
         runcmd(cmd, dir)
+        GitApplyTree.addNote(dir, "HEAD", GitApplyTree.ignore_commit)
 
     @staticmethod
     def extractPatches(tree, startcommits, outdir, paths=None):
@@ -484,18 +523,20 @@  class GitApplyTree(PatchTree):
                 out = runcmd(["sh", "-c", " ".join(shellcmd)], os.path.join(tree, name))
                 if out:
                     for srcfile in out.split():
-                        outfile = os.path.basename(srcfile)
+                        # This loop, which is used to remove any line that
+                        # starts with "%% original patch", is kept for backwards
+                        # compatibility. If/when that compatibility is dropped,
+                        # it can be replaced with code to just read the first
+                        # line of the patch file to get the SHA-1, and the code
+                        # below that writes the modified patch file can be
+                        # replaced with a simple file move.
                         for encoding in ['utf-8', 'latin-1']:
                             patchlines = []
                             try:
                                 with open(srcfile, 'r', encoding=encoding, newline='') as f:
                                     for line in f:
-                                        if line.startswith(GitApplyTree.patch_line_prefix):
-                                            outfile = line.split()[-1].strip()
+                                        if line.startswith("%% " + GitApplyTree.original_patch):
                                             continue
-                                        if line.startswith(GitApplyTree.ignore_commit_prefix):
-                                            outfile = None
-                                            break
                                         patchlines.append(line)
                             except UnicodeDecodeError:
                                 continue
@@ -503,11 +544,16 @@  class GitApplyTree(PatchTree):
                         else:
                             raise PatchError('Unable to find a character encoding to decode %s' % srcfile)
 
-                        if outfile:
-                            bb.utils.mkdirhier(os.path.join(outdir, name))
-                            with open(os.path.join(outdir, name, outfile), 'w') as of:
-                                for line in patchlines:
-                                    of.write(line)
+                        sha1 = patchlines[0].split()[1]
+                        notes = GitApplyTree.getNotes(os.path.join(tree, name), sha1)
+                        if GitApplyTree.ignore_commit in notes:
+                            continue
+                        outfile = notes.get(GitApplyTree.original_patch, os.path.basename(srcfile))
+
+                        bb.utils.mkdirhier(os.path.join(outdir, name))
+                        with open(os.path.join(outdir, name, outfile), 'w') as of:
+                            for line in patchlines:
+                                of.write(line)
         finally:
             shutil.rmtree(tempdir)
 
@@ -555,28 +601,11 @@  class GitApplyTree(PatchTree):
 
             return runcmd(["sh", "-c", " ".join(shellcmd)], self.dir)
 
-        # Add hooks which add a pointer to the original patch file name in the commit message
         reporoot = (runcmd("git rev-parse --show-toplevel".split(), self.dir) or '').strip()
         if not reporoot:
             raise Exception("Cannot get repository root for directory %s" % self.dir)
-        gitdir = (runcmd("git rev-parse --absolute-git-dir".split(), self.dir) or '').strip()
-        if not gitdir:
-            raise Exception("Cannot get gitdir for directory %s" % self.dir)
-        hooks_dir = os.path.join(gitdir, 'hooks')
-        hooks_dir_backup = hooks_dir + '.devtool-orig'
-        if os.path.lexists(hooks_dir_backup):
-            raise Exception("Git hooks backup directory already exists: %s" % hooks_dir_backup)
-        if os.path.lexists(hooks_dir):
-            shutil.move(hooks_dir, hooks_dir_backup)
-        os.mkdir(hooks_dir)
-        commithook = os.path.join(hooks_dir, 'commit-msg')
-        applyhook = os.path.join(hooks_dir, 'applypatch-msg')
-        with open(commithook, 'w') as f:
-            # NOTE: the formatting here is significant; if you change it you'll also need to
-            # change other places which read it back
-            f.write('echo "\n%s: $PATCHFILE" >> $1' % GitApplyTree.patch_line_prefix)
-        os.chmod(commithook, 0o755)
-        shutil.copy2(commithook, applyhook)
+
+        patch_applied = True
         try:
             patchfilevar = 'PATCHFILE="%s"' % os.path.basename(patch['file'])
             if self._need_dirty_check():
@@ -587,7 +616,7 @@  class GitApplyTree(PatchTree):
                     pass
                 else:
                     if output:
-                        # The tree is dirty, not need to try to apply patches with git anymore
+                        # The tree is dirty, no need to try to apply patches with git anymore
                         # since they fail, fallback directly to patch
                         output = PatchTree._applypatch(self, patch, force, reverse, run)
                         output += self._commitpatch(patch, patchfilevar)
@@ -620,10 +649,12 @@  class GitApplyTree(PatchTree):
                     output = PatchTree._applypatch(self, patch, force, reverse, run)
                 output += self._commitpatch(patch, patchfilevar)
                 return output
+        except:
+            patch_applied = False
+            raise
         finally:
-            shutil.rmtree(hooks_dir)
-            if os.path.lexists(hooks_dir_backup):
-                shutil.move(hooks_dir_backup, hooks_dir)
+            if patch_applied:
+                GitApplyTree.addNote(self.dir, "HEAD", GitApplyTree.original_patch, os.path.basename(patch['file']))
 
 
 class QuiltTree(PatchSet):
diff --git a/meta/lib/oeqa/selftest/cases/devtool.py b/meta/lib/oeqa/selftest/cases/devtool.py
index c4dcdb4550..d37848bdef 100644
--- a/meta/lib/oeqa/selftest/cases/devtool.py
+++ b/meta/lib/oeqa/selftest/cases/devtool.py
@@ -989,9 +989,10 @@  class DevtoolModifyTests(DevtoolBase):
         self.assertIn(tempdir, result.output)
         # Check git repo
         self._check_src_repo(tempdir)
-        # Check that the patch is correctly applied
-        # last commit message in the tree must contain
-        # %% original patch: <patchname>
+        # Check that the patch is correctly applied.
+        # The last commit message in the tree must contain the following note:
+        # Notes (devtool):
+        #     original patch: <patchname>
         # ..
         patchname = None
         for uri in src_uri:
@@ -999,7 +1000,7 @@  class DevtoolModifyTests(DevtoolBase):
                 patchname = uri.replace("file://", "").partition('.patch')[0] + '.patch'
         self.assertIsNotNone(patchname)
         result = runCmd('git -C %s log -1' % tempdir)
-        self.assertIn("%%%% original patch: %s" % patchname, result.output)
+        self.assertIn("Notes (devtool):\n    original patch: %s" % patchname, result.output)
 
         # Configure the recipe to check that the git dependencies are correctly patched in cargo config
         bitbake('-c configure %s' % testrecipe)
diff --git a/scripts/lib/devtool/standard.py b/scripts/lib/devtool/standard.py
index 6d7fd17fbd..7972b4f822 100644
--- a/scripts/lib/devtool/standard.py
+++ b/scripts/lib/devtool/standard.py
@@ -937,14 +937,13 @@  def modify(args, config, basepath, workspace):
             seen_patches = []
             for branch in branches:
                 branch_patches[branch] = []
-                (stdout, _) = bb.process.run('git log devtool-base..%s' % branch, cwd=srctree)
-                for line in stdout.splitlines():
-                    line = line.strip()
-                    if line.startswith(oe.patch.GitApplyTree.patch_line_prefix):
-                        origpatch = line[len(oe.patch.GitApplyTree.patch_line_prefix):].split(':', 1)[-1].strip()
-                        if not origpatch in seen_patches:
-                            seen_patches.append(origpatch)
-                            branch_patches[branch].append(origpatch)
+                (stdout, _) = bb.process.run('git rev-list devtool-base..%s' % branch, cwd=srctree)
+                for sha1 in stdout.splitlines():
+                    notes = oe.patch.GitApplyTree.getNotes(srctree, sha1.strip())
+                    origpatch = notes.get(oe.patch.GitApplyTree.original_patch)
+                    if origpatch and origpatch not in seen_patches:
+                        seen_patches.append(origpatch)
+                        branch_patches[branch].append(origpatch)
 
         # Need to grab this here in case the source is within a subdirectory
         srctreebase = srctree
diff --git a/scripts/lib/devtool/upgrade.py b/scripts/lib/devtool/upgrade.py
index ef58523dc8..fa5b8ef3c7 100644
--- a/scripts/lib/devtool/upgrade.py
+++ b/scripts/lib/devtool/upgrade.py
@@ -261,7 +261,7 @@  def _extract_new_source(newpv, srctree, no_patch, srcrev, srcbranch, branch, kee
     revs = {}
     for path in paths:
         (stdout, _) = _run('git rev-parse HEAD', cwd=path)
-        revs[os.path.relpath(path,srctree)] = stdout.rstrip()
+        revs[os.path.relpath(path, srctree)] = stdout.rstrip()
 
     if no_patch:
         patches = oe.recipeutils.get_recipe_patches(crd)
@@ -272,17 +272,35 @@  def _extract_new_source(newpv, srctree, no_patch, srcrev, srcbranch, branch, kee
             _run('git checkout devtool-patched -b %s' % branch, cwd=path)
             (stdout, _) = _run('git branch --list devtool-override-*', cwd=path)
             branches_to_rebase = [branch] + stdout.split()
+            target_branch = revs[os.path.relpath(path, srctree)]
+
+            # There is a bug (or feature?) in git rebase where if a commit with
+            # a note is fully rebased away by being part of an old commit, the
+            # note is still attached to the old commit. Avoid this by making
+            # sure all old devtool related commits have a note attached to them
+            # (this assumes git config notes.rewriteMode is set to ignore).
+            (stdout, _) = __run('git rev-list devtool-base..%s' % target_branch)
+            for rev in stdout.splitlines():
+                if not oe.patch.GitApplyTree.getNotes(path, rev):
+                    oe.patch.GitApplyTree.addNote(path, rev, "dummy")
+
             for b in branches_to_rebase:
-                logger.info("Rebasing {} onto {}".format(b, revs[os.path.relpath(path,srctree)]))
+                logger.info("Rebasing {} onto {}".format(b, target_branch))
                 _run('git checkout %s' % b, cwd=path)
                 try:
-                    _run('git rebase %s' % revs[os.path.relpath(path, srctree)], cwd=path)
+                    _run('git rebase %s' % target_branch, cwd=path)
                 except bb.process.ExecutionError as e:
                     if 'conflict' in e.stdout:
                         logger.warning('Command \'%s\' failed:\n%s\n\nYou will need to resolve conflicts in order to complete the upgrade.' % (e.command, e.stdout.rstrip()))
                         _run('git rebase --abort', cwd=path)
                     else:
                         logger.warning('Command \'%s\' failed:\n%s' % (e.command, e.stdout))
+
+            # Remove any dummy notes added above.
+            (stdout, _) = __run('git rev-list devtool-base..%s' % target_branch)
+            for rev in stdout.splitlines():
+                oe.patch.GitApplyTree.removeNote(path, rev, "dummy")
+
             _run('git checkout %s' % branch, cwd=path)
 
     if tmpsrctree: