Patchwork [bitbake-devel] git: Use merge-base instead of log for testing if a commit is present

login
register
mail settings
Submitter Richard Purdie
Date Nov. 18, 2013, 5:17 p.m.
Message ID <1384795036.6460.247.camel@ted>
Download mbox | patch
Permalink /patch/61919/
State New
Headers show

Comments

Richard Purdie - Nov. 18, 2013, 5:17 p.m.
The current use of git log to check if a given revision is present can be
a little fragile.

For example if revision X was on branch A, and then later added to branch
B, the update checks would not notice this since they just check for X
being in the repository.

We also had some autobuilder corruption where an older packed-refs file
was copied over a new repository containing newer pack files. There
was no update to the refs file since the revision was present but
not accessible in any branch.

The correct fix is to check that the required revisions are present
on the specific branches. This patch does this using merge-base.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
Martin Jansa - Nov. 18, 2013, 6:32 p.m.
On Mon, Nov 18, 2013 at 05:17:16PM +0000, Richard Purdie wrote:
> The current use of git log to check if a given revision is present can be
> a little fragile.
> 
> For example if revision X was on branch A, and then later added to branch
> B, the update checks would not notice this since they just check for X
> being in the repository.
> 
> We also had some autobuilder corruption where an older packed-refs file
> was copied over a new repository containing newer pack files. There
> was no update to the refs file since the revision was present but
> not accessible in any branch.
> 
> The correct fix is to check that the required revisions are present
> on the specific branches. This patch does this using merge-base.

I guess that merge-base is probably faster or easier to use, but did you
consider using git branch --contains?

e.g. checking if selected branch is in
git branch --contains ud.revisions[name]

I'm asking only because I'm using "git branch --contains" in some
scripts and maybe there is good reason I should rewrite them to use git
merge-base instead.

Thanks

> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index 6175e4c..99230c1 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -150,7 +150,7 @@ class Git(FetchMethod):
>              return True
>          os.chdir(ud.clonedir)
>          for name in ud.names:
> -            if not self._contains_ref(ud.revisions[name], d):
> +            if not self._contains_ref(ud.revisions[name], ud.branches[name], d):
>                  return True
>          if ud.write_tarballs and not os.path.exists(ud.fullmirror):
>              return True
> @@ -197,7 +197,7 @@ class Git(FetchMethod):
>          # Update the checkout if needed
>          needupdate = False
>          for name in ud.names:
> -            if not self._contains_ref(ud.revisions[name], d):
> +            if not self._contains_ref(ud.revisions[name], ud.branches[name], d):
>                  needupdate = True
>          if needupdate:
>              try: 
> @@ -281,13 +281,14 @@ class Git(FetchMethod):
>      def supports_srcrev(self):
>          return True
>  
> -    def _contains_ref(self, tag, d):
> +    def _contains_ref(self, tag, branch, d):
>          basecmd = data.getVar("FETCHCMD_git", d, True) or "git"
> -        cmd = "%s log --pretty=oneline -n 1 %s -- 2> /dev/null | wc -l" % (basecmd, tag)
> -        output = runfetchcmd(cmd, d, quiet=True)
> -        if len(output.split()) > 1:
> -            raise bb.fetch2.FetchError("The command '%s' gave output with more then 1 line unexpectedly, output: '%s'" % (cmd, output))
> -        return output.split()[0] != "0"
> +        cmd = "%s merge-base --is-ancestorlog %s %s" % (basecmd, tag, branch)
> +        try:
> +            output = runfetchcmd(cmd, d, quiet=True)
> +        except bb.fetch2.FetchError:
> +            return False
> +        return True
>  
>      def _revision_key(self, url, ud, d, name):
>          """
> 
> 
> _______________________________________________
> bitbake-devel mailing list
> bitbake-devel@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/bitbake-devel
Richard Purdie - Nov. 18, 2013, 9:36 p.m.
On Mon, 2013-11-18 at 19:32 +0100, Martin Jansa wrote:
> On Mon, Nov 18, 2013 at 05:17:16PM +0000, Richard Purdie wrote:
> > The current use of git log to check if a given revision is present can be
> > a little fragile.
> > 
> > For example if revision X was on branch A, and then later added to branch
> > B, the update checks would not notice this since they just check for X
> > being in the repository.
> > 
> > We also had some autobuilder corruption where an older packed-refs file
> > was copied over a new repository containing newer pack files. There
> > was no update to the refs file since the revision was present but
> > not accessible in any branch.
> > 
> > The correct fix is to check that the required revisions are present
> > on the specific branches. This patch does this using merge-base.
> 
> I guess that merge-base is probably faster or easier to use, but did you
> consider using git branch --contains?
> 
> e.g. checking if selected branch is in
> git branch --contains ud.revisions[name]
> 
> I'm asking only because I'm using "git branch --contains" in some
> scripts and maybe there is good reason I should rewrite them to use git
> merge-base instead.

git branch --contains gives you a branch list back which you'd have to
then parse to see if branch X was listed. git merge-base is faster and
we can directly ask what we need with an exit code. There is no more
deeply technical reason than that, was just neater.

Cheers,

Richard

Patch

diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
index 6175e4c..99230c1 100644
--- a/bitbake/lib/bb/fetch2/git.py
+++ b/bitbake/lib/bb/fetch2/git.py
@@ -150,7 +150,7 @@  class Git(FetchMethod):
             return True
         os.chdir(ud.clonedir)
         for name in ud.names:
-            if not self._contains_ref(ud.revisions[name], d):
+            if not self._contains_ref(ud.revisions[name], ud.branches[name], d):
                 return True
         if ud.write_tarballs and not os.path.exists(ud.fullmirror):
             return True
@@ -197,7 +197,7 @@  class Git(FetchMethod):
         # Update the checkout if needed
         needupdate = False
         for name in ud.names:
-            if not self._contains_ref(ud.revisions[name], d):
+            if not self._contains_ref(ud.revisions[name], ud.branches[name], d):
                 needupdate = True
         if needupdate:
             try: 
@@ -281,13 +281,14 @@  class Git(FetchMethod):
     def supports_srcrev(self):
         return True
 
-    def _contains_ref(self, tag, d):
+    def _contains_ref(self, tag, branch, d):
         basecmd = data.getVar("FETCHCMD_git", d, True) or "git"
-        cmd = "%s log --pretty=oneline -n 1 %s -- 2> /dev/null | wc -l" % (basecmd, tag)
-        output = runfetchcmd(cmd, d, quiet=True)
-        if len(output.split()) > 1:
-            raise bb.fetch2.FetchError("The command '%s' gave output with more then 1 line unexpectedly, output: '%s'" % (cmd, output))
-        return output.split()[0] != "0"
+        cmd = "%s merge-base --is-ancestorlog %s %s" % (basecmd, tag, branch)
+        try:
+            output = runfetchcmd(cmd, d, quiet=True)
+        except bb.fetch2.FetchError:
+            return False
+        return True
 
     def _revision_key(self, url, ud, d, name):
         """