Patchwork [bitbake-devel] fetch2: Shorten long srcrevs

login
register
mail settings
Submitter Richard Purdie
Date May 19, 2013, 10:21 a.m.
Message ID <1368958915.32727.96.camel@ted>
Download mbox | patch
Permalink /patch/50085/
State New
Headers show

Comments

Richard Purdie - May 19, 2013, 10:21 a.m.
The long srcrevs are mainly used or the workdir construction as well as
the package version. The long entries are hashes generated by the git fetcher
and other scms using a similar revision mechanism.

We need these to change when the package changes however collisions are
unlikely to happen within the domains we care about. The long revisions
have generated negative user feedback due to the use in path and file
names.

This patch therefore truncates the revisions to 10 characters maximum.

This should be safe in the contexts where these revisions are used as
the chances of spatially close collisions is very low (distant
collisions are not a major issue in the way we use these).

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
CC'd various people who I believe might have feelings about this. The
patch depends on the sortable_revisions patch I just sent out.
Martin Jansa - May 19, 2013, 12:32 p.m.
On Sun, May 19, 2013 at 01:21:55PM +0300, Richard Purdie wrote:
> The long srcrevs are mainly used or the workdir construction as well as
> the package version. The long entries are hashes generated by the git fetcher
> and other scms using a similar revision mechanism.
> 
> We need these to change when the package changes however collisions are
> unlikely to happen within the domains we care about. The long revisions
> have generated negative user feedback due to the use in path and file
> names.
> 
> This patch therefore truncates the revisions to 10 characters maximum.

What about 7 characters like git log --oneline is using?

SRCREVs in recipes are still supposed to be 40 characters for git,
right? Because otherwise this part needs to be changed too:

lib/bb/fetch2/git.py
            # Ensure anything that doesn't look like a sha256
            # checksum/revision is translated into one
            if not ud.revisions[name] or len(ud.revisions[name]) != 40 or (False in [c in "abcdef0123456789" for c in ud.revisions[name]]):

Cheers,

> This should be safe in the contexts where these revisions are used as
> the chances of spatially close collisions is very low (distant
> collisions are not a major issue in the way we use these).
> 
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
> CC'd various people who I believe might have feelings about this. The
> patch depends on the sortable_revisions patch I just sent out.
> 
> diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py
> index f8f8244..402329d 100644
> --- a/bitbake/lib/bb/fetch2/__init__.py
> +++ b/bitbake/lib/bb/fetch2/__init__.py
> @@ -617,6 +617,8 @@ def get_srcrev(d):
>  
>      if len(scms) == 1 and len(urldata[scms[0]].names) == 1:
>          autoinc, rev = urldata[scms[0]].method.sortable_revision(scms[0], urldata[scms[0]], d, urldata[scms[0]].names[0])
> +        if len(rev) > 10:
> +            rev = rev[:10]
>          if autoinc:
>              return "AUTOINC+" + rev
>          return rev
> @@ -633,6 +635,8 @@ def get_srcrev(d):
>          ud = urldata[scm]
>          for name in ud.names:
>              autoinc, rev = ud.method.sortable_revision(scm, ud, d, name)
> +            if len(rev) > 10:
> +                rev = rev[:10]
>              if autoinc and not seenautoinc:
>                  rev = "AUTOINC+" + rev
>                  seenautoinc
> 
>
Bruce Ashfield - May 19, 2013, 4:02 p.m.
On 13-05-19 6:21 AM, Richard Purdie wrote:
> The long srcrevs are mainly used or the workdir construction as well as
> the package version. The long entries are hashes generated by the git fetcher
> and other scms using a similar revision mechanism.
>
> We need these to change when the package changes however collisions are
> unlikely to happen within the domains we care about. The long revisions
> have generated negative user feedback due to the use in path and file
> names.
>
> This patch therefore truncates the revisions to 10 characters maximum.
>
> This should be safe in the contexts where these revisions are used as
> the chances of spatially close collisions is very low (distant
> collisions are not a major issue in the way we use these).

I for one like the change, the split of PN/<hashes> was a nice
improvement and this will make life a little bit easier again.

10 should definitely be enough to avoid collisions, I've rarely had
to go above 7 or 8 characters in my kernel hashes, and 10 will make
collisions that much harder to find.

As Martin asked, I assume we continue to specify 40 character hashes
in our recipes, which is fine with me since it makes the recipe
completely accurate.

Cheers,

Bruce


>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
> CC'd various people who I believe might have feelings about this. The
> patch depends on the sortable_revisions patch I just sent out.
>
> diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py
> index f8f8244..402329d 100644
> --- a/bitbake/lib/bb/fetch2/__init__.py
> +++ b/bitbake/lib/bb/fetch2/__init__.py
> @@ -617,6 +617,8 @@ def get_srcrev(d):
>
>       if len(scms) == 1 and len(urldata[scms[0]].names) == 1:
>           autoinc, rev = urldata[scms[0]].method.sortable_revision(scms[0], urldata[scms[0]], d, urldata[scms[0]].names[0])
> +        if len(rev) > 10:
> +            rev = rev[:10]
>           if autoinc:
>               return "AUTOINC+" + rev
>           return rev
> @@ -633,6 +635,8 @@ def get_srcrev(d):
>           ud = urldata[scm]
>           for name in ud.names:
>               autoinc, rev = ud.method.sortable_revision(scm, ud, d, name)
> +            if len(rev) > 10:
> +                rev = rev[:10]
>               if autoinc and not seenautoinc:
>                   rev = "AUTOINC+" + rev
>                   seenautoinc
>
>
Darren Hart - May 19, 2013, 5:07 p.m.
On 05/19/2013 03:21 AM, Richard Purdie wrote:
> The long srcrevs are mainly used or the workdir construction as well as
> the package version. The long entries are hashes generated by the git fetcher
> and other scms using a similar revision mechanism.
> 
> We need these to change when the package changes however collisions are
> unlikely to happen within the domains we care about. The long revisions
> have generated negative user feedback due to the use in path and file
> names.
> 
> This patch therefore truncates the revisions to 10 characters maximum.
> 
> This should be safe in the contexts where these revisions are used as
> the chances of spatially close collisions is very low (distant
> collisions are not a major issue in the way we use these).
> 
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>


Thank you kindly, you beat me to it.

Acked-by: Darren Hart <dvhart@linux.intel.com>


> ---
> CC'd various people who I believe might have feelings about this. The
> patch depends on the sortable_revisions patch I just sent out.
> 
> diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py
> index f8f8244..402329d 100644
> --- a/bitbake/lib/bb/fetch2/__init__.py
> +++ b/bitbake/lib/bb/fetch2/__init__.py
> @@ -617,6 +617,8 @@ def get_srcrev(d):
>  
>      if len(scms) == 1 and len(urldata[scms[0]].names) == 1:
>          autoinc, rev = urldata[scms[0]].method.sortable_revision(scms[0], urldata[scms[0]], d, urldata[scms[0]].names[0])
> +        if len(rev) > 10:
> +            rev = rev[:10]
>          if autoinc:
>              return "AUTOINC+" + rev
>          return rev
> @@ -633,6 +635,8 @@ def get_srcrev(d):
>          ud = urldata[scm]
>          for name in ud.names:
>              autoinc, rev = ud.method.sortable_revision(scm, ud, d, name)
> +            if len(rev) > 10:
> +                rev = rev[:10]
>              if autoinc and not seenautoinc:
>                  rev = "AUTOINC+" + rev
>                  seenautoinc
> 
>
Richard Purdie - May 20, 2013, 8:33 a.m.
On Sun, 2013-05-19 at 14:32 +0200, Martin Jansa wrote:
> On Sun, May 19, 2013 at 01:21:55PM +0300, Richard Purdie wrote:
> > The long srcrevs are mainly used or the workdir construction as well as
> > the package version. The long entries are hashes generated by the git fetcher
> > and other scms using a similar revision mechanism.
> > 
> > We need these to change when the package changes however collisions are
> > unlikely to happen within the domains we care about. The long revisions
> > have generated negative user feedback due to the use in path and file
> > names.
> > 
> > This patch therefore truncates the revisions to 10 characters maximum.
> 
> What about 7 characters like git log --oneline is using?

git uses varying lengths depending on whether it detects collisions. I
really don't want to have to do the collision detection or worry about
this so 10 characters seemed like a nice number, its short enough to
address the complaints but long enough to avoid problems.

> SRCREVs in recipes are still supposed to be 40 characters for git,
> right? Because otherwise this part needs to be changed too:
> 
> lib/bb/fetch2/git.py
>             # Ensure anything that doesn't look like a sha256
>             # checksum/revision is translated into one
>             if not ud.revisions[name] or len(ud.revisions[name]) != 40 or (False in [c in "abcdef0123456789" for c in ud.revisions[name]]):

Correct, we still specify the exact revisions, this just truncates it
when its used in the user visible places like the version in the
directory name or package version.

Cheers,

Richard
Darren Hart - May 20, 2013, 6:23 p.m.
On 05/20/2013 01:33 AM, Richard Purdie wrote:
> On Sun, 2013-05-19 at 14:32 +0200, Martin Jansa wrote:
>> On Sun, May 19, 2013 at 01:21:55PM +0300, Richard Purdie wrote:
>>> The long srcrevs are mainly used or the workdir construction as well as
>>> the package version. The long entries are hashes generated by the git fetcher
>>> and other scms using a similar revision mechanism.
>>>
>>> We need these to change when the package changes however collisions are
>>> unlikely to happen within the domains we care about. The long revisions
>>> have generated negative user feedback due to the use in path and file
>>> names.
>>>
>>> This patch therefore truncates the revisions to 10 characters maximum.
>>
>> What about 7 characters like git log --oneline is using?
> 
> git uses varying lengths depending on whether it detects collisions. I
> really don't want to have to do the collision detection or worry about
> this so 10 characters seemed like a nice number, its short enough to
> address the complaints but long enough to avoid problems.


Exactly. 10 seems like a good choice to me as well, it significantly
reduces the path length while keeping the implementation simple and
minimizing the chances of collisions.
Martin Jansa - May 22, 2013, 7:51 p.m.
On Sun, May 19, 2013 at 01:21:55PM +0300, Richard Purdie wrote:
> The long srcrevs are mainly used or the workdir construction as well as
> the package version. The long entries are hashes generated by the git fetcher
> and other scms using a similar revision mechanism.
> 
> We need these to change when the package changes however collisions are
> unlikely to happen within the domains we care about. The long revisions
> have generated negative user feedback due to the use in path and file
> names.
> 
> This patch therefore truncates the revisions to 10 characters maximum.
> 
> This should be safe in the contexts where these revisions are used as
> the chances of spatially close collisions is very low (distant
> collisions are not a major issue in the way we use these).

Interesting side-effect of this change is that it rebuilds all recipes
with git SRCPV in PV even with OEBasic signature handler, because stamp files 
are also renamed.

I'm not saying it's bad thing, I was just surprised to see so many
rebuilds after last update.

Patch

diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py
index f8f8244..402329d 100644
--- a/bitbake/lib/bb/fetch2/__init__.py
+++ b/bitbake/lib/bb/fetch2/__init__.py
@@ -617,6 +617,8 @@  def get_srcrev(d):
 
     if len(scms) == 1 and len(urldata[scms[0]].names) == 1:
         autoinc, rev = urldata[scms[0]].method.sortable_revision(scms[0], urldata[scms[0]], d, urldata[scms[0]].names[0])
+        if len(rev) > 10:
+            rev = rev[:10]
         if autoinc:
             return "AUTOINC+" + rev
         return rev
@@ -633,6 +635,8 @@  def get_srcrev(d):
         ud = urldata[scm]
         for name in ud.names:
             autoinc, rev = ud.method.sortable_revision(scm, ud, d, name)
+            if len(rev) > 10:
+                rev = rev[:10]
             if autoinc and not seenautoinc:
                 rev = "AUTOINC+" + rev
                 seenautoinc