Patchwork [1/1] bbclass/sstate: only allowed sstate-cache objects are allowed in a build (read-only sstate-cache)

login
register
mail settings
Submitter Hongxu Jia
Date Aug. 21, 2014, 2:36 a.m.
Message ID <990d4ba779e4127ff546796d62e2921392dbc0f7.1408588282.git.hongxu.jia@windriver.com>
Download mbox | patch
Permalink /patch/78721/
State New
Headers show

Comments

Hongxu Jia - Aug. 21, 2014, 2:36 a.m.
The requirement is the developer who demand only the "new" software
they write is allowed to be compiled from source, they only want to
reuse binaries from an existed sstate-cache, if the developer makes
a change that triggers a rebuild, it should be an instant error.

The purpose of this is for the sstate-cache to check if the item
exists or not. If it doesn't the item needs to be in a whitelist
or we need to fail.

In the sstate-cache code, add a checking in the return path of
sstate_checkhashes. If read-only sstate-cache enable, and the
recipe's ${PN} not in the ${SSTATECACHE_WHITELIST}, it trigered
an instant error.

...
$ bitbake db
ERROR: Read-only sstate-cache is enabled, the build of
"db rpm-native gcc-runtime eglibc linux-libc-headers libgcc"
did not come from sstate-cache. Only the recipe listed in
SSTATECACHE_WHITELIST is allowed to build from source

Summary: There was 1 ERROR message shown, returning a non-zero exit code.
...

[YOCTO #6639]

Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
---
 meta/classes/sstate.bbclass | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
Richard Purdie - Sept. 5, 2014, 9:42 a.m.
Hi Hongxu,

On Thu, 2014-08-21 at 10:36 +0800, Hongxu Jia wrote:
> The requirement is the developer who demand only the "new" software
> they write is allowed to be compiled from source, they only want to
> reuse binaries from an existed sstate-cache, if the developer makes
> a change that triggers a rebuild, it should be an instant error.
> 
> The purpose of this is for the sstate-cache to check if the item
> exists or not. If it doesn't the item needs to be in a whitelist
> or we need to fail.
> 
> In the sstate-cache code, add a checking in the return path of
> sstate_checkhashes. If read-only sstate-cache enable, and the
> recipe's ${PN} not in the ${SSTATECACHE_WHITELIST}, it trigered
> an instant error.
> 
> ...
> $ bitbake db
> ERROR: Read-only sstate-cache is enabled, the build of
> "db rpm-native gcc-runtime eglibc linux-libc-headers libgcc"
> did not come from sstate-cache. Only the recipe listed in
> SSTATECACHE_WHITELIST is allowed to build from source
> 
> Summary: There was 1 ERROR message shown, returning a non-zero exit code.

Sorry for the delay in getting to this, with the other patches for the
release, time has been difficult.

I have been able to take a look at this and the other locked sstate
patches I've had pending for some time.

Having thought quite a bit about this, I think we really want to make
this functionality part of the siggen class code. Where we need to add
hooks, we should do so with callbacks into the siggen code.

I've just sent out a patch which my locked code in it. Could you take a
look and see if we can make that approach work with your code too?

In particular, I don't really want to have multiple whitelist type
variables. Could we use the SIGGEN_LOCKEDSIGS variable as the definitive
way to control which recipes can float and which should not?

Cheers,

Richard
Mark Hatle - Sept. 5, 2014, 2:05 p.m.
On 9/5/14, 4:42 AM, Richard Purdie wrote:
> Hi Hongxu,
>
> On Thu, 2014-08-21 at 10:36 +0800, Hongxu Jia wrote:
>> The requirement is the developer who demand only the "new" software
>> they write is allowed to be compiled from source, they only want to
>> reuse binaries from an existed sstate-cache, if the developer makes
>> a change that triggers a rebuild, it should be an instant error.
>>
>> The purpose of this is for the sstate-cache to check if the item
>> exists or not. If it doesn't the item needs to be in a whitelist
>> or we need to fail.
>>
>> In the sstate-cache code, add a checking in the return path of
>> sstate_checkhashes. If read-only sstate-cache enable, and the
>> recipe's ${PN} not in the ${SSTATECACHE_WHITELIST}, it trigered
>> an instant error.
>>
>> ...
>> $ bitbake db
>> ERROR: Read-only sstate-cache is enabled, the build of
>> "db rpm-native gcc-runtime eglibc linux-libc-headers libgcc"
>> did not come from sstate-cache. Only the recipe listed in
>> SSTATECACHE_WHITELIST is allowed to build from source
>>
>> Summary: There was 1 ERROR message shown, returning a non-zero exit code.
>
> Sorry for the delay in getting to this, with the other patches for the
> release, time has been difficult.
>
> I have been able to take a look at this and the other locked sstate
> patches I've had pending for some time.
>
> Having thought quite a bit about this, I think we really want to make
> this functionality part of the siggen class code. Where we need to add
> hooks, we should do so with callbacks into the siggen code.
>
> I've just sent out a patch which my locked code in it. Could you take a
> look and see if we can make that approach work with your code too?
>
> In particular, I don't really want to have multiple whitelist type
> variables. Could we use the SIGGEN_LOCKEDSIGS variable as the definitive
> way to control which recipes can float and which should not?

I'm all for combining this.. however it appears the patches have two slightly 
separate goals.

The locked sstate patch tells the system to use a specific hash value and ONLY 
that hash value.   So if a user modified a system behavior in a way that would 
change the behavior of the component there is no warning or error.

The readonly sstate-cache on the other hand says, to reuse the sstate cache 
items (there of course may be more then one per recipe name with a different 
hash based on different configurations), but will warn/error when if the system 
configuration changes in a way the hash would have changed.

So there needs to be an interface that says which items are allowed to be built 
from source, and which items require either the locked or readonly sstate behavior.

Something like:

SIGGEN_LOCKEDSIGS = "\
gcc-cross:do_populate_sysroot:ro \
eglibc:do_populate_sysroot:ro \
eglibc:do_packagedata:ro \
gcc-cross:do_packagedata:ro \
"

could work to extend it, but I have a concern with this.  It only affects 
individual recipes.  The request we've had from our customers is that once they 
have built their environment(s), and distributed the sstate-cache to their 
developers -- only the items they distributed are allowed to be used, unless an 
exception is granted.

So having to maintain a SIGGEN_LOCKSIGS = variable with every possible package 
in the system seems complicated at best (for the RO case).  They really want a 
switch that sets everything, except exceptions as read-only.

SIGGEN_LOCKEDSIGS = "\
*:*:ro \
bash:*:rw \
"

but I don't know if the glob/wildcarding would add more processing overhead then 
is acceptable.

--Mark

> Cheers,
>
> Richard
>
Richard Purdie - Sept. 7, 2014, 9:24 a.m.
On Fri, 2014-09-05 at 09:05 -0500, Mark Hatle wrote:
> On 9/5/14, 4:42 AM, Richard Purdie wrote:
> > Hi Hongxu,
> >
> > On Thu, 2014-08-21 at 10:36 +0800, Hongxu Jia wrote:
> >> The requirement is the developer who demand only the "new" software
> >> they write is allowed to be compiled from source, they only want to
> >> reuse binaries from an existed sstate-cache, if the developer makes
> >> a change that triggers a rebuild, it should be an instant error.
> >>
> >> The purpose of this is for the sstate-cache to check if the item
> >> exists or not. If it doesn't the item needs to be in a whitelist
> >> or we need to fail.
> >>
> >> In the sstate-cache code, add a checking in the return path of
> >> sstate_checkhashes. If read-only sstate-cache enable, and the
> >> recipe's ${PN} not in the ${SSTATECACHE_WHITELIST}, it trigered
> >> an instant error.
> >>
> >> ...
> >> $ bitbake db
> >> ERROR: Read-only sstate-cache is enabled, the build of
> >> "db rpm-native gcc-runtime eglibc linux-libc-headers libgcc"
> >> did not come from sstate-cache. Only the recipe listed in
> >> SSTATECACHE_WHITELIST is allowed to build from source
> >>
> >> Summary: There was 1 ERROR message shown, returning a non-zero exit code.
> >
> > Sorry for the delay in getting to this, with the other patches for the
> > release, time has been difficult.
> >
> > I have been able to take a look at this and the other locked sstate
> > patches I've had pending for some time.
> >
> > Having thought quite a bit about this, I think we really want to make
> > this functionality part of the siggen class code. Where we need to add
> > hooks, we should do so with callbacks into the siggen code.
> >
> > I've just sent out a patch which my locked code in it. Could you take a
> > look and see if we can make that approach work with your code too?
> >
> > In particular, I don't really want to have multiple whitelist type
> > variables. Could we use the SIGGEN_LOCKEDSIGS variable as the definitive
> > way to control which recipes can float and which should not?
> 
> I'm all for combining this.. however it appears the patches have two slightly 
> separate goals.
> 
> The locked sstate patch tells the system to use a specific hash value and ONLY 
> that hash value.   So if a user modified a system behavior in a way that would 
> change the behavior of the component there is no warning or error.
> 
> The readonly sstate-cache on the other hand says, to reuse the sstate cache 
> items (there of course may be more then one per recipe name with a different 
> hash based on different configurations), but will warn/error when if the system 
> configuration changes in a way the hash would have changed.
> 
> So there needs to be an interface that says which items are allowed to be built 
> from source, and which items require either the locked or readonly sstate behavior.
> 
> Something like:
> 
> SIGGEN_LOCKEDSIGS = "\
> gcc-cross:do_populate_sysroot:ro \
> eglibc:do_populate_sysroot:ro \
> eglibc:do_packagedata:ro \
> gcc-cross:do_packagedata:ro \
> "
> 
> could work to extend it, but I have a concern with this.  It only affects 
> individual recipes.  The request we've had from our customers is that once they 
> have built their environment(s), and distributed the sstate-cache to their 
> developers -- only the items they distributed are allowed to be used, unless an 
> exception is granted.
> 
> So having to maintain a SIGGEN_LOCKSIGS = variable with every possible package 
> in the system seems complicated at best (for the RO case).  They really want a 
> switch that sets everything, except exceptions as read-only.
> 
> SIGGEN_LOCKEDSIGS = "\
> *:*:ro \
> bash:*:rw \
> "
> 
> but I don't know if the glob/wildcarding would add more processing overhead then 
> is acceptable.

I guess my main point right now is more about the controlling interface
of this which I believe should be the "siggen" class. What I'd like to
try and do is standardise on that as being the interface which handles
this policy, whatever it may be. If it needs hooks adding, we can do
that like the one I added from sstate.

Whether we end up with one siggen class or we have different ones
(perhaps subclassing each other) and we switch between them for
different behaviours I'm less sure about right now.

I'll give the locked sstate interface a bit more thought based on the
above, thanks. I suspect it may be a case of adding a new subclass with
some different functionality (e.g. readonly cache), then if that is what
the user requires, we select that class using BB_SIGNATURE_HANDLER?

Richard
Mark Hatle - Sept. 9, 2014, 2:49 p.m.
On 9/7/14, 4:24 AM, Richard Purdie wrote:
> On Fri, 2014-09-05 at 09:05 -0500, Mark Hatle wrote:
>> On 9/5/14, 4:42 AM, Richard Purdie wrote:
>>> Hi Hongxu,
>>>
>>> On Thu, 2014-08-21 at 10:36 +0800, Hongxu Jia wrote:
>>>> The requirement is the developer who demand only the "new" software
>>>> they write is allowed to be compiled from source, they only want to
>>>> reuse binaries from an existed sstate-cache, if the developer makes
>>>> a change that triggers a rebuild, it should be an instant error.
>>>>
>>>> The purpose of this is for the sstate-cache to check if the item
>>>> exists or not. If it doesn't the item needs to be in a whitelist
>>>> or we need to fail.
>>>>
>>>> In the sstate-cache code, add a checking in the return path of
>>>> sstate_checkhashes. If read-only sstate-cache enable, and the
>>>> recipe's ${PN} not in the ${SSTATECACHE_WHITELIST}, it trigered
>>>> an instant error.
>>>>
>>>> ...
>>>> $ bitbake db
>>>> ERROR: Read-only sstate-cache is enabled, the build of
>>>> "db rpm-native gcc-runtime eglibc linux-libc-headers libgcc"
>>>> did not come from sstate-cache. Only the recipe listed in
>>>> SSTATECACHE_WHITELIST is allowed to build from source
>>>>
>>>> Summary: There was 1 ERROR message shown, returning a non-zero exit code.
>>>
>>> Sorry for the delay in getting to this, with the other patches for the
>>> release, time has been difficult.
>>>
>>> I have been able to take a look at this and the other locked sstate
>>> patches I've had pending for some time.
>>>
>>> Having thought quite a bit about this, I think we really want to make
>>> this functionality part of the siggen class code. Where we need to add
>>> hooks, we should do so with callbacks into the siggen code.
>>>
>>> I've just sent out a patch which my locked code in it. Could you take a
>>> look and see if we can make that approach work with your code too?
>>>
>>> In particular, I don't really want to have multiple whitelist type
>>> variables. Could we use the SIGGEN_LOCKEDSIGS variable as the definitive
>>> way to control which recipes can float and which should not?
>>
>> I'm all for combining this.. however it appears the patches have two slightly
>> separate goals.
>>
>> The locked sstate patch tells the system to use a specific hash value and ONLY
>> that hash value.   So if a user modified a system behavior in a way that would
>> change the behavior of the component there is no warning or error.
>>
>> The readonly sstate-cache on the other hand says, to reuse the sstate cache
>> items (there of course may be more then one per recipe name with a different
>> hash based on different configurations), but will warn/error when if the system
>> configuration changes in a way the hash would have changed.
>>
>> So there needs to be an interface that says which items are allowed to be built
>> from source, and which items require either the locked or readonly sstate behavior.
>>
>> Something like:
>>
>> SIGGEN_LOCKEDSIGS = "\
>> gcc-cross:do_populate_sysroot:ro \
>> eglibc:do_populate_sysroot:ro \
>> eglibc:do_packagedata:ro \
>> gcc-cross:do_packagedata:ro \
>> "
>>
>> could work to extend it, but I have a concern with this.  It only affects
>> individual recipes.  The request we've had from our customers is that once they
>> have built their environment(s), and distributed the sstate-cache to their
>> developers -- only the items they distributed are allowed to be used, unless an
>> exception is granted.
>>
>> So having to maintain a SIGGEN_LOCKSIGS = variable with every possible package
>> in the system seems complicated at best (for the RO case).  They really want a
>> switch that sets everything, except exceptions as read-only.
>>
>> SIGGEN_LOCKEDSIGS = "\
>> *:*:ro \
>> bash:*:rw \
>> "
>>
>> but I don't know if the glob/wildcarding would add more processing overhead then
>> is acceptable.
>
> I guess my main point right now is more about the controlling interface
> of this which I believe should be the "siggen" class. What I'd like to
> try and do is standardise on that as being the interface which handles
> this policy, whatever it may be. If it needs hooks adding, we can do
> that like the one I added from sstate.
>
> Whether we end up with one siggen class or we have different ones
> (perhaps subclassing each other) and we switch between them for
> different behaviours I'm less sure about right now.
>
> I'll give the locked sstate interface a bit more thought based on the
> above, thanks. I suspect it may be a case of adding a new subclass with
> some different functionality (e.g. readonly cache), then if that is what
> the user requires, we select that class using BB_SIGNATURE_HANDLER?

Works for me.  I don't mind the behavior of subclassing at all.  The key thing 
is we've gotten requests for odd behaviors (which like I've said before, I don't 
think the community is or should be interested in -- but something that we have 
to do for our own customer..)

Nothing like a change (pgp signatures) the cripples the speed of the 
sstate-cache.  :P

--Mark

> Richard
>
>
>

Patch

diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
index 4eec6bd..4a04efb 100644
--- a/meta/classes/sstate.bbclass
+++ b/meta/classes/sstate.bbclass
@@ -41,6 +41,14 @@  EXTRA_STAGING_FIXMES ?= ""
 sstate_create_package[dirs] = "${SSTATE_BUILDDIR}"
 sstate_unpack_package[dirs] = "${SSTATE_INSTDIR}"
 
+# 1) If ${SSTATECACHE_WHITELIST} is "", it means read-only sstate-cache
+#    disabled;
+#
+# 2) If read-only sstate-cache enabled and the recipe's ${PN} not listed
+#    in ${SSTATECACHE_WHITELIST}, the build from source will triger an
+#    instant error;
+SSTATECACHE_WHITELIST ?= ""
+
 python () {
     if bb.data.inherits_class('native', d):
         d.setVar('SSTATE_PKGARCH', d.getVar('BUILD_ARCH'))
@@ -382,6 +390,15 @@  sstate_clean[vardepsexclude] = "SSTATE_MANFILEPREFIX"
 CLEANFUNCS += "sstate_cleanall"
 
 python sstate_cleanall() {
+    whitelist = d.getVar('SSTATECACHE_WHITELIST', True)
+    if whitelist:
+        pn = d.getVar('PN', True)
+        if pn not in whitelist.split():
+            msg =  'Read-only sstate-cache is enabled, the clean of \n'
+            msg += '%s is not allowed. Only the recipe listed in\n' % pn
+            msg += 'SSTATECACHE_WHITELIST is allowed to clean sstate-cache'
+            bb.fatal(msg)
+
     bb.note("Removing shared state for package %s" % d.getVar('PN', True))
 
     manifest_dir = d.getVar('SSTATE_MANIFESTS', True)
@@ -704,6 +721,29 @@  def sstate_checkhashes(sq_fn, sq_task, sq_hash, sq_hashfn, d):
             evdata['found'].append( (sq_fn[task], sq_task[task], sq_hash[task], sstatefile ) )
         bb.event.fire(bb.event.MetadataEvent("MissedSstate", evdata), d)
 
+    whitelist = d.getVar('SSTATECACHE_WHITELIST', True)
+    if whitelist:
+        missed_pn = []
+        for task in missed:
+            fn = sq_fn[task]
+            data = bb.cache.Cache.loadDataFull(fn, '', d)
+            pn = data.getVar('PN', True) or ""
+            if pn and pn not in missed_pn:
+                missed_pn.append(pn)
+
+        if missed_pn:
+            blacklist = [pn for pn in missed_pn if pn not in whitelist.split()]
+            if blacklist:
+                # We should manually unlock the bitbake lock, because the fatal
+                # msg will exit the build immediately.
+                lockfile = d.expand("${TOPDIR}/bitbake.lock")
+                os.unlink(lockfile)
+                msg =  'Read-only sstate-cache is enabled, the build of \n'
+                msg += '"' + ' '.join(blacklist) + '"\n'
+                msg += 'did not come from sstate-cache. Only the recipe listed in\n'
+                msg += 'SSTATECACHE_WHITELIST is allowed to build from source'
+                bb.msg.fatal('sstate', msg)
+
     return ret
 
 BB_SETSCENE_DEPVALID = "setscene_depvalid"