| Submitter | Dongxiao Xu |
|---|---|
| Date | April 9, 2012, 8:41 a.m. |
| Message ID | <ccb22c39f4aeb8e310f9a6f8e0f9c55c18ac6891.1333960654.git.dongxiao.xu@intel.com> |
| Download | mbox | patch |
| Permalink | /patch/25357/ |
| State | New |
| Headers | show |
Comments
On Mon, 2012-04-09 at 16:41 +0800, Dongxiao Xu wrote: > The order of keys are not sensitive for config hash, so we need to > identify its order while calculating the md5 value. > > While for certain values, order is also not sensitive (for example, > BBINCLUDED), we also need to identify its order while calculating md5 > value. > > This could fix the problem that Martin Jansa reported in the mailing > list: > > http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > lib/bb/data_smart.py | 10 +++++++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py > index 2c200db..cc61a03 100644 > --- a/lib/bb/data_smart.py > +++ b/lib/bb/data_smart.py > @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): > self.delVar(var) > > def get_hash(self): > - data = "" > + data = {} > config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) > + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) > keys = set(key for key in iter(self) if not key.startswith("__")) > for key in keys: > if key in config_whitelist: > continue > value = self.getVar(key, False) or "" > - data = data + key + ': ' + str(value) + '\n' > + if key in config_sort: > + value = " ".join(sorted(value.split())) > + data.update({key:value}) > > - return hashlib.md5(data).hexdigest() > + data_str = str([(k, data[k]) for k in sorted(data.keys())]) > + return hashlib.md5(data_str).hexdigest() This and the corresponding change in bitbake.conf look rather worrying to me. The order in BBINCLUDED is significant and if it changes we should be reparsing. Looking at the code, get_file_depends(d) is probably buggy due to the use of set() which could reorder the data. We need to keep the data in order there and this issue should then be resolved. Lets fix the real bug. Cheers, Richard
On Wed, 2012-04-11 at 00:00 +0100, Richard Purdie wrote: > On Mon, 2012-04-09 at 16:41 +0800, Dongxiao Xu wrote: > > The order of keys are not sensitive for config hash, so we need to > > identify its order while calculating the md5 value. > > > > While for certain values, order is also not sensitive (for example, > > BBINCLUDED), we also need to identify its order while calculating md5 > > value. > > > > This could fix the problem that Martin Jansa reported in the mailing > > list: > > > > http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html > > > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > --- > > lib/bb/data_smart.py | 10 +++++++--- > > 1 files changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py > > index 2c200db..cc61a03 100644 > > --- a/lib/bb/data_smart.py > > +++ b/lib/bb/data_smart.py > > @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): > > self.delVar(var) > > > > def get_hash(self): > > - data = "" > > + data = {} > > config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) > > + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) > > keys = set(key for key in iter(self) if not key.startswith("__")) > > for key in keys: > > if key in config_whitelist: > > continue > > value = self.getVar(key, False) or "" > > - data = data + key + ': ' + str(value) + '\n' > > + if key in config_sort: > > + value = " ".join(sorted(value.split())) > > + data.update({key:value}) > > > > - return hashlib.md5(data).hexdigest() > > + data_str = str([(k, data[k]) for k in sorted(data.keys())]) > > + return hashlib.md5(data_str).hexdigest() > > > This and the corresponding change in bitbake.conf look rather worrying > to me. The order in BBINCLUDED is significant and if it changes we > should be reparsing. Hi Richard, Why do you say the order in BBINCLUDED is significant? I saw the original code ignores the order when handling __depends and __base_depends. For example: def mark_dependency(d, f): if f.startswith('./'): f = "%s/%s" % (os.getcwd(), f[2:]) deps = d.getVar('__depends') or set() deps.update([(f, cached_mtime(f))]) d.setVar('__depends', deps) I think the get_file_depends(d) just follows the original logic. Or do you mean the mark_dependency(d, f) is also buggy? Thanks, Dongxiao > > Looking at the code, get_file_depends(d) is probably buggy due to the > use of set() which could reorder the data. We need to keep the data in > order there and this issue should then be resolved. > > Lets fix the real bug. > > Cheers, > > Richard > > >
On Mon, 2012-04-16 at 15:15 +0800, Xu, Dongxiao wrote: > On Wed, 2012-04-11 at 00:00 +0100, Richard Purdie wrote: > > On Mon, 2012-04-09 at 16:41 +0800, Dongxiao Xu wrote: > > > The order of keys are not sensitive for config hash, so we need to > > > identify its order while calculating the md5 value. > > > > > > While for certain values, order is also not sensitive (for example, > > > BBINCLUDED), we also need to identify its order while calculating md5 > > > value. > > > > > > This could fix the problem that Martin Jansa reported in the mailing > > > list: > > > > > > http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html > > > > > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > > --- > > > lib/bb/data_smart.py | 10 +++++++--- > > > 1 files changed, 7 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py > > > index 2c200db..cc61a03 100644 > > > --- a/lib/bb/data_smart.py > > > +++ b/lib/bb/data_smart.py > > > @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): > > > self.delVar(var) > > > > > > def get_hash(self): > > > - data = "" > > > + data = {} > > > config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) > > > + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) > > > keys = set(key for key in iter(self) if not key.startswith("__")) > > > for key in keys: > > > if key in config_whitelist: > > > continue > > > value = self.getVar(key, False) or "" > > > - data = data + key + ': ' + str(value) + '\n' > > > + if key in config_sort: > > > + value = " ".join(sorted(value.split())) > > > + data.update({key:value}) > > > > > > - return hashlib.md5(data).hexdigest() > > > + data_str = str([(k, data[k]) for k in sorted(data.keys())]) > > > + return hashlib.md5(data_str).hexdigest() > > > > > > This and the corresponding change in bitbake.conf look rather worrying > > to me. The order in BBINCLUDED is significant and if it changes we > > should be reparsing. > > Hi Richard, > > Why do you say the order in BBINCLUDED is significant? > > I saw the original code ignores the order when handling __depends and > __base_depends. > > For example: > > def mark_dependency(d, f): > if f.startswith('./'): > f = "%s/%s" % (os.getcwd(), f[2:]) > deps = d.getVar('__depends') or set() > deps.update([(f, cached_mtime(f))]) > d.setVar('__depends', deps) > > I think the get_file_depends(d) just follows the original logic. > > Or do you mean the mark_dependency(d, f) is also buggy? In that case I think the original code is also buggy. Cheers, Richard
On Mon, 2012-04-16 at 09:28 +0100, Richard Purdie wrote: > On Mon, 2012-04-16 at 15:15 +0800, Xu, Dongxiao wrote: > > On Wed, 2012-04-11 at 00:00 +0100, Richard Purdie wrote: > > > On Mon, 2012-04-09 at 16:41 +0800, Dongxiao Xu wrote: > > > > The order of keys are not sensitive for config hash, so we need to > > > > identify its order while calculating the md5 value. > > > > > > > > While for certain values, order is also not sensitive (for example, > > > > BBINCLUDED), we also need to identify its order while calculating md5 > > > > value. > > > > > > > > This could fix the problem that Martin Jansa reported in the mailing > > > > list: > > > > > > > > http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html > > > > > > > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > > > --- > > > > lib/bb/data_smart.py | 10 +++++++--- > > > > 1 files changed, 7 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py > > > > index 2c200db..cc61a03 100644 > > > > --- a/lib/bb/data_smart.py > > > > +++ b/lib/bb/data_smart.py > > > > @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): > > > > self.delVar(var) > > > > > > > > def get_hash(self): > > > > - data = "" > > > > + data = {} > > > > config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) > > > > + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) > > > > keys = set(key for key in iter(self) if not key.startswith("__")) > > > > for key in keys: > > > > if key in config_whitelist: > > > > continue > > > > value = self.getVar(key, False) or "" > > > > - data = data + key + ': ' + str(value) + '\n' > > > > + if key in config_sort: > > > > + value = " ".join(sorted(value.split())) > > > > + data.update({key:value}) > > > > > > > > - return hashlib.md5(data).hexdigest() > > > > + data_str = str([(k, data[k]) for k in sorted(data.keys())]) > > > > + return hashlib.md5(data_str).hexdigest() > > > > > > > > > This and the corresponding change in bitbake.conf look rather worrying > > > to me. The order in BBINCLUDED is significant and if it changes we > > > should be reparsing. > > > > Hi Richard, > > > > Why do you say the order in BBINCLUDED is significant? > > > > I saw the original code ignores the order when handling __depends and > > __base_depends. > > > > For example: > > > > def mark_dependency(d, f): > > if f.startswith('./'): > > f = "%s/%s" % (os.getcwd(), f[2:]) > > deps = d.getVar('__depends') or set() > > deps.update([(f, cached_mtime(f))]) > > d.setVar('__depends', deps) > > > > I think the get_file_depends(d) just follows the original logic. > > > > Or do you mean the mark_dependency(d, f) is also buggy? > > In that case I think the original code is also buggy. Could you explain more why __depends should be ordered? I saw the variable's definition is set() type and there is no code depends on the order of "__depends". Thanks, Dongxiao > > Cheers, > > Richard >
On Mon, 2012-04-16 at 16:41 +0800, Xu, Dongxiao wrote: > On Mon, 2012-04-16 at 09:28 +0100, Richard Purdie wrote: > > On Mon, 2012-04-16 at 15:15 +0800, Xu, Dongxiao wrote: > > > On Wed, 2012-04-11 at 00:00 +0100, Richard Purdie wrote: > > > > On Mon, 2012-04-09 at 16:41 +0800, Dongxiao Xu wrote: > > > > > The order of keys are not sensitive for config hash, so we need to > > > > > identify its order while calculating the md5 value. > > > > > > > > > > While for certain values, order is also not sensitive (for example, > > > > > BBINCLUDED), we also need to identify its order while calculating md5 > > > > > value. > > > > > > > > > > This could fix the problem that Martin Jansa reported in the mailing > > > > > list: > > > > > > > > > > http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html > > > > > > > > > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > > > > --- > > > > > lib/bb/data_smart.py | 10 +++++++--- > > > > > 1 files changed, 7 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py > > > > > index 2c200db..cc61a03 100644 > > > > > --- a/lib/bb/data_smart.py > > > > > +++ b/lib/bb/data_smart.py > > > > > @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): > > > > > self.delVar(var) > > > > > > > > > > def get_hash(self): > > > > > - data = "" > > > > > + data = {} > > > > > config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) > > > > > + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) > > > > > keys = set(key for key in iter(self) if not key.startswith("__")) > > > > > for key in keys: > > > > > if key in config_whitelist: > > > > > continue > > > > > value = self.getVar(key, False) or "" > > > > > - data = data + key + ': ' + str(value) + '\n' > > > > > + if key in config_sort: > > > > > + value = " ".join(sorted(value.split())) > > > > > + data.update({key:value}) > > > > > > > > > > - return hashlib.md5(data).hexdigest() > > > > > + data_str = str([(k, data[k]) for k in sorted(data.keys())]) > > > > > + return hashlib.md5(data_str).hexdigest() > > > > > > > > > > > > This and the corresponding change in bitbake.conf look rather worrying > > > > to me. The order in BBINCLUDED is significant and if it changes we > > > > should be reparsing. > > > > > > Hi Richard, > > > > > > Why do you say the order in BBINCLUDED is significant? > > > > > > I saw the original code ignores the order when handling __depends and > > > __base_depends. > > > > > > For example: > > > > > > def mark_dependency(d, f): > > > if f.startswith('./'): > > > f = "%s/%s" % (os.getcwd(), f[2:]) > > > deps = d.getVar('__depends') or set() > > > deps.update([(f, cached_mtime(f))]) > > > d.setVar('__depends', deps) > > > > > > I think the get_file_depends(d) just follows the original logic. > > > > > > Or do you mean the mark_dependency(d, f) is also buggy? > > > > In that case I think the original code is also buggy. > > Could you explain more why __depends should be ordered? I saw the > variable's definition is set() type and there is no code depends on the > order of "__depends". Well, your new code does! Bitbake does care deeply about the order things are included in so I believe this variable should reflect that. Cheers, Richard
Patch
diff --git a/lib/bb/data_smart.py b/lib/bb/data_smart.py index 2c200db..cc61a03 100644 --- a/lib/bb/data_smart.py +++ b/lib/bb/data_smart.py @@ -462,13 +462,17 @@ class DataSmart(MutableMapping): self.delVar(var) def get_hash(self): - data = "" + data = {} config_whitelist = set((self.getVar("BB_HASHCONFIG_WHITELIST", True) or "").split()) + config_sort = set((self.getVar("BB_HASHCONFIG_SORT", True) or "").split()) keys = set(key for key in iter(self) if not key.startswith("__")) for key in keys: if key in config_whitelist: continue value = self.getVar(key, False) or "" - data = data + key + ': ' + str(value) + '\n' + if key in config_sort: + value = " ".join(sorted(value.split())) + data.update({key:value}) - return hashlib.md5(data).hexdigest() + data_str = str([(k, data[k]) for k in sorted(data.keys())]) + return hashlib.md5(data_str).hexdigest()
The order of keys are not sensitive for config hash, so we need to identify its order while calculating the md5 value. While for certain values, order is also not sensitive (for example, BBINCLUDED), we also need to identify its order while calculating md5 value. This could fix the problem that Martin Jansa reported in the mailing list: http://lists.linuxtogo.org/pipermail/bitbake-devel/2012-March/002122.html Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- lib/bb/data_smart.py | 10 +++++++--- 1 files changed, 7 insertions(+), 3 deletions(-)