Patchwork [1/1] sstate.bbclass: fix parallel building issue

login
register
mail settings
Submitter rongqing.li@windriver.com
Date Aug. 13, 2013, 8:20 a.m.
Message ID <02e6f25c210b0628dc4ee4482474b0e6ce5606e4.1376379182.git.rongqing.li@windriver.com>
Download mbox | patch
Permalink /patch/55523/
State Accepted
Commit 0acde33c75f90e06516c0b9ce4291921aa9d4e58
Headers show

Comments

rongqing.li@windriver.com - Aug. 13, 2013, 8:20 a.m.
From: "Roy.Li" <rongqing.li@windriver.com>

sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
sstate_create_package will store SSTATE_BUILDDIR into a archive file by
tar, but once other packages install the same file into sysroot, the
creating the archive file will fail with below error:

    DEBUG: Executing shell function sstate_create_package
    tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed as we read it

This kind of error is harmless, use --ignore-failed-read to ignore it.

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
---
 meta/classes/sstate.bbclass |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Saul Wold - Aug. 13, 2013, 7:02 p.m.
On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
> From: "Roy.Li" <rongqing.li@windriver.com>
>
> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
> tar, but once other packages install the same file into sysroot, the
> creating the archive file will fail with below error:
>
>      DEBUG: Executing shell function sstate_create_package
>      tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed as we read it
>
> This kind of error is harmless, use --ignore-failed-read to ignore it.
>
I am not sure it's so harmless, what if the file is corrupted, then we 
have a bad sstate tarball.  You have identified the part of the root 
cause being the hardlink, but what if the file actually does change 
(which would be a different bug potentially), then your packaging a 
differet set of macros (in this case) with the sysroot.


Sau!

> Signed-off-by: Roy.Li <rongqing.li@windriver.com>
> ---
>   meta/classes/sstate.bbclass |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
> index c1ca54b..3e2fedd 100644
> --- a/meta/classes/sstate.bbclass
> +++ b/meta/classes/sstate.bbclass
> @@ -565,7 +565,7 @@ sstate_create_package () {
>   	TFILE=`mktemp ${SSTATE_PKG}.XXXXXXXX`
>   	# Need to handle empty directories
>   	if [ "$(ls -A)" ]; then
> -		tar -czf $TFILE *
> +		tar --ignore-failed-read -czf $TFILE *
>   	else
>   		tar -cz --file=$TFILE --files-from=/dev/null
>   	fi
>
rongqing.li@windriver.com - Aug. 14, 2013, 5:28 a.m.
On 08/14/2013 03:02 AM, Saul Wold wrote:
> On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
>> From: "Roy.Li" <rongqing.li@windriver.com>
>>
>> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
>> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
>> tar, but once other packages install the same file into sysroot, the
>> creating the archive file will fail with below error:
>>
>>      DEBUG: Executing shell function sstate_create_package
>>      tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
>> as we read it
>>
>> This kind of error is harmless, use --ignore-failed-read to ignore it.
>>
> I am not sure it's so harmless, what if the file is corrupted, then we
> have a bad sstate tarball.  You have identified the part of the root
> cause being the hardlink, but what if the file actually does change
> (which would be a different bug potentially), then your packaging a
> differet set of macros (in this case) with the sysroot.
>
>
> Sau!


The file is not corrupted, and the file content is not changed,  "tar"
said xorg-macros.m4 file is changed, since the number of links of
xorg-macros.m4 has changed when other packages is doing configuration
and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}

If this fix can be accepted, I will rework the commit header.

-Roy


>
>> Signed-off-by: Roy.Li <rongqing.li@windriver.com>
>> ---
>>   meta/classes/sstate.bbclass |    2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
>> index c1ca54b..3e2fedd 100644
>> --- a/meta/classes/sstate.bbclass
>> +++ b/meta/classes/sstate.bbclass
>> @@ -565,7 +565,7 @@ sstate_create_package () {
>>       TFILE=`mktemp ${SSTATE_PKG}.XXXXXXXX`
>>       # Need to handle empty directories
>>       if [ "$(ls -A)" ]; then
>> -        tar -czf $TFILE *
>> +        tar --ignore-failed-read -czf $TFILE *
>>       else
>>           tar -cz --file=$TFILE --files-from=/dev/null
>>       fi
>>
>
>
Martin Jansa - Aug. 14, 2013, 6:56 a.m.
On Wed, Aug 14, 2013 at 01:28:53PM +0800, Rongqing Li wrote:
> 
> 
> On 08/14/2013 03:02 AM, Saul Wold wrote:
> > On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
> >> From: "Roy.Li" <rongqing.li@windriver.com>
> >>
> >> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
> >> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
> >> tar, but once other packages install the same file into sysroot, the
> >> creating the archive file will fail with below error:
> >>
> >>      DEBUG: Executing shell function sstate_create_package
> >>      tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
> >> as we read it
> >>
> >> This kind of error is harmless, use --ignore-failed-read to ignore it.
> >>
> > I am not sure it's so harmless, what if the file is corrupted, then we
> > have a bad sstate tarball.  You have identified the part of the root
> > cause being the hardlink, but what if the file actually does change
> > (which would be a different bug potentially), then your packaging a
> > differet set of macros (in this case) with the sysroot.
> >
> >
> > Sau!
> 
> 
> The file is not corrupted, and the file content is not changed,  "tar"
> said xorg-macros.m4 file is changed, since the number of links of
> xorg-macros.m4 has changed when other packages is doing configuration
> and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}
> 
> If this fix can be accepted, I will rework the commit header.

I think there is still some other issue.

I haven't seen this on ext4 filesystems, but with reiserfs I was able to
reproduce "cp: will not create hard link" issue, e.g.:

do_populate_lic_setscene task failing in sstate_install with 
cp: will not create hard link `/OE/deploy/licenses/recipe' to directory `/OE/deploy/licenses/recipe' (same path)

or
ERROR: Error executing a python function in pn.bb:
CalledProcessError: Command 'cp -afl /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi' returned non-zero exit status 1 with output 
cp: warning: source file `/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/pn' specified more than once

cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime'
cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime-reverse' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime-reverse'
cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/shlibs' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/shlibs'

Number of hardlinks is:
$ find pn/1.0/pkgdata -printf "%f/%n/%i\n"
pkgdata/5/190867045
runtime-reverse/2/190867046
pn-dbg/1/190867047
pn-dev/1/190867048
pn-doc/1/190867049
pn/1/190867067
pn-staticdev/1/190867051
pn-locale/1/190867078
runtime/2/190867053
pn-dbg.packaged/1/190867054
pn-dev.packaged/1/190867056
pn-dbg/1/190867057
pn-dev/1/190867058
pn-doc/1/190867059
pn/1/190867060
pn-staticdev/1/190867062
pn.packaged/1/190867063
pn-locale/1/190867064
pn/1/190867065
shlibs/2/190867069

find ~ -xdev -samefile pn/1.0/pkgdata 2>/dev/null
pn/1.0/pkgdata

I'm not sure where the other pkgdata hardlinks came from.

The problem is that I can reproduce it on 1-2 random recipes from few hundreds
included in bigger image and even not in every build. After the error is shown
it all looks sane, only way to manually reproduce the same error is to really
specify source dirs twice:

$ cp -afl \
/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* \
/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* \
/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi

shows exactly the same 1 warning and 3 errors.
rongqing.li@windriver.com - Aug. 14, 2013, 9:27 a.m.
On 08/14/2013 02:56 PM, Martin Jansa wrote:
> On Wed, Aug 14, 2013 at 01:28:53PM +0800, Rongqing Li wrote:
>>
>>
>> On 08/14/2013 03:02 AM, Saul Wold wrote:
>>> On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
>>>> From: "Roy.Li" <rongqing.li@windriver.com>
>>>>
>>>> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
>>>> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
>>>> tar, but once other packages install the same file into sysroot, the
>>>> creating the archive file will fail with below error:
>>>>
>>>>       DEBUG: Executing shell function sstate_create_package
>>>>       tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
>>>> as we read it
>>>>
>>>> This kind of error is harmless, use --ignore-failed-read to ignore it.
>>>>
>>> I am not sure it's so harmless, what if the file is corrupted, then we
>>> have a bad sstate tarball.  You have identified the part of the root
>>> cause being the hardlink, but what if the file actually does change
>>> (which would be a different bug potentially), then your packaging a
>>> differet set of macros (in this case) with the sysroot.
>>>
>>>
>>> Sau!
>>
>>
>> The file is not corrupted, and the file content is not changed,  "tar"
>> said xorg-macros.m4 file is changed, since the number of links of
>> xorg-macros.m4 has changed when other packages is doing configuration
>> and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}
>>
>> If this fix can be accepted, I will rework the commit header.
>
> I think there is still some other issue.
>
> I haven't seen this on ext4 filesystems, but with reiserfs I was able to
> reproduce "cp: will not create hard link" issue, e.g.:
>
> do_populate_lic_setscene task failing in sstate_install with
> cp: will not create hard link `/OE/deploy/licenses/recipe' to directory `/OE/deploy/licenses/recipe' (same path)
>
> or
> ERROR: Error executing a python function in pn.bb:
> CalledProcessError: Command 'cp -afl /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi' returned non-zero exit status 1 with output
> cp: warning: source file `/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/pn' specified more than once
>
> cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime'
> cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime-reverse' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/runtime-reverse'
> cp: will not create hard link `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/shlibs' to directory `/OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi/shlibs'
>
> Number of hardlinks is:
> $ find pn/1.0/pkgdata -printf "%f/%n/%i\n"
> pkgdata/5/190867045
> runtime-reverse/2/190867046
> pn-dbg/1/190867047
> pn-dev/1/190867048
> pn-doc/1/190867049
> pn/1/190867067
> pn-staticdev/1/190867051
> pn-locale/1/190867078
> runtime/2/190867053
> pn-dbg.packaged/1/190867054
> pn-dev.packaged/1/190867056
> pn-dbg/1/190867057
> pn-dev/1/190867058
> pn-doc/1/190867059
> pn/1/190867060
> pn-staticdev/1/190867062
> pn.packaged/1/190867063
> pn-locale/1/190867064
> pn/1/190867065
> shlibs/2/190867069
>
> find ~ -xdev -samefile pn/1.0/pkgdata 2>/dev/null
> pn/1.0/pkgdata
>
> I'm not sure where the other pkgdata hardlinks came from.
>
> The problem is that I can reproduce it on 1-2 random recipes from few hundreds
> included in bigger image and even not in every build. After the error is shown
> it all looks sane, only way to manually reproduce the same error is to really
> specify source dirs twice:
>
> $ cp -afl \
> /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* \
> /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* \
> /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi
>
> shows exactly the same 1 warning and 3 errors.
>

Your problem seems filesystem issue.

Could you add more debug? like strace result.

-Roy
Richard Purdie - Aug. 14, 2013, 10:46 a.m.
On Wed, 2013-08-14 at 08:56 +0200, Martin Jansa wrote:
> On Wed, Aug 14, 2013 at 01:28:53PM +0800, Rongqing Li wrote:
> > 
> > 
> > On 08/14/2013 03:02 AM, Saul Wold wrote:
> > > On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
> > >> From: "Roy.Li" <rongqing.li@windriver.com>
> > >>
> > >> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
> > >> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
> > >> tar, but once other packages install the same file into sysroot, the
> > >> creating the archive file will fail with below error:
> > >>
> > >>      DEBUG: Executing shell function sstate_create_package
> > >>      tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
> > >> as we read it
> > >>
> > >> This kind of error is harmless, use --ignore-failed-read to ignore it.
> > >>
> > > I am not sure it's so harmless, what if the file is corrupted, then we
> > > have a bad sstate tarball.  You have identified the part of the root
> > > cause being the hardlink, but what if the file actually does change
> > > (which would be a different bug potentially), then your packaging a
> > > differet set of macros (in this case) with the sysroot.
> > >
> > >
> > > Sau!
> > 
> > 
> > The file is not corrupted, and the file content is not changed,  "tar"
> > said xorg-macros.m4 file is changed, since the number of links of
> > xorg-macros.m4 has changed when other packages is doing configuration
> > and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}
> > 
> > If this fix can be accepted, I will rework the commit header.
> 
> I think there is still some other issue.
> 
> I haven't seen this on ext4 filesystems, but with reiserfs I was able to
> reproduce "cp: will not create hard link" issue, e.g.:
> 
> do_populate_lic_setscene task failing in sstate_install with 
> cp: will not create hard link `/OE/deploy/licenses/recipe' to directory `/OE/deploy/licenses/recipe' (same path)
> 
> or
> ERROR: Error executing a python function in pn.bb:
> CalledProcessError: Command 'cp -afl /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi' returned non-zero exit status 1 with output 
> cp: warning: source file `/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/pn' specified more than once

This sounds like a race issue in reiserfs to me...

Cheers,

Richard
Martin Jansa - Aug. 14, 2013, 10:59 a.m.
On Wed, Aug 14, 2013 at 11:46:57AM +0100, Richard Purdie wrote:
> On Wed, 2013-08-14 at 08:56 +0200, Martin Jansa wrote:
> > On Wed, Aug 14, 2013 at 01:28:53PM +0800, Rongqing Li wrote:
> > > 
> > > 
> > > On 08/14/2013 03:02 AM, Saul Wold wrote:
> > > > On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
> > > >> From: "Roy.Li" <rongqing.li@windriver.com>
> > > >>
> > > >> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
> > > >> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
> > > >> tar, but once other packages install the same file into sysroot, the
> > > >> creating the archive file will fail with below error:
> > > >>
> > > >>      DEBUG: Executing shell function sstate_create_package
> > > >>      tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
> > > >> as we read it
> > > >>
> > > >> This kind of error is harmless, use --ignore-failed-read to ignore it.
> > > >>
> > > > I am not sure it's so harmless, what if the file is corrupted, then we
> > > > have a bad sstate tarball.  You have identified the part of the root
> > > > cause being the hardlink, but what if the file actually does change
> > > > (which would be a different bug potentially), then your packaging a
> > > > differet set of macros (in this case) with the sysroot.
> > > >
> > > >
> > > > Sau!
> > > 
> > > 
> > > The file is not corrupted, and the file content is not changed,  "tar"
> > > said xorg-macros.m4 file is changed, since the number of links of
> > > xorg-macros.m4 has changed when other packages is doing configuration
> > > and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}
> > > 
> > > If this fix can be accepted, I will rework the commit header.
> > 
> > I think there is still some other issue.
> > 
> > I haven't seen this on ext4 filesystems, but with reiserfs I was able to
> > reproduce "cp: will not create hard link" issue, e.g.:
> > 
> > do_populate_lic_setscene task failing in sstate_install with 
> > cp: will not create hard link `/OE/deploy/licenses/recipe' to directory `/OE/deploy/licenses/recipe' (same path)
> > 
> > or
> > ERROR: Error executing a python function in pn.bb:
> > CalledProcessError: Command 'cp -afl /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi' returned non-zero exit status 1 with output 
> > cp: warning: source file `/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/pn' specified more than once
> 
> This sounds like a race issue in reiserfs to me...

True, I assume the same until someone else is able to reproduce it on
some other filesystem.

Any idea how to confirm this theory at least to add warning in
documentation that using reiserfs on build partition is causing random
build failures?
rongqing.li@windriver.com - Aug. 15, 2013, 9:51 a.m.
On 08/14/2013 06:59 PM, Martin Jansa wrote:
> On Wed, Aug 14, 2013 at 11:46:57AM +0100, Richard Purdie wrote:
>> On Wed, 2013-08-14 at 08:56 +0200, Martin Jansa wrote:
>>> On Wed, Aug 14, 2013 at 01:28:53PM +0800, Rongqing Li wrote:
>>>>
>>>>
>>>> On 08/14/2013 03:02 AM, Saul Wold wrote:
>>>>> On 08/13/2013 01:20 AM, rongqing.li@windriver.com wrote:
>>>>>> From: "Roy.Li" <rongqing.li@windriver.com>
>>>>>>
>>>>>> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
>>>>>> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
>>>>>> tar, but once other packages install the same file into sysroot, the
>>>>>> creating the archive file will fail with below error:
>>>>>>
>>>>>>       DEBUG: Executing shell function sstate_create_package
>>>>>>       tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed
>>>>>> as we read it
>>>>>>
>>>>>> This kind of error is harmless, use --ignore-failed-read to ignore it.
>>>>>>
>>>>> I am not sure it's so harmless, what if the file is corrupted, then we
>>>>> have a bad sstate tarball.  You have identified the part of the root
>>>>> cause being the hardlink, but what if the file actually does change
>>>>> (which would be a different bug potentially), then your packaging a
>>>>> differet set of macros (in this case) with the sysroot.
>>>>>
>>>>>
>>>>> Sau!
>>>>
>>>>
>>>> The file is not corrupted, and the file content is not changed,  "tar"
>>>> said xorg-macros.m4 file is changed, since the number of links of
>>>> xorg-macros.m4 has changed when other packages is doing configuration
>>>> and call autotools_copy_aclocal to make a hardlink to ${ACLOCALDIR}
>>>>
>>>> If this fix can be accepted, I will rework the commit header.
>>>
>>> I think there is still some other issue.
>>>
>>> I haven't seen this on ext4 filesystems, but with reiserfs I was able to
>>> reproduce "cp: will not create hard link" issue, e.g.:
>>>
>>> do_populate_lic_setscene task failing in sstate_install with
>>> cp: will not create hard link `/OE/deploy/licenses/recipe' to directory `/OE/deploy/licenses/recipe' (same path)
>>>
>>> or
>>> ERROR: Error executing a python function in pn.bb:
>>> CalledProcessError: Command 'cp -afl /OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/* /OE/pkgdata/armv7a-vfp-neon-oe-linux-gnueabi' returned non-zero exit status 1 with output
>>> cp: warning: source file `/OE/work/armv7a-vfp-neon-oe-linux-gnueabi/pn/1.0/pkgdata/pn' specified more than once
>>
>> This sounds like a race issue in reiserfs to me...
>
> True, I assume the same until someone else is able to reproduce it on
> some other filesystem.
>
> Any idea how to confirm this theory at least to add warning in
> documentation that using reiserfs on build partition is causing random
> build failures?
>



OK, But your issue is not related to me.

I can reproduce my issue by two simple script.

1. make a hardlink from a 
file(0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch), we do not change 
the source file.

#! /bin/bash

n=0

while [ $n -le 100000 ] ; do
	n=`expr "$n" + 1`
  	aa=`mktemp`
	rm $aa
	cp -lf 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch $aa
	rm $aa
done



2. tar this file

#! /bin/bash

n=0

while [ $n -le 100 ] ; do
	n=`expr "$n" + 1`
	tar -czvf aa.tar.gz 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
	rm aa.tar.gz
done

3. the result of tar is below:
001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
tar: 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch: file changed as we 
read it
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
tar: 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch: file changed as we 
read it
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
tar: 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch: file changed as we 
read it
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
tar: 0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch: file changed as we 
read it
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
0001-qemu-set-COMPATIBLE_HOST-for-mips64.patch
Phil Blundell - Aug. 15, 2013, 9:55 a.m.
On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
> OK, But your issue is not related to me.
> 
> I can reproduce my issue by two simple script.

If tar is deciding that the file has "changed" just because the link
count on the dentry has increased, that sounds like it is probably a bug
in tar and ought to be fixed there.

That said, I can't immediately think why autotools_copy_aclocal couldn't
use a symlink rather than a hard link which would avoid this whole
problem.  If the file is in the sysroot then there should be no risk of
it going away underneath its user.

p.
rongqing.li@windriver.com - Aug. 15, 2013, 10:08 a.m.
On 08/15/2013 05:55 PM, Phil Blundell wrote:
> On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
>> OK, But your issue is not related to me.
>>
>> I can reproduce my issue by two simple script.
>
> If tar is deciding that the file has "changed" just because the link
> count on the dentry has increased, that sounds like it is probably a bug
> in tar and ought to be fixed there.
>
> That said, I can't immediately think why autotools_copy_aclocal couldn't
> use a symlink rather than a hard link which would avoid this whole
> problem.  If the file is in the sysroot then there should be no risk of
> it going away underneath its user.
>

Good idea, use a symlink in autotools_copy_aclocal,
when tar makes a archive file, use -h parameter.

   -h, --dereference
            follow symlinks; archive and dump the files they point to




-Roy



> p.
>
>
>
>
Richard Purdie - Aug. 15, 2013, 4:23 p.m.
On Thu, 2013-08-15 at 10:55 +0100, Phil Blundell wrote:
> On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
> > OK, But your issue is not related to me.
> > 
> > I can reproduce my issue by two simple script.
> 
> If tar is deciding that the file has "changed" just because the link
> count on the dentry has increased, that sounds like it is probably a bug
> in tar and ought to be fixed there.
> 
> That said, I can't immediately think why autotools_copy_aclocal couldn't
> use a symlink rather than a hard link which would avoid this whole
> problem.  If the file is in the sysroot then there should be no risk of
> it going away underneath its user.

Sadly this doesn't work. We block copy a set of .m4 files from the
sysroot. We can be running do_configure of package A whilst package B is
de-installed from the sysroot and this leads to files disappearing
whilst they're being accessed. Its turned out to be a really awkward
problem to fix.

Cheers,

Richard
Mark Hatle - Aug. 15, 2013, 4:27 p.m.
On 8/15/13 11:23 AM, Richard Purdie wrote:
> On Thu, 2013-08-15 at 10:55 +0100, Phil Blundell wrote:
>> On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
>>> OK, But your issue is not related to me.
>>>
>>> I can reproduce my issue by two simple script.
>>
>> If tar is deciding that the file has "changed" just because the link
>> count on the dentry has increased, that sounds like it is probably a bug
>> in tar and ought to be fixed there.
>>
>> That said, I can't immediately think why autotools_copy_aclocal couldn't
>> use a symlink rather than a hard link which would avoid this whole
>> problem.  If the file is in the sysroot then there should be no risk of
>> it going away underneath its user.
>
> Sadly this doesn't work. We block copy a set of .m4 files from the
> sysroot. We can be running do_configure of package A whilst package B is
> de-installed from the sysroot and this leads to files disappearing
> whilst they're being accessed. Its turned out to be a really awkward
> problem to fix.

Do we need some kind of a read/write lock on accessing those files.  (Is this 
even something that we can do easily though the existing mechanisms?)

--Mark

> Cheers,
>
> Richard
>
>
>
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
>
Phil Blundell - Aug. 15, 2013, 4:38 p.m.
On Thu, 2013-08-15 at 17:23 +0100, Richard Purdie wrote:
> Sadly this doesn't work. We block copy a set of .m4 files from the
> sysroot. We can be running do_configure of package A whilst package B is
> de-installed from the sysroot and this leads to files disappearing
> whilst they're being accessed. Its turned out to be a really awkward
> problem to fix.

Oh, I see, this is the aclocal "scan all .m4 files" thing.  I suppose
the ideal arrangement, following the earlier discussion today about
accidental library linkage, would be to provide a way to only copy
the .m4 files that were installed by recipes in DEPENDS (recursively of
course).  This would have the pleasant side effect of reducing the
number of files that aclocal needs to scan which might make it a bit
faster as well.

But, looking at aclocal itself, it doesn't seem as though it would be
very hard to patch it to cope a bit more gracefully with files which
disappear (or turn out to be unreadable) underneath it.  I wonder if
that would be a better fix and we could then just remove all this
copying altogether.

p.
Richard Purdie - Aug. 15, 2013, 11:04 p.m.
On Thu, 2013-08-15 at 11:27 -0500, Mark Hatle wrote:
> On 8/15/13 11:23 AM, Richard Purdie wrote:
> > On Thu, 2013-08-15 at 10:55 +0100, Phil Blundell wrote:
> >> On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
> >>> OK, But your issue is not related to me.
> >>>
> >>> I can reproduce my issue by two simple script.
> >>
> >> If tar is deciding that the file has "changed" just because the link
> >> count on the dentry has increased, that sounds like it is probably a bug
> >> in tar and ought to be fixed there.
> >>
> >> That said, I can't immediately think why autotools_copy_aclocal couldn't
> >> use a symlink rather than a hard link which would avoid this whole
> >> problem.  If the file is in the sysroot then there should be no risk of
> >> it going away underneath its user.
> >
> > Sadly this doesn't work. We block copy a set of .m4 files from the
> > sysroot. We can be running do_configure of package A whilst package B is
> > de-installed from the sysroot and this leads to files disappearing
> > whilst they're being accessed. Its turned out to be a really awkward
> > problem to fix.
> 
> Do we need some kind of a read/write lock on accessing those files.  (Is this 
> even something that we can do easily though the existing mechanisms?)

It would kill performance for no good reason, been there, looked at
it...

Cheers,

Richard
rongqing.li@windriver.com - Aug. 16, 2013, 8:25 a.m.
On 08/16/2013 07:04 AM, Richard Purdie wrote:
> On Thu, 2013-08-15 at 11:27 -0500, Mark Hatle wrote:
>> On 8/15/13 11:23 AM, Richard Purdie wrote:
>>> On Thu, 2013-08-15 at 10:55 +0100, Phil Blundell wrote:
>>>> On Thu, 2013-08-15 at 17:51 +0800, Rongqing Li wrote:
>>>>> OK, But your issue is not related to me.
>>>>>
>>>>> I can reproduce my issue by two simple script.
>>>>
>>>> If tar is deciding that the file has "changed" just because the link
>>>> count on the dentry has increased, that sounds like it is probably a bug
>>>> in tar and ought to be fixed there.
>>>>
>>>> That said, I can't immediately think why autotools_copy_aclocal couldn't
>>>> use a symlink rather than a hard link which would avoid this whole
>>>> problem.  If the file is in the sysroot then there should be no risk of
>>>> it going away underneath its user.
>>>
>>> Sadly this doesn't work. We block copy a set of .m4 files from the
>>> sysroot. We can be running do_configure of package A whilst package B is
>>> de-installed from the sysroot and this leads to files disappearing
>>> whilst they're being accessed. Its turned out to be a really awkward
>>> problem to fix.
>>
>> Do we need some kind of a read/write lock on accessing those files.  (Is this
>> even something that we can do easily though the existing mechanisms?)
>
> It would kill performance for no good reason, been there, looked at
> it...
>
> Cheers,
>
> Richard
>

I think reverting the below optimization maybe better than using lock

commit 8c5544c2311b080bb212efb7f6b804db63e125f5
Author: Richard Purdie <richard.purdie@linuxfoundation.org>
Date:   Thu Oct 11 13:36:53 2012 +0100

     scripts/cp-noerror: Try and use hardlinks if possible

     Since we generally have lots of copies of the directories created 
using this tool, use
     hardlinks where possible. This should save a little disk space and 
improve performance
     slightly.

     (From OE-Core rev: bfa11c028c2da093f7b4e6b7b1d611da90ae052f)

     Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>


-Roy



> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
>
>
rongqing.li@windriver.com - Aug. 16, 2013, 9:05 a.m.
On 08/16/2013 04:25 PM, Rongqing Li wrote:
>>> Do we need some kind of a read/write lock on accessing those files.
>>> (Is this
>>> even something that we can do easily though the existing mechanisms?)
>>
>> It would kill performance for no good reason, been there, looked at
>> it...
>>
>> Cheers,
>>
>> Richard
>>
>
> I think reverting the below optimization maybe better than using lock
>
> commit 8c5544c2311b080bb212efb7f6b804db63e125f5
> Author: Richard Purdie <richard.purdie@linuxfoundation.org>
> Date:   Thu Oct 11 13:36:53 2012 +0100
>
>      scripts/cp-noerror: Try and use hardlinks if possible
>
>      Since we generally have lots of copies of the directories created
> using this tool, use
>      hardlinks where possible. This should save a little disk space and
> improve performance
>      slightly.
>
>      (From OE-Core rev: bfa11c028c2da093f7b4e6b7b1d611da90ae052f)
>
>      Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
>
>
> -Roy
>
>

I think the upper commit saves lots of space, but the saved time maybe 
be ignored.

On my general building image.

1. aclocal size and numbers of files
         bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ du -sh aclocal
         768K    aclocal

         bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ ls aclocal|wc
              54      54     621
2. do hardlink copy 1000 times
         bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ cat ./aa
         #! /bin/bash

         n=0

         while [ $n -le 1000 ] ; do
             n=`expr "$n" + 1`
             cp -alf aclocal ./tmp/
             rm -rf ./tmp/aclocal
         done
         /bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ time ./aa

         real    0m4.416s
         user    0m0.084s
         sys     0m0.256s

2. do copy 1000 times
         bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ cat ./aa
         #! /bin/bash

         n=0

         while [ $n -le 1000 ] ; do
             n=`expr "$n" + 1`
             cp -rf aclocal ./tmp/
             rm -rf ./tmp/aclocal
         done

         bitbake_build/tmp/sysroots/x86_64-linux/usr/share$ time ./aa

         real    0m8.707s
         user    0m0.104s
         sys     0m0.324s


Since we need several hours to compile a image, several seconds 
improvement...
Richard Purdie - Aug. 16, 2013, 9:27 a.m.
On Fri, 2013-08-16 at 17:05 +0800, Rongqing Li wrote:
> 
> On 08/16/2013 04:25 PM, Rongqing Li wrote:
> >>> Do we need some kind of a read/write lock on accessing those files.
> >>> (Is this
> >>> even something that we can do easily though the existing mechanisms?)
> >>
> >> It would kill performance for no good reason, been there, looked at
> >> it...
> >>
> >> Cheers,
> >>
> >> Richard
> >>
> >
> > I think reverting the below optimization maybe better than using lock
> >
> > commit 8c5544c2311b080bb212efb7f6b804db63e125f5
> > Author: Richard Purdie <richard.purdie@linuxfoundation.org>
> > Date:   Thu Oct 11 13:36:53 2012 +0100
> >
> >      scripts/cp-noerror: Try and use hardlinks if possible
> >
> >      Since we generally have lots of copies of the directories created
> > using this tool, use
> >      hardlinks where possible. This should save a little disk space and
> > improve performance
> >      slightly.
> >
> >      (From OE-Core rev: bfa11c028c2da093f7b4e6b7b1d611da90ae052f)
> >
> >      Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> >
> >
> > -Roy
> >
> >
> 
> I think the upper commit saves lots of space, but the saved time maybe 
> be ignored.

Its more that we have less files bouncing around the kernel so the disk
IO queues can be used for more useful stuff. You wouldn't hit blocked IO
in your tests above. The commit says space savings were the primary
benefit, the speed/IO is just a nice bonus and can't hurt.

I think Phil is right, we should look at fixing aclocal so file
disappeared errors are just handled gracefully, then this whole mess can
go away and we'll get even better performance.

I like the idea of iterating DEPENDS and figuring out which ones to add
but that is quite a bit more work. Ultimately it would be worthwhile
though and the same logic could then be used for a sysroot per workdir
type logic.

Cheers,

Richard
Richard Purdie - Sept. 12, 2013, 3:39 p.m.
On Tue, 2013-08-13 at 16:20 +0800, rongqing.li@windriver.com wrote:
> From: "Roy.Li" <rongqing.li@windriver.com>
> 
> sstate_package creates hardlink from sysroot to SSTATE_BUILDDIR, then
> sstate_create_package will store SSTATE_BUILDDIR into a archive file by
> tar, but once other packages install the same file into sysroot, the
> creating the archive file will fail with below error:
> 
>     DEBUG: Executing shell function sstate_create_package
>     tar: x86_64-linux/usr/share/aclocal/xorg-macros.m4: file changed as we read it
> 
> This kind of error is harmless, use --ignore-failed-read to ignore it.
> 
> Signed-off-by: Roy.Li <rongqing.li@windriver.com>


I've dug into this issue a bit and having looked a the code in tar for
this warning, I believe this is the right fix.

Throughout the system we now hardlink files together and the timestamps
of the files can change when the number of hardlinks is changed so its
possible to race against various parts of the system. The aclocal-copy
is the most frequently used so we hit it there easiest.

So I'm now in favour of taking this patch. We could improve aclocal but
that is really a separate issue.

Cheers,

Richard

> ---
>  meta/classes/sstate.bbclass |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
> index c1ca54b..3e2fedd 100644
> --- a/meta/classes/sstate.bbclass
> +++ b/meta/classes/sstate.bbclass
> @@ -565,7 +565,7 @@ sstate_create_package () {
>  	TFILE=`mktemp ${SSTATE_PKG}.XXXXXXXX`
>  	# Need to handle empty directories
>  	if [ "$(ls -A)" ]; then
> -		tar -czf $TFILE *
> +		tar --ignore-failed-read -czf $TFILE *
>  	else
>  		tar -cz --file=$TFILE --files-from=/dev/null
>  	fi

Patch

diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
index c1ca54b..3e2fedd 100644
--- a/meta/classes/sstate.bbclass
+++ b/meta/classes/sstate.bbclass
@@ -565,7 +565,7 @@  sstate_create_package () {
 	TFILE=`mktemp ${SSTATE_PKG}.XXXXXXXX`
 	# Need to handle empty directories
 	if [ "$(ls -A)" ]; then
-		tar -czf $TFILE *
+		tar --ignore-failed-read -czf $TFILE *
 	else
 		tar -cz --file=$TFILE --files-from=/dev/null
 	fi