diff mbox series

[v3,2/2] shadow: link executables statically for -native variant

Message ID 20240111131521.2305172-2-alex@linutronix.de
State Accepted, archived
Commit 495ff95eae14a91c94187f78a0b30c7957c9b168
Headers show
Series [v3,1/2] shadow: update 4.13 -> 4.14.2 | expand

Commit Message

Alexander Kanavin Jan. 11, 2024, 1:15 p.m. UTC
shadow 4.14.x adds a number of libraries it dynamically links with
(md, bsd, attr). This causes troubles in setscene tasks where
shadow executables are used (such as useradd), as pulling in
the needed dynamic libraries needs unpleasant special-casing.

Signed-off-by: Alexander Kanavin <alex@linutronix.de>

---
v2: patch only Makefiles that produce executables and libshadow.a
(that executables all statically link with), do not patch libsubid/Makefile,
as patching in .a linking can clash with producing dynamic libraries.
libsubid is used only in getsubids executable, which is not used in
setscene user management (or anywhere else from what I can see).

v3: add -no-pie to linker flags, as otherwise some host distros
would refuse to link against libattr produced on other host distros
and supplied via sstate (libattr made with gcc 13 and used on gcc 11/12
hosts seems to be problematic)

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
---
 meta/conf/distro/include/no-static-libs.inc |  5 +++++
 meta/recipes-extended/shadow/shadow.inc     | 10 ++++++++++
 2 files changed, 15 insertions(+)

Comments

Dmitry Baryshkov Jan. 17, 2024, 12:46 p.m. UTC | #1
On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
>
> shadow 4.14.x adds a number of libraries it dynamically links with
> (md, bsd, attr). This causes troubles in setscene tasks where
> shadow executables are used (such as useradd), as pulling in
> the needed dynamic libraries needs unpleasant special-casing.
>
> Signed-off-by: Alexander Kanavin <alex@linutronix.de>

It seems, this is causing issues with the TuxOE builds. We have been
observing issues with the TuxOE build environment with the image
creation choking on the home dirs. Reverting this patch seems to fix
the problem. The build environment is Ubuntu 20.04 running in a
container on Ubuntu 22.04.

ERROR: rpb-weston-image-1.0-r0 do_image_tar:
ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
1, None, None)
ERROR: Logfile of failure stored in:
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
Log data follows:
| DEBUG: Executing python function set_image_size
| DEBUG: 1402908.000000 = 1079160 * 1.300000
| DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
| DEBUG: 1402908.000000 = int(1402908.000000)
| DEBUG: 1404928 = aligned(1402908)
| DEBUG: returning 1404928
| DEBUG: Python function set_image_size finished
| DEBUG: Executing shell function do_image_tar
| tar: ./home/linaro/.bashrc: Unknown file type; file ignored
| tar: ./home/linaro/.profile: Unknown file type; file ignored
| tar: Exiting with failure status due to previous errors
| WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
exit 1 from '[ $? -eq 1 ]'
| WARNING: Backtrace (BB generated script):
| #1: do_image_tar,
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075,
line 146
| #2: main, /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075,
line 152
NOTE: recipe rpb-weston-image-1.0-r0: task do_image_tar: Failed
ERROR: Task (/oe/build/conf/../../layers/meta-rpb/recipes-samples/images/rpb-weston-image.bb:do_image_tar)
failed with exit code '1'

| DEBUG: Python function extend_recipe_sysroot finished
| DEBUG: Executing python function set_image_size
| DEBUG: 1402908.000000 = 1079160 * 1.300000
| DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
| DEBUG: 1402908.000000 = int(1402908.000000)
| DEBUG: 1404928 = aligned(1402908)
| DEBUG: returning 1404928
| DEBUG: Python function set_image_size finished
| DEBUG: Executing shell function do_image_ext4
| DEBUG: Executing dd if=/dev/zero
of=/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.ext4
seek=1404928 count=0 bs=1024
| 0+0 records in
| 0+0 records out
| 0 bytes copied, 9.0878e-05 s, 0.0 kB/s
| DEBUG: Actual Rootfs size:  1075376
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs
| DEBUG: Actual Partition size: 1438646272
| DEBUG: Executing mkfs.ext4 -F -b 4096
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.ext4
-d /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs
| mke2fs 1.47.0 (5-Feb-2023)
| Discarding device blocks:      0/351232
             done
| Creating filesystem with 351232 4k blocks and 87824 inodes
| Filesystem UUID: 27cefb6f-e38e-44ed-ab0f-4b613d3594f2
| Superblock backups stored on blocks:
| 32768, 98304, 163840, 229376, 294912
|
| Allocating group tables:  0/11               done
| Writing inode tables:  0/11               done
| Creating journal (8192 blocks): done
| Copying files into the device: __populate_fs: ignoring entry ".bashrc"
| .bashrc: File not found by ext2_lookup while looking up ".bashrc"
| mkfs.ext4: File not found by ext2_lookup while populating file system
| WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095:176
exit 1 from 'mkfs.$fstype -F $extra_imagecmd
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.$fstype
-d /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs'
| WARNING: Backtrace (BB generated script):
| #1: oe_mkext234fs,
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
line 176
| #2: do_image_ext4,
/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
line 146
| #3: main, /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
line 213
NOTE: recipe rpb-weston-image-1.0-r0: task do_image_ext4: Failed



>
> ---
> v2: patch only Makefiles that produce executables and libshadow.a
> (that executables all statically link with), do not patch libsubid/Makefile,
> as patching in .a linking can clash with producing dynamic libraries.
> libsubid is used only in getsubids executable, which is not used in
> setscene user management (or anywhere else from what I can see).
>
> v3: add -no-pie to linker flags, as otherwise some host distros
> would refuse to link against libattr produced on other host distros
> and supplied via sstate (libattr made with gcc 13 and used on gcc 11/12
> hosts seems to be problematic)
>
> Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> ---
>  meta/conf/distro/include/no-static-libs.inc |  5 +++++
>  meta/recipes-extended/shadow/shadow.inc     | 10 ++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/meta/conf/distro/include/no-static-libs.inc b/meta/conf/distro/include/no-static-libs.inc
> index 75359928a14..8898d53d756 100644
> --- a/meta/conf/distro/include/no-static-libs.inc
> +++ b/meta/conf/distro/include/no-static-libs.inc
> @@ -21,6 +21,11 @@ DISABLE_STATIC:pn-libusb1-native = ""
>  # needed by rust
>  DISABLE_STATIC:pn-musl = ""
>
> +# needed by shadow-native to build static executables, particularly useradd
> +DISABLE_STATIC:pn-attr-native = ""
> +DISABLE_STATIC:pn-libbsd-native = ""
> +DISABLE_STATIC:pn-libmd-native = ""
> +
>  EXTRA_OECONF:append = "${DISABLE_STATIC}"
>
>  EXTRA_OECMAKE:append:pn-libical = " -DSHARED_ONLY=True"
> diff --git a/meta/recipes-extended/shadow/shadow.inc b/meta/recipes-extended/shadow/shadow.inc
> index c024746d4ff..43f456251a5 100644
> --- a/meta/recipes-extended/shadow/shadow.inc
> +++ b/meta/recipes-extended/shadow/shadow.inc
> @@ -47,6 +47,16 @@ EXTRA_OECONF += "--without-libcrack \
>
>  CFLAGS:append:libc-musl = " -DLIBBSD_OVERLAY"
>
> +# Force static linking of utilities so we can use from the sysroot/sstate for useradd
> +# without worrying about the dependency libraries being available
> +LDFLAGS:append:class-native = " -no-pie"
> +do_compile:prepend:class-native () {
> +       sed -i -e 's#\(LIBS.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
> +              -e 's#\(LIBBSD.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
> +              -e 's#\(LIBATTR.*\)-lattr#\1 ${STAGING_LIBDIR}/libattr.a#g' \
> +               ${B}/lib/Makefile ${B}/src/Makefile
> +}
> +
>  NSCDOPT = ""
>  NSCDOPT:class-native = "--without-nscd"
>  NSCDOPT:class-nativesdk = "--without-nscd"
> --
> 2.39.2
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#193543): https://lists.openembedded.org/g/openembedded-core/message/193543
> Mute This Topic: https://lists.openembedded.org/mt/103661548/3618183
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [dbaryshkov@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Dmitry Baryshkov Jan. 18, 2024, 9:50 a.m. UTC | #2
On Wed, 17 Jan 2024 at 14:46, Dmitry Baryshkov <dbaryshkov@gmail.com> wrote:
>
> On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> >
> > shadow 4.14.x adds a number of libraries it dynamically links with
> > (md, bsd, attr). This causes troubles in setscene tasks where
> > shadow executables are used (such as useradd), as pulling in
> > the needed dynamic libraries needs unpleasant special-casing.
> >
> > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
>
> It seems, this is causing issues with the TuxOE builds. We have been
> observing issues with the TuxOE build environment with the image
> creation choking on the home dirs. Reverting this patch seems to fix
> the problem. The build environment is Ubuntu 20.04 running in a
> container on Ubuntu 22.04.
>
> ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> 1, None, None)
> ERROR: Logfile of failure stored in:
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> Log data follows:
> | DEBUG: Executing python function set_image_size
> | DEBUG: 1402908.000000 = 1079160 * 1.300000
> | DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> | DEBUG: 1402908.000000 = int(1402908.000000)
> | DEBUG: 1404928 = aligned(1402908)
> | DEBUG: returning 1404928
> | DEBUG: Python function set_image_size finished
> | DEBUG: Executing shell function do_image_tar
> | tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> | tar: ./home/linaro/.profile: Unknown file type; file ignored
> | tar: Exiting with failure status due to previous errors
> | WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> exit 1 from '[ $? -eq 1 ]'
> | WARNING: Backtrace (BB generated script):
> | #1: do_image_tar,
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075,
> line 146
> | #2: main, /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075,
> line 152
> NOTE: recipe rpb-weston-image-1.0-r0: task do_image_tar: Failed
> ERROR: Task (/oe/build/conf/../../layers/meta-rpb/recipes-samples/images/rpb-weston-image.bb:do_image_tar)
> failed with exit code '1'
>
> | DEBUG: Python function extend_recipe_sysroot finished
> | DEBUG: Executing python function set_image_size
> | DEBUG: 1402908.000000 = 1079160 * 1.300000
> | DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> | DEBUG: 1402908.000000 = int(1402908.000000)
> | DEBUG: 1404928 = aligned(1402908)
> | DEBUG: returning 1404928
> | DEBUG: Python function set_image_size finished
> | DEBUG: Executing shell function do_image_ext4
> | DEBUG: Executing dd if=/dev/zero
> of=/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.ext4
> seek=1404928 count=0 bs=1024
> | 0+0 records in
> | 0+0 records out
> | 0 bytes copied, 9.0878e-05 s, 0.0 kB/s
> | DEBUG: Actual Rootfs size:  1075376
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs
> | DEBUG: Actual Partition size: 1438646272
> | DEBUG: Executing mkfs.ext4 -F -b 4096
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.ext4
> -d /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs
> | mke2fs 1.47.0 (5-Feb-2023)
> | Discarding device blocks:      0/351232
>              done
> | Creating filesystem with 351232 4k blocks and 87824 inodes
> | Filesystem UUID: 27cefb6f-e38e-44ed-ab0f-4b613d3594f2
> | Superblock backups stored on blocks:
> | 32768, 98304, 163840, 229376, 294912
> |
> | Allocating group tables:  0/11               done
> | Writing inode tables:  0/11               done
> | Creating journal (8192 blocks): done
> | Copying files into the device: __populate_fs: ignoring entry ".bashrc"
> | .bashrc: File not found by ext2_lookup while looking up ".bashrc"
> | mkfs.ext4: File not found by ext2_lookup while populating file system
> | WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095:176
> exit 1 from 'mkfs.$fstype -F $extra_imagecmd
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/deploy-rpb-weston-image-image-complete/rpb-weston-image-qcom-armv8a.rootfs-20240117062023.$fstype
> -d /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/rootfs'
> | WARNING: Backtrace (BB generated script):
> | #1: oe_mkext234fs,
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
> line 176
> | #2: do_image_ext4,
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
> line 146
> | #3: main, /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_ext4.160095,
> line 213
> NOTE: recipe rpb-weston-image-1.0-r0: task do_image_ext4: Failed

Alexander, any additional ideas on how we can debug this? So far we
observe this kind of issue randomly. Vishal was able to reproduce the
issue on the Ubuntu 22.04 host.
I was not able to reproduce the issue with OE-Core up to the commit
e85069acf304 ("shadow: update 4.13 -> 4.14.2").

You can find full success / failure logs e.g. using the following log:
https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/nicolas/plans/2ay49JMJCSuYzpqrGBP8bXqyb78

>
>
>
> >
> > ---
> > v2: patch only Makefiles that produce executables and libshadow.a
> > (that executables all statically link with), do not patch libsubid/Makefile,
> > as patching in .a linking can clash with producing dynamic libraries.
> > libsubid is used only in getsubids executable, which is not used in
> > setscene user management (or anywhere else from what I can see).
> >
> > v3: add -no-pie to linker flags, as otherwise some host distros
> > would refuse to link against libattr produced on other host distros
> > and supplied via sstate (libattr made with gcc 13 and used on gcc 11/12
> > hosts seems to be problematic)
> >
> > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> > ---
> >  meta/conf/distro/include/no-static-libs.inc |  5 +++++
> >  meta/recipes-extended/shadow/shadow.inc     | 10 ++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/meta/conf/distro/include/no-static-libs.inc b/meta/conf/distro/include/no-static-libs.inc
> > index 75359928a14..8898d53d756 100644
> > --- a/meta/conf/distro/include/no-static-libs.inc
> > +++ b/meta/conf/distro/include/no-static-libs.inc
> > @@ -21,6 +21,11 @@ DISABLE_STATIC:pn-libusb1-native = ""
> >  # needed by rust
> >  DISABLE_STATIC:pn-musl = ""
> >
> > +# needed by shadow-native to build static executables, particularly useradd
> > +DISABLE_STATIC:pn-attr-native = ""
> > +DISABLE_STATIC:pn-libbsd-native = ""
> > +DISABLE_STATIC:pn-libmd-native = ""
> > +
> >  EXTRA_OECONF:append = "${DISABLE_STATIC}"
> >
> >  EXTRA_OECMAKE:append:pn-libical = " -DSHARED_ONLY=True"
> > diff --git a/meta/recipes-extended/shadow/shadow.inc b/meta/recipes-extended/shadow/shadow.inc
> > index c024746d4ff..43f456251a5 100644
> > --- a/meta/recipes-extended/shadow/shadow.inc
> > +++ b/meta/recipes-extended/shadow/shadow.inc
> > @@ -47,6 +47,16 @@ EXTRA_OECONF += "--without-libcrack \
> >
> >  CFLAGS:append:libc-musl = " -DLIBBSD_OVERLAY"
> >
> > +# Force static linking of utilities so we can use from the sysroot/sstate for useradd
> > +# without worrying about the dependency libraries being available
> > +LDFLAGS:append:class-native = " -no-pie"
> > +do_compile:prepend:class-native () {
> > +       sed -i -e 's#\(LIBS.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
> > +              -e 's#\(LIBBSD.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
> > +              -e 's#\(LIBATTR.*\)-lattr#\1 ${STAGING_LIBDIR}/libattr.a#g' \
> > +               ${B}/lib/Makefile ${B}/src/Makefile
> > +}
> > +
> >  NSCDOPT = ""
> >  NSCDOPT:class-native = "--without-nscd"
> >  NSCDOPT:class-nativesdk = "--without-nscd"
> > --
> > 2.39.2
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#193543): https://lists.openembedded.org/g/openembedded-core/message/193543
> > Mute This Topic: https://lists.openembedded.org/mt/103661548/3618183
> > Group Owner: openembedded-core+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [dbaryshkov@gmail.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
>
>
> --
> With best wishes
> Dmitry
Richard Purdie Jan. 18, 2024, 9:59 a.m. UTC | #3
On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > 
> > shadow 4.14.x adds a number of libraries it dynamically links with
> > (md, bsd, attr). This causes troubles in setscene tasks where
> > shadow executables are used (such as useradd), as pulling in
> > the needed dynamic libraries needs unpleasant special-casing.
> > 
> > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> 
> It seems, this is causing issues with the TuxOE builds. We have been
> observing issues with the TuxOE build environment with the image
> creation choking on the home dirs. Reverting this patch seems to fix
> the problem. The build environment is Ubuntu 20.04 running in a
> container on Ubuntu 22.04.
> 
> ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> 1, None, None)
> ERROR: Logfile of failure stored in:
> /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> Log data follows:
> > DEBUG: Executing python function set_image_size
> > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > DEBUG: 1402908.000000 = int(1402908.000000)
> > DEBUG: 1404928 = aligned(1402908)
> > DEBUG: returning 1404928
> > DEBUG: Python function set_image_size finished
> > DEBUG: Executing shell function do_image_tar
> > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > tar: Exiting with failure status due to previous errors
> > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> exit 1 from '[ $? -eq 1 ]'
> 

The error is coming from tar during archive creation, "Unknown file
type; file ignored". I'm a little confused/concerned about what it is
seeing which it can't handle.

It might also be good to work out if that is tar from the host or tar
from tar-native. Is the host's tar unable to support something we're
relying upon?

If it were me, I'd probably have a look into the tar source code too,
see what might trigger an error like that.

Cheers,

Richard
Alexander Kanavin Jan. 18, 2024, 10:13 a.m. UTC | #4
I'd like to clarify the 'randomly' part: does the failure disappear if
you re-run bitbake on the same build directory, or is it random only
between different builds? If it's deterministic in the same build
directory, then you can narrow it down to the specific tar invocation
directly from command line perhaps, and then try to see what is it in
the tree that tar operates on, that triggers the error.

Alex

On Thu, 18 Jan 2024 at 10:59, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> > On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > >
> > > shadow 4.14.x adds a number of libraries it dynamically links with
> > > (md, bsd, attr). This causes troubles in setscene tasks where
> > > shadow executables are used (such as useradd), as pulling in
> > > the needed dynamic libraries needs unpleasant special-casing.
> > >
> > > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> >
> > It seems, this is causing issues with the TuxOE builds. We have been
> > observing issues with the TuxOE build environment with the image
> > creation choking on the home dirs. Reverting this patch seems to fix
> > the problem. The build environment is Ubuntu 20.04 running in a
> > container on Ubuntu 22.04.
> >
> > ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> > ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> > 1, None, None)
> > ERROR: Logfile of failure stored in:
> > /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> > Log data follows:
> > > DEBUG: Executing python function set_image_size
> > > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > > DEBUG: 1402908.000000 = int(1402908.000000)
> > > DEBUG: 1404928 = aligned(1402908)
> > > DEBUG: returning 1404928
> > > DEBUG: Python function set_image_size finished
> > > DEBUG: Executing shell function do_image_tar
> > > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > > tar: Exiting with failure status due to previous errors
> > > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> > exit 1 from '[ $? -eq 1 ]'
> >
>
> The error is coming from tar during archive creation, "Unknown file
> type; file ignored". I'm a little confused/concerned about what it is
> seeing which it can't handle.
>
> It might also be good to work out if that is tar from the host or tar
> from tar-native. Is the host's tar unable to support something we're
> relying upon?
>
> If it were me, I'd probably have a look into the tar source code too,
> see what might trigger an error like that.
>
> Cheers,
>
> Richard
>
>
Dmitry Baryshkov Jan. 18, 2024, 1:32 p.m. UTC | #5
On Thu, 18 Jan 2024 at 12:13, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
>
> I'd like to clarify the 'randomly' part: does the failure disappear if
> you re-run bitbake on the same build directory, or is it random only
> between different builds?

It is random between different builds. Rerunning the build frequently
(but not always) makes it go away.

> If it's deterministic in the same build
> directory, then you can narrow it down to the specific tar invocation
> directly from command line perhaps, and then try to see what is it in
> the tree that tar operates on, that triggers the error.
>
> Alex
>
> On Thu, 18 Jan 2024 at 10:59, Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> >
> > On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> > > On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > > >
> > > > shadow 4.14.x adds a number of libraries it dynamically links with
> > > > (md, bsd, attr). This causes troubles in setscene tasks where
> > > > shadow executables are used (such as useradd), as pulling in
> > > > the needed dynamic libraries needs unpleasant special-casing.
> > > >
> > > > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> > >
> > > It seems, this is causing issues with the TuxOE builds. We have been
> > > observing issues with the TuxOE build environment with the image
> > > creation choking on the home dirs. Reverting this patch seems to fix
> > > the problem. The build environment is Ubuntu 20.04 running in a
> > > container on Ubuntu 22.04.
> > >
> > > ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> > > ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> > > 1, None, None)
> > > ERROR: Logfile of failure stored in:
> > > /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> > > Log data follows:
> > > > DEBUG: Executing python function set_image_size
> > > > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > > > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > > > DEBUG: 1402908.000000 = int(1402908.000000)
> > > > DEBUG: 1404928 = aligned(1402908)
> > > > DEBUG: returning 1404928
> > > > DEBUG: Python function set_image_size finished
> > > > DEBUG: Executing shell function do_image_tar
> > > > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > > > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > > > tar: Exiting with failure status due to previous errors
> > > > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> > > exit 1 from '[ $? -eq 1 ]'
> > >
> >
> > The error is coming from tar during archive creation, "Unknown file
> > type; file ignored". I'm a little confused/concerned about what it is
> > seeing which it can't handle.
> >
> > It might also be good to work out if that is tar from the host or tar
> > from tar-native. Is the host's tar unable to support something we're
> > relying upon?
> >
> > If it were me, I'd probably have a look into the tar source code too,
> > see what might trigger an error like that.
> >
> > Cheers,
> >
> > Richard
> >
> >
Dmitry Baryshkov Jan. 18, 2024, 1:37 p.m. UTC | #6
On Thu, 18 Jan 2024 at 11:59, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> > On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > >
> > > shadow 4.14.x adds a number of libraries it dynamically links with
> > > (md, bsd, attr). This causes troubles in setscene tasks where
> > > shadow executables are used (such as useradd), as pulling in
> > > the needed dynamic libraries needs unpleasant special-casing.
> > >
> > > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> >
> > It seems, this is causing issues with the TuxOE builds. We have been
> > observing issues with the TuxOE build environment with the image
> > creation choking on the home dirs. Reverting this patch seems to fix
> > the problem. The build environment is Ubuntu 20.04 running in a
> > container on Ubuntu 22.04.
> >
> > ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> > ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> > 1, None, None)
> > ERROR: Logfile of failure stored in:
> > /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> > Log data follows:
> > > DEBUG: Executing python function set_image_size
> > > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > > DEBUG: 1402908.000000 = int(1402908.000000)
> > > DEBUG: 1404928 = aligned(1402908)
> > > DEBUG: returning 1404928
> > > DEBUG: Python function set_image_size finished
> > > DEBUG: Executing shell function do_image_tar
> > > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > > tar: Exiting with failure status due to previous errors
> > > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> > exit 1 from '[ $? -eq 1 ]'
> >
>
> The error is coming from tar during archive creation, "Unknown file
> type; file ignored". I'm a little confused/concerned about what it is
> seeing which it can't handle.
>
> It might also be good to work out if that is tar from the host or tar
> from tar-native. Is the host's tar unable to support something we're
> relying upon?
>
> If it were me, I'd probably have a look into the tar source code too,
> see what might trigger an error like that.

I compared this to the ext4 creation error. The code for __populate_fs
is more easy to follow. Basically this error means that st.st_mode
doesn't match S_IFCHR, S_IFBLK, S_IFREG and S_IFDIR checks.
I assume the files in home dir are created by useradd in some way that
bypasses pseudo. Then when tar / mkfs are executed, pseudo doesn't
know about the file and returns bad st_mode through lstat().
Richard Purdie Jan. 18, 2024, 1:51 p.m. UTC | #7
On Thu, 2024-01-18 at 15:37 +0200, Dmitry Baryshkov wrote:
> On Thu, 18 Jan 2024 at 11:59, Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > 
> > On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> > > On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > > > 
> > > > shadow 4.14.x adds a number of libraries it dynamically links with
> > > > (md, bsd, attr). This causes troubles in setscene tasks where
> > > > shadow executables are used (such as useradd), as pulling in
> > > > the needed dynamic libraries needs unpleasant special-casing.
> > > > 
> > > > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> > > 
> > > It seems, this is causing issues with the TuxOE builds. We have been
> > > observing issues with the TuxOE build environment with the image
> > > creation choking on the home dirs. Reverting this patch seems to fix
> > > the problem. The build environment is Ubuntu 20.04 running in a
> > > container on Ubuntu 22.04.
> > > 
> > > ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> > > ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> > > 1, None, None)
> > > ERROR: Logfile of failure stored in:
> > > /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> > > Log data follows:
> > > > DEBUG: Executing python function set_image_size
> > > > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > > > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > > > DEBUG: 1402908.000000 = int(1402908.000000)
> > > > DEBUG: 1404928 = aligned(1402908)
> > > > DEBUG: returning 1404928
> > > > DEBUG: Python function set_image_size finished
> > > > DEBUG: Executing shell function do_image_tar
> > > > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > > > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > > > tar: Exiting with failure status due to previous errors
> > > > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> > > exit 1 from '[ $? -eq 1 ]'
> > > 
> > 
> > The error is coming from tar during archive creation, "Unknown file
> > type; file ignored". I'm a little confused/concerned about what it is
> > seeing which it can't handle.
> > 
> > It might also be good to work out if that is tar from the host or tar
> > from tar-native. Is the host's tar unable to support something we're
> > relying upon?
> > 
> > If it were me, I'd probably have a look into the tar source code too,
> > see what might trigger an error like that.
> 
> I compared this to the ext4 creation error. The code for __populate_fs
> is more easy to follow. Basically this error means that st.st_mode
> doesn't match S_IFCHR, S_IFBLK, S_IFREG and S_IFDIR checks.
> I assume the files in home dir are created by useradd in some way that
> bypasses pseudo. Then when tar / mkfs are executed, pseudo doesn't
> know about the file and returns bad st_mode through lstat().

Files shouldn't be getting created in useradd that pseudo doesn't know
about. If they are, we could be missing an intercept on some glibc
function call for example.

Is this in a multiple worker setup with a shared sstate? We need to
track down which OS the escape is happening on. In theory it should be
reproducible. Do you have anything with a bleeding edge glibc there
(e.g. gentoo)?

Cheers,

Richard
Dmitry Baryshkov Jan. 18, 2024, 3:12 p.m. UTC | #8
On Thu, 18 Jan 2024 at 15:51, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Thu, 2024-01-18 at 15:37 +0200, Dmitry Baryshkov wrote:
> > On Thu, 18 Jan 2024 at 11:59, Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> > >
> > > On Wed, 2024-01-17 at 14:46 +0200, Dmitry Baryshkov wrote:
> > > > On Thu, 11 Jan 2024 at 15:15, Alexander Kanavin <alex.kanavin@gmail.com> wrote:
> > > > >
> > > > > shadow 4.14.x adds a number of libraries it dynamically links with
> > > > > (md, bsd, attr). This causes troubles in setscene tasks where
> > > > > shadow executables are used (such as useradd), as pulling in
> > > > > the needed dynamic libraries needs unpleasant special-casing.
> > > > >
> > > > > Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> > > >
> > > > It seems, this is causing issues with the TuxOE builds. We have been
> > > > observing issues with the TuxOE build environment with the image
> > > > creation choking on the home dirs. Reverting this patch seems to fix
> > > > the problem. The build environment is Ubuntu 20.04 running in a
> > > > container on Ubuntu 22.04.
> > > >
> > > > ERROR: rpb-weston-image-1.0-r0 do_image_tar:
> > > > ExecutionError('/oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075',
> > > > 1, None, None)
> > > > ERROR: Logfile of failure stored in:
> > > > /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/log.do_image_tar.160075
> > > > Log data follows:
> > > > > DEBUG: Executing python function set_image_size
> > > > > DEBUG: 1402908.000000 = 1079160 * 1.300000
> > > > > DEBUG: 1402908.000000 = max(1402908.000000, 65536)[1402908.000000] + 0
> > > > > DEBUG: 1402908.000000 = int(1402908.000000)
> > > > > DEBUG: 1404928 = aligned(1402908)
> > > > > DEBUG: returning 1404928
> > > > > DEBUG: Python function set_image_size finished
> > > > > DEBUG: Executing shell function do_image_tar
> > > > > tar: ./home/linaro/.bashrc: Unknown file type; file ignored
> > > > > tar: ./home/linaro/.profile: Unknown file type; file ignored
> > > > > tar: Exiting with failure status due to previous errors
> > > > > WARNING: /oe/build/tmp-rpb_wayland-glibc/work/qcom_armv8a-linaro-linux/rpb-weston-image/1.0/temp/run.do_image_tar.160075:146
> > > > exit 1 from '[ $? -eq 1 ]'
> > > >
> > >
> > > The error is coming from tar during archive creation, "Unknown file
> > > type; file ignored". I'm a little confused/concerned about what it is
> > > seeing which it can't handle.
> > >
> > > It might also be good to work out if that is tar from the host or tar
> > > from tar-native. Is the host's tar unable to support something we're
> > > relying upon?
> > >
> > > If it were me, I'd probably have a look into the tar source code too,
> > > see what might trigger an error like that.
> >
> > I compared this to the ext4 creation error. The code for __populate_fs
> > is more easy to follow. Basically this error means that st.st_mode
> > doesn't match S_IFCHR, S_IFBLK, S_IFREG and S_IFDIR checks.
> > I assume the files in home dir are created by useradd in some way that
> > bypasses pseudo. Then when tar / mkfs are executed, pseudo doesn't
> > know about the file and returns bad st_mode through lstat().
>
> Files shouldn't be getting created in useradd that pseudo doesn't know
> about. If they are, we could be missing an intercept on some glibc
> function call for example.
>
> Is this in a multiple worker setup with a shared sstate? We need to
> track down which OS the escape is happening on. In theory it should be
> reproducible. Do you have anything with a bleeding edge glibc there
> (e.g. gentoo)?

We initially saw this on Ubuntu 20.04 inside the docker on top Ubuntu
22.04 host. Then Vishal was able to reproduce this on bare Ubuntu
22.04 host.
Bisection points to this particular patch, so I assume that static
linking causes some issue here. E.g. getting one of syscalls inlined
in one of these static libs, so that pseudo is no longer able to
override it.

--
With best wishes
Dmitry
diff mbox series

Patch

diff --git a/meta/conf/distro/include/no-static-libs.inc b/meta/conf/distro/include/no-static-libs.inc
index 75359928a14..8898d53d756 100644
--- a/meta/conf/distro/include/no-static-libs.inc
+++ b/meta/conf/distro/include/no-static-libs.inc
@@ -21,6 +21,11 @@  DISABLE_STATIC:pn-libusb1-native = ""
 # needed by rust
 DISABLE_STATIC:pn-musl = ""
 
+# needed by shadow-native to build static executables, particularly useradd
+DISABLE_STATIC:pn-attr-native = ""
+DISABLE_STATIC:pn-libbsd-native = ""
+DISABLE_STATIC:pn-libmd-native = ""
+
 EXTRA_OECONF:append = "${DISABLE_STATIC}"
 
 EXTRA_OECMAKE:append:pn-libical = " -DSHARED_ONLY=True"
diff --git a/meta/recipes-extended/shadow/shadow.inc b/meta/recipes-extended/shadow/shadow.inc
index c024746d4ff..43f456251a5 100644
--- a/meta/recipes-extended/shadow/shadow.inc
+++ b/meta/recipes-extended/shadow/shadow.inc
@@ -47,6 +47,16 @@  EXTRA_OECONF += "--without-libcrack \
 
 CFLAGS:append:libc-musl = " -DLIBBSD_OVERLAY"
 
+# Force static linking of utilities so we can use from the sysroot/sstate for useradd
+# without worrying about the dependency libraries being available
+LDFLAGS:append:class-native = " -no-pie"
+do_compile:prepend:class-native () {
+	sed -i -e 's#\(LIBS.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
+	       -e 's#\(LIBBSD.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \
+	       -e 's#\(LIBATTR.*\)-lattr#\1 ${STAGING_LIBDIR}/libattr.a#g' \
+               ${B}/lib/Makefile ${B}/src/Makefile
+}
+
 NSCDOPT = ""
 NSCDOPT:class-native = "--without-nscd"
 NSCDOPT:class-nativesdk = "--without-nscd"