Message ID | 2ed08c8656181cf2b4495978e28e70c5274fb7e7.camel@etorok.net |
---|---|
State | New |
Headers | show |
Series | xz: do not OOM with 3x builds, lower memlimit | expand |
I'm not sure this is right as a global change. This fixes situations where 3 xz instances run on your specific setup, what about 4? Or 5? Or n instances for all n in N? I think you need to limit the amount of threads rather to match available RAM, and do it via site.conf and not bitbake.conf. ?= assignment is used in bitbake.conf exactly for that purpose. Alex On Sat, 27 Aug 2022 at 20:15, Edwin Török via lists.openembedded.org <edwin=etorok.net@lists.openembedded.org> wrote: > > By default a build such as [1] might run 3 'xz' in parallel: > ``` > Currently 3 running tasks (11878 of 11883) 99% > |###################################################################### > #### | > 0: demo-coreip-xfce4-1.0-r0 do_image_ext4 - 3m17s (pid 2088739) > 1: demo-coreip-xfce4-1.0-r0 do_image_tar - 3m16s (pid 2088743) > 2: demo-coreip-xfce4-1.0-r0 do_image_wic - 3m16s (pid 2088745) > ``` > > However the default memory usage limit of `xz` is 50% each, so this > will > attempt to use 150% memory, and it gets OOM killed by systemd-oomd on > Fedora 36. > ``` > Aug 27 18:38:57 fedora systemd-oomd[3150]: Killed > /user.slice/user-1000.slice/user@1000.service/app.slice/app- > org.gnome.Terminal.slice/vte-spawn-2d92eb7b-b005-41b4-a786- > fc8c0d360ce3.scope due to memory used (66890584064) / total > (67332812800) and swap used (7744446464) / total (8589930496) being > more than 90.00% > ``` > > Even with systemd-oomd turned off it'd eventually start swapping > heavily > on a system with 64GiB of physical memory and 8GiB of swap. > > Reduce memory limit on xz so that we can run 3 in parallel without > driving the host close to or OOM. 25% seems to work on this particular > build and allows it to complete successfully. > > [1] https://github.com/sifive/freedom-u-sdk/tree/2022.06.00 > > Signed-off-by: Edwin Török <edwin@etorok.net> > --- > meta/conf/bitbake.conf | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf > index 2a3cf6f8aa..48ba52c12c 100644 > --- a/meta/conf/bitbake.conf > +++ b/meta/conf/bitbake.conf > @@ -857,7 +857,8 @@ BB_NUMBER_THREADS ?= "${@oe.utils.cpu_count()}" > PARALLEL_MAKE ?= "-j ${@oe.utils.cpu_count()}" > > # Default parallelism and resource usage for xz > -XZ_MEMLIMIT ?= "50%" > +# A build might run 3 'xz' in parallel, so don't exhaust memory > +XZ_MEMLIMIT ?= "25%" > XZ_THREADS ?= "${@oe.utils.cpu_count(at_least=2)}" > XZ_THREADS[vardepvalue] = "1" > XZ_DEFAULTS ?= "--memlimit=${XZ_MEMLIMIT} --threads=${XZ_THREADS}" > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#169939): https://lists.openembedded.org/g/openembedded-core/message/169939 > Mute This Topic: https://lists.openembedded.org/mt/93294351/1686489 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [alex.kanavin@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
On Sat, 2022-08-27 at 20:23 +0200, Alexander Kanavin wrote: > I'm not sure this is right as a global change. This fixes situations > where 3 xz instances run on your specific setup, what about 4? Or 5? > Or n instances for all n in N? Thanks for the quick reply. I'd like to make the defaults safe for everyone, perhaps a simple fix would be to set XZ_MEMLIMIT to just below 100/BB_NUMBER_THREADS by default? > > I think you need to limit the amount of threads rather to match > available RAM, and do it via site.conf and not bitbake.conf. ?= I don't think it is currently possible to limit the number of 'xz' processes though. (Note that they are 3 distinct 'xz' processes, so setting XZ_THREADS has no effect, which controls only the number of threads that a single 'xz' process would use). The closest would be BB_NUMBER_THREADS, but that controls a lot of other things too (parallel package fetch perhaps)? > assignment is used in bitbake.conf exactly for that purpose. Currently the openembedded defaults don't work: the build always OOMs, at least on this particular target (or one with a large enough rootfs). The defaults should be conservative enough so they work on all systems (i.e. memlimit * number of xz processes < 100%), the user can then further tweak the memlimit vs number of processes (note *not* threads) tradeoff in their local conf file as you suggest. Best regards, --Edwin > > Alex > > On Sat, 27 Aug 2022 at 20:15, Edwin Török via lists.openembedded.org > <edwin=etorok.net@lists.openembedded.org> wrote: > > > > By default a build such as [1] might run 3 'xz' in parallel: > > ``` > > Currently 3 running tasks (11878 of 11883) 99% > > > ################################################################# > > > ##### > > #### | > > 0: demo-coreip-xfce4-1.0-r0 do_image_ext4 - 3m17s (pid 2088739) > > 1: demo-coreip-xfce4-1.0-r0 do_image_tar - 3m16s (pid 2088743) > > 2: demo-coreip-xfce4-1.0-r0 do_image_wic - 3m16s (pid 2088745) > > ``` > > > > However the default memory usage limit of `xz` is 50% each, so this > > will > > attempt to use 150% memory, and it gets OOM killed by systemd-oomd > > on > > Fedora 36. > > ``` > > Aug 27 18:38:57 fedora systemd-oomd[3150]: Killed > > /user.slice/user-1000.slice/user@1000.service/app.slice/app- > > org.gnome.Terminal.slice/vte-spawn-2d92eb7b-b005-41b4-a786- > > fc8c0d360ce3.scope due to memory used (66890584064) / total > > (67332812800) and swap used (7744446464) / total (8589930496) being > > more than 90.00% > > ``` > > > > Even with systemd-oomd turned off it'd eventually start swapping > > heavily > > on a system with 64GiB of physical memory and 8GiB of swap. > > > > Reduce memory limit on xz so that we can run 3 in parallel without > > driving the host close to or OOM. 25% seems to work on this > > particular > > build and allows it to complete successfully. > > > > [1] https://github.com/sifive/freedom-u-sdk/tree/2022.06.00 > > > > Signed-off-by: Edwin Török <edwin@etorok.net> > > --- > > meta/conf/bitbake.conf | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf > > index 2a3cf6f8aa..48ba52c12c 100644 > > --- a/meta/conf/bitbake.conf > > +++ b/meta/conf/bitbake.conf > > @@ -857,7 +857,8 @@ BB_NUMBER_THREADS ?= "${@oe.utils.cpu_count()}" > > PARALLEL_MAKE ?= "-j ${@oe.utils.cpu_count()}" > > > > # Default parallelism and resource usage for xz > > -XZ_MEMLIMIT ?= "50%" > > +# A build might run 3 'xz' in parallel, so don't exhaust memory > > +XZ_MEMLIMIT ?= "25%" > > XZ_THREADS ?= "${@oe.utils.cpu_count(at_least=2)}" > > XZ_THREADS[vardepvalue] = "1" > > XZ_DEFAULTS ?= "--memlimit=${XZ_MEMLIMIT} --threads=${XZ_THREADS}" > > > > > > -=-=-=-=-=-=-=-=-=-=-=- > > Links: You receive all messages sent to this group. > > View/Reply Online (#169939): > > https://lists.openembedded.org/g/openembedded-core/message/169939 > > Mute This Topic: https://lists.openembedded.org/mt/93294351/1686489 > > Group Owner: openembedded-core+owner@lists.openembedded.org > > Unsubscribe: > > https://lists.openembedded.org/g/openembedded-core/unsub [ > > alex.kanavin@gmail.com] > > -=-=-=-=-=-=-=-=-=-=-=- > >
On Sat, 27 Aug 2022 at 20:45, Edwin Török <edwin@etorok.net> wrote: > I don't think it is currently possible to limit the number of 'xz' > processes though. This is a general problem and has been discussed many times over. Bitbake historically doesn't limit the amount of compiler instances or any other processes that may start, and makes no promises about avoiding OOM situations. If you find yourself in one, you need to tighten BB_NUMBER_THREADS/PARALLEL_MAKE until OOM goes away, or add more RAM. More conservative defaults will never be conservative enough for every user, so we stick with defaults that make use of every available CPU core. That said, there are recent bitbake patches to track 'memory pressure' and (I'm not sure exactly how) try to hold back if memory gets tight, so you're welcome to take them and run experiments. Alex
On Sat, 2022-08-27 at 21:00 +0200, Alexander Kanavin wrote: > On Sat, 27 Aug 2022 at 20:45, Edwin Török <edwin@etorok.net> wrote: > > I don't think it is currently possible to limit the number of 'xz' > > processes though. > > This is a general problem and has been discussed many times over. > Bitbake historically doesn't limit the amount of compiler instances > or > any other processes that may start, and makes no promises about > avoiding OOM situations. Indeed the more general problem of limiting memory used by compiler processes is more difficult to solve. > If you find yourself in one, you need to > tighten BB_NUMBER_THREADS/PARALLEL_MAKE until OOM goes away If XZ always uses 50% of available memory then adding more RAM won't help though. The only safe value would be BB_NUMBER_OF_THREADS=1. There seems to be an unintended consequence of XZ_MEMLIMIT=50%. AFAICT from the manpage 'xz -9' is meant to use 674MiB of memory at most. XZ_MEMLIMIT=50% is 32GiB in my case, so it appears to increase memory usage beyond what -9 would use (and compression time) by quite a lot. It can also create problems when decompressing on less powerful systems (xz says it might need up to 20% of the compressor's memory, so ~6.4 GiB in my case, but that number could be higher if compressed on a machine with even more RAM). Unless the intention here was to use as much memory as available to squeeze out a few extra savings in compressed image size I think that perhaps the default should be '50%' or '674MiB', whichever is smaller. (And then "add more RAM" is a perfectly valid way to get out of the OOM situation) What do you think, should I try a patch for that? Thanks, --Edwin > , or add > more RAM. More conservative defaults will never be conservative > enough > for every user, so we stick with defaults that make use of every > available CPU core. > > That said, there are recent bitbake patches to track 'memory > pressure' > and (I'm not sure exactly how) try to hold back if memory gets tight, > so you're welcome to take them and run experiments. > > Alex
On Sat, 27 Aug 2022 at 21:15, Edwin Török <edwin@etorok.net> wrote: > If XZ always uses 50% of available memory then adding more RAM won't > help though. The only safe value would be BB_NUMBER_OF_THREADS=1. But does it? I wonder if reducing XZ_THREADS will actually reduce the RAM consumption as well. If the problem was common, it would be reported a lot more, and yet it's not. So I'd suggest you run experiments with XZ_THREADS first. Alex
On 2022-08-27 15:20, Alexander Kanavin wrote: > On Sat, 27 Aug 2022 at 21:15, Edwin Török <edwin@etorok.net> wrote: >> If XZ always uses 50% of available memory then adding more RAM won't >> help though. The only safe value would be BB_NUMBER_OF_THREADS=1. > But does it? I wonder if reducing XZ_THREADS will actually reduce the > RAM consumption as well. If the problem was common, it would be > reported a lot more, and yet it's not. So I'd suggest you run > experiments with XZ_THREADS first. > > Alex Edwin, Alex, We have a shiny new hammer called PSI or /proc/pressure (1) and I think that everyone should use it! ;-) We should both try your build to see if it helps at this early stage but 3x builds on a 64 GB machine does seem to be too much. If not, maybe we could make a decision about the % of memory / XZ_THREADS more dynamically based on available memory or even by making xz be more aware of limited available RAM somehow. ../Randy 1) https://lore.kernel.org/bitbake-devel/?q=%2Fproc%2Fpressure > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#169970): https://lists.openembedded.org/g/openembedded-core/message/169970 > Mute This Topic: https://lists.openembedded.org/mt/93294351/3616765 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [randy.macleod@windriver.com] > -=-=-=-=-=-=-=-=-=-=-=- >
diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf index 2a3cf6f8aa..48ba52c12c 100644 --- a/meta/conf/bitbake.conf +++ b/meta/conf/bitbake.conf @@ -857,7 +857,8 @@ BB_NUMBER_THREADS ?= "${@oe.utils.cpu_count()}" PARALLEL_MAKE ?= "-j ${@oe.utils.cpu_count()}" # Default parallelism and resource usage for xz -XZ_MEMLIMIT ?= "50%" +# A build might run 3 'xz' in parallel, so don't exhaust memory +XZ_MEMLIMIT ?= "25%" XZ_THREADS ?= "${@oe.utils.cpu_count(at_least=2)}" XZ_THREADS[vardepvalue] = "1" XZ_DEFAULTS ?= "--memlimit=${XZ_MEMLIMIT} --threads=${XZ_THREADS}"
By default a build such as [1] might run 3 'xz' in parallel: ``` Currently 3 running tasks (11878 of 11883) 99% |###################################################################### #### | 0: demo-coreip-xfce4-1.0-r0 do_image_ext4 - 3m17s (pid 2088739) 1: demo-coreip-xfce4-1.0-r0 do_image_tar - 3m16s (pid 2088743) 2: demo-coreip-xfce4-1.0-r0 do_image_wic - 3m16s (pid 2088745) ``` However the default memory usage limit of `xz` is 50% each, so this will attempt to use 150% memory, and it gets OOM killed by systemd-oomd on Fedora 36. ``` Aug 27 18:38:57 fedora systemd-oomd[3150]: Killed /user.slice/user-1000.slice/user@1000.service/app.slice/app- org.gnome.Terminal.slice/vte-spawn-2d92eb7b-b005-41b4-a786- fc8c0d360ce3.scope due to memory used (66890584064) / total (67332812800) and swap used (7744446464) / total (8589930496) being more than 90.00% ``` Even with systemd-oomd turned off it'd eventually start swapping heavily on a system with 64GiB of physical memory and 8GiB of swap. Reduce memory limit on xz so that we can run 3 in parallel without driving the host close to or OOM. 25% seems to work on this particular build and allows it to complete successfully. [1] https://github.com/sifive/freedom-u-sdk/tree/2022.06.00 Signed-off-by: Edwin Török <edwin@etorok.net> --- meta/conf/bitbake.conf | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)