[3/3] oe/utils: by default cap cpu_count() to 64 cores

Message ID 20211124171529.4107434-3-ross.burton@arm.com
State Accepted, archived
Commit 765d0f25ce48636b1838a5968e2dc15de2127428
Headers show
Series [1/3] oe/utils: allow naming threads in ThreadedPool | expand

Commit Message

Ross Burton Nov. 24, 2021, 5:15 p.m. UTC
Larger systems may have large numbers of cores, but beyond a certain
point they can't all be used for compiling: whilst purely
compute-intensive jobs can be parallelised to hundreds of cores,
operations such as compressing (needs lots of RAM) or compiling (lots of
I/O) don't scale linearly.

For example, the Marvel ThunderX2 has 32 cores, each capable of
executing four threads, and can be configured with two sockets, making
256 CPUs according to Linux. Zstd using 256 threads has been seen to
fail to allocate memory during even small recipes such as iso-codes.

Add a default cap of 64 CPUs to the cpu_count() method so that extreme
parallisation is limited.  64 is high enough that meaningful gains
beyond it are unlikely, but high enough that most systems won't be
effected.

Signed-off-by: Ross Burton <ross.burton@arm.com>
---
 meta/lib/oe/utils.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Konrad Weihmann Nov. 24, 2021, 5:20 p.m. UTC | #1
On 24.11.21 18:15, Ross Burton wrote:
> Larger systems may have large numbers of cores, but beyond a certain
> point they can't all be used for compiling: whilst purely
> compute-intensive jobs can be parallelised to hundreds of cores,
> operations such as compressing (needs lots of RAM) or compiling (lots of
> I/O) don't scale linearly.
> 
> For example, the Marvel ThunderX2 has 32 cores, each capable of
> executing four threads, and can be configured with two sockets, making
> 256 CPUs according to Linux. Zstd using 256 threads has been seen to
> fail to allocate memory during even small recipes such as iso-codes.
> 
> Add a default cap of 64 CPUs to the cpu_count() method so that extreme
> parallisation is limited.  64 is high enough that meaningful gains
> beyond it are unlikely, but high enough that most systems won't be
> effected.
> 
> Signed-off-by: Ross Burton <ross.burton@arm.com>
> ---
>   meta/lib/oe/utils.py | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/meta/lib/oe/utils.py b/meta/lib/oe/utils.py
> index 7982b2b511..136650e6f7 100644
> --- a/meta/lib/oe/utils.py
> +++ b/meta/lib/oe/utils.py
> @@ -248,9 +248,9 @@ def trim_version(version, num_parts=2):
>       trimmed = ".".join(parts[:num_parts])
>       return trimmed
>   
> -def cpu_count(at_least=1):
> +def cpu_count(at_least=1, at_most=64):
>       cpus = len(os.sched_getaffinity(0))
> -    return max(cpus, at_least)
> +    return max(min(cpus, at_most), at_least)

I like that patch, 64 threads seems a reasonable choice to me.
Can we somehow have that documented in the migration guide for instance 
(??), as I think there may be one or another user out there, that will 
notice a change in their build performance.

>   
>   def execute_pre_post_process(d, cmds):
>       if cmds is None:
> 
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#158731): https://lists.openembedded.org/g/openembedded-core/message/158731
> Mute This Topic: https://lists.openembedded.org/mt/87285695/3647476
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>

Patch

diff --git a/meta/lib/oe/utils.py b/meta/lib/oe/utils.py
index 7982b2b511..136650e6f7 100644
--- a/meta/lib/oe/utils.py
+++ b/meta/lib/oe/utils.py
@@ -248,9 +248,9 @@  def trim_version(version, num_parts=2):
     trimmed = ".".join(parts[:num_parts])
     return trimmed
 
-def cpu_count(at_least=1):
+def cpu_count(at_least=1, at_most=64):
     cpus = len(os.sched_getaffinity(0))
-    return max(cpus, at_least)
+    return max(min(cpus, at_most), at_least)
 
 def execute_pre_post_process(d, cmds):
     if cmds is None: