Message ID | 20211124171529.4107434-3-ross.burton@arm.com |
---|---|
State | Accepted, archived |
Commit | 765d0f25ce48636b1838a5968e2dc15de2127428 |
Headers | show |
Series | [1/3] oe/utils: allow naming threads in ThreadedPool | expand |
On 24.11.21 18:15, Ross Burton wrote: > Larger systems may have large numbers of cores, but beyond a certain > point they can't all be used for compiling: whilst purely > compute-intensive jobs can be parallelised to hundreds of cores, > operations such as compressing (needs lots of RAM) or compiling (lots of > I/O) don't scale linearly. > > For example, the Marvel ThunderX2 has 32 cores, each capable of > executing four threads, and can be configured with two sockets, making > 256 CPUs according to Linux. Zstd using 256 threads has been seen to > fail to allocate memory during even small recipes such as iso-codes. > > Add a default cap of 64 CPUs to the cpu_count() method so that extreme > parallisation is limited. 64 is high enough that meaningful gains > beyond it are unlikely, but high enough that most systems won't be > effected. > > Signed-off-by: Ross Burton <ross.burton@arm.com> > --- > meta/lib/oe/utils.py | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/meta/lib/oe/utils.py b/meta/lib/oe/utils.py > index 7982b2b511..136650e6f7 100644 > --- a/meta/lib/oe/utils.py > +++ b/meta/lib/oe/utils.py > @@ -248,9 +248,9 @@ def trim_version(version, num_parts=2): > trimmed = ".".join(parts[:num_parts]) > return trimmed > > -def cpu_count(at_least=1): > +def cpu_count(at_least=1, at_most=64): > cpus = len(os.sched_getaffinity(0)) > - return max(cpus, at_least) > + return max(min(cpus, at_most), at_least) I like that patch, 64 threads seems a reasonable choice to me. Can we somehow have that documented in the migration guide for instance (??), as I think there may be one or another user out there, that will notice a change in their build performance. > > def execute_pre_post_process(d, cmds): > if cmds is None: > > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#158731): https://lists.openembedded.org/g/openembedded-core/message/158731 > Mute This Topic: https://lists.openembedded.org/mt/87285695/3647476 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [kweihmann@outlook.com] > -=-=-=-=-=-=-=-=-=-=-=- >
diff --git a/meta/lib/oe/utils.py b/meta/lib/oe/utils.py index 7982b2b511..136650e6f7 100644 --- a/meta/lib/oe/utils.py +++ b/meta/lib/oe/utils.py @@ -248,9 +248,9 @@ def trim_version(version, num_parts=2): trimmed = ".".join(parts[:num_parts]) return trimmed -def cpu_count(at_least=1): +def cpu_count(at_least=1, at_most=64): cpus = len(os.sched_getaffinity(0)) - return max(cpus, at_least) + return max(min(cpus, at_most), at_least) def execute_pre_post_process(d, cmds): if cmds is None:
Larger systems may have large numbers of cores, but beyond a certain point they can't all be used for compiling: whilst purely compute-intensive jobs can be parallelised to hundreds of cores, operations such as compressing (needs lots of RAM) or compiling (lots of I/O) don't scale linearly. For example, the Marvel ThunderX2 has 32 cores, each capable of executing four threads, and can be configured with two sockets, making 256 CPUs according to Linux. Zstd using 256 threads has been seen to fail to allocate memory during even small recipes such as iso-codes. Add a default cap of 64 CPUs to the cpu_count() method so that extreme parallisation is limited. 64 is high enough that meaningful gains beyond it are unlikely, but high enough that most systems won't be effected. Signed-off-by: Ross Burton <ross.burton@arm.com> --- meta/lib/oe/utils.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)