mbox series

[RFC,0/1] package.bbclass: Expose list of split out debug files

Message ID 20240228062139.473528-1-philip.lorenz@bmw.de
Headers show
Series package.bbclass: Expose list of split out debug files | expand

Message

Philip Lorenz Feb. 28, 2024, 6:21 a.m. UTC
With the introduction of debuginfod ([1]), providing debug symbols to
developers has been greatly simplified. Initial support for spawning a
debuginfod server is already available as part of poky.

However, this relies on debuginfod scraping the debug packages for their
build IDs. This is not only inefficient (as all packages need to be
extracted again), but it also does not scale well when covering a large
number of builds.

To mitigate this, we are currently working on an approach to extract the
metadata needed to provide debug symbols as part of the bitbake build.
This metadata includes the mapping of the GNU build ID to the package
holding the debug symbol. The metadata will be treated as another build
artifact and can be consumed by a daemon implementing the debuginfod
HTTP API to serve debug symbol file requests from the package feed
produced by the bitbake build.

Initially, we considered implementing the generation of debug metadata
directly as part of emit_pkgdata() in package.bbclass (disabled by
default). However, we discarded this idea as introducing a configuration
option would increase maintenance effort for a feature that would
potentially only be enabled in very few builds.  Instead, we opted to
extend package.bbclass to expose the minimal information needed to
reliably identify debug symbol files, which can then be consumed by a
packaging hook.

Is this extension something that is viable to be merged? We are
considering open-sourcing the other parts needed to implement the setup
described above, but as those parts are still in the prototyping phase,
it will require some more time.

[1] https://sourceware.org/elfutils/Debuginfod.html

Philip Lorenz (1):
  package.bbclass: Expose list of split out debug files

 meta/classes-global/package.bbclass |  4 ++++
 meta/lib/oe/package.py              | 19 ++++++++++---------
 2 files changed, 14 insertions(+), 9 deletions(-)

Comments

Alexander Kanavin Feb. 28, 2024, 7:41 a.m. UTC | #1
On Wed, 28 Feb 2024 at 07:22, Philip Lorenz <philip.lorenz@bmw.de> wrote:
> However, this relies on debuginfod scraping the debug packages for their
> build IDs. This is not only inefficient (as all packages need to be
> extracted again), but it also does not scale well when covering a large
> number of builds.

Is it possible to see numbers behind this claim? When there is a
proposal to increase code complexity, that needs to be justified in a
way that can be locally observed.

> Is this extension something that is viable to be merged? We are
> considering open-sourcing the other parts needed to implement the setup
> described above, but as those parts are still in the prototyping phase,
> it will require some more time.

The patch looks okay, but it's not useful without those other parts,
so you need to get them ready and submit the whole set.

Alex
Richard Purdie Feb. 28, 2024, 9:14 a.m. UTC | #2
On Wed, 2024-02-28 at 07:21 +0100, Philip Lorenz wrote:
> With the introduction of debuginfod ([1]), providing debug symbols to
> developers has been greatly simplified. Initial support for spawning a
> debuginfod server is already available as part of poky.
> 
> However, this relies on debuginfod scraping the debug packages for their
> build IDs. This is not only inefficient (as all packages need to be
> extracted again), but it also does not scale well when covering a large
> number of builds.
> 
> To mitigate this, we are currently working on an approach to extract the
> metadata needed to provide debug symbols as part of the bitbake build.
> This metadata includes the mapping of the GNU build ID to the package
> holding the debug symbol. The metadata will be treated as another build
> artifact and can be consumed by a daemon implementing the debuginfod
> HTTP API to serve debug symbol file requests from the package feed
> produced by the bitbake build.
> 
> Initially, we considered implementing the generation of debug metadata
> directly as part of emit_pkgdata() in package.bbclass (disabled by
> default). However, we discarded this idea as introducing a configuration
> option would increase maintenance effort for a feature that would
> potentially only be enabled in very few builds.  Instead, we opted to
> extend package.bbclass to expose the minimal information needed to
> reliably identify debug symbol files, which can then be consumed by a
> packaging hook.
> 
> Is this extension something that is viable to be merged? We are
> considering open-sourcing the other parts needed to implement the setup
> described above, but as those parts are still in the prototyping phase,
> it will require some more time.
> 
> [1] https://sourceware.org/elfutils/Debuginfod.html

I think this is the kind of direction we've wanted to go in. I'm not
sure the patch as it stands is that useful as it just lists files which
you could just as easily obtain with a os.walk on the filesystem but in
principle I'd be fine with writing some extra data during do_package or
do_packagedata which saves the buildid mappings.

So yes, in principle the idea sounds good but obviously the final
decision would depend upon the patches.

I'm assuming this data wouldn't be that large or that expensive to
compute so I'd prefer not to hide it behind extra configuration options
if we can help it. That does depend on the overheads/costs though.

Cheers,

Richard
Philip Lorenz Feb. 28, 2024, 3:22 p.m. UTC | #3
Hi Alex,

On 28.02.24 08:41, Alexander Kanavin wrote:
> On Wed, 28 Feb 2024 at 07:22, Philip Lorenz <philip.lorenz@bmw.de> wrote:
>> However, this relies on debuginfod scraping the debug packages for their
>> build IDs. This is not only inefficient (as all packages need to be
>> extracted again), but it also does not scale well when covering a large
>> number of builds.
> Is it possible to see numbers behind this claim? When there is a
> proposal to increase code complexity, that needs to be justified in a
> way that can be locally observed.

Let me provide some numbers based on both an internal medium-sized build 
based on kirkstone as well as a core-image-minimal build based on master.

Kirkstone:

> find -name "*.ipk" | wc
>    8415    8415  615076
> du -h -c
> 3.5G    total

> time /bin/sh -c 'for f in */*.ipk; do ar p $f data.tar.xz | tar -tJ > 
> /dev/null; done'
>
> real    5m13.629s
> user    4m56.653s
> sys     1m41.578s

master (core-image-minimal):

> find -name "*.ipk" | wc
>    4553    4553  287890
> du -h -c
> 2.1G    total

> time /bin/sh -c 'for f in */*.ipk; do ar p $f data.tar.zst | tar 
> --zstd -t > /dev/null; done'
>
> real    1m2.521s
> user    0m40.876s
> sys     1m8.232s
Exact figures of course vary and this can be further optimized by 
introducing parallelism. However, given that the artifacts are available 
uncompressed during packaging and the packaging step is also the one 
responsible for splitting out the debug symbols so limiting build ID 
extraction to the files that are known to contain debug symbols also is 
an efficiency win (and one also avoid implementing any kind of 
heuristics to determine which files actually contain the debug symbols).

>
>> Is this extension something that is viable to be merged? We are
>> considering open-sourcing the other parts needed to implement the setup
>> described above, but as those parts are still in the prototyping phase,
>> it will require some more time.
> The patch looks okay, but it's not useful without those other parts,
> so you need to get them ready and submit the whole set.

I'll answer this as part of my reply to Richard. I'd be more than happy 
to share the tooling we use to produce the build ID metadata and this 
was more an issue of where to actually place it. The only thing that is 
not yet in a state ready for public consumption is our daemon that 
consumes this metadata and then transparently fulfills any incoming 
debuginfo requests by retrieving the debug file from the corresponding 
package.

Br,

Philip
Philip Lorenz Feb. 28, 2024, 3:41 p.m. UTC | #4
Hi Richard,

On 28.02.24 10:14, Richard Purdie wrote:
> On Wed, 2024-02-28 at 07:21 +0100, Philip Lorenz wrote:
>> With the introduction of debuginfod ([1]), providing debug symbols to
>> developers has been greatly simplified. Initial support for spawning a
>> debuginfod server is already available as part of poky.
>>
>> However, this relies on debuginfod scraping the debug packages for their
>> build IDs. This is not only inefficient (as all packages need to be
>> extracted again), but it also does not scale well when covering a large
>> number of builds.
>>
>> To mitigate this, we are currently working on an approach to extract the
>> metadata needed to provide debug symbols as part of the bitbake build.
>> This metadata includes the mapping of the GNU build ID to the package
>> holding the debug symbol. The metadata will be treated as another build
>> artifact and can be consumed by a daemon implementing the debuginfod
>> HTTP API to serve debug symbol file requests from the package feed
>> produced by the bitbake build.
>>
>> Initially, we considered implementing the generation of debug metadata
>> directly as part of emit_pkgdata() in package.bbclass (disabled by
>> default). However, we discarded this idea as introducing a configuration
>> option would increase maintenance effort for a feature that would
>> potentially only be enabled in very few builds.  Instead, we opted to
>> extend package.bbclass to expose the minimal information needed to
>> reliably identify debug symbol files, which can then be consumed by a
>> packaging hook.
>>
>> Is this extension something that is viable to be merged? We are
>> considering open-sourcing the other parts needed to implement the setup
>> described above, but as those parts are still in the prototyping phase,
>> it will require some more time.
>>
>> [1] https://sourceware.org/elfutils/Debuginfod.html
> I think this is the kind of direction we've wanted to go in. I'm not
> sure the patch as it stands is that useful as it just lists files which
> you could just as easily obtain with a os.walk on the filesystem but in
> principle I'd be fine with writing some extra data during do_package or
> do_packagedata which saves the buildid mappings.
In one of my first iterations I placed the build ID to file mapping into 
the "extended" section of "pkgdata". We'd then consume this data after 
the build has finished to produce the debug info metadata database which 
contains the mapping from build ID to debug symbol file and the package 
containing the file. If this sounds sane to you I can clean up that 
version and share it here.
> So yes, in principle the idea sounds good but obviously the final
> decision would depend upon the patches.
>
> I'm assuming this data wouldn't be that large or that expensive to
> compute so I'd prefer not to hide it behind extra configuration options
> if we can help it. That does depend on the overheads/costs though.
>
I just executed build ID extraction on the debug packages of our medium 
sized kirkstone based distro (see my reply to Alex for more details). 
Sequentially extracting build IDs from around 8000 files took around 
1:30 minutes on my machine. While I wouldn't call this excessive, I am 
also not sure whether this is too much overhead given that I only expect 
this data to be used in some deployments.

Br,

Philip
Alexander Kanavin Feb. 28, 2024, 5:40 p.m. UTC | #5
On Wed, 28 Feb 2024 at 16:41, Philip Lorenz <philip.lorenz@bmw.de> wrote:
> > I'm assuming this data wouldn't be that large or that expensive to
> > compute so I'd prefer not to hide it behind extra configuration options
> > if we can help it. That does depend on the overheads/costs though.
> >
> I just executed build ID extraction on the debug packages of our medium
> sized kirkstone based distro (see my reply to Alex for more details).
> Sequentially extracting build IDs from around 8000 files took around
> 1:30 minutes on my machine. While I wouldn't call this excessive, I am
> also not sure whether this is too much overhead given that I only expect
> this data to be used in some deployments.

I have to object to the numbers because they were done with a
sequential shell loop. Debuginfod does it in threads and is able to
complete the scans much faster. So you need to check how quickly it
completes its job when started with oe-debuginfod rather. There might
be an improvement coming from what you are proposing, but it's most
likely not going to be as drastic.

From debuginfod manpage:

-c NUM --concurrency=NUM
Set the concurrency limit for the scanning queue threads, which work
together to process archives & files located by the traversal thread.
This important for controlling CPU-intensive operations like parsing
an ELF file and especially decompressing archives. The default is the
number of processors on the system; the minimum is 1.
https://manpages.debian.org/testing/debuginfod/debuginfod.8.en.html

There's also something else I noticed just now: there seems to be an
alternative implementation of debuginfod you want to introduce? Why?
If the original from elfutils isn't working well enough, shouldn't we
make it better?

One possibility is teaching it to mass-import pre-computed entries
into its index, so that sweeping file tree scans with archive
extractions can be avoided altogether. Or doing incremental index
imports directly from do_package.

Alex
Philip Lorenz Feb. 29, 2024, 8:20 a.m. UTC | #6
Hi Alex,

On 28.02.24 18:40, Alexander Kanavin wrote:
> On Wed, 28 Feb 2024 at 16:41, Philip Lorenz <philip.lorenz@bmw.de> wrote:
>>> I'm assuming this data wouldn't be that large or that expensive to
>>> compute so I'd prefer not to hide it behind extra configuration options
>>> if we can help it. That does depend on the overheads/costs though.
>>>
>> I just executed build ID extraction on the debug packages of our medium
>> sized kirkstone based distro (see my reply to Alex for more details).
>> Sequentially extracting build IDs from around 8000 files took around
>> 1:30 minutes on my machine. While I wouldn't call this excessive, I am
>> also not sure whether this is too much overhead given that I only expect
>> this data to be used in some deployments.
> I have to object to the numbers because they were done with a
> sequential shell loop. Debuginfod does it in threads and is able to
> complete the scans much faster. So you need to check how quickly it
> completes its job when started with oe-debuginfod rather. There might
> be an improvement coming from what you are proposing, but it's most
> likely not going to be as drastic.

I think there's some misunderstanding that I'd like to sort out first: 
This is in no way about deprecating or not using debuginfod. It however 
is an optimization on how build IDs are extracted which can be used by a 
variety of tools (such as debuginfod). As such a sequential scan should 
give a rough idea on how much time it takes to extract the build IDs 
during do_package (wall clock time is bound to differ). Based on this we 
can see that its not free but also not extremely expensive although I'd 
like to leave the judgement call on whether this something that should 
be enabled on all builds to someone else.

> There's also something else I noticed just now: there seems to be an
> alternative implementation of debuginfod you want to introduce? Why?
> If the original from elfutils isn't working well enough, shouldn't we
> make it better?

Let try to give you some sort of insight of how we are planning to use 
it and I hope this clarifies things:

In our case we are dealing with hundreds of bitbake builds whose 
artifacts (including package feeds) are published to some storage 
accessible via a HTTP. We would now like to offer a service that gives 
developers access to the debug files in a seamless way (i.e. we want to 
eliminate the process of manually having to download the debug packages 
matching a particular build). To accomplish this, our setup is based 
around a lightweight "gateway" daemon that translates a debuginfo HTTP 
request into a fetch of the corresponding package from the matching 
repository, extracting the debug symbol file and then serving that to 
the requesting client.

This is quite different to the way debuginfod works (which seems to be 
built around the idea of having the debug symbol files readily available 
via the file system) and I also see advantages in that approach when one 
has a fairly static set of debug symbol files one wants to serve. 
There's also some other non-functional requirements that would make 
deployment of debuginfod in our case quite difficult.

This is no way meant to be a fully fledged debuginfod reimplementation 
but a simple gateway between the debuginfod protocol and a backing 
package repository. I am not sure whether such an extension is in scope 
of the elfutils package.
> One possibility is teaching it to mass-import pre-computed entries
> into its index, so that sweeping file tree scans with archive
> extractions can be avoided altogether. Or doing incremental index
> imports directly from do_package.
Producing this data is exactly what this RFC is about. Using the 
extracted build ID information to optimize the import into debuginfod is 
one of the possible use cases but I'd also suggest to keep the extracted 
data agnostic of any concrete tooling (e.g. pkgdata).

Br,

Philip
Alexander Kanavin Feb. 29, 2024, 8:54 a.m. UTC | #7
On Thu, 29 Feb 2024 at 09:20, Philip Lorenz <philip.lorenz@bmw.de> wrote:
> > One possibility is teaching it to mass-import pre-computed entries
> > into its index, so that sweeping file tree scans with archive
> > extractions can be avoided altogether. Or doing incremental index
> > imports directly from do_package.
> Producing this data is exactly what this RFC is about. Using the
> extracted build ID information to optimize the import into debuginfod is
> one of the possible use cases but I'd also suggest to keep the extracted
> data agnostic of any concrete tooling (e.g. pkgdata).

This is fair enough. But you need to think upfront about how producing
this data should be tested with just oe-core/poky and what use cases
it could have. Simple sanity check is ok, but improving debuginfod to
import the pre-computed values is much better. Maybe something else
too?

This also allows you to develop and publish the alternative service on
its own schedule and terms, if the code is not mature or BMW legal is
having a hard time signing off on making it public etc. We don't need
to see it, if there's a use case in core.

Alex