Patchwork [1/2,RFC] rpm: Add workaround for debugedit-segv

login
register
mail settings
Submitter Mark Hatle
Date March 25, 2013, 5:19 p.m.
Message ID <1364231993-31670-2-git-send-email-mark.hatle@windriver.com>
Download mbox | patch
Permalink /patch/46871/
State Accepted
Commit 5a9d83de1f764dc7ceebe4f0c754d77aa506ffe3
Headers show

Comments

Phil Blundell - March 25, 2013, 5:02 p.m.
On Mon, 2013-03-25 at 12:19 -0500, Mark Hatle wrote:
> +Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address
> +of 0, but a size != 0.

That doesn't seem like totally unreasonable behaviour for a NOBITS
section.  What were you expecting libelf to do in that case?

++      if (data != NULL && size != 0)
++        hashFunctionContextUpdateMC (&ctx, &chunk);

I suppose one could argue that allocating a chunk of zero-filled memory
of the right size and then hashing that would be a slightly better fix.
Whether it matters in practice or not would depend on what exactly is
going into this hash and what it's being used for.

p.
Mark Hatle - March 25, 2013, 5:10 p.m.
On 3/25/13 12:02 PM, Phil Blundell wrote:
> On Mon, 2013-03-25 at 12:19 -0500, Mark Hatle wrote:
>> +Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address
>> +of 0, but a size != 0.
>
> That doesn't seem like totally unreasonable behaviour for a NOBITS
> section.  What were you expecting libelf to do in that case?
>
> ++      if (data != NULL && size != 0)
> ++        hashFunctionContextUpdateMC (&ctx, &chunk);
>
> I suppose one could argue that allocating a chunk of zero-filled memory
> of the right size and then hashing that would be a slightly better fix.
> Whether it matters in practice or not would depend on what exactly is
> going into this hash and what it's being used for.

It appears in the past it either didn't load the section at all, or the size was 
set to 0.

It's a combination of the data pointer set to NULL and the size != 0 that is 
causing the segfault.  This doesn't appear to happen outside of PPC and MIPS.

I'm going to look into identifying if the section is a NOBITS and skipping the 
whole operation if it is.

--Mark
Mark Hatle - March 25, 2013, 5:19 p.m.
[ YOCTO #4089 ]

On PPC and MIPS, there appears to be a condition that causes
debugedit to segfault.  The segfault is related to a call into
the md5hash algorithm, an address of '0', and a size > 0 is passed
causing the access of the address to segv.

This workaround may prove to be the final fix, but it's currently
unclear what the actual cause of the 0 address is.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 meta/recipes-devtools/rpm/rpm/debugedit-segv.patch | 35 ++++++++++++++++++++++
 meta/recipes-devtools/rpm/rpm_5.4.9.bb             |  3 +-
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 meta/recipes-devtools/rpm/rpm/debugedit-segv.patch
Phil Blundell - March 25, 2013, 5:45 p.m.
On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote:
> I'm going to look into identifying if the section is a NOBITS and skipping the 
> whole operation if it is.

That would mean that a change in the size of the .bss wouldn't have any
impact on the hash.  Maybe that's fine for your application though, I
dunno.

p.
Mark Hatle - March 25, 2013, 7:32 p.m.
On 3/25/13 12:45 PM, Phil Blundell wrote:
> On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote:
>> I'm going to look into identifying if the section is a NOBITS and skipping the
>> whole operation if it is.
>
> That would mean that a change in the size of the .bss wouldn't have any
> impact on the hash.  Maybe that's fine for your application though, I
> dunno.

I'm not completely familiar with how the buildid is calculated, other then an 
md5 hash over the sections themselves.  I can't think of a case where the 
contents of the sections wouldn't change along with the .bss and .plt size.

(buildid is supposed to be used by gdb to find/verify that the debuginfo matches 
the binary actually being debugged...)

Through my google search, I never found a full spec of what the buildid was 
supposed to contain, only a high-level description from an old Fedora 8 work 
item.  (Note, as long the buildid is consistent it really doesn't appear to 
matter how it was generated, just that both the app and debuginfo contain it.)

--Mark


> p.
>
>
Mark Hatle - March 25, 2013, 9:47 p.m.
On 3/25/13 2:32 PM, Mark Hatle wrote:
> On 3/25/13 12:45 PM, Phil Blundell wrote:
>> On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote:
>>> I'm going to look into identifying if the section is a NOBITS and skipping the
>>> whole operation if it is.
>>
>> That would mean that a change in the size of the .bss wouldn't have any
>> impact on the hash.  Maybe that's fine for your application though, I
>> dunno.
>
> I'm not completely familiar with how the buildid is calculated, other then an
> md5 hash over the sections themselves.  I can't think of a case where the
> contents of the sections wouldn't change along with the .bss and .plt size.
>
> (buildid is supposed to be used by gdb to find/verify that the debuginfo matches
> the binary actually being debugged...)
>
> Through my google search, I never found a full spec of what the buildid was
> supposed to contain, only a high-level description from an old Fedora 8 work
> item.  (Note, as long the buildid is consistent it really doesn't appear to
> matter how it was generated, just that both the app and debuginfo contain it.)

I've looked at the code some more.  It does checksum the header itself and then 
if it's got contents, it also adds the contents to the checksum.. That is where 
the failure appears to be happening:

           if (u.shdr.sh_type != SHT_NOBITS)
             {
               Elf_Data *d = elf_rawdata (dso->scn[i], NULL);
               if (d == NULL)
                 goto bad;
               process (d->d_buf, d->d_size);
             }

So it's specifically checking for SHT_NOBITS, but it's matching so it falls 
through and d->d_buf == 0, causing the failure.  I'll keep investigating, but 
somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!].

--Mark

> --Mark
>
>
>> p.
>>
>>
>
>
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core
>
Phil Blundell - March 26, 2013, 10:38 a.m.
On Mon, 2013-03-25 at 16:47 -0500, Mark Hatle wrote:
> I've looked at the code some more.  It does checksum the header itself and then 
> if it's got contents, it also adds the contents to the checksum.. That is where 
> the failure appears to be happening:
> 
>            if (u.shdr.sh_type != SHT_NOBITS)
>              {
>                Elf_Data *d = elf_rawdata (dso->scn[i], NULL);
>                if (d == NULL)
>                  goto bad;
>                process (d->d_buf, d->d_size);
>              }
> 
> So it's specifically checking for SHT_NOBITS, but it's matching so it falls 
> through and d->d_buf == 0, causing the failure.  I'll keep investigating, but 
> somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!].

Ah.  If a NOBITS section is getting reported as something else then it
seems that this must clearly be a bug in libelf and ought to be fixed
there rather than working around it in rpm.  What do you actually get as
u.shdr.sh_type?

p.
Mark Hatle - March 26, 2013, 12:20 p.m.
On 3/26/13 5:38 AM, Phil Blundell wrote:
> On Mon, 2013-03-25 at 16:47 -0500, Mark Hatle wrote:
>> I've looked at the code some more.  It does checksum the header itself and then
>> if it's got contents, it also adds the contents to the checksum.. That is where
>> the failure appears to be happening:
>>
>>             if (u.shdr.sh_type != SHT_NOBITS)
>>               {
>>                 Elf_Data *d = elf_rawdata (dso->scn[i], NULL);
>>                 if (d == NULL)
>>                   goto bad;
>>                 process (d->d_buf, d->d_size);
>>               }
>>
>> So it's specifically checking for SHT_NOBITS, but it's matching so it falls
>> through and d->d_buf == 0, causing the failure.  I'll keep investigating, but
>> somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!].
>
> Ah.  If a NOBITS section is getting reported as something else then it
> seems that this must clearly be a bug in libelf and ought to be fixed
> there rather than working around it in rpm.  What do you actually get as
> u.shdr.sh_type?

I posted an updated patch last night.  debugedit was re-translating (byte 
swapping) the header elements during the buildid calculations.  This was causing 
the value of sh_type to be in the ELF binaries native endian during the check. 
So big endian binaries failed to match the SHT_NOBITS, thus the problem being 
discovered on PPC and MIPS.

(Fix was to not translate the original values, but instead only translate a 
copy.  I verified the produced buildid was expected, as I now have a better 
understanding of how the buildid is generated.)

--Mark

> p.
>

Patch

diff --git a/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch b/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch
new file mode 100644
index 0000000..bd91693
--- /dev/null
+++ b/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch
@@ -0,0 +1,35 @@ 
+There are cases, especially on PPC and MIPS, where the data address
+returned is 0, but the size is not 0.
+
+It appears to happen when the sections headers are similar to:
+
+  [21] .data             PROGBITS        000239c0 0139c0 000010 00  WA  0   0  8
+  [22] .got              PROGBITS        000239d0 0139d0 000014 04 WAX  0   0  4
+  [23] .plt              NOBITS          000239e4 0139e4 000234 00 WAX  0   0  4
+  [24] .bss              NOBITS          00023c18 0139e4 0001c8 00  WA  0   0  8
+  [25] .comment          PROGBITS        00000000 0139e4 000011 01  MS  0   0  1
+  [26] .debug_aranges    PROGBITS        00000000 0139f8 000d68 00      0   0  8
+
+Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address
+of 0, but a size != 0.
+
+This could be a bug in libelf...
+
+Upstream-status: Pending
+
+Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
+
+Index: rpm-5.4.9/tools/debugedit.c
+===================================================================
+--- rpm-5.4.9.orig/tools/debugedit.c
++++ rpm-5.4.9/tools/debugedit.c
+@@ -1434,7 +1434,8 @@ handle_build_id (DSO *dso, Elf_Data *bui
+     auto inline void process (const void *data, size_t size)
+     {
+       memchunk chunk = { .data = (void *) data, .size = size };
+-      hashFunctionContextUpdateMC (&ctx, &chunk);
++      if (data != NULL && size != 0)
++        hashFunctionContextUpdateMC (&ctx, &chunk);
+     }
+     union
+     {
diff --git a/meta/recipes-devtools/rpm/rpm_5.4.9.bb b/meta/recipes-devtools/rpm/rpm_5.4.9.bb
index ba24111..e9c8f23 100644
--- a/meta/recipes-devtools/rpm/rpm_5.4.9.bb
+++ b/meta/recipes-devtools/rpm/rpm_5.4.9.bb
@@ -84,7 +84,8 @@  SRC_URI = "http://www.rpm5.org/files/rpm/rpm-5.4/rpm-5.4.9-0.20120508.src.rpm;ex
 	   file://python-rpm-rpmsense.patch \
 	   file://rpm-reloc-macros.patch \
 	   file://rpm-platform2.patch \
-     file://rpm-remove-sykcparse-decl.patch \
+	   file://rpm-remove-sykcparse-decl.patch \
+	   file://debugedit-segv.patch \
 	  "
 
 # Uncomment the following line to enable platform score debugging