mbox series

[0/6] keep reducing regression reports noise

Message ID 20230228181052.4191521-1-alexis.lothore@bootlin.com
Headers show
Series keep reducing regression reports noise | expand

Message

Alexis Lothoré Feb. 28, 2023, 6:10 p.m. UTC
From: Alexis Lothoré <alexis.lothore@bootlin.com>

Hello,
here is another batch of fixes to reduce noise in regression reports. Fixes are
directly linked to main noise sources seen in 4.2_M3 regression report ([1]).
- fix some existing selftests for resulttool
- add more filters for ptests incorrectly logging failures. The root cause of
  those wrongly named test results remained to be fixed to stop saving broken
  test results, but this series fixes at least parsing for existing results
- Stop logging "newly passing" tests in regression reports. Some real
  regressions are "hidden" in big chunks of newly passing tests:
foo: FAIL -> PASS
bar: FAIL -> PASS
moo: FAIL -> PASS
[...]
xxx: PASS -> FAIL
[...]
yyy: FAIL -> PASS
zzz: FAIL -> PASS

We are reaching a point where regression reports are small enough to get posted
on pastebin, so by following [2] you can find a report sample generated with
tooling patched with this series

[1] https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/testresults/testresult-regressions-report.txt
[2] https://pastebin.com/QgfLKhpx

Alexis Lothoré (6):
  scripts/resulttool: call fixup_ptest_names in regression_common
  oeqa/selftest/resulttool: fix ptest filtering tests
  oeqa/selftest/resulttool: fix fake data used for testing
  scripts/resulttool: fix ptests results containing a non reproducible
    path
  oeqa/selftest/resulttool: add test for error propagation in test name
    filtering
  scripts/resulttool: do not count newly passing tests as regressions

 .../oeqa/selftest/cases/resulttooltests.py    | 112 +++++++++++-------
 scripts/lib/resulttool/regression.py          |  66 +++++++----
 2 files changed, 112 insertions(+), 66 deletions(-)

Comments

Tim Orling March 1, 2023, 6:24 a.m. UTC | #1
On Tue, Feb 28, 2023 at 10:10 AM Alexis Lothoré via lists.openembedded.org
<alexis.lothore=bootlin.com@lists.openembedded.org> wrote:

> From: Alexis Lothoré <alexis.lothore@bootlin.com>
>
> Hello,
> here is another batch of fixes to reduce noise in regression reports.
> Fixes are
> directly linked to main noise sources seen in 4.2_M3 regression report
> ([1]).
> - fix some existing selftests for resulttool
> - add more filters for ptests incorrectly logging failures. The root cause
> of
>   those wrongly named test results remained to be fixed to stop saving
> broken
>   test results, but this series fixes at least parsing for existing results
> - Stop logging "newly passing" tests in regression reports. Some real
>   regressions are "hidden" in big chunks of newly passing tests:
> foo: FAIL -> PASS
> bar: FAIL -> PASS
> moo: FAIL -> PASS
> [...]
> xxx: PASS -> FAIL
> [...]
> yyy: FAIL -> PASS
> zzz: FAIL -> PASS
>
> We are reaching a point where regression reports are small enough to get
> posted
> on pastebin, so by following [2] you can find a report sample generated
> with
> tooling patched with this series
>
> [1] https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/t
> <https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/testresults/testresult-regressions-report.txt>


It seems a bit odd that all the regressions are changing from a valid state
(PASS, SKIP…) -> None. Does this literally mean the only changes were
dropped test cases?

estresults/testresult-regressions-report.txt
> <https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/testresults/testresult-regressions-report.txt>
> [2] https://pastebin.com/QgfLKhpx
>
> Alexis Lothoré (6):
>   scripts/resulttool: call fixup_ptest_names in regression_common
>   oeqa/selftest/resulttool: fix ptest filtering tests
>   oeqa/selftest/resulttool: fix fake data used for testing
>   scripts/resulttool: fix ptests results containing a non reproducible
>     path
>   oeqa/selftest/resulttool: add test for error propagation in test name
>     filtering
>   scripts/resulttool: do not count newly passing tests as regressions
>
>  .../oeqa/selftest/cases/resulttooltests.py    | 112 +++++++++++-------
>  scripts/lib/resulttool/regression.py          |  66 +++++++----
>  2 files changed, 112 insertions(+), 66 deletions(-)
>
> --
> 2.39.1
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#177847):
> https://lists.openembedded.org/g/openembedded-core/message/177847
> Mute This Topic: https://lists.openembedded.org/mt/97296204/924729
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> ticotimo@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Alexis Lothoré March 1, 2023, 7:58 a.m. UTC | #2
Hello Tim,
On 3/1/23 07:24, Tim Orling wrote:

> On Tue, Feb 28, 2023 at 10:10 AM Alexis Lothoré via lists.openembedded.org
> <http://lists.openembedded.org>
> <alexis.lothore=bootlin.com@lists.openembedded.org
> <mailto:bootlin.com@lists.openembedded.org>> wrote:
> 
>     From: Alexis Lothoré <alexis.lothore@bootlin.com
>     <mailto:alexis.lothore@bootlin.com>>
> 
>     Hello,
>     here is another batch of fixes to reduce noise in regression reports. Fixes are
>     directly linked to main noise sources seen in 4.2_M3 regression report ([1]).
>     - fix some existing selftests for resulttool
>     - add more filters for ptests incorrectly logging failures. The root cause of
>       those wrongly named test results remained to be fixed to stop saving broken
>       test results, but this series fixes at least parsing for existing results
>     - Stop logging "newly passing" tests in regression reports. Some real
>       regressions are "hidden" in big chunks of newly passing tests:
>     foo: FAIL -> PASS
>     bar: FAIL -> PASS
>     moo: FAIL -> PASS
>     [...]
>     xxx: PASS -> FAIL
>     [...]
>     yyy: FAIL -> PASS
>     zzz: FAIL -> PASS
> 
>     We are reaching a point where regression reports are small enough to get posted
>     on pastebin, so by following [2] you can find a report sample generated with
>     tooling patched with this series
> 
>     [1] https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/t
>     <https://autobuilder.yocto.io/pub/releases/yocto-4.2_M3.rc1/testresults/testresult-regressions-report.txt>
> 
> 
> It seems a bit odd that all the regressions are changing from a valid state
> (PASS, SKIP…) -> None. Does this literally mean the only changes were dropped
> test cases?

For most of "XXX -> None" transitions in the 4.2_M3.rc1 regression report, tests
were not dropped between base and target tests, they are present in both, but
the test names saved and stored in git are incorrect and bear a "non
reproducible" part, which makes the tooling raise many of those wrong
transitions. Here is an example:

ptestresult.binutils-ld.in testcase
/home/pokybuild/yocto-worker/qemux86/build/build-st-15167/tmp/work/core2-32-poky-linux/binutils-cross-testsuite/2.40-r0/git/ld/testsuite/ld-ctf/ctf.exp:
ERROR -> None

This binutils-ld test result is present in both base and target results, but the
test name is very likely broken: the error has been captured as part of the test
name, and worse than than, it contains multiple parts that change between
executions (possibly "core2-32-poky-linux" and "qemux86", but especially
"build-st-15167") because of path embedded in the error log. So when running
"resulttool regression-git", the tool does not find in target the test it has
found in base, which raises a "XXX -> None".
Obviously the main issue has to be fixed in all runners generating those errors
(so far, we have seen this kind of issues with ptests for binutils, curl, dbus,
toolchains, glibc, etc), but since we want to be able to work with current tests
results history, we must make the tools able to circumvent those issues.

Regards,