[1/1] buildstats.bbclass: add functionality to collect build system stats

Submitted by Sakib Sajal on Oct. 23, 2020, 4:56 p.m. | Patch ID: 177420

Details

Message ID 20201023165648.5515-2-sakib.sajal@windriver.com
State New
Headers show

Commit Message

Sakib Sajal Oct. 23, 2020, 4:56 p.m.
There are a number of timeout and hang defects where
it would be useful to collect statistics about what
is running on a build host when that condition occurs.

This adds functionality to collect build system stats
on a regular interval and/or on task failure. Both
features are disabled by default.

To enable logging on a regular interval, set:
BB_HEARTBEAT_EVENT = "<interval>"
Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats

To enable logging on a task failure, set:
BB_LOG_HOST_STAT_ON_FAILURE = "1"
Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats

The list of commands, along with the desired options, need
to be specified in the BB_LOG_HOST_STAT_CMDS variable
delimited by ; as such:
BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"

Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>

---
 meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

-- 
2.27.0
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#143718): https://lists.openembedded.org/g/openembedded-core/message/143718
Mute This Topic: https://lists.openembedded.org/mt/77756272/3616849
Group Owner: openembedded-core+owner@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [michael@yoctoproject.org]
-=-=-=-=-=-=-=-=-=-=-=-

Patch hide | download patch | download mbox

diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
index 6f87187233..c68d7bb8a2 100644
--- a/meta/classes/buildstats.bbclass
+++ b/meta/classes/buildstats.bbclass
@@ -104,14 +104,38 @@  def write_task_data(status, logfile, e, d):
             f.write("Status: FAILED \n")
         f.write("Ended: %0.2f \n" % e.time)
 
+def write_host_data(logfile, e, d):
+    import subprocess, os, datetime
+    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
+    with open(logfile, "a") as f:
+        f.write("Event Time: %f\nDate: %s\n" % (e.time, datetime.datetime.now()))
+        for cmd in cmds:
+            if len(cmd) == 0:
+                continue
+            c = cmd.split()
+            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
+                try:
+                    output = subprocess.check_output(c, stderr=subprocess.STDOUT).decode('utf-8')
+                except subprocess.CalledProcessError as err:
+                    output = "Error running command: %s\n%s" % (cmd, err)
+                f.write("%s\n%s\n" % (cmd, output))
+            else:
+                f.write("Error running command: '%s': %s is not an executable.\n" % (cmd, c[0]))
+
 python run_buildstats () {
     import bb.build
     import bb.event
     import time, subprocess, platform
 
     bn = d.getVar('BUILDNAME')
-    bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
-    taskdir = os.path.join(bsdir, d.getVar('PF'))
+    # bitbake fires HeartbeatEvent even before a build has been
+    # triggered, causing BUILDNAME to be None
+    if bn is not None:
+        bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
+        taskdir = os.path.join(bsdir, d.getVar('PF'))
+        if isinstance(e, bb.event.HeartbeatEvent):
+            bb.utils.mkdirhier(bsdir)
+            write_host_data(os.path.join(bsdir, "host_stats"), e, d)
 
     if isinstance(e, bb.event.BuildStarted):
         ########################################################################
@@ -186,10 +210,12 @@  python run_buildstats () {
         build_status = os.path.join(bsdir, "build_stats")
         with open(build_status, "a") as f:
             f.write(d.expand("Failed at: ${PF} at task: %s \n" % e.task))
+            if bb.utils.to_boolean(d.getVar("BB_LOG_HOST_STAT_ON_FAILURE")):
+                write_host_data(build_status, e, d)
 }
 
 addhandler run_buildstats
-run_buildstats[eventmask] = "bb.event.BuildStarted bb.event.BuildCompleted bb.build.TaskStarted bb.build.TaskSucceeded bb.build.TaskFailed"
+run_buildstats[eventmask] = "bb.event.BuildStarted bb.event.BuildCompleted bb.event.HeartbeatEvent bb.build.TaskStarted bb.build.TaskSucceeded bb.build.TaskFailed"
 
 python runqueue_stats () {
     import buildstats

Comments

Richard Purdie Oct. 28, 2020, 2:27 p.m.
On Fri, 2020-10-23 at 12:56 -0400, Sakib Sajal wrote:
> There are a number of timeout and hang defects where
> it would be useful to collect statistics about what
> is running on a build host when that condition occurs.
> 
> This adds functionality to collect build system stats
> on a regular interval and/or on task failure. Both
> features are disabled by default.
> 
> To enable logging on a regular interval, set:
> BB_HEARTBEAT_EVENT = "<interval>"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats
> 
> To enable logging on a task failure, set:
> BB_LOG_HOST_STAT_ON_FAILURE = "1"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats
> 
> The list of commands, along with the desired options, need
> to be specified in the BB_LOG_HOST_STAT_CMDS variable
> delimited by ; as such:
> BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"
> 
> Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>
> ---
>  meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
> index 6f87187233..c68d7bb8a2 100644
> --- a/meta/classes/buildstats.bbclass
> +++ b/meta/classes/buildstats.bbclass
> @@ -104,14 +104,38 @@ def write_task_data(status, logfile, e, d):
>              f.write("Status: FAILED \n")
>          f.write("Ended: %0.2f \n" % e.time)
>  
> +def write_host_data(logfile, e, d):
> +    import subprocess, os, datetime
> +    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
> +    with open(logfile, "a") as f:
> +        f.write("Event Time: %f\nDate: %s\n" % (e.time, datetime.datetime.now()))
> +        for cmd in cmds:
> +            if len(cmd) == 0:
> +                continue
> +            c = cmd.split()
> +            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
> +                try:
> +                    output = subprocess.check_output(c, stderr=subprocess.STDOUT).decode('utf-8')
> +                except subprocess.CalledProcessError as err:
> +                    output = "Error running command: %s\n%s" % (cmd, err)
> +                f.write("%s\n%s\n" % (cmd, output))
> +            else:
> +                f.write("Error running command: '%s': %s is not an executable.\n" % (cmd, c[0]))
> +


I am a little worried about this for some of the reasons Chris
mentions. I worry that not all distros will have a standard location
for some of the tools we want to run.

One trick you could try is to use something like: 

path = d.getVar("PATH") + ":" + d.getVar("BB_ORIGENV", False).getVar("PATH")

which means we'd add back in the original search PATH for the tools as
well as our own directories.

Cheers,

Richard
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#143854): https://lists.openembedded.org/g/openembedded-core/message/143854
Mute This Topic: https://lists.openembedded.org/mt/77756272/3617530
Group Owner: openembedded-core+owner@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [oe-patchwork@oe-patch.openembedded.org]
-=-=-=-=-=-=-=-=-=-=-=-