Patchwork [bitbake-devel] request for comments - dynamic task selection

login
register
mail settings
Submitter Alexandru DAMIAN
Date Feb. 20, 2014, 7:17 p.m.
Message ID <CAJ2CSBs-RL_8V_nGYhY21k0J2E_ME3xwC3kgKLqt1PCEYwxLNQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/67111/
State New
Headers show

Comments

Alexandru DAMIAN - Feb. 20, 2014, 7:17 p.m.
Hello,

I have a RFC patch that attempts to improve build performance by
dynamically selecting the next-to-run task based on  currently running
tasks. The general idea is if we overload the network with too many fetch
tasks, it's better to start doing unpack tasks while the disks idle instead
of starting new fetch tasks. The concept can be generally applied, e.g.
it's better to schedule package tasks in parallel with compile tasks
instead of running all the compile tasks first and then all package tasks.

This patch just looks at currently running tasks and selects the next
available task that hasn't a similar-type task already running.

I'm seeing a roughly 2% build time reduction when building the
virtual:native components, with just 26 tasks reordered out of 244 tasks
executed. I am attaching a toaster.sqlite file that captures the test -
build no.1 is done with the origin/master source, and build no. 2 is done
with this patch applied.

DO NOT MERGE - this patch exposes dependency problems in OE-Core,
specifically GCC 4.8 recipes. It breaks the OE-Core builds.

?I'm waiting for your feedback. Meanwhile, I'm gonna try to solve the GCC
problems exposed.?
Richard Purdie - Feb. 22, 2014, 11:22 a.m.
On Thu, 2014-02-20 at 19:17 +0000, Damian, Alexandru wrote:
> I have a RFC patch that attempts to improve build performance by
> dynamically selecting the next-to-run task based on  currently running
> tasks. The general idea is if we overload the network with too many
> fetch tasks, it's better to start doing unpack tasks while the disks
> idle instead of starting new fetch tasks. The concept can be generally
> applied, e.g. it's better to schedule package tasks in parallel with
> compile tasks instead of running all the compile tasks first and then
> all package tasks.
>
> This patch just looks at currently running tasks and selects the next
> available task that hasn't a similar-type task already running.
>
> I'm seeing a roughly 2% build time reduction when building the
> virtual:native components, with just 26 tasks reordered out of 244
> tasks executed. I am attaching a toaster.sqlite file that captures the
> test - build no.1 is done with the origin/master source, and build no.
> 2 is done with this patch applied.

The trouble with scheduler changes is that they're hard to evaluate as
the benefits may appear on one machine but cause issues on another.

Firstly, I'd like to see this as a new scheduler, not changing the
existing one. I'd then like Stefan to try this against our standard
performance tests, see what it does to our various benchmarks there.

I'd also note that do_fetch is not the best task for optimisation since
most people have relatively up to date source directories locally at any
point after their first build.

Cheers,

Richard

Patch

From 51d008326c8a9c8c76e2f7f7d46f8dbc6b4996e7 Mon Sep 17 00:00:00 2001
From: Alexandru DAMIAN <alexandru.damian@intel.com>
Date: Tue, 18 Feb 2014 17:43:43 +0000
Subject: [PATCH 1/1] bitbake: runqueue: improve task selection algorithm

When selecting the next task to run, Bitbake uses a
priority map to select one task from the list of currently
buildable tasks. The priority map is statically decided before
starting the run through the use of the scheduler.

This patch changes the task selection to add dynamic selection
based on currently running tasks. The basic assumption is that
we get a lower total build time if we select tasks as to
allow a better spread of load across subsystems, ie. select
disk-intensive tasks instead of network-intensive tasks if there
are too many network tasks currently running.

First implementation simply looks at currently running tasks
and favors selecting a new task based on not having another task
of the same type currently running.

Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com>
---
 bitbake/lib/bb/runqueue.py | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py
index 413d59f..290320b 100644
--- a/bitbake/lib/bb/runqueue.py
+++ b/bitbake/lib/bb/runqueue.py
@@ -135,16 +135,33 @@  class RunQueueScheduler(object):
             for taskid in xrange(self.numTasks):
                 self.rev_prio_map[self.prio_map[taskid]] = taskid
 
-        best = None
-        bestprio = None
-        for taskid in self.buildable:
-            prio = self.rev_prio_map[taskid]
-            if not bestprio or bestprio > prio:
-                stamp = self.stamps[taskid]
-                if stamp in self.rq.build_stamps.itervalues():
-                    continue
-                bestprio = prio
-                best = taskid
+        runningtasks = []
+        for taskid in xrange(self.numTasks):
+            if self.rq.runq_running[taskid] == 1 and self.rq.runq_complete[taskid] == 0:
+                runningtasks.append(taskid)
+
+        if (len(runningtasks)):
+            runningtasknames = {}
+            for taskid in runningtasks:
+                taskname = self.rqdata.runq_task[taskid]
+                if taskname in runningtasknames.keys():
+                    runningtasknames[taskname] += 1
+                else:
+                    runningtasknames[taskname] = 1
+
+            # we select a task that isn't in the list of currently running task names
+            b = sorted(self.buildable, key=lambda x: self.rev_prio_map[x])
+            for taskid in b:
+                taskname = self.rqdata.runq_task[taskid]
+                if not taskname in runningtasknames:
+                    stamp = self.stamps[taskid]
+                    if stamp in self.rq.build_stamps.itervalues():
+                        continue
+                    #bb.note("Selected task %s" % self.rqdata.runq_task[taskid])
+                    return taskid
+
+        b = sorted(self.buildable, key=lambda x: self.rev_prio_map[x])
+        return b[0]
 
         return best
 
-- 
1.8.3.2