| Submitter | Shane Wang |
|---|---|
| Date | March 29, 2012, 12:54 p.m. |
| Message ID | <b0f25a4ae7b257c5e0631a3e5c1f90facf25aca6.1333025491.git.shane.wang@intel.com> |
| Download | mbox | patch |
| Permalink | /patch/24861/ |
| State | New |
| Headers | show |
Comments
On Thu, 2012-03-29 at 20:54 +0800, Shane Wang wrote: > > > ne.wang@intel.com> > To: > bitbake-devel@lists.openembedded.org > Subject: > [bitbake-devel] [PATCH 5/8] > runqueue.py: check results[0] in > keys of build_pids before being > used to avoid exceptions > Date: > Thu, 29 Mar 2012 20:54:54 +0800 > (29/03/12 13:54:54) > > > [Yocto #2186] > > Signed-off-by: Shane Wang <shane.wang@intel.com> > --- > bitbake/lib/bb/runqueue.py | 20 ++++++++++++-------- > 1 files changed, 12 insertions(+), 8 deletions(-) This kind of change sets off alarm bells. The big question is why are you seeing exceptions? I suspect you're forking off processes within hob which are then confusing the waitpid code. I'd have to ask why the UI is forking processes when a build is running and why we're suddenly started seeing this... So can you please explain the problem further so we can fix the real problem? I did look at #2186 but that doesn't help me either :( Cheers, Richard
Richard Purdie wrote on 2012-03-30: > On Thu, 2012-03-29 at 20:54 +0800, Shane Wang wrote: >> >> >> ne.wang@intel.com> >> To: >> bitbake-devel@lists.openembedded.org >> Subject: >> [bitbake-devel] [PATCH 5/8] >> runqueue.py: check results[0] in >> keys of build_pids before being >> used to avoid exceptions >> Date: >> Thu, 29 Mar 2012 20:54:54 +0800 >> (29/03/12 13:54:54) >> >> >> [Yocto #2186] >> >> Signed-off-by: Shane Wang <shane.wang@intel.com> >> --- >> bitbake/lib/bb/runqueue.py | 20 ++++++++++++-------- >> 1 files changed, 12 insertions(+), 8 deletions(-) > > This kind of change sets off alarm bells. The big question is why are > you seeing exceptions? I suspect you're forking off processes within hob > which are then confusing the waitpid code. I'd have to ask why the UI is > forking processes when a build is running and why we're suddenly started > seeing this... The steps I did is to "Force stop" a build and click "build packages" to rebuild. Then I saw the exceptions. In the command mode, there is no issue because the process exits. In finish_now() in runqueue.py, os.kill() kills all sub-processes but they don't exit. when I start a new build, self.build_pids is empty, but due to the above reason, os.waitpid still can get the value of pids. > > So can you please explain the problem further so we can fix the real > problem? I did look at #2186 but that doesn't help me either :( OK, I am going to submit another patch, but I think the condition check is also needed. Otherwise, in the current code of runqueue_process_waitpid(), why do we have: if result[0] in self.build_stamps.keys(): del self.build_stamps[result[0]] > > Cheers, > > Richard -- Shane
By the way, I have never met the exception when I do "normally stop" the bitbake. -- Shane Wang, Shane wrote on 2012-03-30: > Richard Purdie wrote on 2012-03-30: > >> On Thu, 2012-03-29 at 20:54 +0800, Shane Wang wrote: >>> >>> >>> ne.wang@intel.com> >>> To: >>> bitbake-devel@lists.openembedded.org >>> Subject: >>> [bitbake-devel] [PATCH 5/8] >>> runqueue.py: check results[0] in >>> keys of build_pids before being >>> used to avoid exceptions >>> Date: >>> Thu, 29 Mar 2012 20:54:54 +0800 >>> (29/03/12 13:54:54) >>> >>> >>> [Yocto #2186] >>> >>> Signed-off-by: Shane Wang <shane.wang@intel.com> >>> --- >>> bitbake/lib/bb/runqueue.py | 20 ++++++++++++-------- >>> 1 files changed, 12 insertions(+), 8 deletions(-) >> >> This kind of change sets off alarm bells. The big question is why are >> you seeing exceptions? I suspect you're forking off processes within hob >> which are then confusing the waitpid code. I'd have to ask why the UI is >> forking processes when a build is running and why we're suddenly started >> seeing this... > The steps I did is to "Force stop" a build and click "build packages" to rebuild. > Then I saw the exceptions. > In the command mode, there is no issue because the process exits. > > In finish_now() in runqueue.py, os.kill() kills all sub-processes but > they don't exit. when I start a new build, self.build_pids is empty, but > due to the above reason, os.waitpid still can get the value of pids. > > >> >> So can you please explain the problem further so we can fix the real >> problem? I did look at #2186 but that doesn't help me either :( > OK, I am going to submit another patch, but I think the condition check > is also needed. Otherwise, in the current code of > runqueue_process_waitpid(), why do we have: > if result[0] in self.build_stamps.keys(): > del self.build_stamps[result[0]] > >> >> Cheers, >> >> Richard >
On Thu, Mar 29, 2012 at 11:10 PM, Wang, Shane <shane.wang@intel.com> wrote: >> So can you please explain the problem further so we can fix the real >> problem? I did look at #2186 but that doesn't help me either :( > OK, I am going to submit another patch, but I think the condition check is also needed. > Otherwise, in the current code of runqueue_process_waitpid(), why do we have: > if result[0] in self.build_stamps.keys(): > del self.build_stamps[result[0]] This is also off, from a code standpoint, even assuming it's needed. There's no need to use the keys method at all for a map. 'in' against a map automatically checks by key. result[0] in self.build_stamps.
On Fri, 2012-03-30 at 06:10 +0000, Wang, Shane wrote: > Richard Purdie wrote on 2012-03-30: > > > On Thu, 2012-03-29 at 20:54 +0800, Shane Wang wrote: > >> > >> > >> ne.wang@intel.com> > >> To: > >> bitbake-devel@lists.openembedded.org > >> Subject: > >> [bitbake-devel] [PATCH 5/8] > >> runqueue.py: check results[0] in > >> keys of build_pids before being > >> used to avoid exceptions > >> Date: > >> Thu, 29 Mar 2012 20:54:54 +0800 > >> (29/03/12 13:54:54) > >> > >> > >> [Yocto #2186] > >> > >> Signed-off-by: Shane Wang <shane.wang@intel.com> > >> --- > >> bitbake/lib/bb/runqueue.py | 20 ++++++++++++-------- > >> 1 files changed, 12 insertions(+), 8 deletions(-) > > > > This kind of change sets off alarm bells. The big question is why are > > you seeing exceptions? I suspect you're forking off processes within hob > > which are then confusing the waitpid code. I'd have to ask why the UI is > > forking processes when a build is running and why we're suddenly started > > seeing this... > The steps I did is to "Force stop" a build and click "build packages" to rebuild. Then I saw the exceptions. > In the command mode, there is no issue because the process exits. Ok, so what it sounds like is that waitpid() is not being called in the "force stop" mode to collect the exit values of the processes. We should fix the code to collect the exit values even in force stop mode. Cheers, Richard (resisting the urge to talk about reaping and zombies)
Patch
diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py index 6970548..67ad14b 100644 --- a/bitbake/lib/bb/runqueue.py +++ b/bitbake/lib/bb/runqueue.py @@ -1049,17 +1049,21 @@ class RunQueueExecute: result = os.waitpid(-1, os.WNOHANG) if result[0] == 0 and result[1] == 0: return None - task = self.build_pids[result[0]] - del self.build_pids[result[0]] - self.build_pipes[result[0]].close() - del self.build_pipes[result[0]] + task = None + if result[0] in self.build_pids.keys(): + task = self.build_pids[result[0]] + del self.build_pids[result[0]] + if result[0] in self.build_pipes.keys(): + self.build_pipes[result[0]].close() + del self.build_pipes[result[0]] # self.build_stamps[result[0]] may not exist when use shared work directory. if result[0] in self.build_stamps.keys(): del self.build_stamps[result[0]] - if result[1] != 0: - self.task_fail(task, result[1]>>8) - else: - self.task_complete(task) + if task: + if result[1] != 0: + self.task_fail(task, result[1]>>8) + else: + self.task_complete(task) return True def finish_now(self):
[Yocto #2186] Signed-off-by: Shane Wang <shane.wang@intel.com> --- bitbake/lib/bb/runqueue.py | 20 ++++++++++++-------- 1 files changed, 12 insertions(+), 8 deletions(-)