Message ID | 20230728111311.1665976-4-richard.purdie@linuxfoundation.org |
---|---|
State | Accepted, archived |
Commit | 3924e94214b5135369be2551d54fb92097d35e95 |
Headers | show |
Series | [1/4] oeqa/runtime/ltp: Increase ltp test output timeout | expand |
On Fri, Jul 28, 2023 at 5:13 AM Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > > As spotted by Joshua Watt, the returncode isn't set until .poll() or .wait() > is called so we need to call this after the .kill() call. > > This fixes return code reporting so that timeouts for example now return an > exit code when they didn't before. > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > --- > meta/lib/oeqa/core/target/ssh.py | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py > index 243e45dd99e..72ed1adbf89 100644 > --- a/meta/lib/oeqa/core/target/ssh.py > +++ b/meta/lib/oeqa/core/target/ssh.py > @@ -265,6 +265,7 @@ def SSHCall(command, logger, timeout=None, **opts): > time.sleep(5) > try: > process.kill() > + process.wait() I think this can still result in a orphaned process; if the process.terminate() on the line above is successful, the process will exit and then process.kill() here will raise an exception and will call wait. This is true generally everywhere; wait() must be called regardless of if kill() successed, because kill() will fail if the process is already dead. It's a little tricky because process.commuinicate() calls wait(), so you can't unconditionally call wait(), but my solution was to remove all the kill() and wait() calls that already exist and add: try: run() except: logger.debug(....) raise finally: if process.returncode is None: try: process.wait(timeout=5) except TimeoutExpired: try: process.kill() except OSError: pass process.wait() > except OSError: > logger.debug('OSError when killing process') > pass > @@ -287,6 +288,7 @@ def SSHCall(command, logger, timeout=None, **opts): > except TimeoutExpired: > try: > process.kill() > + process.wait() > except OSError: > logger.debug('OSError') > pass > @@ -316,6 +318,7 @@ def SSHCall(command, logger, timeout=None, **opts): > # whilst running and ensure we don't leave a process behind. > if process.poll() is None: > process.kill() > + process.wait() > logger.debug('Something went wrong, killing SSH process') > raise > > -- > 2.39.2 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#185024): https://lists.openembedded.org/g/openembedded-core/message/185024 > Mute This Topic: https://lists.openembedded.org/mt/100408183/3616693 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [JPEWhacker@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
On Fri, Jul 28, 2023 at 8:01 AM Joshua Watt <jpewhacker@gmail.com> wrote: > > On Fri, Jul 28, 2023 at 5:13 AM Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > > > As spotted by Joshua Watt, the returncode isn't set until .poll() or .wait() > > is called so we need to call this after the .kill() call. > > > > This fixes return code reporting so that timeouts for example now return an > > exit code when they didn't before. > > > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > > --- > > meta/lib/oeqa/core/target/ssh.py | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py > > index 243e45dd99e..72ed1adbf89 100644 > > --- a/meta/lib/oeqa/core/target/ssh.py > > +++ b/meta/lib/oeqa/core/target/ssh.py > > @@ -265,6 +265,7 @@ def SSHCall(command, logger, timeout=None, **opts): > > time.sleep(5) > > try: > > process.kill() > > + process.wait() > > I think this can still result in a orphaned process; if the > process.terminate() on the line above is successful, the process will > exit and then process.kill() here will raise an exception and will > call wait. This is true generally everywhere; wait() must be called "will _not_ call wait" - Need to proofread better > regardless of if kill() successed, because kill() will fail if the > process is already dead. It's a little tricky because > process.commuinicate() calls wait(), so you can't unconditionally call > wait(), but my solution was to remove all the kill() and wait() calls > that already exist and add: > > try: > run() > except: > logger.debug(....) > raise > finally: > if process.returncode is None: > try: > process.wait(timeout=5) > except TimeoutExpired: > try: > process.kill() > except OSError: > pass > process.wait() > > > except OSError: > > logger.debug('OSError when killing process') > > pass > > @@ -287,6 +288,7 @@ def SSHCall(command, logger, timeout=None, **opts): > > except TimeoutExpired: > > try: > > process.kill() > > + process.wait() > > except OSError: > > logger.debug('OSError') > > pass > > @@ -316,6 +318,7 @@ def SSHCall(command, logger, timeout=None, **opts): > > # whilst running and ensure we don't leave a process behind. > > if process.poll() is None: > > process.kill() > > + process.wait() > > logger.debug('Something went wrong, killing SSH process') > > raise > > > > -- > > 2.39.2 > > > > > > -=-=-=-=-=-=-=-=-=-=-=- > > Links: You receive all messages sent to this group. > > View/Reply Online (#185024): https://lists.openembedded.org/g/openembedded-core/message/185024 > > Mute This Topic: https://lists.openembedded.org/mt/100408183/3616693 > > Group Owner: openembedded-core+owner@lists.openembedded.org > > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [JPEWhacker@gmail.com] > > -=-=-=-=-=-=-=-=-=-=-=- > >
On Fri, 2023-07-28 at 08:02 -0600, Joshua Watt wrote: > On Fri, Jul 28, 2023 at 8:01 AM Joshua Watt <jpewhacker@gmail.com> wrote: > > > > On Fri, Jul 28, 2023 at 5:13 AM Richard Purdie > > <richard.purdie@linuxfoundation.org> wrote: > > > > > > As spotted by Joshua Watt, the returncode isn't set until .poll() or .wait() > > > is called so we need to call this after the .kill() call. > > > > > > This fixes return code reporting so that timeouts for example now return an > > > exit code when they didn't before. > > > > > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > > > --- > > > meta/lib/oeqa/core/target/ssh.py | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py > > > index 243e45dd99e..72ed1adbf89 100644 > > > --- a/meta/lib/oeqa/core/target/ssh.py > > > +++ b/meta/lib/oeqa/core/target/ssh.py > > > @@ -265,6 +265,7 @@ def SSHCall(command, logger, timeout=None, **opts): > > > time.sleep(5) > > > try: > > > process.kill() > > > + process.wait() > > > > I think this can still result in a orphaned process; if the > > process.terminate() on the line above is successful, the process will > > exit and then process.kill() here will raise an exception and will > > call wait. This is true generally everywhere; wait() must be called > > "will _not_ call wait" - Need to proofread better > > > regardless of if kill() successed, because kill() will fail if the > > process is already dead. It's a little tricky because > > process.commuinicate() calls wait(), so you can't unconditionally call > > wait(), but my solution was to remove all the kill() and wait() calls > > that already exist and add: > > > > try: > > run() > > except: > > logger.debug(....) > > raise > > finally: > > if process.returncode is None: > > try: > > process.wait(timeout=5) > > except TimeoutExpired: > > try: > > process.kill() > > except OSError: > > pass > > process.wait() > > > > I think I'll probably merge this as it does work better than the existing code and lets me clean up the ltp test results but I'd take a further patch... Cheers, Richard
diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py index 243e45dd99e..72ed1adbf89 100644 --- a/meta/lib/oeqa/core/target/ssh.py +++ b/meta/lib/oeqa/core/target/ssh.py @@ -265,6 +265,7 @@ def SSHCall(command, logger, timeout=None, **opts): time.sleep(5) try: process.kill() + process.wait() except OSError: logger.debug('OSError when killing process') pass @@ -287,6 +288,7 @@ def SSHCall(command, logger, timeout=None, **opts): except TimeoutExpired: try: process.kill() + process.wait() except OSError: logger.debug('OSError') pass @@ -316,6 +318,7 @@ def SSHCall(command, logger, timeout=None, **opts): # whilst running and ensure we don't leave a process behind. if process.poll() is None: process.kill() + process.wait() logger.debug('Something went wrong, killing SSH process') raise
As spotted by Joshua Watt, the returncode isn't set until .poll() or .wait() is called so we need to call this after the .kill() call. This fixes return code reporting so that timeouts for example now return an exit code when they didn't before. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> --- meta/lib/oeqa/core/target/ssh.py | 3 +++ 1 file changed, 3 insertions(+)