Message ID | 20211217113722.21974-1-ranjitsinhrathod1991@gmail.com |
---|---|
State | New, archived |
Headers | show |
Series | [meta,dunfell] boost: Add a NULL check for the pointer which causes a crash | expand |
On Fri, Dec 17, 2021 at 1:37 AM Ranjitsinh Rathod <ranjitsinhrathod1991@gmail.com> wrote: > > From: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > > Issues seen in boost asio call when used within shared libraries > particularly in aarch64. > The discussion with boost maintainers is on going at > github.com/chriskohlhoff/asio/issues/588 originally reported at > github.com/chriskohlhoff/asio/issues/642. The crash is quite frequent > with no solution in sight at present. > As a workaround this simple patch that checks the nullness of the > call stack seems to avoid the crash. Not sure I feel good about "seems to" ! Especially when the comments in the linked pull request include: "Please don't merge this just yet. While I no longer get SIGSEGVs with this patch, my event loop (io.run()) now just quits. There is a deeper underlying issue here." and "Closing. As noted elsewhere, this change is not correct. The compensating_work_started function is only called from inside the scheduler where we must have a valid, non-null call stack. That it's null may indicate that something is seriously wrong with the way the program is built. (A build issue with shared libraries perhaps?) Suppressing it like this will instead introduce a subtle, hard-to-debug work counting problem." So I'd like to see a little more discussion before we decide to take this patch. Steve > > Typical Crash backtrace: > Thread 1 (LWP 907): > 0 boost::asio::detail::scheduler::compensating_work_started (this=0x559e174230) at /usr/include/boost/asio/detail/impl/scheduler.ipp:321 > 1 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=0x7f861ab348, __in_chrg=<optimized out>) > at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:712 > 2 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x7f800023d0) at > /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:730 > 3 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (owner=0x559e174230, base=0x7f800023d0, ec=..., bytes_transferred=<optimized out>) > at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:774 > 4 0x0000007f894a4398 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=17, ec=..., owner=0x559e174230, this=0x7f800023d0) > at /usr/include/boost/asio/detail/scheduler_operation.hpp:40 > 5 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x559e174230) at > /usr/include/boost/asio/detail/impl/scheduler.ipp:447 > 6 boost::asio::detail::scheduler::run (this=0x559e174230, ec=...) at /usr/include/boost/asio/detail/impl/scheduler.ipp:200 > 7 0x0000007f895e444c in boost::asio::io_context::run (this=0x559e174b70) at /usr/include/boost/asio/impl/io_context.ipp:63 > > Signed-off-by: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > Signed-off-by: Ranjitsinh Rathod <ranjitsinhrathod1991@gmail.com> > --- > ...L-check-for-the-pointer-which-causes.patch | 54 +++++++++++++++++++ > meta/recipes-support/boost/boost_1.72.0.bb | 1 + > 2 files changed, 55 insertions(+) > create mode 100644 meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > > diff --git a/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > new file mode 100644 > index 0000000000..5ffea59e29 > --- /dev/null > +++ b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > @@ -0,0 +1,54 @@ > +From d5efa47962b0101d9ec305b38a1520e9c13f118e Mon Sep 17 00:00:00 2001 > +From: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > +Date: Mon, 31 May 2021 14:40:12 +0100 > +Subject: [PATCH] boost: Add a NULL check for the pointer which causes a crash > + > +Issues seen in boost asio call when used within shared libraries > +particularly in aarch64. > +The discussion with boost maintainers is on going at > +github.com/chriskohlhoff/asio/issues/588 originally reported at > +github.com/chriskohlhoff/asio/issues/642. The crash is quite frequent > +with no solution in sight at present. > +As a workaround this simple patch that checks the nullness of the > +call stack seems to avoid the crash. > + > +Typical Crash backtrace: > +Thread 1 (LWP 907): > +0 boost::asio::detail::scheduler::compensating_work_started (this=0x559e174230) at /usr/include/boost/asio/detail/impl/scheduler.ipp:321 > +1 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=0x7f861ab348, __in_chrg=<optimized out>) > +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:712 > +2 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x7f800023d0) at > +/usr/include/boost/asio/detail/impl/epoll_reactor.ipp:730 > +3 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (owner=0x559e174230, base=0x7f800023d0, ec=..., bytes_transferred=<optimized out>) > +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:774 > +4 0x0000007f894a4398 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=17, ec=..., owner=0x559e174230, this=0x7f800023d0) > +at /usr/include/boost/asio/detail/scheduler_operation.hpp:40 > +5 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x559e174230) at > +/usr/include/boost/asio/detail/impl/scheduler.ipp:447 > +6 boost::asio::detail::scheduler::run (this=0x559e174230, ec=...) at /usr/include/boost/asio/detail/impl/scheduler.ipp:200 > +7 0x0000007f895e444c in boost::asio::io_context::run (this=0x559e174b70) at /usr/include/boost/asio/impl/io_context.ipp:63 > + > +Upstream-Status: Submitted [https://github.com/chriskohlhoff/asio/pull/330] > + > +Signed-off-by: Mikko Rapeli <mikko.rapeli@bmw.de> > +Signed-off-by: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > + > +--- > + boost/asio/detail/impl/scheduler.ipp | 4 ++-- > + 1 file changed, 2 insertions(+), 2 deletions(-) > + > +diff --git a/boost/asio/detail/impl/scheduler.ipp b/boost/asio/detail/impl/scheduler.ipp > +index 4ef5c8668..02fc0d2e5 100644 > +--- a/boost/asio/detail/impl/scheduler.ipp > ++++ b/boost/asio/detail/impl/scheduler.ipp > +@@ -317,8 +317,8 @@ void scheduler::restart() > + > + void scheduler::compensating_work_started() > + { > +- thread_info_base* this_thread = thread_call_stack::contains(this); > +- ++static_cast<thread_info*>(this_thread)->private_outstanding_work; > ++ if (thread_info_base* this_thread = thread_call_stack::contains(this)) > ++ ++static_cast<thread_info*>(this_thread)->private_outstanding_work; > + } > + > + void scheduler::post_immediate_completion( > diff --git a/meta/recipes-support/boost/boost_1.72.0.bb b/meta/recipes-support/boost/boost_1.72.0.bb > index df1cc16937..7fb8526b33 100644 > --- a/meta/recipes-support/boost/boost_1.72.0.bb > +++ b/meta/recipes-support/boost/boost_1.72.0.bb > @@ -9,4 +9,5 @@ SRC_URI += " \ > file://0001-dont-setup-compiler-flags-m32-m64.patch \ > file://0001-revert-cease-dependence-on-range.patch \ > file://0001-added-typedef-executor_type.patch \ > + file://0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch \ > " > -- > 2.17.1 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#159821): https://lists.openembedded.org/g/openembedded-core/message/159821 > Mute This Topic: https://lists.openembedded.org/mt/87787403/3617601 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [sakoman@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
On Fri, Dec 17, 2021 at 6:47 AM Steve Sakoman <sakoman@gmail.com> wrote: > > On Fri, Dec 17, 2021 at 1:37 AM Ranjitsinh Rathod > <ranjitsinhrathod1991@gmail.com> wrote: > > > > From: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > > > > Issues seen in boost asio call when used within shared libraries > > particularly in aarch64. > > The discussion with boost maintainers is on going at > > github.com/chriskohlhoff/asio/issues/588 originally reported at > > github.com/chriskohlhoff/asio/issues/642. The crash is quite frequent > > with no solution in sight at present. > > As a workaround this simple patch that checks the nullness of the > > call stack seems to avoid the crash. > > Not sure I feel good about "seems to" ! > > Especially when the comments in the linked pull request include: > > "Please don't merge this just yet. While I no longer get SIGSEGVs with > this patch, my event loop (io.run()) now just quits. There is a deeper > underlying issue here." > > and > > "Closing. As noted elsewhere, this change is not correct. The > compensating_work_started function is only called from inside the > scheduler where we must have a valid, non-null call stack. That it's > null may indicate that something is seriously wrong with the way the > program is built. (A build issue with shared libraries perhaps?) > Suppressing it like this will instead introduce a subtle, > hard-to-debug work counting problem." > > So I'd like to see a little more discussion before we decide to take this patch. Good call Steve, I also think its just hiding the issue a null return might result in other symptoms which might be entirely differrent. > > Steve > > > > > Typical Crash backtrace: > > Thread 1 (LWP 907): > > 0 boost::asio::detail::scheduler::compensating_work_started (this=0x559e174230) at /usr/include/boost/asio/detail/impl/scheduler.ipp:321 > > 1 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=0x7f861ab348, __in_chrg=<optimized out>) > > at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:712 > > 2 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x7f800023d0) at > > /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:730 > > 3 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (owner=0x559e174230, base=0x7f800023d0, ec=..., bytes_transferred=<optimized out>) > > at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:774 > > 4 0x0000007f894a4398 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=17, ec=..., owner=0x559e174230, this=0x7f800023d0) > > at /usr/include/boost/asio/detail/scheduler_operation.hpp:40 > > 5 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x559e174230) at > > /usr/include/boost/asio/detail/impl/scheduler.ipp:447 > > 6 boost::asio::detail::scheduler::run (this=0x559e174230, ec=...) at /usr/include/boost/asio/detail/impl/scheduler.ipp:200 > > 7 0x0000007f895e444c in boost::asio::io_context::run (this=0x559e174b70) at /usr/include/boost/asio/impl/io_context.ipp:63 > > > > Signed-off-by: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > > Signed-off-by: Ranjitsinh Rathod <ranjitsinhrathod1991@gmail.com> > > --- > > ...L-check-for-the-pointer-which-causes.patch | 54 +++++++++++++++++++ > > meta/recipes-support/boost/boost_1.72.0.bb | 1 + > > 2 files changed, 55 insertions(+) > > create mode 100644 meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > > > > diff --git a/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > > new file mode 100644 > > index 0000000000..5ffea59e29 > > --- /dev/null > > +++ b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch > > @@ -0,0 +1,54 @@ > > +From d5efa47962b0101d9ec305b38a1520e9c13f118e Mon Sep 17 00:00:00 2001 > > +From: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > > +Date: Mon, 31 May 2021 14:40:12 +0100 > > +Subject: [PATCH] boost: Add a NULL check for the pointer which causes a crash > > + > > +Issues seen in boost asio call when used within shared libraries > > +particularly in aarch64. > > +The discussion with boost maintainers is on going at > > +github.com/chriskohlhoff/asio/issues/588 originally reported at > > +github.com/chriskohlhoff/asio/issues/642. The crash is quite frequent > > +with no solution in sight at present. > > +As a workaround this simple patch that checks the nullness of the > > +call stack seems to avoid the crash. > > + > > +Typical Crash backtrace: > > +Thread 1 (LWP 907): > > +0 boost::asio::detail::scheduler::compensating_work_started (this=0x559e174230) at /usr/include/boost/asio/detail/impl/scheduler.ipp:321 > > +1 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=0x7f861ab348, __in_chrg=<optimized out>) > > +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:712 > > +2 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x7f800023d0) at > > +/usr/include/boost/asio/detail/impl/epoll_reactor.ipp:730 > > +3 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (owner=0x559e174230, base=0x7f800023d0, ec=..., bytes_transferred=<optimized out>) > > +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:774 > > +4 0x0000007f894a4398 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=17, ec=..., owner=0x559e174230, this=0x7f800023d0) > > +at /usr/include/boost/asio/detail/scheduler_operation.hpp:40 > > +5 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x559e174230) at > > +/usr/include/boost/asio/detail/impl/scheduler.ipp:447 > > +6 boost::asio::detail::scheduler::run (this=0x559e174230, ec=...) at /usr/include/boost/asio/detail/impl/scheduler.ipp:200 > > +7 0x0000007f895e444c in boost::asio::io_context::run (this=0x559e174b70) at /usr/include/boost/asio/impl/io_context.ipp:63 > > + > > +Upstream-Status: Submitted [https://github.com/chriskohlhoff/asio/pull/330] > > + > > +Signed-off-by: Mikko Rapeli <mikko.rapeli@bmw.de> > > +Signed-off-by: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> > > + > > +--- > > + boost/asio/detail/impl/scheduler.ipp | 4 ++-- > > + 1 file changed, 2 insertions(+), 2 deletions(-) > > + > > +diff --git a/boost/asio/detail/impl/scheduler.ipp b/boost/asio/detail/impl/scheduler.ipp > > +index 4ef5c8668..02fc0d2e5 100644 > > +--- a/boost/asio/detail/impl/scheduler.ipp > > ++++ b/boost/asio/detail/impl/scheduler.ipp > > +@@ -317,8 +317,8 @@ void scheduler::restart() > > + > > + void scheduler::compensating_work_started() > > + { > > +- thread_info_base* this_thread = thread_call_stack::contains(this); > > +- ++static_cast<thread_info*>(this_thread)->private_outstanding_work; > > ++ if (thread_info_base* this_thread = thread_call_stack::contains(this)) > > ++ ++static_cast<thread_info*>(this_thread)->private_outstanding_work; > > + } > > + > > + void scheduler::post_immediate_completion( > > diff --git a/meta/recipes-support/boost/boost_1.72.0.bb b/meta/recipes-support/boost/boost_1.72.0.bb > > index df1cc16937..7fb8526b33 100644 > > --- a/meta/recipes-support/boost/boost_1.72.0.bb > > +++ b/meta/recipes-support/boost/boost_1.72.0.bb > > @@ -9,4 +9,5 @@ SRC_URI += " \ > > file://0001-dont-setup-compiler-flags-m32-m64.patch \ > > file://0001-revert-cease-dependence-on-range.patch \ > > file://0001-added-typedef-executor_type.patch \ > > + file://0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch \ > > " > > -- > > 2.17.1 > > > > > > > > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#159827): https://lists.openembedded.org/g/openembedded-core/message/159827 > Mute This Topic: https://lists.openembedded.org/mt/87787403/1997914 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [raj.khem@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- >
Steve, Earlier we have also thought the same things from the upstream issue comment and we have tried another workaround mentioned on the ticket which is disabled epoll. We have tried it and it didn't work for the project. we still facing the issue. Then we tried the NULL check for the variable. Maybe it's not a proper fix (just a workaround to check the pointer before accessing), but there is nothing wrong with checking the null before accessing any variable. Ranjitsinh
No, as upstream explained, in this case the null indicates invalid internal state, so the correct behavior for the program is indeed to halt the program execution and for you it is to find out how the invalid internal state occurs in the first place through debugging or similar. Alex Alex On Mon 20. Dec 2021 at 12.12, Ranjitsinh Rathod < ranjitsinhrathod1991@gmail.com> wrote: > Steve, > > Earlier we have also thought the same things from the upstream issue > comment and we have tried another workaround mentioned on the ticket which > is disabled epoll. > We have tried it and it didn't work for the project. we still facing the > issue. > > Then we tried the NULL check for the variable. Maybe it's not a proper fix > (just a workaround to check the pointer before accessing), but there is > nothing wrong with checking the null before accessing any variable. > > Ranjitsinh > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#159854): > https://lists.openembedded.org/g/openembedded-core/message/159854 > Mute This Topic: https://lists.openembedded.org/mt/87787403/1686489 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [ > alex.kanavin@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- > >
diff --git a/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch new file mode 100644 index 0000000000..5ffea59e29 --- /dev/null +++ b/meta/recipes-support/boost/boost/0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch @@ -0,0 +1,54 @@ +From d5efa47962b0101d9ec305b38a1520e9c13f118e Mon Sep 17 00:00:00 2001 +From: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> +Date: Mon, 31 May 2021 14:40:12 +0100 +Subject: [PATCH] boost: Add a NULL check for the pointer which causes a crash + +Issues seen in boost asio call when used within shared libraries +particularly in aarch64. +The discussion with boost maintainers is on going at +github.com/chriskohlhoff/asio/issues/588 originally reported at +github.com/chriskohlhoff/asio/issues/642. The crash is quite frequent +with no solution in sight at present. +As a workaround this simple patch that checks the nullness of the +call stack seems to avoid the crash. + +Typical Crash backtrace: +Thread 1 (LWP 907): +0 boost::asio::detail::scheduler::compensating_work_started (this=0x559e174230) at /usr/include/boost/asio/detail/impl/scheduler.ipp:321 +1 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=0x7f861ab348, __in_chrg=<optimized out>) +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:712 +2 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x7f800023d0) at +/usr/include/boost/asio/detail/impl/epoll_reactor.ipp:730 +3 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (owner=0x559e174230, base=0x7f800023d0, ec=..., bytes_transferred=<optimized out>) +at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:774 +4 0x0000007f894a4398 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=17, ec=..., owner=0x559e174230, this=0x7f800023d0) +at /usr/include/boost/asio/detail/scheduler_operation.hpp:40 +5 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x559e174230) at +/usr/include/boost/asio/detail/impl/scheduler.ipp:447 +6 boost::asio::detail::scheduler::run (this=0x559e174230, ec=...) at /usr/include/boost/asio/detail/impl/scheduler.ipp:200 +7 0x0000007f895e444c in boost::asio::io_context::run (this=0x559e174b70) at /usr/include/boost/asio/impl/io_context.ipp:63 + +Upstream-Status: Submitted [https://github.com/chriskohlhoff/asio/pull/330] + +Signed-off-by: Mikko Rapeli <mikko.rapeli@bmw.de> +Signed-off-by: Ranjitsinh Rathod <ranjitsinh.rathod@kpit.com> + +--- + boost/asio/detail/impl/scheduler.ipp | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/boost/asio/detail/impl/scheduler.ipp b/boost/asio/detail/impl/scheduler.ipp +index 4ef5c8668..02fc0d2e5 100644 +--- a/boost/asio/detail/impl/scheduler.ipp ++++ b/boost/asio/detail/impl/scheduler.ipp +@@ -317,8 +317,8 @@ void scheduler::restart() + + void scheduler::compensating_work_started() + { +- thread_info_base* this_thread = thread_call_stack::contains(this); +- ++static_cast<thread_info*>(this_thread)->private_outstanding_work; ++ if (thread_info_base* this_thread = thread_call_stack::contains(this)) ++ ++static_cast<thread_info*>(this_thread)->private_outstanding_work; + } + + void scheduler::post_immediate_completion( diff --git a/meta/recipes-support/boost/boost_1.72.0.bb b/meta/recipes-support/boost/boost_1.72.0.bb index df1cc16937..7fb8526b33 100644 --- a/meta/recipes-support/boost/boost_1.72.0.bb +++ b/meta/recipes-support/boost/boost_1.72.0.bb @@ -9,4 +9,5 @@ SRC_URI += " \ file://0001-dont-setup-compiler-flags-m32-m64.patch \ file://0001-revert-cease-dependence-on-range.patch \ file://0001-added-typedef-executor_type.patch \ + file://0001-boost-Add-a-NULL-check-for-the-pointer-which-causes.patch \ "