Patchwork [1/1] ncurses: Disable parallel make

login
register
mail settings
Submitter Robert Yang
Date May 18, 2012, 12:14 p.m.
Message ID <4FB63D21.7060603@windriver.com>
Download mbox | patch
Permalink /patch/27963/
State New
Headers show

Comments

Robert Yang - May 18, 2012, 12:14 p.m.
I've looked into the code, this is a race issue which is caused by
install.libs and install.data:

1) install.data needs run tic
2) tic needs libtinfo.so
3) install.libs would regenerate libtinfo.so
4) but install.data doesn't depend on install.libs, and they can run
    parallelly

So there would be errors in a very critical condition: tic is begining
to run at the same time when install.libs is generating libtinfo.so, and
this libtinfo.so is not integrity, then there would be the error:

tic: error while loading shared libraries: /path/to/lib/libtinfo.so.5: file too 
short

Here are the related code:
1) The do_install in meta/recipes-core/ncurses/ncurses.inc:

make install.libs ... install.data

2) The install.libs and install.data in ncurses-5.9/narrowc/Makefile, note that
    it uses the double-colon which means that the all the targets in different
    rules will be run, and if there are no prerequisites for that rule, its
    recipe is always executed (even if the target already exists). The
    install.libs is in 9 rules (just paste 2 here):

install.libs uninstall.libs \
install.data uninstall.data ::
         cd misc && ${MAKE} ${CF_MFLAGS} $@
...
install.libs \
uninstall.libs \
install.ncurses \
uninstall.ncurses ::
         cd ncurses && ${MAKE} ${CF_MFLAGS} $@
...


3) The install.libs in ncurses-5.9/narrowc/ncurses/Makefile:

../lib/libtinfo.so.$(REL_VERSION) : \
                 ../lib \
                 $(SHARED_T_OBJS)
         @echo linking $@
         $(MK_SHARED_LIB) $(SHARED_T_OBJS) $(TINFO_LIST) $(LDFLAGS)
         cd ../lib && ($(LN_S) libtinfo.so.$(REL_VERSION) 
libtinfo.so.$(ABI_VERSION); $(LN_S) libtinfo.so.$(ABI_VERSION) libtinfo.so; )


The ../lib/libtinfo.so has been generated in the compile stage, but it would be
regenerated since its prerequisites (../lib) has changed.

The easiest way to fix this is first run install.libs, then install.data, here
is the patch:


Another solution is modify the Makefiles, but that is not as simple as modify
the ncurses.inc.

Xiaofeng will send the V2 after enough testing.

// Robert

On 05/18/2012 03:33 PM, Xiaofeng Yan wrote:
> On 2012?05?17? 20:02, Jason Wessel wrote:
>> On 05/16/2012 09:01 PM, Xiaofeng Yan wrote:
>>> On 2012?05?16? 19:02, Saul Wold wrote:
>>>> On 05/16/2012 01:10 PM, xiaofeng.yan@windriver.com wrote:
>>>>> From: Xiaofeng Yan<xiaofeng.yan@windriver.com>
>>>>>
>>>>> Ncurses failure non-gplv3 build by race issue. So disable parallel \
>>>>> make when building this package.
>>>>>
>>>> This is not the best approach as you disable PARALLEL_MAKE for both
>>>> non-gplv3 and gplv3 versions. Further, we want to get rid of [M1]
>>>> setting as much as possible, so this patch is not helping that.
>>>>
>>>> Did you try running on a large many core machine? It might help if you
>>>> have some other builds going also to stress the machine.
>>>>
>>>> Sau!
>>> Thanks for your reply. The most cores I have are eight. I also set
>>> PARALLEL_MAKE=j1000 and 10000. I think I need try to find new way for
>>> fixing bugs.
>>>
>> Do you have an error file from a failed build (and ideally the failed build
>> directory)? Having diagnosed many problems like this in the past, it is
>> easiest to look for the failure case and add some sleep statement in the
>> Makefile to get it to trigger every time in the same way.
> Hi jason,
> The failed build information is in *Bug 2298*
> <https://bugzilla.yoctoproject.org/show_bug.cgi?id=2298>. The error appear in
> the stage of install, not configure and compiling.
> Do you any ideas after reading bug information?
>
> Thanks
> Yan
>
>
>
>> The two most common problems are:
>> 1) autoconf re-runs due to time stamps or partially patched files
>> 2) a generated file is reported as missing
>>
>> In the first case it, it will often be some error with a .h missing or some
>> other strange error about a header in the compilation and it is a result of
>> only having a partial file because it is getting regenerated at the time.
>>
>> In the second case you just find the file's rule in the Makefile and add an if
>> statement in the Make target goal if it is a multi-object rule to look for the
>> problem object and sleep a bit. I have yet to see a case I couldn't reproduce
>> the results by following the strategy of some forcing some extra delay. You
>> probably won't have to go to this length, but there was one time I even wrote
>> a C wrapper around a command to add some sleep controlled by an environment
>> variable to prove config.h was getting removed and regenerated. Example:
>>
>> #include<stdio.h>
>> #include<stdlib.h>
>> #include<string.h>
>> #include<unistd.h>
>>
>> int main(int argc, char *const argv[]) {
>> char *lookfor;
>> if (argc>= 2) {
>> lookfor = getenv("LOOKFORSLEEP");
>> if (lookfor&& strcmp(argv[1], lookfor) == 0) {
>> if (argc>= 3&& strcmp(argv[2], "config.h") == 0) {
>> unlink("config.h");
>> printf("Special sleep on command %s\n", lookfor);
>> sleep(2);
>> }
>> }
>> }
>> execv("/bin/sh", argv);
>> return 0;
>> }
>>
>>
>> Best of luck,
>> Jason.
>>
>
>
>
>
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core

Patch

diff --git a/meta/recipes-core/ncurses/ncurses.inc 
b/meta/recipes-core/ncurses/ncurses.inc
index ae99e2c..6309b69 100644
--- a/meta/recipes-core/ncurses/ncurses.inc
+++ b/meta/recipes-core/ncurses/ncurses.inc
@@ -122,8 +122,17 @@  shell_do_install() {
          # Order of installation is important; widec installs a 'curses.h'
          # header with more definitions and must be installed last hence.
          # Compatibility of these headers will be checked in 'do_test()'.
+
+        # The install.data should run after install.libs, otherwise
+        # there would be a race issue in a very critical conditon, since
+        # tic will be run by install.data, and tic needs libtinfo.so
+        # which would be regenerated by install.libs.
          oe_runmake -C narrowc ${_install_opts} \
-                install.data install.progs
+                install.progs
+
+        oe_runmake -C narrowc DESTDIR='${D}'  \
+                PKG_CONFIG_LIBDIR='${libdir}/pkgconfig' \
+                install.data

          ! ${ENABLE_WIDEC} || \
              oe_runmake -C widec ${_install_opts}