| Submitter | Richard Purdie |
|---|---|
| Date | April 26, 2012, 7:54 p.m. |
| Message ID | <1335470063.20130.32.camel@ted> |
| Download | mbox | patch |
| Permalink | /patch/26469/ |
| State | New |
| Headers | show |
Comments
On 4/26/12 2:54 PM, Richard Purdie wrote: > The existing shell script is a fork bomb and forks off hundreds of > grep/cur/wc calls as it reads from its input stream and iterates over > the file data table for each line of input. This patch replaces the > shell code with python code which doesn't exec anything and hence runs > much faster without the exec() overhead. This speeds up rpm packaging > considerably, as can be measured simply by timing it, or watching the > processor utilisation. Just an FYI, the intent was to replace this code completely with a patch to RPM when it got upgraded. Then RPM could read in the dependencies directly and not have to worry about hacky script solutions. Needless to say we're not yet at that point, so this is fine.. but hopefully we're getting closer. --Mark > Signed-off-by: Richard Purdie<richard.purdie@linuxfoundation.org> > --- > diff --git a/meta/classes/package_rpm.bbclass b/meta/classes/package_rpm.bbclass > index ffe3b31..d90976b 100644 > --- a/meta/classes/package_rpm.bbclass > +++ b/meta/classes/package_rpm.bbclass > @@ -1004,8 +1004,9 @@ python do_package_rpm () { > > # Construct per file dependencies file > def dump_filerdeps(varname, outfile, d): > - outfile.write("#!/bin/sh\n") > - outfile.write("\n# Dependency table\n") > + outfile.write("#!/usr/bin/env python\n\n") > + outfile.write("# Dependency table\n") > + outfile.write('deps = {\n') > for pkg in packages.split(): > dependsflist_key = 'FILE' + varname + 'FLIST' + "_" + pkg > dependsflist = (d.getVar(dependsflist_key, True) or "") > @@ -1018,7 +1019,7 @@ python do_package_rpm () { > file = file.replace("@tab@", "\t") > file = file.replace("@space@", " ") > file = file.replace("@at@", "@") > - outfile.write("#" + pkgd + file + "\t") > + outfile.write('"' + pkgd + file + '" : "') > for dep in depends_dict: > ver = depends_dict[dep] > if dep and ver: > @@ -1027,12 +1028,15 @@ python do_package_rpm () { > outfile.write(dep + " " + ver + " ") > else: > outfile.write(dep + " ") > - outfile.write("\n") > - outfile.write("\n\nwhile read file_name ; do\n") > - outfile.write("\tlength=$(echo \"#${file_name}\t\" | wc -c )\n") > - outfile.write("\tline=$(grep \"^#${file_name}\t\" $0 | cut -c ${length}- )\n") > - outfile.write("\tprintf \"%s\\n\" ${line}\n") > - outfile.write("done\n") > + outfile.write('",\n') > + outfile.write('}\n\n') > + outfile.write("import sys\n") > + outfile.write("while 1:\n") > + outfile.write("\tline = sys.stdin.readline().strip()\n") > + outfile.write("\tif not line:\n") > + outfile.write("\t\tsys.exit(0)\n") > + outfile.write("\tif line in deps:\n") > + outfile.write("\t\tprint(deps[line] + '\\n')\n") > > # OE-core dependencies a.k.a. RPM requires > outdepends = workdir + "/" + srcname + ".requires" > >
On Thu, 2012-04-26 at 15:34 -0500, Mark Hatle wrote: > On 4/26/12 2:54 PM, Richard Purdie wrote: > > The existing shell script is a fork bomb and forks off hundreds of > > grep/cur/wc calls as it reads from its input stream and iterates over > > the file data table for each line of input. This patch replaces the > > shell code with python code which doesn't exec anything and hence runs > > much faster without the exec() overhead. This speeds up rpm packaging > > considerably, as can be measured simply by timing it, or watching the > > processor utilisation. > > Just an FYI, the intent was to replace this code completely with a patch to RPM > when it got upgraded. Then RPM could read in the dependencies directly and not > have to worry about hacky script solutions. > > Needless to say we're not yet at that point, so this is fine.. but hopefully > we're getting closer. I agree with the objective. I've had the patch lying around for a while and in use in my builds so it was already written and doesn't seem to regress anything. It seems to make sense to include it for now until the rpm upgrade happens and we fix things properly. The patch also has a partner in crime: http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t2&id=c0cbf016f59b06f461961df4d6cdfcd0928c96ee but I suspect you'd dislike that one even if functionally it doesn't change much at the moment. Cheers, Richard
On 4/27/12 12:09 PM, Richard Purdie wrote: > On Thu, 2012-04-26 at 15:34 -0500, Mark Hatle wrote: >> On 4/26/12 2:54 PM, Richard Purdie wrote: >>> The existing shell script is a fork bomb and forks off hundreds of >>> grep/cur/wc calls as it reads from its input stream and iterates over >>> the file data table for each line of input. This patch replaces the >>> shell code with python code which doesn't exec anything and hence runs >>> much faster without the exec() overhead. This speeds up rpm packaging >>> considerably, as can be measured simply by timing it, or watching the >>> processor utilisation. >> >> Just an FYI, the intent was to replace this code completely with a patch to RPM >> when it got upgraded. Then RPM could read in the dependencies directly and not >> have to worry about hacky script solutions. >> >> Needless to say we're not yet at that point, so this is fine.. but hopefully >> we're getting closer. > > I agree with the objective. I've had the patch lying around for a while > and in use in my builds so it was already written and doesn't seem to > regress anything. It seems to make sense to include it for now until the > rpm upgrade happens and we fix things properly. > > The patch also has a partner in crime: > > http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t2&id=c0cbf016f59b06f461961df4d6cdfcd0928c96ee > > but I suspect you'd dislike that one even if functionally it doesn't > change much at the moment. Ya.. this other one sets a bad precedent, even though the functional behavior is the same currently. (There is one odd thing, the lack of the #file stuff in the first chunk, I'm surprised anything works since the dependency stuff is never translated back to on-disk filename format.) My longer term intention is to either write out the per-file data directly into the spec file so RPM can process it "per-file", or write the data to a data file that works along side the .spec file, but again it's only loaded once, instead of many times as it currently is. (The still TODO work of course is do something reasonable in ipkg and deb packages that add these missing per-file dependencies. I suspect we'll be limited in many ways because, AFAIK, deb and ipkg can't do dependencies on path names.) --Mark > Cheers, > > Richard >
Patch
diff --git a/meta/classes/package_rpm.bbclass b/meta/classes/package_rpm.bbclass index ffe3b31..d90976b 100644 --- a/meta/classes/package_rpm.bbclass +++ b/meta/classes/package_rpm.bbclass @@ -1004,8 +1004,9 @@ python do_package_rpm () { # Construct per file dependencies file def dump_filerdeps(varname, outfile, d): - outfile.write("#!/bin/sh\n") - outfile.write("\n# Dependency table\n") + outfile.write("#!/usr/bin/env python\n\n") + outfile.write("# Dependency table\n") + outfile.write('deps = {\n') for pkg in packages.split(): dependsflist_key = 'FILE' + varname + 'FLIST' + "_" + pkg dependsflist = (d.getVar(dependsflist_key, True) or "") @@ -1018,7 +1019,7 @@ python do_package_rpm () { file = file.replace("@tab@", "\t") file = file.replace("@space@", " ") file = file.replace("@at@", "@") - outfile.write("#" + pkgd + file + "\t") + outfile.write('"' + pkgd + file + '" : "') for dep in depends_dict: ver = depends_dict[dep] if dep and ver: @@ -1027,12 +1028,15 @@ python do_package_rpm () { outfile.write(dep + " " + ver + " ") else: outfile.write(dep + " ") - outfile.write("\n") - outfile.write("\n\nwhile read file_name ; do\n") - outfile.write("\tlength=$(echo \"#${file_name}\t\" | wc -c )\n") - outfile.write("\tline=$(grep \"^#${file_name}\t\" $0 | cut -c ${length}- )\n") - outfile.write("\tprintf \"%s\\n\" ${line}\n") - outfile.write("done\n") + outfile.write('",\n') + outfile.write('}\n\n') + outfile.write("import sys\n") + outfile.write("while 1:\n") + outfile.write("\tline = sys.stdin.readline().strip()\n") + outfile.write("\tif not line:\n") + outfile.write("\t\tsys.exit(0)\n") + outfile.write("\tif line in deps:\n") + outfile.write("\t\tprint(deps[line] + '\\n')\n") # OE-core dependencies a.k.a. RPM requires outdepends = workdir + "/" + srcname + ".requires"
The existing shell script is a fork bomb and forks off hundreds of grep/cur/wc calls as it reads from its input stream and iterates over the file data table for each line of input. This patch replaces the shell code with python code which doesn't exec anything and hence runs much faster without the exec() overhead. This speeds up rpm packaging considerably, as can be measured simply by timing it, or watching the processor utilisation. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> ---