Patchwork package_rpm.bbclass: Replace shell provides/requires script with python version

login
register
mail settings
Submitter Richard Purdie
Date April 26, 2012, 7:54 p.m.
Message ID <1335470063.20130.32.camel@ted>
Download mbox | patch
Permalink /patch/26469/
State New
Headers show

Comments

Richard Purdie - April 26, 2012, 7:54 p.m.
The existing shell script is a fork bomb and forks off hundreds of
grep/cur/wc calls as it reads from its input stream and iterates over
the file data table for each line of input. This patch replaces the
shell code with python code which doesn't exec anything and hence runs
much faster without the exec() overhead. This speeds up rpm packaging
considerably, as can be measured simply by timing it, or watching the
processor utilisation.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
Mark Hatle - April 26, 2012, 8:34 p.m.
On 4/26/12 2:54 PM, Richard Purdie wrote:
> The existing shell script is a fork bomb and forks off hundreds of
> grep/cur/wc calls as it reads from its input stream and iterates over
> the file data table for each line of input. This patch replaces the
> shell code with python code which doesn't exec anything and hence runs
> much faster without the exec() overhead. This speeds up rpm packaging
> considerably, as can be measured simply by timing it, or watching the
> processor utilisation.

Just an FYI, the intent was to replace this code completely with a patch to RPM 
when it got upgraded.  Then RPM could read in the dependencies directly and not 
have to worry about hacky script solutions.

Needless to say we're not yet at that point, so this is fine..  but hopefully 
we're getting closer.

--Mark

> Signed-off-by: Richard Purdie<richard.purdie@linuxfoundation.org>
> ---
> diff --git a/meta/classes/package_rpm.bbclass b/meta/classes/package_rpm.bbclass
> index ffe3b31..d90976b 100644
> --- a/meta/classes/package_rpm.bbclass
> +++ b/meta/classes/package_rpm.bbclass
> @@ -1004,8 +1004,9 @@ python do_package_rpm () {
>
>   	# Construct per file dependencies file
>   	def dump_filerdeps(varname, outfile, d):
> -		outfile.write("#!/bin/sh\n")
> -		outfile.write("\n# Dependency table\n")
> +		outfile.write("#!/usr/bin/env python\n\n")
> +		outfile.write("# Dependency table\n")
> +		outfile.write('deps = {\n')
>   		for pkg in packages.split():
>   			dependsflist_key = 'FILE' + varname + 'FLIST' + "_" + pkg
>   			dependsflist = (d.getVar(dependsflist_key, True) or "")
> @@ -1018,7 +1019,7 @@ python do_package_rpm () {
>   				file = file.replace("@tab@", "\t")
>   				file = file.replace("@space@", " ")
>   				file = file.replace("@at@", "@")
> -				outfile.write("#" + pkgd + file + "\t")
> +				outfile.write('"' + pkgd + file + '" : "')
>   				for dep in depends_dict:
>   					ver = depends_dict[dep]
>   					if dep and ver:
> @@ -1027,12 +1028,15 @@ python do_package_rpm () {
>   						outfile.write(dep + " " + ver + " ")
>   					else:
>   						outfile.write(dep + " ")
> -				outfile.write("\n")
> -		outfile.write("\n\nwhile read file_name ; do\n")
> -		outfile.write("\tlength=$(echo \"#${file_name}\t\" | wc -c )\n")
> -		outfile.write("\tline=$(grep \"^#${file_name}\t\" $0 | cut -c ${length}- )\n")
> -		outfile.write("\tprintf \"%s\\n\" ${line}\n")
> -		outfile.write("done\n")
> +				outfile.write('",\n')
> +		outfile.write('}\n\n')
> +		outfile.write("import sys\n")
> +		outfile.write("while 1:\n")
> +		outfile.write("\tline = sys.stdin.readline().strip()\n")
> +		outfile.write("\tif not line:\n")
> +		outfile.write("\t\tsys.exit(0)\n")
> +		outfile.write("\tif line in deps:\n")
> +		outfile.write("\t\tprint(deps[line] + '\\n')\n")
>
>   	# OE-core dependencies a.k.a. RPM requires
>   	outdepends = workdir + "/" + srcname + ".requires"
>
>
Richard Purdie - April 27, 2012, 5:09 p.m.
On Thu, 2012-04-26 at 15:34 -0500, Mark Hatle wrote:
> On 4/26/12 2:54 PM, Richard Purdie wrote:
> > The existing shell script is a fork bomb and forks off hundreds of
> > grep/cur/wc calls as it reads from its input stream and iterates over
> > the file data table for each line of input. This patch replaces the
> > shell code with python code which doesn't exec anything and hence runs
> > much faster without the exec() overhead. This speeds up rpm packaging
> > considerably, as can be measured simply by timing it, or watching the
> > processor utilisation.
> 
> Just an FYI, the intent was to replace this code completely with a patch to RPM 
> when it got upgraded.  Then RPM could read in the dependencies directly and not 
> have to worry about hacky script solutions.
> 
> Needless to say we're not yet at that point, so this is fine..  but hopefully 
> we're getting closer.

I agree with the objective. I've had the patch lying around for a while
and in use in my builds so it was already written and doesn't seem to
regress anything. It seems to make sense to include it for now until the
rpm upgrade happens and we fix things properly.

The patch also has a partner in crime:

http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t2&id=c0cbf016f59b06f461961df4d6cdfcd0928c96ee

but I suspect you'd dislike that one even if functionally it doesn't
change much at the moment.

Cheers,

Richard
Mark Hatle - April 27, 2012, 5:37 p.m.
On 4/27/12 12:09 PM, Richard Purdie wrote:
> On Thu, 2012-04-26 at 15:34 -0500, Mark Hatle wrote:
>> On 4/26/12 2:54 PM, Richard Purdie wrote:
>>> The existing shell script is a fork bomb and forks off hundreds of
>>> grep/cur/wc calls as it reads from its input stream and iterates over
>>> the file data table for each line of input. This patch replaces the
>>> shell code with python code which doesn't exec anything and hence runs
>>> much faster without the exec() overhead. This speeds up rpm packaging
>>> considerably, as can be measured simply by timing it, or watching the
>>> processor utilisation.
>>
>> Just an FYI, the intent was to replace this code completely with a patch to RPM
>> when it got upgraded.  Then RPM could read in the dependencies directly and not
>> have to worry about hacky script solutions.
>>
>> Needless to say we're not yet at that point, so this is fine..  but hopefully
>> we're getting closer.
>
> I agree with the objective. I've had the patch lying around for a while
> and in use in my builds so it was already written and doesn't seem to
> regress anything. It seems to make sense to include it for now until the
> rpm upgrade happens and we fix things properly.
>
> The patch also has a partner in crime:
>
> http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t2&id=c0cbf016f59b06f461961df4d6cdfcd0928c96ee
>
> but I suspect you'd dislike that one even if functionally it doesn't
> change much at the moment.

Ya..  this other one sets a bad precedent, even though the functional behavior 
is the same currently.  (There is one odd thing, the lack of the #file stuff in 
the first chunk, I'm surprised anything works since the dependency stuff is 
never translated back to on-disk filename format.)

My longer term intention is to either write out the per-file data directly into 
the spec file so RPM can process it "per-file", or write the data to a data file 
that works along side the .spec file, but again it's only loaded once, instead 
of many times as it currently is.

(The still TODO work of course is do something reasonable in ipkg and deb 
packages that add these missing per-file dependencies.  I suspect we'll be 
limited in many ways because, AFAIK, deb and ipkg can't do dependencies on path 
names.)

--Mark

> Cheers,
>
> Richard
>

Patch

diff --git a/meta/classes/package_rpm.bbclass b/meta/classes/package_rpm.bbclass
index ffe3b31..d90976b 100644
--- a/meta/classes/package_rpm.bbclass
+++ b/meta/classes/package_rpm.bbclass
@@ -1004,8 +1004,9 @@  python do_package_rpm () {
 
 	# Construct per file dependencies file
 	def dump_filerdeps(varname, outfile, d):
-		outfile.write("#!/bin/sh\n")
-		outfile.write("\n# Dependency table\n")
+		outfile.write("#!/usr/bin/env python\n\n")
+		outfile.write("# Dependency table\n")
+		outfile.write('deps = {\n')
 		for pkg in packages.split():
 			dependsflist_key = 'FILE' + varname + 'FLIST' + "_" + pkg
 			dependsflist = (d.getVar(dependsflist_key, True) or "")
@@ -1018,7 +1019,7 @@  python do_package_rpm () {
 				file = file.replace("@tab@", "\t")
 				file = file.replace("@space@", " ")
 				file = file.replace("@at@", "@")
-				outfile.write("#" + pkgd + file + "\t")
+				outfile.write('"' + pkgd + file + '" : "')
 				for dep in depends_dict:
 					ver = depends_dict[dep]
 					if dep and ver:
@@ -1027,12 +1028,15 @@  python do_package_rpm () {
 						outfile.write(dep + " " + ver + " ")
 					else:
 						outfile.write(dep + " ")
-				outfile.write("\n")
-		outfile.write("\n\nwhile read file_name ; do\n")
-		outfile.write("\tlength=$(echo \"#${file_name}\t\" | wc -c )\n")
-		outfile.write("\tline=$(grep \"^#${file_name}\t\" $0 | cut -c ${length}- )\n")
-		outfile.write("\tprintf \"%s\\n\" ${line}\n")
-		outfile.write("done\n")
+				outfile.write('",\n')
+		outfile.write('}\n\n')
+		outfile.write("import sys\n")
+		outfile.write("while 1:\n")
+		outfile.write("\tline = sys.stdin.readline().strip()\n")
+		outfile.write("\tif not line:\n")
+		outfile.write("\t\tsys.exit(0)\n")
+		outfile.write("\tif line in deps:\n")
+		outfile.write("\t\tprint(deps[line] + '\\n')\n")
 
 	# OE-core dependencies a.k.a. RPM requires
 	outdepends = workdir + "/" + srcname + ".requires"