Patchwork [1/1] sstate.bbclass: Improve sstate_installpkg performance

login
register
mail settings
Submitter Mark Hatle
Date May 10, 2012, 11:13 p.m.
Message ID <8c6c132129d83dca4e5b88a642126901c32a78de.1336691544.git.mark.hatle@windriver.com>
Download mbox | patch
Permalink /patch/27455/
State Accepted
Commit d9f655753fbdc8cbd8e705577430fed4f23732b3
Headers show

Comments

Mark Hatle - May 10, 2012, 11:13 p.m.
In a pathological case, lots of files to process, the sstate_installpkg
performance was very poor.  It interated over each file and ran 3
individual sed commands per file.  Changing this to keep iterating
but running only a single command took about 1/3 time time.

However, when looking at the corresponding sstate_hardcode_path
function, it was clear we could optimize this further.

Using the same encoding logic to specify only the minimumal sed
operation necessary, and using xargs to avoid the os.system call the
install step was able to be performed in 13% of the original time.

Example timing numbers for perl:

3m7s original code
1m20s single sed, but interating
0m26s using xargs and limited sed

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 meta/classes/sstate.bbclass |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)
Peter Seebach - May 11, 2012, 12:03 a.m.
On Thu, 10 May 2012 18:13:38 -0500
Mark Hatle <mark.hatle@windriver.com> wrote:

> +	sstate_hardcode_cmd = "sed -e 's:^:%s:g' %s | xargs %s" %
> (sstateinst, fixmefn, sstate_sed_cmd)

How confident are we that the file names can never have whitespace in
them?

-s
Richard Purdie - May 11, 2012, 12:17 a.m.
On Thu, 2012-05-10 at 18:13 -0500, Mark Hatle wrote:
> In a pathological case, lots of files to process, the sstate_installpkg
> performance was very poor.  It interated over each file and ran 3
> individual sed commands per file.  Changing this to keep iterating
> but running only a single command took about 1/3 time time.
> 
> However, when looking at the corresponding sstate_hardcode_path
> function, it was clear we could optimize this further.
> 
> Using the same encoding logic to specify only the minimumal sed
> operation necessary, and using xargs to avoid the os.system call the
> install step was able to be performed in 13% of the original time.
> 
> Example timing numbers for perl:
> 
> 3m7s original code
> 1m20s single sed, but interating
> 0m26s using xargs and limited sed
> 
> Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
> ---
>  meta/classes/sstate.bbclass |   28 +++++++++++++++++++++-------
>  1 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
> index a8c98e5..ad7d121 100644
> --- a/meta/classes/sstate.bbclass
> +++ b/meta/classes/sstate.bbclass
> @@ -174,18 +174,29 @@ def sstate_installpkg(ss, d):
>      bb.build.exec_func('sstate_unpack_package', d)
>  
>      # Fixup hardcoded paths
> +    #
> +    # Note: The logic below must match the reverse logic in
> +    # sstate_hardcode_path(d)
> +
>      fixmefn =  sstateinst + "fixmepath"
>      if os.path.isfile(fixmefn):
>          staging = d.getVar('STAGING_DIR', True)
>          staging_target = d.getVar('STAGING_DIR_TARGET', True)
>          staging_host = d.getVar('STAGING_DIR_HOST', True)
> -        fixmefd = open(fixmefn, "r")
> -        fixmefiles = fixmefd.readlines()
> -        fixmefd.close()
> -        for file in fixmefiles:
> -            os.system("sed -i -e s:FIXMESTAGINGDIRTARGET:%s:g %s" % (staging_target, sstateinst + file))
> -            os.system("sed -i -e s:FIXMESTAGINGDIRHOST:%s:g %s" % (staging_host, sstateinst + file))
> -            os.system("sed -i -e s:FIXMESTAGINGDIR:%s:g %s" % (staging, sstateinst + file))
> +
> +	if bb.data.inherits_class('native', d) or bb.data.inherits_class('nativesdk', d) or bb.data.inherits_class('crosssdk', d) or bb.data.inherits_class('cross-canadian', d):
> +		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIR:%s:g'" % (staging)
> +	elif bb.data.inherits_class('cross', d):
> +		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIRTARGET:%s:g; s:FIXMESTAGINGDIR:%s:g'" % (staging_target, staging)
> +	else:
> +		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIRHOST:%s:g'" % (staging_host)
> +
> +	# Add sstateinst to each filename in fixmepath, use xargs to efficiently call sed
> +	sstate_hardcode_cmd = "sed -e 's:^:%s:g' %s | xargs %s" % (sstateinst, fixmefn, sstate_sed_cmd)
> +
> +	print "Replacing fixme paths in sstate package: %s" % (sstate_hardcode_cmd)
> +	os.system(sstate_hardcode_cmd)
> +
>          # Need to remove this or we'd copy it into the target directory and may 
>          # conflict with another writer
>          os.remove(fixmefn)
> @@ -300,6 +311,9 @@ python sstate_cleanall() {
>  def sstate_hardcode_path(d):
>  	# Need to remove hardcoded paths and fix these when we install the
>  	# staging packages.
> +	#
> +	# Note: the logic in this function needs to match the reverse logic
> +	# in sstate_installpkg(ss, d)
>  
>  	staging = d.getVar('STAGING_DIR', True)
>  	staging_target = d.getVar('STAGING_DIR_TARGET', True)

I was thinking this looked familiar, it was the other half of the
problem we originally solved:

http://git.yoctoproject.org/cgit.cgi/poky/commit/meta/classes/sstate.bbclass?id=db94ad4cf32d8ce3e97a8287d1c89a58d008a142

:)

Cheers,

Richard
Richard Purdie - May 11, 2012, 12:18 a.m.
On Thu, 2012-05-10 at 19:03 -0500, Peter Seebach wrote:
> On Thu, 10 May 2012 18:13:38 -0500
> Mark Hatle <mark.hatle@windriver.com> wrote:
> 
> > +	sstate_hardcode_cmd = "sed -e 's:^:%s:g' %s | xargs %s" %
> > (sstateinst, fixmefn, sstate_sed_cmd)
> 
> How confident are we that the file names can never have whitespace in
> them?

Fairly since there are 101 other places this would have broken first.

The day autotools thinks about supporting that, we might start to think
about it too. Until then there is little point in worrying about it.

Cheers,

Richard

Patch

diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
index a8c98e5..ad7d121 100644
--- a/meta/classes/sstate.bbclass
+++ b/meta/classes/sstate.bbclass
@@ -174,18 +174,29 @@  def sstate_installpkg(ss, d):
     bb.build.exec_func('sstate_unpack_package', d)
 
     # Fixup hardcoded paths
+    #
+    # Note: The logic below must match the reverse logic in
+    # sstate_hardcode_path(d)
+
     fixmefn =  sstateinst + "fixmepath"
     if os.path.isfile(fixmefn):
         staging = d.getVar('STAGING_DIR', True)
         staging_target = d.getVar('STAGING_DIR_TARGET', True)
         staging_host = d.getVar('STAGING_DIR_HOST', True)
-        fixmefd = open(fixmefn, "r")
-        fixmefiles = fixmefd.readlines()
-        fixmefd.close()
-        for file in fixmefiles:
-            os.system("sed -i -e s:FIXMESTAGINGDIRTARGET:%s:g %s" % (staging_target, sstateinst + file))
-            os.system("sed -i -e s:FIXMESTAGINGDIRHOST:%s:g %s" % (staging_host, sstateinst + file))
-            os.system("sed -i -e s:FIXMESTAGINGDIR:%s:g %s" % (staging, sstateinst + file))
+
+	if bb.data.inherits_class('native', d) or bb.data.inherits_class('nativesdk', d) or bb.data.inherits_class('crosssdk', d) or bb.data.inherits_class('cross-canadian', d):
+		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIR:%s:g'" % (staging)
+	elif bb.data.inherits_class('cross', d):
+		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIRTARGET:%s:g; s:FIXMESTAGINGDIR:%s:g'" % (staging_target, staging)
+	else:
+		sstate_sed_cmd = "sed -i -e 's:FIXMESTAGINGDIRHOST:%s:g'" % (staging_host)
+
+	# Add sstateinst to each filename in fixmepath, use xargs to efficiently call sed
+	sstate_hardcode_cmd = "sed -e 's:^:%s:g' %s | xargs %s" % (sstateinst, fixmefn, sstate_sed_cmd)
+
+	print "Replacing fixme paths in sstate package: %s" % (sstate_hardcode_cmd)
+	os.system(sstate_hardcode_cmd)
+
         # Need to remove this or we'd copy it into the target directory and may 
         # conflict with another writer
         os.remove(fixmefn)
@@ -300,6 +311,9 @@  python sstate_cleanall() {
 def sstate_hardcode_path(d):
 	# Need to remove hardcoded paths and fix these when we install the
 	# staging packages.
+	#
+	# Note: the logic in this function needs to match the reverse logic
+	# in sstate_installpkg(ss, d)
 
 	staging = d.getVar('STAGING_DIR', True)
 	staging_target = d.getVar('STAGING_DIR_TARGET', True)