| Submitter | Enrico Scholz |
|---|---|
| Date | Feb. 10, 2013, 12:41 p.m. |
| Message ID | <69be0b1aae50eb097172bcfa2e9aa02157f62436.1360499939.git.enrico.scholz@sigma-chemnitz.de> |
| Download | mbox | patch |
| Permalink | /patch/44419/ |
| State | Accepted |
| Commit | ec2aab09769f4b6817d74d2175afa2b7c7598750 |
| Headers | show |
Comments
On Sun, 2013-02-10 at 13:41 +0100, Enrico Scholz wrote: > oe.path.realpath() provides are common and more correct implementation > for resolving symlinks within sysroot. Use it. > > Old implementation suffered from lot of problems; e.g. > > * redundant code > > * calls 'os.stat()' which references files on host; this can give wrong > results about existing/non-existing and can cause EPERM (instead of > the catched ENONENT) exceptions > > * does not deal with special cases like '..' leaving the sysroot. Whilst these changes are good, they do come at a cost: post symlink package changes ./perfscript -c 8c22531e491e6b0cfffaaa80d6bc75db757fc1d1 49:38.46,17:12.15 pre symlink package changes ./perfscript -c 1a80329b3fcf23ecc23e409a260b9b2182652f65 48:16.33,13:39.97 So it added 1m20 to the overall build time, but more worryingly, added added nearly 3m30 to the time for: bitbake virtual/kernel -c cleansstate; bitbake virtual/kernel These tests are based on the script linked from https://wiki.yoctoproject.org/wiki/Performance_Test where the kernel test is this is the second number above list, overall build time is the first. Have you any time to look into this performance regression? FWIW, "bitbake virtual/kernel -c package -P -f" shows that package_fixsymlinks() takes a rather long time in the profile output. Cheers, Richard
Richard Purdie <richard.purdie@linuxfoundation.org> writes: >> Old implementation suffered from lot of problems; e.g. >> >> * redundant code >> >> * calls 'os.stat()' which references files on host; this can give wrong >> results about existing/non-existing and can cause EPERM (instead of >> the catched ENONENT) exceptions >> >> * does not deal with special cases like '..' leaving the sysroot. > > Whilst these changes are good, they do come at a cost: > > post symlink package changes > ./perfscript -c 8c22531e491e6b0cfffaaa80d6bc75db757fc1d1 > 49:38.46,17:12.15 > > pre symlink package changes > ./perfscript -c 1a80329b3fcf23ecc23e409a260b9b2182652f65 > 48:16.33,13:39.97 > > So it added 1m20 to the overall build time, but more worryingly, added > added nearly 3m30 to the time for: > > bitbake virtual/kernel -c cleansstate; bitbake virtual/kernel > > These tests are based on the script linked from > https://wiki.yoctoproject.org/wiki/Performance_Test where the kernel > test is this is the second number above list, overall build time is the > first. > > Have you any time to look into this performance regression? Well, patch replaced a call to os.path.realpath() with a more accurate variant honoring a sysroot. There must be more work be done to replace things previously solved by the operating system with custom (python) code. I will try to write a C implementation of oe.path.realpath() but this can take some time and I do not have an idea how to integrate it into oe. Enrico
On Tue, 2013-03-12 at 11:25 +0100, Enrico Scholz wrote: > Richard Purdie <richard.purdie@linuxfoundation.org> writes: > > >> Old implementation suffered from lot of problems; e.g. > >> > >> * redundant code > >> > >> * calls 'os.stat()' which references files on host; this can give wrong > >> results about existing/non-existing and can cause EPERM (instead of > >> the catched ENONENT) exceptions > >> > >> * does not deal with special cases like '..' leaving the sysroot. > > > > Whilst these changes are good, they do come at a cost: > > > > post symlink package changes > > ./perfscript -c 8c22531e491e6b0cfffaaa80d6bc75db757fc1d1 > > 49:38.46,17:12.15 > > > > pre symlink package changes > > ./perfscript -c 1a80329b3fcf23ecc23e409a260b9b2182652f65 > > 48:16.33,13:39.97 > > > > So it added 1m20 to the overall build time, but more worryingly, added > > added nearly 3m30 to the time for: > > > > bitbake virtual/kernel -c cleansstate; bitbake virtual/kernel > > > > These tests are based on the script linked from > > https://wiki.yoctoproject.org/wiki/Performance_Test where the kernel > > test is this is the second number above list, overall build time is the > > first. > > > > Have you any time to look into this performance regression? > > Well, patch replaced a call to os.path.realpath() with a more accurate > variant honoring a sysroot. There must be more work be done to replace > things previously solved by the operating system with custom (python) > code. > > I will try to write a C implementation of oe.path.realpath() but this > can take some time and I do not have an idea how to integrate it into > oe. The issue is not an interpreted language verses C, the problem is the shear number of syscalls. Each isdir() and islink() call means a stat call into the kernel. For "bitbake virtual/kernel -c package", I'm seeing 400,000 stat calls for 23,000 files. We know the filesystem doesn't change so we could cache the stat calls and hopefully hence the speed. I actually have some code I tried this with but its not working as well as I'd like so I need to do some further investigation about what is happening. Cheers, Richard
Richard Purdie <richard.purdie@linuxfoundation.org> writes: >> > Have you any time to look into this performance regression? >> >> Well, patch replaced a call to os.path.realpath() with a more accurate >> variant honoring a sysroot. There must be more work be done to replace >> things previously solved by the operating system with custom (python) >> code. >> >> I will try to write a C implementation of oe.path.realpath() but this >> can take some time and I do not have an idea how to integrate it into >> oe. > > The issue is not an interpreted language verses C, the problem is the > shear number of syscalls. Each isdir() and islink() call means a stat > call into the kernel. With try: - is_dir = os.path.isdir(file) + if assume_dir: + is_dir = True + else: + is_dir = os.path.isdir(file) except: in meta/lib/oe/path.py you should be able to reduce calls to isdir(). A - rtarget = oe.path.realpath(path, inst_root, True, assume_dir = True) + rtarget = oe.path.realpath(path, inst_root, False, assume_dir = True) in package.bbclass seems to be save (pkgfiles[] is filled by os.walk() so that locations should be already physical) and might improve performance too. But I do not have numbers how much you can save by these changes. Enrico
Patch
diff --git a/meta/classes/package.bbclass b/meta/classes/package.bbclass index b10e6f6..a74ec8a 100644 --- a/meta/classes/package.bbclass +++ b/meta/classes/package.bbclass @@ -741,7 +741,8 @@ python split_and_strip_files () { continue try: - s = os.stat(file) + ltarget = oe.path.realpath(file, dvar, False) + s = os.lstat(ltarget) except OSError, (err, strerror): if err != errno.ENOENT: raise @@ -752,11 +753,6 @@ python split_and_strip_files () { # If it's a symlink, and points to an ELF file, we capture the readlink target if os.path.islink(file): target = os.readlink(file) - if not os.path.isabs(target): - ltarget = os.path.join(os.path.dirname(file), target) - else: - ltarget = target - if isELF(ltarget): #bb.note("Sym: %s (%d)" % (ltarget, isELF(ltarget))) symlinks[file] = target @@ -1005,14 +1001,12 @@ python package_fixsymlinks () { rpath = path[len(inst_root):] pkg_files[pkg].append(rpath) try: - s = os.stat(path) + rtarget = oe.path.realpath(path, inst_root, True) + os.lstat(rtarget) except OSError, (err, strerror): if err != errno.ENOENT: raise - target = os.readlink(path) - if target[0] != '/': - target = os.path.join(os.path.dirname(path)[len(inst_root):], target) - dangling_links[pkg].append(os.path.normpath(target)) + dangling_links[pkg].append(os.path.normpath(rtarget[len(inst_root):])) newrdepends = {} for pkg in dangling_links:
oe.path.realpath() provides are common and more correct implementation for resolving symlinks within sysroot. Use it. Old implementation suffered from lot of problems; e.g. * redundant code * calls 'os.stat()' which references files on host; this can give wrong results about existing/non-existing and can cause EPERM (instead of the catched ENONENT) exceptions * does not deal with special cases like '..' leaving the sysroot. Signed-off-by: Enrico Scholz <enrico.scholz@sigma-chemnitz.de> --- meta/classes/package.bbclass | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-)