Patchwork [1/1] pseudo: Use fxstatat64 in unlinkat

login
register
mail settings
Submitter Andrei Gherzan
Date Aug. 2, 2012, 8 p.m.
Message ID <fe3056c0364345c518f66266a7f92c1116763156.1343937363.git.andrei@gherzan.ro>
Download mbox | patch
Permalink /patch/33715/
State New
Headers show

Comments

Andrei Gherzan - Aug. 2, 2012, 8 p.m.
This is done to avoid "Value too large for defined data type" error while trying to
remove a file > 2GB.

[Yocto #2881]

Signed-off-by: Andrei Gherzan <andrei@gherzan.ro>
---
 .../pseudo/files/unlinkat-use-fxstatat64.patch     |   30 ++++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_1.4.bb         |    5 ++--
 2 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
Richard Purdie - Aug. 2, 2012, 8:08 p.m.
Adding seebs to the cc since I suspect he may prefer to merge this
upstream or he may have feedback?

Cheers,

Richard

On Thu, 2012-08-02 at 23:00 +0300, Andrei Gherzan wrote:
> This is done to avoid "Value too large for defined data type" error while trying to
> remove a file > 2GB.
> 
> [Yocto #2881]
> 
> Signed-off-by: Andrei Gherzan <andrei@gherzan.ro>
> ---
>  .../pseudo/files/unlinkat-use-fxstatat64.patch     |   30 ++++++++++++++++++++
>  meta/recipes-devtools/pseudo/pseudo_1.4.bb         |    5 ++--
>  2 files changed, 33 insertions(+), 2 deletions(-)
>  create mode 100644 meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
> 
> diff --git a/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch b/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
> new file mode 100644
> index 0000000..987c4c8
> --- /dev/null
> +++ b/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
> @@ -0,0 +1,30 @@
> +pseudo: Use fxstatat64 in unlinkat
> +
> +This is done to avoid "Value too large for defined data type" error while trying to
> +remove a file > 2GB.
> +
> +Signed-off-by: Andrei Gherzan <andrei@gherzan.ro>
> +Upstream-Status: Pending
> +
> +Index: pseudo-1.4/ports/unix/guts/unlinkat.c
> +===================================================================
> +--- pseudo-1.4.orig/ports/unix/guts/unlinkat.c	2012-07-27 23:30:37.000000000 +0300
> ++++ pseudo-1.4/ports/unix/guts/unlinkat.c	2012-08-02 22:43:26.826645748 +0300
> +@@ -8,7 +8,7 @@
> +  */
> + 	pseudo_msg_t *msg;
> + 	int save_errno;
> +-	struct stat buf;
> ++	struct stat64 buf;
> + 	int old_db_entry;
> + 
> + #ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
> +@@ -31,7 +31,7 @@
> + #ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
> + 	rc = real_lstat(path, &buf);
> + #else
> +-	rc = real___fxstatat(_STAT_VER, dirfd, path, &buf, AT_SYMLINK_NOFOLLOW);
> ++	rc = real___fxstatat64(_STAT_VER, dirfd, path, &buf, AT_SYMLINK_NOFOLLOW);
> + #endif
> + 	if (rc == -1) {
> + 		return rc;
> diff --git a/meta/recipes-devtools/pseudo/pseudo_1.4.bb b/meta/recipes-devtools/pseudo/pseudo_1.4.bb
> index e1e1f6f..4c21172 100644
> --- a/meta/recipes-devtools/pseudo/pseudo_1.4.bb
> +++ b/meta/recipes-devtools/pseudo/pseudo_1.4.bb
> @@ -1,8 +1,9 @@
>  require pseudo.inc
>  
> -PR = "r12"
> +PR = "r13"
>  
> -SRC_URI = "http://www.yoctoproject.org/downloads/${BPN}/${BPN}-${PV}.tar.bz2"
> +SRC_URI = "http://www.yoctoproject.org/downloads/${BPN}/${BPN}-${PV}.tar.bz2 \
> +           file://unlinkat-use-fxstatat64.patch"
>  
>  SRC_URI[md5sum] = "bc04c6c9d13cfdb789ffc2f3cca9ab08"
>  SRC_URI[sha256sum] = "147fa7b177061a145d330b9e159529a185be94550f123c6acb0d3b75d480c5b4"
Peter Seebach - Aug. 2, 2012, 9:35 p.m.
On Thu, 2 Aug 2012 21:08:09 +0100
Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> Adding seebs to the cc since I suspect he may prefer to merge this
> upstream or he may have feedback?

Ooh, thanks. Yeah, we need to fix this in upstream, but the patch as
provided will likely cause Strange Experiences.

See, if you look at a bit more of unlinkat:

#ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
        rc = real_lstat(path, &buf);
#else  
        rc = real___fxstatat(_STAT_VER, dirfd, path, &buf,
                             AT_SYMLINK_NOFOLLOW);
#endif 
        if (rc == -1) {
                return rc;
        }
        msg = pseudo_client_op_plain(OP_MAY_UNLINK, 0, -1, dirfd, path,
                                     &buf);

So, what's pseudo_client_op_plain? Well. On systems that don't
distinguish between stat64 and stat, it's just a #define for
pseudo_client_op. But on Linux, at least:

#define PSEUDO_STATBUF_64 1
#define PSEUDO_STATBUF struct stat64

And then pseudo_client_op_plain does:

                pseudo_stat64_from32(&buf64, buf);
                return pseudo_client_op(op, access, fd, dirfd, path,
                                        &buf64, oldpath); 

So unless it happens that the 'struct stat' and 'struct stat64'
structures have the same layout, this is going to end badly.
We need a more comprehensive fix. But the unlinkat() code is in
the unix port, which tries not to rely on the knowledge that it
can rely on a stat64 existing. (It gets complicated, because on
Darwin, file sizes have always been 64-bit, but inode sizes
changed, but their solution does not involve making everyone refer
to stat64 objects...)

So I'm pretty sure we have to do something a little fancier here,
but I am not 100% sure what the right answer is. Of course, the
Darwin support, which was put in because of all the people telling
me that oe-core would run on OS X as soon as pseudo ran there, has
become a little less exciting of late. :)

Alternatives:
* Check for errno 75 and treat it as harmless.
* Ensure that pseudo is built with -D_FILE_OFFSET_BITS=64 on Linux.

I am not sure whether that latter would work when using pseudo with
other applications not built that way. Or, alternatively, whether
it would work to use pseudo NOT built that way with 32 bit
applications which were; I hadn't realized it was a setting.  :)

So I'm all for fixing this, but the proposed patch looks to me like
it might break things, just because I have low confidence that
treating a stat64 structure as a stat32 would be safe...

-s
Peter Seebach - Aug. 2, 2012, 10:48 p.m.
On Thu, 2 Aug 2012 23:00:44 +0300
Andrei Gherzan <andrei@gherzan.ro> wrote:

> This is done to avoid "Value too large for defined data type" error
> while trying to remove a file > 2GB.
> 
> [Yocto #2881]

I thought "hey, that should be easy" and looked at this for a bit. Then
looked more. Then ran tests. Then looked more. Then grovelled through
man pages. Then hit up a guy I know at Apple for a system internals
question. Then looked at it more.

This one is HARD. See, the underlying problem is that, back at the dawn
of time, someone thought 32 bits was a reasonable file size. We know
that to be wrong now, but what to do?

The BSDs just jumped to 64-bit off_t back around 4.4BSD, so all the
BSD-derived stuff is fine. Except, of course, that OS X now has a new
problem because they also need to convert to 64-bit *inodes*, which
creates a change in the shape of a structure. But even then, their
solution is basically that the system defaults to mapping stat() to the
new type, and struct stat to the new type, and if you want the old ones
you can mess with #defines.

And Linux basically does this, too, except that it's *not* the default.
And furthermore, it affects file size. So if you are building for
32-bit, and you haven't defined _FILE_OFFSET_BITS=64, 'struct stat' has
a 32-bit off_t, and stat() expects that.

And in fact, the way this really gets resolved is that, if you define
that, and you write stat(), your code ends up linked against stat64().

So the problem is: On systems where both stat and stat64 exist, we have
to make that distinction, and things which don't use stat64 have to
call a different wrapper (pseudo_client_op_plain). And on some other
systems, there's no such thing as stat64 in view at all, so we can't
refer to that portably.

In fact, pseudo never cares about the size of a file. And it appears
that stat(2) will in fact fill in the whole buffer, then return -1 with
errno set to EOVERFLOW. So we could in theory just discount an errno -1
accompanied by EOVERFLOW. Except there's no *specification* that stat
is returning a valid stat buf except a few members might be truncated
or corrupt.

So, looking again at:

#ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
        rc = real_lstat(path, &buf);
#else
        rc = real___fxstatat(_STAT_VER, dirfd, path, &buf,
                             AT_SYMLINK_NOFOLLOW);
#endif
        if (rc == -1) {
                return rc;
        }
        msg = pseudo_client_op_plain(OP_MAY_UNLINK, 0, -1, dirfd, path,
                                     &buf);

The issue here is that both need to use stat64, or neither can. And
that we don't really want to have stat64 references in the nominally
portable code.

I think probably the right solution is to use the PSEUDO_STATBUF logic
that's already in ports, and expand on it a bit. I'm messing with:

#if PSEUDO_STATBUF_64
extern pseudo_msg_t *pseudo_client_op_plain(pseudo_op_t op, int access,
int fd, int dirfd, const char *path, const struct stat *buf, ...);
#define base_lstat real_lstat64
#define base_fstat real_fstat64
#define base_stat real_stat64
#define base_fstatat(dirfd, path, buf, flags)
real___fxstatat64(_STAT_VER, dirfd, path, buf, flags)
#else
#define pseudo_client_op_plain pseudo_client_op
#define base_lstat real_lstat 
#define base_fstat real_fstat 
#define base_stat real_stat
#define base_fstatat(dirfd, path, buf, flags)
real___fxstatat(_STAT_VER, dirfd, path, buf, flags)
#endif

with the intent being that the unix/guts/*.c would switch to using
PSEUDO_STATBUF instead of struct stat, and then all the calls to
real_fstat => base_fstat; this allows us to just use pseudo_client_op
consistently for most of these. So far as I can tell, nothing in that
directory expects to pass a struct stat of any kind back to the user,
with the arguable exception of ftw() and nftw(), both of which are
only passing around pointers to functions which take pointers to struct
stat, so they don't care.

-s
Peter Seebach - Aug. 6, 2012, 6:44 p.m.
On Thu, 2 Aug 2012 17:48:18 -0500
Peter Seebach <peter.seebach@windriver.com> wrote:

> I think probably the right solution is to use the PSEUDO_STATBUF logic
> that's already in ports, and expand on it a bit.

A followup:

I've got a test version of this tagged PSEUDO_1_4_1. I've got one
report about a cryptic error from tar (see yocto #2881 for some
discussion), but would be really interested in feedback from anyone who
is using 32-bit systems with this stuff.

-s

Patch

diff --git a/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch b/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
new file mode 100644
index 0000000..987c4c8
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/unlinkat-use-fxstatat64.patch
@@ -0,0 +1,30 @@ 
+pseudo: Use fxstatat64 in unlinkat
+
+This is done to avoid "Value too large for defined data type" error while trying to
+remove a file > 2GB.
+
+Signed-off-by: Andrei Gherzan <andrei@gherzan.ro>
+Upstream-Status: Pending
+
+Index: pseudo-1.4/ports/unix/guts/unlinkat.c
+===================================================================
+--- pseudo-1.4.orig/ports/unix/guts/unlinkat.c	2012-07-27 23:30:37.000000000 +0300
++++ pseudo-1.4/ports/unix/guts/unlinkat.c	2012-08-02 22:43:26.826645748 +0300
+@@ -8,7 +8,7 @@
+  */
+ 	pseudo_msg_t *msg;
+ 	int save_errno;
+-	struct stat buf;
++	struct stat64 buf;
+ 	int old_db_entry;
+ 
+ #ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
+@@ -31,7 +31,7 @@
+ #ifdef PSEUDO_NO_REAL_AT_FUNCTIONS
+ 	rc = real_lstat(path, &buf);
+ #else
+-	rc = real___fxstatat(_STAT_VER, dirfd, path, &buf, AT_SYMLINK_NOFOLLOW);
++	rc = real___fxstatat64(_STAT_VER, dirfd, path, &buf, AT_SYMLINK_NOFOLLOW);
+ #endif
+ 	if (rc == -1) {
+ 		return rc;
diff --git a/meta/recipes-devtools/pseudo/pseudo_1.4.bb b/meta/recipes-devtools/pseudo/pseudo_1.4.bb
index e1e1f6f..4c21172 100644
--- a/meta/recipes-devtools/pseudo/pseudo_1.4.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_1.4.bb
@@ -1,8 +1,9 @@ 
 require pseudo.inc
 
-PR = "r12"
+PR = "r13"
 
-SRC_URI = "http://www.yoctoproject.org/downloads/${BPN}/${BPN}-${PV}.tar.bz2"
+SRC_URI = "http://www.yoctoproject.org/downloads/${BPN}/${BPN}-${PV}.tar.bz2 \
+           file://unlinkat-use-fxstatat64.patch"
 
 SRC_URI[md5sum] = "bc04c6c9d13cfdb789ffc2f3cca9ab08"
 SRC_URI[sha256sum] = "147fa7b177061a145d330b9e159529a185be94550f123c6acb0d3b75d480c5b4"