About pseudo's chmod

Submitted by Robert Yang on Aug. 1, 2016, 8:57 a.m. | Patch ID: 128297

Details

Message ID 579F0EE8.3080009@windriver.com
State New
Headers show

Commit Message

Robert Yang Aug. 1, 2016, 8:57 a.m.
On 08/01/2016 04:42 PM, Seebs wrote:
> On 1 Aug 2016, at 0:57, Robert Yang wrote:
>
>> Sorry, the steps were wrong, they should be:
>> 1) Out of pseudo:
>>    $ umask 0022
>>    $ touch file1
>> 2) Under pseudo:
>>    $ ln file1 file2
>>    $ chmod 777 file2
>>    $ ls -l file1 file2
>>    We can see that both file1 and file2's mode is 0777.
>>    But if we remove file2:
>>    $ rm -f file2
>>    $ ls file1
>>    Now file1's permission is 0755, not 0777 or 0644, it should be 0777 here.
>
> The short answer is: If you aren't tracking a file in pseudo, we don't make
> promises about its behavior. Normally, we don't touch them, but if there's hard
> links to them that are being touched, well. And having a hard link that is
> tracked, and another that isn't, is probably impossible to support. I definitely
> don't want to keep database entries for files that have been deleted, that way
> lies madness and possible database corruption; for instance, if we do that, and
> you make a new file of the same type, it'll show up as having those permissions,
> with only a path-mismatch warning in the database to suggest what went wrong.
>
> I would say that the probable correct answer is either "make the original file
> be tracked" or "don't use hard links in this case".

Hi Peter,

How about we track the src when hardlink, for example:

$ ln oldpath newpath

Track both oldpath and newpath. The patch for pseudo is:

// Robert

>
> Note that, under older pseudo, the file would have ended up 0777. The behavior
> of silently masking out 022 from underlying filesystem permissions is entirely
> intentional. During some debugging quite a while back, we discovered a quirk in
> oe-core, plus a strange behavior of
> GNU tar, which between them resulted in some sstage directories getting
> unpacked with mode 0777.
>
> And we *really* don't want the build system generating files which other users
> can modify, especially not in stuff that's intended to go into a root
> filesystem. So stripping out those bits in the underlying filesystem is
> intentional.
>
> If you were actually root, yes, the original file would have its mode changed to
> 0777. But we should never be *caring* about the mode bits on
> raw files outside of pseudo, except that we want them to allow owner
> write permissions and not allow group or other write permissions. If the
> file's permissions matter to the build system or generated stuff, the
> file should be tracked by pseudo.
>
> -s

Patch hide | download patch | download mbox

diff --git a/ports/unix/guts/linkat.c b/ports/unix/guts/linkat.c
index ec27e47..521b8fa 100644
--- a/ports/unix/guts/linkat.c
+++ b/ports/unix/guts/linkat.c
@@ -62,6 +62,10 @@ 
           * if the thing linked is a symlink.
           */
          pseudo_client_op(OP_LINK, 0, -1, -1, newpath, &buf);
+        /*
+         * Track oldpath in case it isn't tracked.
+         */
+        pseudo_client_op(OP_LINK, 0, -1, -1, oldpath, &buf);

          errno = save_errno;


Comments

Seebs Aug. 1, 2016, 6:17 p.m.
On 1 Aug 2016, at 3:57, Robert Yang wrote:

> How about we track the src when hardlink, for example:

I don't think I like this.

Fundamentally, the problem here is making a hard link to a file outside 
of pseudo, and then modifying the file.

Consider, if you will, what happens if you replace "chmod 0777 file2"
with:
	# echo "# last line" >> file2

The problem here isn't pseudo's tracking; it's that we're making a hard 
link to a file, modifying the hard link, deleting the hard link, and 
expecting the original file to have the same attributes it had before
this all happened.

-s
Richard Purdie Aug. 1, 2016, 8:01 p.m.
On Mon, 2016-08-01 at 13:17 -0500, Seebs wrote:
> On 1 Aug 2016, at 3:57, Robert Yang wrote:
> 
> > How about we track the src when hardlink, for example:
> 
> I don't think I like this.
> 
> Fundamentally, the problem here is making a hard link to a file
> outside 
> of pseudo, and then modifying the file.
> 
> Consider, if you will, what happens if you replace "chmod 0777 file2"
> with:
> 	# echo "# last line" >> file2
> 
> The problem here isn't pseudo's tracking; it's that we're making a
> hard 
> link to a file, modifying the hard link, deleting the hard link, and 
> expecting the original file to have the same attributes it had before
> this all happened.

No, we're actually expecting it to retain the mode it was given via the
hardlink under pseudo. 

This is what a real world system would do and in this case we could
track it via pseudo since pseudo is loaded when the hardlink is
created. It would be unreasonable for pseudo to track all hardlinks but
tracking ones created under it does seem reasonable?

Cheers,

Richard
Seebs Aug. 1, 2016, 8:17 p.m.
On 1 Aug 2016, at 15:01, Richard Purdie wrote:

> No, we're actually expecting it to retain the mode it was given via 
> the
> hardlink under pseudo.
>
> This is what a real world system would do and in this case we could
> track it via pseudo since pseudo is loaded when the hardlink is
> created. It would be unreasonable for pseudo to track all hardlinks 
> but
> tracking ones created under it does seem reasonable?

Hmm. Well, strictly speaking, the link created under pseudo *does* get 
tracked. Hmm. But an implicit request to track also the thing linked to 
is not a horrible idea. Although you'd still be able to beat it:

	$ touch file1
	$ ln file1 file2
	$ pseudo
	# ln file2 file3
	# chmod 777 file3
	# rm file2 file3
	# ls -l file1

The general case of "find everything this link also refers to" is 
clearly out of scope. That said... Hmm. I think my main feeling is, if 
we want
to link to the file, and we want the changes to the linked file to
survive, we should probably either create that file under pseudo, or 
explicitly claim it with pseudo when we start wanting to do the 
tracking.
(You can trivially do this to a tree with chown -R root tree).

-s
Richard Purdie Aug. 1, 2016, 10:55 p.m.
On Mon, 2016-08-01 at 15:17 -0500, Seebs wrote:
> On 1 Aug 2016, at 15:01, Richard Purdie wrote:
> 
> > No, we're actually expecting it to retain the mode it was given via
> > the
> > hardlink under pseudo.
> > 
> > This is what a real world system would do and in this case we could
> > track it via pseudo since pseudo is loaded when the hardlink is
> > created. It would be unreasonable for pseudo to track all hardlinks
> > but
> > tracking ones created under it does seem reasonable?
> 
> Hmm. Well, strictly speaking, the link created under pseudo *does*
> get 
> tracked. Hmm. But an implicit request to track also the thing linked
> to 
> is not a horrible idea. Although you'd still be able to beat it:
> 
> 	$ touch file1
> 	$ ln file1 file2
> 	$ pseudo
> 	# ln file2 file3
> 	# chmod 777 file3
> 	# rm file2 file3
> 	# ls -l file1
> 
> The general case of "find everything this link also refers to" is 
> clearly out of scope.

Agreed.

>  That said... Hmm. I think my main feeling is, if we want
> to link to the file, and we want the changes to the linked file to
> survive, we should probably either create that file under pseudo, or 
> explicitly claim it with pseudo when we start wanting to do the 
> tracking.
> (You can trivially do this to a tree with chown -R root tree).

The trouble is that for speed, we do create trees of hardlinked files
and play games with those and sstate amongst other things. Its
obviously faster to do this than make physical copies of the files.
Given what I know of the code paths, I suspect that tracking the source
of a hardlink would make life much easier for us. Obviously we can go
and start adding "chown -R" calls everywhere but that seems a little
ugly to me and doesn't do performance any favours.

Is there any significant downside if we do track the source of
hardlinks?

Cheers,

Richard
Mark Hatle Aug. 1, 2016, 11:36 p.m.
On 8/1/16 5:55 PM, Richard Purdie wrote:
> On Mon, 2016-08-01 at 15:17 -0500, Seebs wrote:
>> On 1 Aug 2016, at 15:01, Richard Purdie wrote:
>>
>>> No, we're actually expecting it to retain the mode it was given via
>>> the
>>> hardlink under pseudo.
>>>
>>> This is what a real world system would do and in this case we could
>>> track it via pseudo since pseudo is loaded when the hardlink is
>>> created. It would be unreasonable for pseudo to track all hardlinks
>>> but
>>> tracking ones created under it does seem reasonable?
>>
>> Hmm. Well, strictly speaking, the link created under pseudo *does*
>> get 
>> tracked. Hmm. But an implicit request to track also the thing linked
>> to 
>> is not a horrible idea. Although you'd still be able to beat it:
>>
>> 	$ touch file1
>> 	$ ln file1 file2
>> 	$ pseudo
>> 	# ln file2 file3
>> 	# chmod 777 file3
>> 	# rm file2 file3
>> 	# ls -l file1
>>
>> The general case of "find everything this link also refers to" is 
>> clearly out of scope.
> 
> Agreed.
> 
>>  That said... Hmm. I think my main feeling is, if we want
>> to link to the file, and we want the changes to the linked file to
>> survive, we should probably either create that file under pseudo, or 
>> explicitly claim it with pseudo when we start wanting to do the 
>> tracking.
>> (You can trivially do this to a tree with chown -R root tree).
> 
> The trouble is that for speed, we do create trees of hardlinked files
> and play games with those and sstate amongst other things. Its
> obviously faster to do this than make physical copies of the files.
> Given what I know of the code paths, I suspect that tracking the source
> of a hardlink would make life much easier for us. Obviously we can go
> and start adding "chown -R" calls everywhere but that seems a little
> ugly to me and doesn't do performance any favours.
> 
> Is there any significant downside if we do track the source of
> hardlinks?

Would it makes sense to track the xattrs and linked files and such using some
type of inode reference (virtual or otherwise)?

Since in the case of a hard link, on a normal Linux style filesystem, there will
be a single inode that has a reference count higher then 1.  Thus you can know
the modes, xattrs, etc for that inode.. then the file points to the inode with
reference counts.  (this might require a rework on internal structures.. but
also might solve the problem.)

--Mark

> Cheers,
> 
> Richard
> 
> 
> 
>
Robert Yang Aug. 2, 2016, 1:52 a.m.
On 08/02/2016 06:55 AM, Richard Purdie wrote:
> On Mon, 2016-08-01 at 15:17 -0500, Seebs wrote:
>> On 1 Aug 2016, at 15:01, Richard Purdie wrote:
>>
>>> No, we're actually expecting it to retain the mode it was given via
>>> the
>>> hardlink under pseudo.
>>>
>>> This is what a real world system would do and in this case we could
>>> track it via pseudo since pseudo is loaded when the hardlink is
>>> created. It would be unreasonable for pseudo to track all hardlinks
>>> but
>>> tracking ones created under it does seem reasonable?
>>
>> Hmm. Well, strictly speaking, the link created under pseudo *does*
>> get
>> tracked. Hmm. But an implicit request to track also the thing linked
>> to
>> is not a horrible idea. Although you'd still be able to beat it:
>>
>> 	$ touch file1
>> 	$ ln file1 file2
>> 	$ pseudo
>> 	# ln file2 file3
>> 	# chmod 777 file3
>> 	# rm file2 file3
>> 	# ls -l file1
>>
>> The general case of "find everything this link also refers to" is
>> clearly out of scope.
>
> Agreed.
>
>>   That said... Hmm. I think my main feeling is, if we want
>> to link to the file, and we want the changes to the linked file to
>> survive, we should probably either create that file under pseudo, or
>> explicitly claim it with pseudo when we start wanting to do the
>> tracking.
>> (You can trivially do this to a tree with chown -R root tree).
>
> The trouble is that for speed, we do create trees of hardlinked files
> and play games with those and sstate amongst other things. Its
> obviously faster to do this than make physical copies of the files.
> Given what I know of the code paths, I suspect that tracking the source
> of a hardlink would make life much easier for us. Obviously we can go
> and start adding "chown -R" calls everywhere but that seems a little
> ugly to me and doesn't do performance any favours.
>
> Is there any significant downside if we do track the source of
> hardlinks?

AFAIK, no.

And when remove file2, but file1's permission is changed, it should
be considered as a bug.

// Robert

>
> Cheers,
>
> Richard
>
>
>
>
Seebs Aug. 2, 2016, 3:37 a.m.
On 1 Aug 2016, at 17:55, Richard Purdie wrote:

> The trouble is that for speed, we do create trees of hardlinked files
> and play games with those and sstate amongst other things.

Yeah. One concern would be ownership: If the things we're hardlinking to
weren't created/tracked by pseudo initially, they're going to show up as 
owned by the actual user doing the build. So hardlinks to them will 
also.

So it may be more reasonable to just create those things under pseudo
also, in which case they're already tracked.

> Its
> obviously faster to do this than make physical copies of the files.
> Given what I know of the code paths, I suspect that tracking the 
> source
> of a hardlink would make life much easier for us. Obviously we can go
> and start adding "chown -R" calls everywhere but that seems a little
> ugly to me and doesn't do performance any favours.

> Is there any significant downside if we do track the source of
> hardlinks?

Hmm. I'm not sure. It'll make things more complicated, but only 
marginally. The net impact would be fewer surprises in common cases, but 
the uncommon cases would get *really* weird. But it seems reasonable.
(At least for the case where there isn't an existing entry.) I'll
poke at this a bit, I think.

-s
Seebs Aug. 2, 2016, 3:39 a.m.
On 1 Aug 2016, at 18:36, Mark Hatle wrote:

> Would it makes sense to track the xattrs and linked files and such 
> using some
> type of inode reference (virtual or otherwise)?
>
> Since in the case of a hard link, on a normal Linux style filesystem, 
> there will
> be a single inode that has a reference count higher then 1.  Thus you 
> can know
> the modes, xattrs, etc for that inode.. then the file points to the 
> inode with
> reference counts.  (this might require a rework on internal 
> structures.. but
> also might solve the problem.)

Well, that *is* how we track xattrs. And everything else, we do use 
device and inode, but, we maintain one row for each path, and delete the 
individual path rows.

This gets back to one of the original design goals, which was to avoid
all the horrific things that happen to fakeroot because it's only 
tracking
device/inode. We *want* the multiple path entries so we can report 
apparent database problems.

(Side note: There's some performance optimizations that have reduced 
stability/robustness, and I have plans involving making pseudo smarter
about those, and/or allowing configuration to prefer being more 
cautious.)

-s
Seebs Aug. 2, 2016, 3:43 a.m.
On 1 Aug 2016, at 20:52, Robert Yang wrote:

> And when remove file2, but file1's permission is changed, it should
> be considered as a bug.

I'm not sure of that. My interpretation would be that hard linking under 
pseudo to files which weren't created under the same pseudo database is
user error; that's not how the database is intended to work. That said, 
it's pretty trivial to add the things to it.

Although I'd like to know more about the use cases for these, because it
occurs to me that the qualifier "same pseudo database" points out 
another
possible failure mode: Would any of those files that are being linked to
be getting linked to from *more than one* pseudo database? Because if 
they were, that would be a thing I haven't been planning for and I don't 
know
whether it'd work sanely.

-s
Robert Yang Aug. 2, 2016, 6:07 a.m.
On 08/02/2016 11:43 AM, Seebs wrote:
> On 1 Aug 2016, at 20:52, Robert Yang wrote:
>
>> And when remove file2, but file1's permission is changed, it should
>> be considered as a bug.
>
> I'm not sure of that. My interpretation would be that hard linking under pseudo
> to files which weren't created under the same pseudo database is
> user error; that's not how the database is intended to work. That said, it's
> pretty trivial to add the things to it.
>
> Although I'd like to know more about the use cases for these, because it
> occurs to me that the qualifier "same pseudo database" points out another

Currently, the problem in oe-core is:

       1) bitbake gzip
       2) Edit rpm-native or package.bbclass to make do_package re-run.
       3) bitbake gzip
       After the first build, build/version.c in gzip-dbg is 0444, but after
       the second build, it will be 0644, this is because do_package does:
       $ ln ${B}/version.c gzip-dbg/version.c,
       $ chmod 0444 gzip-dbg/version.c (it runs chmod 0644 on the real filesystem)
       And in the second build, the gzip-dbg/version.c will be removed and
       created again, so that stat() can't get 0444 but 0644 since
       ${B}/version.c is not tracked by pseudo.

// Robert

> possible failure mode: Would any of those files that are being linked to
> be getting linked to from *more than one* pseudo database? Because if they were,
> that would be a thing I haven't been planning for and I don't know
> whether it'd work sanely.
>
> -s
Robert Yang Aug. 2, 2016, 6:08 a.m.
On 08/02/2016 02:07 PM, Robert Yang wrote:
>
>
> On 08/02/2016 11:43 AM, Seebs wrote:
>> On 1 Aug 2016, at 20:52, Robert Yang wrote:
>>
>>> And when remove file2, but file1's permission is changed, it should
>>> be considered as a bug.
>>
>> I'm not sure of that. My interpretation would be that hard linking under pseudo
>> to files which weren't created under the same pseudo database is
>> user error; that's not how the database is intended to work. That said, it's
>> pretty trivial to add the things to it.
>>
>> Although I'd like to know more about the use cases for these, because it
>> occurs to me that the qualifier "same pseudo database" points out another
>
> Currently, the problem in oe-core is:
>
>        1) bitbake gzip
>        2) Edit rpm-native or package.bbclass to make do_package re-run.
>        3) bitbake gzip
>        After the first build, build/version.c in gzip-dbg is 0444, but after
>        the second build, it will be 0644, this is because do_package does:
>        $ ln ${B}/version.c gzip-dbg/version.c,
>        $ chmod 0444 gzip-dbg/version.c (it runs chmod 0644 on the real filesystem)
>        And in the second build, the gzip-dbg/version.c will be removed and
>        created again, so that stat() can't get 0444 but 0644 since
>        ${B}/version.c is not tracked by pseudo.


And please see the first email in the thread for more details.

// Robert

>
> // Robert
>
>> possible failure mode: Would any of those files that are being linked to
>> be getting linked to from *more than one* pseudo database? Because if they were,
>> that would be a thing I haven't been planning for and I don't know
>> whether it'd work sanely.
>>
>> -s
Seebs Aug. 2, 2016, 6:30 a.m.
On 2 Aug 2016, at 1:07, Robert Yang wrote:

> Currently, the problem in oe-core is:
>
>       1) bitbake gzip
>       2) Edit rpm-native or package.bbclass to make do_package re-run.
>       3) bitbake gzip
>       After the first build, build/version.c in gzip-dbg is 0444, but 
> after
>       the second build, it will be 0644, this is because do_package 
> does:
>       $ ln ${B}/version.c gzip-dbg/version.c,
>       $ chmod 0444 gzip-dbg/version.c (it runs chmod 0644 on the real 
> filesystem)
>       And in the second build, the gzip-dbg/version.c will be removed 
> and
>       created again, so that stat() can't get 0444 but 0644 since
>       ${B}/version.c is not tracked by pseudo.

Hmm. I'm a bit confused by this. Wouldn't the second build also do a
"chmod 0444" on gzip-dbg/version.c? Why is it doing that chmod the first
time and not the second? If it does it the second time, the fact that
the underlying file's mode changed won't matter.

But in this case... While I'm fine with modifying things to track the
file linked-to, it still feels like this is a usage error. 
Fundamentally,
we're unpacking a file (${B}/version.c), then linking to it, changing
the link in some way, deleting the link, and expecting the original file
to be unchanged. That doesn't seem right to me.

-s
Robert Yang Aug. 2, 2016, 6:44 a.m.
On 08/02/2016 02:30 PM, Seebs wrote:
> On 2 Aug 2016, at 1:07, Robert Yang wrote:
>
>> Currently, the problem in oe-core is:
>>
>>       1) bitbake gzip
>>       2) Edit rpm-native or package.bbclass to make do_package re-run.
>>       3) bitbake gzip
>>       After the first build, build/version.c in gzip-dbg is 0444, but after
>>       the second build, it will be 0644, this is because do_package does:
>>       $ ln ${B}/version.c gzip-dbg/version.c,
>>       $ chmod 0444 gzip-dbg/version.c (it runs chmod 0644 on the real filesystem)
>>       And in the second build, the gzip-dbg/version.c will be removed and
>>       created again, so that stat() can't get 0444 but 0644 since
>>       ${B}/version.c is not tracked by pseudo.
>
> Hmm. I'm a bit confused by this. Wouldn't the second build also do a
> "chmod 0444" on gzip-dbg/version.c? Why is it doing that chmod the first

Because the stat() gets 0644 on ${B}/version.c in the second run, so it
would run chmod 0644 rather than 0444 on gzip-dbg/version.c. And why it
gets 0644 ? pseudo's chmod automatically adds "w" on the real file, so:
if -e gzip-dbg/version.c; then
	${B}/version.c = 0444
else
	${B}/version.c = 0644
fi

And in the second run, gzip-dbg/version.c is removed and will be recreated,
so that it got 0644.

> time and not the second? If it does it the second time, the fact that
> the underlying file's mode changed won't matter.
>
> But in this case... While I'm fine with modifying things to track the

Thanks, I will send a patch for it.

> file linked-to, it still feels like this is a usage error. Fundamentally,
> we're unpacking a file (${B}/version.c), then linking to it, changing
> the link in some way, deleting the link, and expecting the original file
> to be unchanged. That doesn't seem right to me.

But that is what the real filesystem works without pseudo:
$ touch file1
$ ln file1 file2
$ chmod 777 file2
$ rm -f file2

file1 will be 777 on the real filesystem.

// Robert

>
> -s
Seebs Aug. 2, 2016, 6:50 a.m.
On 2 Aug 2016, at 1:44, Robert Yang wrote:

> Because the stat() gets 0644 on ${B}/version.c in the second run, so 
> it
> would run chmod 0644 rather than 0444 on gzip-dbg/version.c.

Why is it calling chmod at all? The goal is apparently to give
gzip-dbg/version.c the same mode that ${B}/version.c has. Since it's a
hard link, no chmod call is needed at all.

>> time and not the second? If it does it the second time, the fact that
>> the underlying file's mode changed won't matter.
>>
>> But in this case... While I'm fine with modifying things to track the
>
> Thanks, I will send a patch for it.

I already have a patch for this.

>> file linked-to, it still feels like this is a usage error. 
>> Fundamentally,
>> we're unpacking a file (${B}/version.c), then linking to it, changing
>> the link in some way, deleting the link, and expecting the original 
>> file
>> to be unchanged. That doesn't seem right to me.

> But that is what the real filesystem works without pseudo:
> $ touch file1
> $ ln file1 file2
> $ chmod 777 file2
> $ rm -f file2
>
> file1 will be 777 on the real filesystem.

Yes. But it seems that the mode the file is being changed to is 
dependant on the original reported mode. And that's the part that I'm 
confused by;
we're relying on the mode of the original file, but we're also changing
the mode of a hard link to it. This makes no sense to me.

So, I have a likely fix handy. (The difference between mine and the
version you proposed earlier is that, as I recall, yours does the LINK
operation on the original file unconditionally, which is in error; it
should only be done when no existing data was found in the database.) 
I'll double-check that again and go through looking for loose ends, 
because
I have a vague feeling that I'm overlooking a thing, but I haven't 
figured out what it would be yet.

-s
Robert Yang Aug. 2, 2016, 8:32 a.m.
On 08/02/2016 02:50 PM, Seebs wrote:
> On 2 Aug 2016, at 1:44, Robert Yang wrote:
>
>> Because the stat() gets 0644 on ${B}/version.c in the second run, so it
>> would run chmod 0644 rather than 0444 on gzip-dbg/version.c.
>
> Why is it calling chmod at all? The goal is apparently to give
> gzip-dbg/version.c the same mode that ${B}/version.c has. Since it's a
> hard link, no chmod call is needed at all.

It is worth trying, I will try to remove os.chown() and os.chmod() from
package.bbclass to see what will happen:

             if not cpath.islink(file):
                 os.link(file, fpath)
                 fstat = cpath.stat(file)
                 os.chmod(fpath, fstat.st_mode)
                 os.chown(fpath, fstat.st_uid, fstat.st_gid)

>
>>> time and not the second? If it does it the second time, the fact that
>>> the underlying file's mode changed won't matter.
>>>
>>> But in this case... While I'm fine with modifying things to track the
>>
>> Thanks, I will send a patch for it.
>
> I already have a patch for this.
>
>>> file linked-to, it still feels like this is a usage error. Fundamentally,
>>> we're unpacking a file (${B}/version.c), then linking to it, changing
>>> the link in some way, deleting the link, and expecting the original file
>>> to be unchanged. That doesn't seem right to me.
>
>> But that is what the real filesystem works without pseudo:
>> $ touch file1
>> $ ln file1 file2
>> $ chmod 777 file2
>> $ rm -f file2
>>
>> file1 will be 777 on the real filesystem.
>
> Yes. But it seems that the mode the file is being changed to is dependant on the
> original reported mode. And that's the part that I'm confused by;
> we're relying on the mode of the original file, but we're also changing
> the mode of a hard link to it. This makes no sense to me.
>
> So, I have a likely fix handy. (The difference between mine and the
> version you proposed earlier is that, as I recall, yours does the LINK
> operation on the original file unconditionally, which is in error; it

I had checked pseudo's source code, I didn't find any function that can check
this, and I appreciated that if you can fix it:-).

// Robert

> should only be done when no existing data was found in the database.) I'll
> double-check that again and go through looking for loose ends, because
> I have a vague feeling that I'm overlooking a thing, but I haven't figured out
> what it would be yet.
>
> -s
Mark Hatle Aug. 2, 2016, 3:12 p.m.
On 8/2/16 1:50 AM, Seebs wrote:
> On 2 Aug 2016, at 1:44, Robert Yang wrote:
> 
>> Because the stat() gets 0644 on ${B}/version.c in the second run, so 
>> it
>> would run chmod 0644 rather than 0444 on gzip-dbg/version.c.
> 
> Why is it calling chmod at all? The goal is apparently to give
> gzip-dbg/version.c the same mode that ${B}/version.c has. Since it's a
> hard link, no chmod call is needed at all.
> 
>>> time and not the second? If it does it the second time, the fact that
>>> the underlying file's mode changed won't matter.
>>>
>>> But in this case... While I'm fine with modifying things to track the
>>
>> Thanks, I will send a patch for it.
> 
> I already have a patch for this.
> 
>>> file linked-to, it still feels like this is a usage error. 
>>> Fundamentally,
>>> we're unpacking a file (${B}/version.c), then linking to it, changing
>>> the link in some way, deleting the link, and expecting the original 
>>> file
>>> to be unchanged. That doesn't seem right to me.
> 
>> But that is what the real filesystem works without pseudo:
>> $ touch file1
>> $ ln file1 file2
>> $ chmod 777 file2
>> $ rm -f file2
>>
>> file1 will be 777 on the real filesystem.
> 
> Yes. But it seems that the mode the file is being changed to is 
> dependant on the original reported mode. And that's the part that I'm 
> confused by;
> we're relying on the mode of the original file, but we're also changing
> the mode of a hard link to it. This makes no sense to me.
> 
> So, I have a likely fix handy. (The difference between mine and the
> version you proposed earlier is that, as I recall, yours does the LINK
> operation on the original file unconditionally, which is in error; it
> should only be done when no existing data was found in the database.) 
> I'll double-check that again and go through looking for loose ends, 
> because
> I have a vague feeling that I'm overlooking a thing, but I haven't 
> figured out what it would be yet.

A better way to describe the problem:

I create a file:

$ touch my-special-file
$ chmod 0444 my-special-file

I then run pseudo

$ pseudo ln my-special-file file1
$ stat file1
Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
$ pseudo rm -f file1
$ # remove pseudo database

At this point the original 'my-special-file' is now 0644, not 0444.  This is
because pseudo changes the perms on the file so that it could pretend to be root.

So if we repeat the last steps:

$ pseudo ln my-special-file file1
$ stat file1
Access: (0644/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
$ pseudo rm -f file1
$ # remove pseudo database

So the problem is -- pseudo is modifying the 'original' file perms, which on a
new instance of the pseudo database then gets inherited.  So we get
unpredictable results if this is the first time through -- or not the first time
through.

The problem is 'my-special-file' is never owned by pseudo -- but that isn't
really what the problem is -- the problem is because it's a hardlink -- and
pseudo is changing perms to emulate root -- it triggers a QA fault because the
file is 'changing'.

--Mark

> -s
>
Seebs Aug. 2, 2016, 7:16 p.m.
On 2 Aug 2016, at 3:32, Robert Yang wrote:

> It is worth trying, I will try to remove os.chown() and os.chmod() from
> package.bbclass to see what will happen:
>
>             if not cpath.islink(file):
>                 os.link(file, fpath)
>                 fstat = cpath.stat(file)
>                 os.chmod(fpath, fstat.st_mode)
>                 os.chown(fpath, fstat.st_uid, fstat.st_gid)

... Okay, I admit, I can't figure this one out. Can anyone explain
this?

Does python's os.link work across devices maybe?

-s
Ross Burton Aug. 2, 2016, 7:18 p.m.
On 2 August 2016 at 20:16, Seebs <seebs@seebs.net> wrote:

> Does python's os.link work across devices maybe?
>

The documentation at least says hard link explicitly, so presumably not.

Ross
Seebs Aug. 2, 2016, 7:19 p.m.
On 2 Aug 2016, at 10:12, Mark Hatle wrote:

> So the problem is -- pseudo is modifying the 'original' file perms, 
> which on a
> new instance of the pseudo database then gets inherited.  So we get
> unpredictable results if this is the first time through -- or not the 
> first time
> through.

Yeah. And the problem here is in part that we're calling chmod on a file 
linked to a file external to pseudo, but we're also checking that file's 
mode to find out what permissions we want it to have. Which is probably 
an error.

So, it does look like we need the tracking, but also if we stopped 
doing:

	stat file1 # obtain mode
	ln file1 file2
	chmod <mode> file2

we'd save significant time AND avoid the problem.

-s
Mark Hatle Aug. 2, 2016, 7:39 p.m.
On 8/2/16 2:19 PM, Seebs wrote:
> On 2 Aug 2016, at 10:12, Mark Hatle wrote:
> 
>> So the problem is -- pseudo is modifying the 'original' file perms, 
>> which on a
>> new instance of the pseudo database then gets inherited.  So we get
>> unpredictable results if this is the first time through -- or not the 
>> first time
>> through.
> 
> Yeah. And the problem here is in part that we're calling chmod on a file 
> linked to a file external to pseudo, but we're also checking that file's 
> mode to find out what permissions we want it to have. Which is probably 
> an error.
> 
> So, it does look like we need the tracking, but also if we stopped 
> doing:
> 
> 	stat file1 # obtain mode
> 	ln file1 file2
> 	chmod <mode> file2
> 
> we'd save significant time AND avoid the problem.
> 
> -s
> 

The alternative to the 'ln' is a 'cp' operation.  This is how it used to work
until optimizations were added a few releases ago.  It was observed that this
saves a large amount of space -- but it had the unintended consequence of
suddenly files are now changes as the do_install/do_package processing
manipulates files and modes change...

I almost wonder if in pseudo we should avoid changing any modes internal to the
file -- until something tries to write to the file.  (Can we even capture that
and try to fix it?  If not, what we're doing is likely the only answer.)

--Mark
Seebs Aug. 2, 2016, 7:53 p.m.
On 2 Aug 2016, at 14:39, Mark Hatle wrote:

> The alternative to the 'ln' is a 'cp' operation.  This is how it used 
> to work
> until optimizations were added a few releases ago.  It was observed 
> that this
> saves a large amount of space -- but it had the unintended consequence 
> of
> suddenly files are now changes as the do_install/do_package processing
> manipulates files and modes change...

Yeah, and I think if we just dropped the copy of permissions, this
would at least temporarily go away.

> I almost wonder if in pseudo we should avoid changing any modes 
> internal to the
> file -- until something tries to write to the file.  (Can we even 
> capture that
> and try to fix it?  If not, what we're doing is likely the only 
> answer.)

I don't think we usefully can.

Hmm.

So it occurs to me, there's only a few cases where we actually have any
reason to make on-filesystem permissions changes at all. Say, execute 
bits for executables. Otherwise, we could in theory just not bother 
writing
changes to the disk, because the disk permissions are irrelevant.

-s