Message ID | 20230824195332.851528-1-JPEWhacker@gmail.com |
---|---|
State | New |
Headers | show |
Series | [bitbake-devel,v2] fetch2: git: Check if clone directory is a git repo | expand |
On 24/08/2023 21:53, Joshua Watt wrote: > If the clone target directory exists and is a valid git repo, but the > git directory is not a child, it needs to be erased and re-cloned. One > example of how this can happen is if a clone creates the directory, but > then fails to actual clone and make it a git repository. This left-over > directory can be particularly problematic if the download directory is a > descent of some top layer git repo (e.g. the default with poky), as the > commands that operate on the remote later will then mangle the layers > git repository instead of the download git repo. > > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> > --- > bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > index 2a3c06fe4e9..33895e09b29 100644 > --- a/bitbake/lib/bb/fetch2/git.py > +++ b/bitbake/lib/bb/fetch2/git.py > @@ -65,6 +65,7 @@ import fnmatch > import os > import re > import shlex > +import shutil > import subprocess > import tempfile > import bb > @@ -365,8 +366,35 @@ class Git(FetchMethod): > runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) > repourl = self._get_repo_url(ud) > > + needs_clone = False > + if os.path.exists(ud.clonedir): > + # The directory may exist, but not be the top level of a bare git > + # repository in which case it needs to be deleted and re-cloned. > + try: > + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > + except bb.fetch2.FetchError as e: > + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > + needs_clone = True > + else: > + toplevel = os.path.abspath(output.rstrip()) > + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > + # The top level Git directory must either be the clone directory > + # or a child of the clone directory. Any ancestor directory of > + # the clone directory is not valid as the Git directory (and > + # probably belongs to some other unrelated repository), so a > + # clone is required > + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > + needs_clone = True > + > + if needs_clone: > + shutil.rmtree(ud.clonedir) > + else: > + needs_clone = True > + > # If the repo still doesn't exist, fallback to cloning it > - if not os.path.exists(ud.clonedir): > + if needs_clone: > # We do this since git will use a "-l" option automatically for local urls where possible, > # but it doesn't work when git/objects is a symlink, only works when it is a directory. > if repourl.startswith("file://"): > -- > 2.33.0 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14979): https://lists.openembedded.org/g/bitbake-devel/message/14979 > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com] > -=-=-=-=-=-=-=-=-=-=-=- > Is it possible to factor the try/except else to a function? Then we would not need the else, and would just return immediately and have a self contained code, which in my opinion is easier to read. I also think f-string would be more readable for runfetchcmd. f-string are available since python 3.6. Can you share the rationale for the else clause? I do not see anything inside the else clause that might raise a FetchError, thus the else clause is redundant. Another change I suggest is to move the rmtree inside the usual need_clone and ignore the error if the path does not exist. Removes another conditional. I also think using needs_clone to assert whether the rmtree should be done is confusing A different variable should be used or use a function's return value. All the code inside if os.path.exists(ud.clonedir): deals with the need for removal and as a side effect will trigger the clone in another part of the code.
On 25/08/2023 11:18, Paulo Neves wrote: > On 24/08/2023 21:53, Joshua Watt wrote: > >> If the clone target directory exists and is a valid git repo, but the >> git directory is not a child, it needs to be erased and re-cloned. One >> example of how this can happen is if a clone creates the directory, but >> then fails to actual clone and make it a git repository. This left-over >> directory can be particularly problematic if the download directory is a >> descent of some top layer git repo (e.g. the default with poky), as the >> commands that operate on the remote later will then mangle the layers >> git repository instead of the download git repo. >> >> Signed-off-by: Joshua Watt >> [<JPEWhacker@gmail.com>](mailto:JPEWhacker@gmail.com) >> --- >> bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- >> 1 file changed, 29 insertions(+), 1 deletion(-) >> >> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py >> index 2a3c06fe4e9..33895e09b29 100644 >> --- a/bitbake/lib/bb/fetch2/git.py >> +++ b/bitbake/lib/bb/fetch2/git.py >> @@ -65,6 +65,7 @@ import fnmatch >> import os >> import re >> import shlex >> +import shutil >> import subprocess >> import tempfile >> import bb >> @@ -365,8 +366,35 @@ class Git(FetchMethod): >> runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) >> repourl = self._get_repo_url(ud) >> >> + needs_clone = False >> + if os.path.exists(ud.clonedir): >> + # The directory may exist, but not be the top level of a bare git >> + # repository in which case it needs to be deleted and re-cloned. >> + try: >> + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel >> + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >> + except bb.fetch2.FetchError as e: >> + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) >> + needs_clone = True >> + else: >> + toplevel = os.path.abspath(output.rstrip()) >> + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') >> + # The top level Git directory must either be the clone directory >> + # or a child of the clone directory. Any ancestor directory of >> + # the clone directory is not valid as the Git directory (and >> + # probably belongs to some other unrelated repository), so a >> + # clone is required >> + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: >> + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) >> + needs_clone = True >> + >> + if needs_clone: >> + shutil.rmtree(ud.clonedir) >> + else: >> + needs_clone = True >> + >> # If the repo still doesn't exist, fallback to cloning it >> - if not os.path.exists(ud.clonedir): >> + if needs_clone: >> # We do this since git will use a "-l" option automatically for local urls where possible, >> # but it doesn't work when git/objects is a symlink, only works when it is a directory. >> if repourl.startswith( >> ["file://"](file://) >> ): >> -- >> 2.33.0 > > Is it possible to factor the try/except else to a function? Then we > would not need the else, and would just return immediately and have a > self contained code, which in my opinion is easier to read. I also think > f-string would be more readable for runfetchcmd. f-string are available > since python 3.6. > > Can you share the rationale for the else clause? I do not see anything > inside the else clause that might raise a FetchError, thus the else > clause is redundant. > > Another change I suggest is to move the rmtree inside the usual > need_clone and ignore the error if the path does not exist. Removes > another conditional. > > I also think using needs_clone to assert whether the rmtree should be > done is confusing A different variable should be used or use a > function's return value. All the code inside if > os.path.exists(ud.clonedir): deals with the need for removal and as a > side effect will trigger the clone in another part of the code. > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14982): > https://lists.openembedded.org/g/bitbake-devel/message/14982 > Mute This Topic: > https://lists.openembedded.org/mt/100942876/4454782 > Group Owner: > bitbake-devel+owner@lists.openembedded.org > Unsubscribe: > https://lists.openembedded.org/g/bitbake-devel/unsub > [ > paulo@myneves.com > ] > -=-=-=-=-=-=-=-=-=-=-=- > Another change I suggest is to move the rmtree inside the usual need_clone and ignore the error if the path does not exist. Removes another conditional. Disregard this comment. It is incorrect.
On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote: > > On 24/08/2023 21:53, Joshua Watt wrote: > > If the clone target directory exists and is a valid git repo, but the > > git directory is not a child, it needs to be erased and re-cloned. One > > example of how this can happen is if a clone creates the directory, but > > then fails to actual clone and make it a git repository. This left-over > > directory can be particularly problematic if the download directory is a > > descent of some top layer git repo (e.g. the default with poky), as the > > commands that operate on the remote later will then mangle the layers > > git repository instead of the download git repo. > > > > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> > > --- > > bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- > > 1 file changed, 29 insertions(+), 1 deletion(-) > > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > > index 2a3c06fe4e9..33895e09b29 100644 > > --- a/bitbake/lib/bb/fetch2/git.py > > +++ b/bitbake/lib/bb/fetch2/git.py > > @@ -65,6 +65,7 @@ import fnmatch > > import os > > import re > > import shlex > > +import shutil > > import subprocess > > import tempfile > > import bb > > @@ -365,8 +366,35 @@ class Git(FetchMethod): > > runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) > > repourl = self._get_repo_url(ud) > > > > + needs_clone = False > > + if os.path.exists(ud.clonedir): > > + # The directory may exist, but not be the top level of a bare git > > + # repository in which case it needs to be deleted and re-cloned. > > + try: > > + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > > + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > > + except bb.fetch2.FetchError as e: > > + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > > + needs_clone = True > > + else: > > + toplevel = os.path.abspath(output.rstrip()) > > + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > > + # The top level Git directory must either be the clone directory > > + # or a child of the clone directory. Any ancestor directory of > > + # the clone directory is not valid as the Git directory (and > > + # probably belongs to some other unrelated repository), so a > > + # clone is required > > + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > > + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > > + needs_clone = True > > + > > + if needs_clone: > > + shutil.rmtree(ud.clonedir) > > + else: > > + needs_clone = True > > + > > # If the repo still doesn't exist, fallback to cloning it > > - if not os.path.exists(ud.clonedir): > > + if needs_clone: > > # We do this since git will use a "-l" option automatically for local urls where possible, > > # but it doesn't work when git/objects is a symlink, only works when it is a directory. > > if repourl.startswith("file://"): > > -- > > 2.33.0 > > > > > > -=-=-=-=-=-=-=-=-=-=-=- > > Links: You receive all messages sent to this group. > > View/Reply Online (#14979): https://lists.openembedded.org/g/bitbake-devel/message/14979 > > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782 > > Group Owner: bitbake-devel+owner@lists.openembedded.org > > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com] > > -=-=-=-=-=-=-=-=-=-=-=- > > > Is it possible to factor the try/except else to a function? Then we > would not need the else, and would just return immediately and have a > self contained code, which in my opinion is easier to read. I also think > f-string would be more readable for runfetchcmd. f-string are available > since python 3.6. > > Can you share the rationale for the else clause? I do not see anything > inside the else clause that might raise a FetchError, thus the else > clause is redundant. The else clause is pretty much the entire reason for this patch in the first place :) The problem is that the fetcher assumes that the simple existence of the download directory means it can can add a remote and clone there, but that is not true. If your clone directory is a subdirectory of another git repository, but not a git repository itself, bitbake will (incorrectly) add the remote the higher level git repos and fetch there, leaving it a mess. More visually, if your layout is like this: poky <-- This is a git repository build downloads git2 foo <-- This is NOT a git repository The old code would see that the foo directory existed, see that it was part of a git repo (poky!) add the remotes and start cloning objects into the poky git repo, which is not at all what we want. The else clause corrects this as it makes the fetcher check if "foo" is not only a git repo, but that the it is the toplevel of a git repo (or at least that it's --git-dir is not a parent of that directory). If it's not, it deletes it because it's not a valid place to clone things into anymore. > > Another change I suggest is to move the rmtree inside the usual > need_clone and ignore the error if the path does not exist. Removes > another conditional. > > I also think using needs_clone to assert whether the rmtree should be > done is confusing A different variable should be used or use a > function's return value. All the code inside if > os.path.exists(ud.clonedir): deals with the need for removal and as a > side effect will trigger the clone in another part of the code. > > >
On 25/08/2023 16:19, Joshua Watt wrote: > On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote: >> On 24/08/2023 21:53, Joshua Watt wrote: >>> If the clone target directory exists and is a valid git repo, but the >>> git directory is not a child, it needs to be erased and re-cloned. One >>> example of how this can happen is if a clone creates the directory, but >>> then fails to actual clone and make it a git repository. This left-over >>> directory can be particularly problematic if the download directory is a >>> descent of some top layer git repo (e.g. the default with poky), as the >>> commands that operate on the remote later will then mangle the layers >>> git repository instead of the download git repo. >>> >>> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> >>> --- >>> bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- >>> 1 file changed, 29 insertions(+), 1 deletion(-) >>> >>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py >>> index 2a3c06fe4e9..33895e09b29 100644 >>> --- a/bitbake/lib/bb/fetch2/git.py >>> +++ b/bitbake/lib/bb/fetch2/git.py >>> @@ -65,6 +65,7 @@ import fnmatch >>> import os >>> import re >>> import shlex >>> +import shutil >>> import subprocess >>> import tempfile >>> import bb >>> @@ -365,8 +366,35 @@ class Git(FetchMethod): >>> runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) >>> repourl = self._get_repo_url(ud) >>> >>> + needs_clone = False >>> + if os.path.exists(ud.clonedir): >>> + # The directory may exist, but not be the top level of a bare git >>> + # repository in which case it needs to be deleted and re-cloned. >>> + try: >>> + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel >>> + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) >>> + except bb.fetch2.FetchError as e: >>> + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) >>> + needs_clone = True >>> + else: >>> + toplevel = os.path.abspath(output.rstrip()) >>> + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') >>> + # The top level Git directory must either be the clone directory >>> + # or a child of the clone directory. Any ancestor directory of >>> + # the clone directory is not valid as the Git directory (and >>> + # probably belongs to some other unrelated repository), so a >>> + # clone is required >>> + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: >>> + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) >>> + needs_clone = True >>> + >>> + if needs_clone: >>> + shutil.rmtree(ud.clonedir) >>> + else: >>> + needs_clone = True >>> + >>> # If the repo still doesn't exist, fallback to cloning it >>> - if not os.path.exists(ud.clonedir): >>> + if needs_clone: >>> # We do this since git will use a "-l" option automatically for local urls where possible, >>> # but it doesn't work when git/objects is a symlink, only works when it is a directory. >>> if repourl.startswith("file://"): >>> -- >>> 2.33.0 >>> >>> >>> >>> >> Is it possible to factor the try/except else to a function? Then we >> would not need the else, and would just return immediately and have a >> self contained code, which in my opinion is easier to read. I also think >> f-string would be more readable for runfetchcmd. f-string are available >> since python 3.6. >> >> Can you share the rationale for the else clause? I do not see anything >> inside the else clause that might raise a FetchError, thus the else >> clause is redundant. > The else clause is pretty much the entire reason for this patch in the > first place :) > > The problem is that the fetcher assumes that the simple existence of > the download directory means it can can add a remote and clone there, > but that is not true. If your clone directory is a subdirectory of > another git repository, but not a git repository itself, bitbake will > (incorrectly) add the remote the higher level git repos and fetch > there, leaving it a mess. > > More visually, if your layout is like this: > > poky <-- This is a git repository > build > downloads > git2 > foo <-- This is NOT a git repository > > The old code would see that the foo directory existed, see that it was > part of a git repo (poky!) add the remotes and start cloning objects > into the poky git repo, which is not at all what we want. > > The else clause corrects this as it makes the fetcher check if "foo" > is not only a git repo, but that the it is the toplevel of a git repo > (or at least that it's --git-dir is not a parent of that directory). > If it's not, it deletes it because it's not a valid place to clone > things into anymore. I understand the conceptual point of the patch and I think it is valid. What i mean is that you do not need the else clause. The code under the else clause can continue inside the try clause and not change the logic. The try-else idiom is very unusual and represents a meaning which is not needed here: That the code inside the else should not trigger the except clause. >> Another change I suggest is to move the rmtree inside the usual >> need_clone and ignore the error if the path does not exist. Removes >> another conditional. >> >> I also think using needs_clone to assert whether the rmtree should be >> done is confusing A different variable should be used or use a >> function's return value. All the code inside if >> os.path.exists(ud.clonedir): deals with the need for removal and as a >> side effect will trigger the clone in another part of the code. >> >> >> > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#14984): https://lists.openembedded.org/g/bitbake-devel/message/14984 > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com] > -=-=-=-=-=-=-=-=-=-=-=- >
On Thu, Aug 31, 2023 at 6:21 AM Paulo Neves <paulo@myneves.com> wrote: > > On 25/08/2023 16:19, Joshua Watt wrote: > > On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote: > >> On 24/08/2023 21:53, Joshua Watt wrote: > >>> If the clone target directory exists and is a valid git repo, but the > >>> git directory is not a child, it needs to be erased and re-cloned. One > >>> example of how this can happen is if a clone creates the directory, but > >>> then fails to actual clone and make it a git repository. This left-over > >>> directory can be particularly problematic if the download directory is a > >>> descent of some top layer git repo (e.g. the default with poky), as the > >>> commands that operate on the remote later will then mangle the layers > >>> git repository instead of the download git repo. > >>> > >>> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> > >>> --- > >>> bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- > >>> 1 file changed, 29 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py > >>> index 2a3c06fe4e9..33895e09b29 100644 > >>> --- a/bitbake/lib/bb/fetch2/git.py > >>> +++ b/bitbake/lib/bb/fetch2/git.py > >>> @@ -65,6 +65,7 @@ import fnmatch > >>> import os > >>> import re > >>> import shlex > >>> +import shutil > >>> import subprocess > >>> import tempfile > >>> import bb > >>> @@ -365,8 +366,35 @@ class Git(FetchMethod): > >>> runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) > >>> repourl = self._get_repo_url(ud) > >>> > >>> + needs_clone = False > >>> + if os.path.exists(ud.clonedir): > >>> + # The directory may exist, but not be the top level of a bare git > >>> + # repository in which case it needs to be deleted and re-cloned. > >>> + try: > >>> + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel > >>> + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) > >>> + except bb.fetch2.FetchError as e: > >>> + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) > >>> + needs_clone = True > >>> + else: > >>> + toplevel = os.path.abspath(output.rstrip()) > >>> + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') > >>> + # The top level Git directory must either be the clone directory > >>> + # or a child of the clone directory. Any ancestor directory of > >>> + # the clone directory is not valid as the Git directory (and > >>> + # probably belongs to some other unrelated repository), so a > >>> + # clone is required > >>> + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: > >>> + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) > >>> + needs_clone = True > >>> + > >>> + if needs_clone: > >>> + shutil.rmtree(ud.clonedir) > >>> + else: > >>> + needs_clone = True > >>> + > >>> # If the repo still doesn't exist, fallback to cloning it > >>> - if not os.path.exists(ud.clonedir): > >>> + if needs_clone: > >>> # We do this since git will use a "-l" option automatically for local urls where possible, > >>> # but it doesn't work when git/objects is a symlink, only works when it is a directory. > >>> if repourl.startswith("file://"): > >>> -- > >>> 2.33.0 > >>> > >>> > >>> > >>> > >> Is it possible to factor the try/except else to a function? Then we > >> would not need the else, and would just return immediately and have a > >> self contained code, which in my opinion is easier to read. I also think > >> f-string would be more readable for runfetchcmd. f-string are available > >> since python 3.6. > >> > >> Can you share the rationale for the else clause? I do not see anything > >> inside the else clause that might raise a FetchError, thus the else > >> clause is redundant. > > The else clause is pretty much the entire reason for this patch in the > > first place :) > > > > The problem is that the fetcher assumes that the simple existence of > > the download directory means it can can add a remote and clone there, > > but that is not true. If your clone directory is a subdirectory of > > another git repository, but not a git repository itself, bitbake will > > (incorrectly) add the remote the higher level git repos and fetch > > there, leaving it a mess. > > > > More visually, if your layout is like this: > > > > poky <-- This is a git repository > > build > > downloads > > git2 > > foo <-- This is NOT a git repository > > > > The old code would see that the foo directory existed, see that it was > > part of a git repo (poky!) add the remotes and start cloning objects > > into the poky git repo, which is not at all what we want. > > > > The else clause corrects this as it makes the fetcher check if "foo" > > is not only a git repo, but that the it is the toplevel of a git repo > > (or at least that it's --git-dir is not a parent of that directory). > > If it's not, it deletes it because it's not a valid place to clone > > things into anymore. > > I understand the conceptual point of the patch and I think it is valid. > What i mean is that you do not need the else clause. The code under the > else clause can continue inside the try clause and not change the logic. > The try-else idiom is very unusual and represents a meaning which is not > needed here: That the code inside the else should not trigger the except > clause. Ah, yes. I see what you are saying now > > >> Another change I suggest is to move the rmtree inside the usual > >> need_clone and ignore the error if the path does not exist. Removes > >> another conditional. > >> > >> I also think using needs_clone to assert whether the rmtree should be > >> done is confusing A different variable should be used or use a > >> function's return value. All the code inside if > >> os.path.exists(ud.clonedir): deals with the need for removal and as a > >> side effect will trigger the clone in another part of the code. > >> > >> > >> > > -=-=-=-=-=-=-=-=-=-=-=- > > Links: You receive all messages sent to this group. > > View/Reply Online (#14984): https://lists.openembedded.org/g/bitbake-devel/message/14984 > > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782 > > Group Owner: bitbake-devel+owner@lists.openembedded.org > > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com] > > -=-=-=-=-=-=-=-=-=-=-=- > > > >
diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py index 2a3c06fe4e9..33895e09b29 100644 --- a/bitbake/lib/bb/fetch2/git.py +++ b/bitbake/lib/bb/fetch2/git.py @@ -65,6 +65,7 @@ import fnmatch import os import re import shlex +import shutil import subprocess import tempfile import bb @@ -365,8 +366,35 @@ class Git(FetchMethod): runfetchcmd(fetch_cmd, d, workdir=ud.clonedir) repourl = self._get_repo_url(ud) + needs_clone = False + if os.path.exists(ud.clonedir): + # The directory may exist, but not be the top level of a bare git + # repository in which case it needs to be deleted and re-cloned. + try: + # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel + output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir) + except bb.fetch2.FetchError as e: + logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e) + needs_clone = True + else: + toplevel = os.path.abspath(output.rstrip()) + abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/') + # The top level Git directory must either be the clone directory + # or a child of the clone directory. Any ancestor directory of + # the clone directory is not valid as the Git directory (and + # probably belongs to some other unrelated repository), so a + # clone is required + if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir: + logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir) + needs_clone = True + + if needs_clone: + shutil.rmtree(ud.clonedir) + else: + needs_clone = True + # If the repo still doesn't exist, fallback to cloning it - if not os.path.exists(ud.clonedir): + if needs_clone: # We do this since git will use a "-l" option automatically for local urls where possible, # but it doesn't work when git/objects is a symlink, only works when it is a directory. if repourl.startswith("file://"):
If the clone target directory exists and is a valid git repo, but the git directory is not a child, it needs to be erased and re-cloned. One example of how this can happen is if a clone creates the directory, but then fails to actual clone and make it a git repository. This left-over directory can be particularly problematic if the download directory is a descent of some top layer git repo (e.g. the default with poky), as the commands that operate on the remote later will then mangle the layers git repository instead of the download git repo. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> --- bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-)