lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Unexpected git (mis)behavior


From: Vadim Zeitlin
Subject: Re: [lmi] Unexpected git (mis)behavior
Date: Fri, 13 Nov 2020 01:56:21 +0100

On Fri, 13 Nov 2020 00:15:59 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> Usually, if some git operation fails, there's a helpful error message
GC> that describes what's wrong and clearly suggests what to do about it.
GC> Here's a situation that occurred today, where that was not the case.
GC> 
GC> On our corporate redhat server, not in any chroot (with git-1.8.3.1,
GC> provided by EPEL), I had cloned the lmi repository some time ago,
GC> into /opt/lmi/src/lmi as usual. Today, Kim tried to update it, and
GC> git-pull failed:
GC> 
GC> /opt/lmi/src/lmi[0]$git pull
GC> remote: Enumerating objects: 117, done.
GC> remote: Counting objects: 100% (117/117), done.
GC> remote: Compressing objects: 100% (32/32), done.
GC> remote: Total 86 (delta 57), reused 77 (delta 48), pack-reused 0
GC> Unpacking objects: 100% (86/86), done.
GC> From https://github.com/let-me-illustrate/lmi
GC>    5724e39..b03a344  master     -> origin/master
GC> Updating 5724e39..b03a344
GC> error: unable to unlink old '.github/workflows/ci.yml' (Permission denied)

 The first puzzle is that I don't understand why did this happen.

GC> The file that couldn't be unlinked had permissions "-rw-r--r--",

 But this is irrelevant, isn't it? It's the permissions of the parent
directory that matters.

GC> which Kim used 'sudo' to reset to "-rw-rw-r--", also repairing
GC> undesired permissions on other files, thus:
GC>   sudo chgrp -R lmi /opt/lmi
GC>   sudo find /opt/lmi -type d -exec chmod g+s {} +
GC>   sudo chmod -R g=u /opt/lmi

 If Kim didn't have write access to /opt/lmi/src/lmi/.github/workflows
directory, this would explain the error above, but I'm still not sure how
could it have happened. Do you understand this?

GC> When she then repeated the git-pull command, it failed:
GC> 
GC> /opt/lmi/src/lmi[0]$git pull
GC> Updating 5724e39..b03a344
GC> error: Your local changes to the following files would be overwritten by 
merge:

 I'm not completely sure what was the state at this moment ("git status"
could be used to examine it), but I'm almost certain that, assuming there
were no local changes before pull, "git reset --hard" and "git clean -fdx"
would have fixed it (but, let me emphasize: both of these commands are
highly destructive and shouldn't be used lightly).

[...long list of "modified" files snipped...]

GC> Apparently the first git-pull attempt brought those files up to date in
GC> the local tree, but didn't recognize that it had introduced or modified
GC> them: git seems to treat them as local modifications that we had made
GC> manually, although we surely didn't.

 I could try to reproduce the problem by setting up the permissions to
trigger an error during update, just as it happened above, if you'd like a
detailed explanation, but roughly speaking I think Git must have failed to
synchronize the working copy with the index and then it must have rolled
back index changes too, but left the changing to the working directory that
had been already done before the error in it. So the net effect was that
some files in the working directory ended up being modified, compared to
the index, or even new, for the files that didn't exist until the update,
but were created during it.

 This could be argued to be a bug in Git, but I don't think it's easy, or
maybe even possible, to fix, as there is no "atomic update several files"
operation in the file system API, so it just can't ensure that it has
updated all files or none of them. And I guess it tries to touch the file
system as little as possible if an unexpected error occurs to try to avoid
making things even worse if something really catastrophic has happened. All
this is pure speculation on my part, of course, but I think it makes sense.

GC> We've recovered now, by removing all the offending files:
GC>   rm `git diff --name-only`
GC>   rm `git ls-files --others`
GC> after which 'git pull' succeeded.

 This is more or less the equivalent of the Git commands I gave above, so
it indeed was the right thing to do.

GC> What surprised me is that git modified the local tree, then aborted but
GC> left the local modifications in place, and git-status didn't tell us about
GC> the problem that running git-pull had created,

 Sorry, I don't see the output of git-status in your message. I'm pretty
sure that it would have given an accurate assessment of the situation if
you had run it before the second git-pull. Did you really do it and didn't
see any changes?

GC> or give a clearer suggestion about how we should recover. "Please,
GC> commit your changes" is misleading because those weren't "our"
GC> changes--they were git's changes.

 But git-pull has no way to know about it. You could have arrived to
exactly the same situation by making these changes manually.

GC> It looks like there was a merge in progress that was terminated
GC> abruptly

 Yes, git-pull is git-fetch + git-merge (which is why I sometimes recommend
using these commands directly, just to understand what is happening), so
running it tried to do a (fast-forward) merge.

GC> (the "Aborting" message above), whereas in my prior experience
GC> git-status would have told us that a merge had failed, and suggested
GC> that we either resolve conflicts and git-add them and continue, or else
GC> run an '--abort' command that would have cleaned up the local
GC> modifications.

 I think "--abort" could have worked after the first pull, but almost
certainly not after the second one.

GC> I don't think this can have been caused by my customized hooks/post-checkout
GC> script

 No, it must have been due to wrong directory permissions -- but I still
don't know how did that happen.

GC> I'm guessing that this is a git QoI issue: git would normally either clean
GC> up better after a failed pull, or give more helpful suggestions, but maybe
GC> it stumbles because of the (hidden) .github subdirectory.

 I'm quite certain that being hidden doesn't play any role here. Or, wait,
perhaps it does: Git certainly doesn't care, but would it be possible that
somebody ran chgrp or chmod command using "*", instead of using "find
-exec"?

GC> I would have thought that that subdirectory would have had group=user
GC> permissions because
GC>   $git config --get-all core.sharedrepository
GC>   1
GC> but clearly it didn't.

 Yes, and this remains a mystery to me...

GC> Should I make some appropriate hook (post-checkout, perhaps) do this:
GC>   sudo chgrp -R lmi /opt/lmi
GC>   sudo find /opt/lmi -type d -exec chmod g+s {} +
GC>   sudo chmod -R g=u /opt/lmi
GC> in order to enforce those rules after every pull, so that this problem
GC> can't occur again?

 Normally it should never occur in the first place, so there should be no
need to correct it like this. I'd be curious if the command

        find /opt/lmi/src/lmi '!' -group lmi -o '(' -type d -a '!' -perm g+s ')'

returns any files right now?

 Sorry for not being of much help,
VZ

Attachment: pgpKg5xu1I5Zi.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]