git is somewhat robust when killed in the middle. libgit2 seems somewhat less so. I'd

For example, cloning into the same location will have one <code class="no

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ensure robustness when killed in the middle about gitoxide HOT 10 OPEN

byron commented on July 30, 2024

Ensure robustness when killed in the middle

from gitoxide.

Comments (10)

joshtriplett commented on July 30, 2024 1

For example, cloning into the same location will have one git process fail early as it races to 'creation of .git directory' only, instead of racing for moving the cloned repository into place. Interestingly, when kill -9ed, git probably would be unable to recover such a partial clone as it couldn't differentiate a race from a dead process.

You could potentially handle that with file locks, which don't outlive the process holding them.

from gitoxide.

Byron commented on July 30, 2024

I absolutely agree, thanks for making that explicit.

How would you create multiple files atomically? This would be required to perfectly conclude a clone, which currently is done in a plumbing command by creating refs sequentially.

Would you go as far as to maintain an undo list to respond to interrupts properly?

from gitoxide.

joshtriplett commented on July 30, 2024

@Byron Catching interrupts won't handle kill -9, or a power failure, or similar.

You don't have to make entire operations atomic when that isn't possible. It's OK if an interrupted fetch leaves some refs created and others not; another fetch will clear that up. (Clones can be made atomic by doing them into a temporary directory and renaming that directory into place.) What must not happen is a ref getting created but the object it references not existing, or an incomplete pack existing where git looks for packs, or other cases where the repository is in an inconsistent state. There should always be an order you can perform operations in such that the repository remains consistent.

from gitoxide.

Byron commented on July 30, 2024

I see, so rather than complicating things try to design operations to never corrupt the git repository, and ideally recover automatically next time the operation is run in case resources have been leaked, like some refs still being present after interruption.

In the same vein, operations should probably be hardened against races, too. For example, cloning into the same location will have one git process fail early as it races to 'creation of .git directory' only, instead of racing for moving the cloned repository into place. Interestingly, when kill -9ed, git probably would be unable to recover such a partial clone as it couldn't differentiate a race from a dead process. Thinking about it, since there is a special kind of temp file which is always cleaned up if a process is killed, that could possibly be used as marker to differentiate this case as well.

Generally, when seeing the .git repository as database, all best-practices should certainly be applied to prevent corruption and allow (auto) recovery.

from gitoxide.

Byron commented on July 30, 2024

git-tempfile and git-lock will help for sure in keeping the repository consistent in cases that are not kill -9. It's still to be figured out how to respond to stray locks though, especially on the server side where doing so should probably be automated.

from gitoxide.

kim commented on July 30, 2024

It is potentially interesting to note that git.git repacks refs when more than one is updated in a transaction (for —atomic support). Last time I checked, libgit2 doesn’t do that, making ref transactions not actually atomic.

from gitoxide.

Byron commented on July 30, 2024

@kim And on top of that one has to write the reflog which isn't contained in the packed-refs file and consist of a file per ref. I will still have to see how exactly that is locked, but I am hopeful it's made in a way to not completely tank performance.

from gitoxide.

kim commented on July 30, 2024

Oh my yes reflogs. I do think they’re a performance sink, which is why they are disabled by default for bare repos. I did, however, end up recently forcing creation on select refs and installing inotify watches for cache invalidation purposes (I cannot watch the refdb itself, because a maintenance pack refs would generate false remove events, and the volume of events would just be too high).

Guess you can see now why I keep crying “reftable” ;)

from gitoxide.

Byron commented on July 30, 2024

Oh my yes reflogs. I do think they’re a performance sink, which is why they are disabled by default for bare repos. I did, however, end up recently forcing creation on select refs and installing inotify watches for cache invalidation purposes (I cannot watch the refdb itself, because a maintenance pack refs would generate false remove events, and the volume of events would just be too high).

Super interesting, thanks for sharing! I checked it against my current sketch for transactions and am glad this would naturally be supported. I also noted that in bare repos, no reflog is created otherwise which wasn't on my radar yet.

Guess you can see now why I keep crying “reftable” ;)

There is no way around it on the server for sure. Since it operates in blocks it's probably less likely to be overwhelmed by the amount of filesystem events that it produces. But whatever it's going to be there is no escape if there are too many events it's either the files backend or the reftable.

from gitoxide.

kim commented on July 30, 2024

if there are too many events

I misspoke there, it's more related to the notify crate not allowing the set event masks for portability reasons, and also the inherent raciness of watching directories recursively.

from gitoxide.

Ensure robustness when killed in the middle about gitoxide HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs