Comments (2)
Overwriting anything in the dest fs on a read seems bad
From: Jeff Inman [email protected]
Sent: Wednesday, June 1, 2016 11:11:20 AM
To: pftool/pftool
Subject: [pftool/pftool] restart is conflicted for copy vs. compare of MarFS files (#26)
The pftool compare task uses CTM just like the copy task. During comparison, worker_comparelist() calls update_chunk() which evokes worker_update_chunk(). I think the reasoning must be that long "deep" file comparisons (i.e. with '-M') ought to be able to restart, so they should maintain CTM as they go. Furthermore, if someone attempted to compare a source with an incomplete copy (which would have copy-related CTM), the surviving CTM built during the copy task would just imply parts of the file not to compare, but the other parts (not existing) could still correctly indicate a "mismatch", when the read() on them fails. (Actually, you'd get a non-fatal read-error on each such chunk.)
The problem that crops up with chunked comparisons involving MarFS destinations is that worker_update_chunk() causes us to overwrite the chunk-info in the MDFS file. We might overwrite with correct info, but this still changes the mtime of the file, which causes MD comparisons for the remaining source/destination chunks to fail. [The destination is then re-adjusted to again have correct MD, when pftool finishes the comparison.] The upshot is that chunked comparisons of correctly copied MarFS files appear to show a "mismatch" on all but the first chunk.
The simplest solution would be to disable restarts for chunked "deep" comparisons. A better, not-much-harder approach would be to get worker_update_chunk() to know the difference between copy and compare tasks, so the call to write_chunkinfo() in libmarfs can only be called in the copy case. Another alternative would be to only do the MD comparisons when looking at the first chunk. This seems less good, because it still means messing with the MDFS file during comparisons.
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/26, or mute the threadhttps://github.com/notifications/unsubscribe/ALNxmd4rPTf6LGKboNJUY9ZGNNT53EWNks5qHb04gaJpZM4IrxgW.
from pftool.
Fixed. When doing compares, we don't touch the destination. This is the "better" solution, mentioned above. There was another hitch, where update_stats() was getting called for compares. Did the same thing with that.
We still maintain CTM for compares, though. That means "deep" chunked compares can be restarted.
from pftool.
Related Issues (20)
- Change python shebang to use system python in python wrappers via autotools
- Replace optparse with argparse HOT 1
- Refactor error checking for config file
- Remove references to threading in README
- Remove all code related to threaded vs not threaded in python wrappers HOT 1
- Unable to use pfrsync HOT 1
- Pftool writes fail into the root of a MarFS repository - HOT 5
- Pftool will not build HOT 3
- avoid touching destinations until doing actual work on them HOT 3
- coordinate logging cmd-line parameters and function implementations HOT 1
- missing config NS results in obscure "failed to stat path" HOT 2
- restart after SIGINT occasionally gets rename failures on individual chunk-files HOT 1
- changes to CTF for temp-files need to be made to CTA, as well
- pfcp shouldn't try to create temp-files on /dev/null HOT 1
- fatal error: openssl/md2.h: No such file or directory HOT 1
- Unable to build pftool without MarFS - HOT 1
- Pftool fails to build as a subdirectory of marfs HOT 2
- After using disable-marfs with configure not able to make HOT 4
- While running any command of pftool getting segmentation fault HOT 13
- upgrade python wrappers to be complatible with python3 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pftool.