GithubHelp home page GithubHelp logo

Comments (8)

wez avatar wez commented on April 23, 2024

FWIW, I believe that @dturner-tw already has a working implementation of git + watchman.

To answer your question about since queries, yes, this is one of the core features in watchman.
We track changes along with an abstract clock identifier that ticks as changes are observed.
The watchman service can maintain a symbolic cursor that tracks a change for your specific tool.

For example, you can choose a cursor name like n:mytool (make sure that you don't pick a name that collides with another tool) then when you issue this query for the first time you'll get information about all the files in the entire tree:

["query", "/path/to/root", {
   "since": "n:mytool",
   "fields": ["name"]
}]

when you issue that query a second time, you'll get just the changes since the last time.

In some cases (if watchman got restarted, or you overflowed inotify kernel limits), watchman will tell you that some files have changed, even if they haven't really, to ensure that you don't miss changes.

You can read a bit more about this stuff here:

https://facebook.github.io/watchman/docs/cmd/query.html
https://facebook.github.io/watchman/docs/file-query.html
https://facebook.github.io/watchman/docs/clockspec.html

from watchman.

pclouds avatar pclouds commented on April 23, 2024

Hi,

Yes I know about David's watchman support. I kinda compete with him in this :)

How does a cursor name show me new files since its time? Suppose I have file A already when I register n:mytool, then I delete A and recreate A (editors and compilers do that) and add B. When I query watchman I'd expect to see B only, not A. I can't rely on cclock because cclock would be updated because of the recreation.

$ ls
a
$ ../watchman watch `pwd`
{
    "version": "3.0.0",
    "watch": "<my path>"
}
$ echo '[ "query", "<my path>", { "since" : "n:mytool" } ]' | ../watchman query -j 
{
    "version": "3.0.0",
    "clock": "c:1415665145:5806:1:3",
    "is_fresh_instance": true,
    "files": [
        {
            "name": "a",
            "size": 0,
            "mode": 33188,
            "new": true,
            "exists": true
        }
    ]
}
$ echo '[ "query", "/home/pclouds/w/watchman/z", { "since" : "n:mytool" } ]' | ../watchman query -j 
{
    "version": "3.0.0",
    "clock": "c:1415665145:5806:1:7",
    "is_fresh_instance": false,
    "files": []
}
$ rm a 
$ echo 3>a
$ echo 3>b
$ ls
a  b
$ echo '[ "query", "/home/pclouds/w/watchman/z", { "since" : "n:mytool" } ]' | ../watchman query -j                                                                                             
{
    "version": "3.0.0",
    "clock": "c:1415665145:5806:1:14",
    "is_fresh_instance": false,
    "files": [
        {
            "name": "b",
            "size": 0,
            "mode": 33188,
            "new": true,
            "exists": true
        },
        {
            "name": "a",
            "size": 0,
            "mode": 33188,
            "new": true,
            "exists": true
        }
    ]
}

from watchman.

wez avatar wez commented on April 23, 2024

From the perspective of the kernel and the filesystem, A is a new file here so that is what watchman is reporting to you. What watchman is saying is that something about A changed since you last looked. You can use that signal as a way to figure out whether you need to open the file and look at its content.

from watchman.

pclouds avatar pclouds commented on April 23, 2024

I agree it could be done outside watchman. But that's less efficient. For short-lived programs like git, we would need to keep the list of all files somewhere on disk in order to determine if a file is a "new" or not, and pay I/O cost for this file list (David did this). But we use watchman to reduce I/O in the first place.

Or we could add yet another daemon to keep the whole file list on memory (so no I/O penalty), duplicating what watchman already keeps in memory. I could go with this, but I was hoping that maybe watchman can be extended somehow to let the user attach some custom attributes to its file list. I can try to work out something if you're interested in this option. Otherwise I think we can close this issue.

from watchman.

pclouds avatar pclouds commented on April 23, 2024

Sorry I can't stop thinking about this. Another option instead of custom attributes is support clockspec in command "find". The user needs to ask for this in advance (e.g. at "query" time) so we can make a snapshot of the file list (basically one more linked list per cursor/clock in struct watchman_file).

from watchman.

sunshowers avatar sunshowers commented on April 23, 2024

That sounds far too expensive -- O(number of cursors/clocks * number of files) -- and I'm pretty sure you're going to have to resolve Watchman's file list against your own anyway. Having your own daemon sounds like the correct approach.

from watchman.

wez avatar wez commented on April 23, 2024

I've thought about custom attributes in watchman in the past, but what it boils down to is that Watchman can't know enough about any specific use case to make optimal choices about slurping data out of files.
This makes it likely that any effort to save I/O in the client will result in a net increase in redundant or unnecessary I/O in Watchman. The next logical step from there is "well, let's add a plugin or extensibility framework so that we can run code in the process itself when files change", and that poses some other challenges; security of dynamic loading if we expose this only in C, embedding scripting language(s) instead of using C, ensuring that those functions return quickly enough that we don't underflow the notification stream and so on.

I'm not saying that we can't do any of these things, but one of the reasons that we haven't needed to thus far is that the client typically knows more about the nature of the watched root and can make smarter choices about when to incur the I/O cost, and that gain is bigger than we're likely to see by making changes in the watchman service.

The find command is a simple legacy command to find files. The since command is the corresponding simple legacy command to do the same with a clockspec. We recommend that you use the query command; both find and since are internally implemented in terms of query, so you'd be saving some small translation overhead.

from watchman.

pclouds avatar pclouds commented on April 23, 2024

I finally agree there's no elegant way to put this in watchman. I guess I'll have to live with another daemon. Thank you for making watchman.

from watchman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.