saracen / lfscache Goto Github PK
View Code? Open in Web Editor NEWLFS Cache is a caching Git LFS proxy.
License: MIT License
LFS Cache is a caching Git LFS proxy.
License: MIT License
Does lfscache support distributed deployment? We expect thousands of concurrent container accesses to this cache. Alternatively, can I deploy lfscache on multiple machines and then use Nginx as a reverse proxy? Would this approach be feasible for achieving a distributed setup?
Wondering if there is any appetite for TLS support on the cache? I have worked around this myself using a load balancer for my use case, but since net/http is already in there, I did a quick test of just having it as a boolean config (with a static path) and using the built in listen with TLS, see here:
https://github.com/comerford/lfscache/blob/master/main.go#L61
Basic test worked, would just need some sane way to specify the cert paths in the options rather than static config, but will leave that to someone with more actual Go experience.
P.S. Super useful little program - thanks!
Looked at the code, you expect http or https instead of just plain github.com/...
Hey,
Your tool is quite nice, especially since downloading from AWS is ridiculously slow for us lately.
It would be very useful in our local network with multiple machines using multiple github repositories if we could just have a static dns entry from e.g. github-cloud.s3.amazonaws.com to the local lfscache machine. That way all personal access tokens will stay valid, laptops moving to other networks don't have to change their repositories lfs path everytime etc.
For that it would be necessary to be able to configure git-lfs to use the full original URL as with github.com, so /org/repo.git/info/lfs which would be used by lfscache instead of the --url parameter (the machine with lfscache would have to not use the local dns and then access the same HTTP Location URL it got from the request). Additionally, since it would only be necessary to cache certain repos, supplying a list of URLs for which the current cache is used would be useful and then all other requests would simply be passed through.
So, not sure if this is something you would want to add and if it's too much work, but let me know what you think.
–url github.com/org/repo.git/info/lfs
specifies the URL down to the “repo.git” level. Is there a way to configure it to “--url github.com,” so that different project clients will automatically access their respective remote LFS servers? For example, the LFS server for “org/repo1.git” is “github.com/org/repo1.git/info/lfs”, and the LFS server for “org/repo2.git” is “github.com/org/repo2.git/info/lfs”
I am trying build & modify this project. I am getting an build error, I guess it is related with go version but i am not sure.
I appreciate if you can give me a hint about error.
Thanks for this important piece of project.
vagrant@devbox:~/go$ go get github.com/saracen/lfscache
github.com/saracen/lfscache/server
src/github.com/saracen/lfscache/server/server.go:170:64: unknown field 'ErrorHandler' in struct literal of type httputil.ReverseProxy
vagrant@devbox:~/go$ go version
go version go1.10.4 linux/amd64
Currently, the proxy cannot be used with Github, the following error message will be output
Error: fatal: Unencrypted HTTP is not supported for GitHub. Ensure the repository remote URL is using HTTPS.
Error: fatal: could not read Username for 'http://localhost:4224': terminal prompts disabled
We have an lfs repo with 60.000+ files. I am seeing an issue where some files are correctly downloaded and cached, but the program crashes after some files.
Output looks like this:
level=info ts=2020-06-23T12:06:03.1992193Z event=fetched oid=9230f90f3b3f461400dd00fdb3c2257c22332466bb9ad0c8476e7a23d46c458a took=567.9984ms downloaded=45180/45180 rate="852 KB/s"
level=info ts=2020-06-23T12:06:03.2002201Z event=served oid=9230f90f3b3f461400dd00fdb3c2257c22332466bb9ad0c8476e7a23d46c458a source=fresh took=570.0011ms size=45180 rate="79 KB/s"
level=info ts=2020-06-23T12:06:03.21022Z event=serving oid=63b5bca91e6758cd576111976ed5c69887e2a57b074db0aca53a5843ebecc9ff source=fresh
level=info ts=2020-06-23T12:06:03.21022Z event=fetching oid=63b5bca91e6758cd576111976ed5c69887e2a57b074db0aca53a5843ebecc9ff
panic: runtime error: index out of range [-9223372036854775808]
goroutine 130 [running]:
github.com/git-lfs/git-lfs/tools/humanize.FormatByteRate(0xb07c, 0x0, 0x9f61a0, 0x0)
/home/travis/gopath/pkg/mod/github.com/git-lfs/[email protected]+incompatible/tools/humanize/humanize.go:160 +0x318
github.com/saracen/lfscache/server.(*Server).fetch.func1(0xc0001926f0, 0xc0008efee0, 0xc0000d4180, 0xc000188130, 0x40, 0xbfb499eacc970048, 0x40e92108d, 0x9f61a0, 0xb07c, 0xc0008effd0)
/home/travis/gopath/src/github.com/saracen/lfscache/server/server.go:405 +0x7d
github.com/saracen/lfscache/server.(*Server).fetch(0xc0000d4180, 0x272f0240, 0xc0001925a0, 0xc000188130, 0x40, 0xc0000a4000, 0x50, 0xb07c, 0xc000192540, 0x0, ...)
/home/travis/gopath/src/github.com/saracen/lfscache/server/server.go:445 +0x584
created by github.com/saracen/lfscache/server.(*Server).serve
/home/travis/gopath/src/github.com/saracen/lfscache/server/server.go:354 +0x6b3
Update 1:
Version is: 0.1.3, commit 848d795, built at 2020-06-20T14:23:14Z
I am have a git repo that uses lfs. The LFS server is a totally different machine.
github: [email protected]/.......
lfs https://machine.domain:9999/artifactory/mystore
it was started with
/lfscache --url https://machine.domain:1339/artifactory/git_lfs_store/info/lfs --directory /srv/docker-helper/shares/cache/gitcache/lfs --http-addr=172.17.0.1:9876
The error
level=error ts=2019-12-12T00:41:04.943901096Z event=proxying request=https://machine.domain:1339/artifactory/git_lfs_store/info/lfs/objects/batch err="tls: first record does not look like a TLS handshake"
But I do not see network traffic going to the lfs-store machine.
Q1> in the url , do I need .../info/lfs ? I think it might work better without the added /info/lfs
Q2> looking at the server code I see
'''
http: req.TLS == nil
'''
the input request is http but the outgoing needs TLS
it would look like this clone-machine : http <-> lfsproxy : https <-> lfs-store
Can one server instance act as a cache for multiple projects?
This works 👍:
--url https://<server>/depot.git/info/lfs
git config lfs.url http://<user>:<token>@localhost:8080
But then I wanted to move the project-specific part depot.git/info/lfs to lfs.url and do the same for other projects (all hosted on the same server, of course). However, this seems to make my git-lfs fall back to the "origin" remote instead of lfs.url:
--url https://<server>
git config lfs.url http://<user>:<token>@localhost:8080/depot.git/info/lfs
I do see a small amount of network traffic going to the lfscache server so it could be that my git-lfs is making requests but that lfscache can't find the binaries and that this makes my git-lfs fall back to "origin".
Could this feature be added?
Loving this tool. We'd like to try using it with our private repositories but it seems to fail with a 403 error when we use it with our HTTPS credentials and an personal access token which works when we git clone directly (i.e. without using lfscache as a proxy).
Can you tell us how to use lfscache to proxy private LFS repositories on github?
Using a github.com personal access token in place of , we run the lfscache server with:
lfscache_0.1.0_darwin_amd64/lfscache --url <username>:<password>@github.com/<org>/<repo>.git/info/lfs --directory /tmp/lfstest --http-addr=:9876
Configure our git lfs url:
git config --global --replace-all lfs.url "http://localhost:9876/"
Then attempt to clone with:
git clone [email protected]:<org>/<repo>.git
Our lfscache server outputs:
level=error ts=2019-01-03T15:32:37.504463Z event=proxying request=https://<username>:<password>@github.com/<org>/<repo>.git/info/lfs/objects/batch err="remote server responded with 403 status code"
However this following command seems to work indicating that the issue lies with lfscache:
git clone https://<username>:<password>@github.com/<org>/<repo>.git
Thanks,
Tom
Hey!
I am trying to set up caching for a private repository we host on Github. I saw that there was an issue filed in the past, but the author does not go into a lot of detail on how to set up such a scenario properly: #1
I am trying to set up authentication with a Personal Access Token / PAT. As I understand it, lfscache transparently passes through authentication, so one would provide it on the client, rather than when starting lfscache.
I also tried the avenue via gh auth login
and not passing username:PAT
when setting lfs.url
, same result.
On the cache host, I am getting this error message when I try to pull:
level=error ts=2022-10-01T09:47:07.026010538Z event=proxying request=https://github.com/[org]]/[repo]/info/lfs/objects/batch err="remote server responded with 403 status code"
These are the steps I take:
On the lfscache host:
./lfscache --url https://github.com/[org]]/[repo]]/info/lfs --directory lfs --http-addr=:9876
On the client:
git init
git remote add origin https://[username]:[PAT]@github.com/[org]/[repo].git
git config lfs.url http://[username]:[PAT]@[lfscache-ip]:9876
Please advise.
We're trying to configure lfscache in a high availability mode, using 2 or more replicas. One replica works just fine, but once a second is added, we see a git LFS Client Errors.
LFS: Client error: http://localhost:8000/_lfs_cache/77ec36d03f6209383a19162c683732df373d444b689698b50924d8815be485d4
LFS: Client error: http://localhost:8000/_lfs_cache/777c309f83d684467825cbc5ab81354fa62e455c70ad748fec80efa4af5ccb99
Sample logs:
12:04:08.710310 trace git-lfs: tq: starting transfer adapter "basic"
12:04:08.711549 trace git-lfs: HTTP: GET http://localhost:8000/_lfs_cache/6ae41d077e8556623c7cdcbaac1bab58058e247732c28d81e98e62d16bc6fd95
12:04:08.714053 trace git-lfs: HTTP: 200
12:04:08.715580 trace git-lfs: HTTP: GET http://localhost:8000/_lfs_cache/12ff0790a4e9c7cefb9d3a78feb13f94d8c70463344e4e2f2c7082f4bf6e231a
12:04:08.718284 trace git-lfs: HTTP: 400
12:04:08.719041 trace git-lfs: tq: refusing to retry "12ff0790a4e9c7cefb9d3a78feb13f94d8c70463344e4e2f2c7082f4bf6e231a", too many retries (8)
12:04:08.719088 trace git-lfs: tq: refusing to retry "12ff0790a4e9c7cefb9d3a78feb13f94d8c70463344e4e2f2c7082f4bf6e231a", too many retries (8)
12:04:08.720393 trace git-lfs: HTTP: GET http://localhost:8000/_lfs_cache/77ec36d03f6209383a19162c683732df373d444b689698b50924d8815be485d4
12:04:08.723638 trace git-lfs: HTTP: 400
12:04:08.724289 trace git-lfs: tq: refusing to retry "77ec36d03f6209383a19162c683732df373d444b689698b50924d8815be485d4", too many retries (8)
12:04:08.724324 trace git-lfs: tq: refusing to retry "77ec36d03f6209383a19162c683732df373d444b689698b50924d8815be485d4", too many retries (8)
12:04:08.725621 trace git-lfs: HTTP: GET http://localhost:8000/_lfs_cache/777c309f83d684467825cbc5ab81354fa62e455c70ad748fec80efa4af5ccb99
12:04:08.728185 trace git-lfs: HTTP: 400
12:04:08.728749 trace git-lfs: tq: refusing to retry "777c309f83d684467825cbc5ab81354fa62e455c70ad748fec80efa4af5ccb99", too many retries (8)
12:04:08.728761 trace git-lfs: tq: refusing to retry "777c309f83d684467825cbc5ab81354fa62e455c70ad748fec80efa4af5ccb99", too many retries (8)
The lfscache server produces no error logs, so it's difficult to track this down.
It looks like it's the signature verification from the X-Lfs-Signature
header is failing. The hmac key is randomly generated for each instance, so a batch request served by one lfscache instance would not be valid against the other replicas.
I think being able to provide the hmac key as a CLI argument or environment variable would resolve this. Also, it would be nice if lfscache printed an error to the logs in this case.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.