uptake / cran-server Goto Github PK
View Code? Open in Web Editor NEWSelf-hosted R package repository
License: BSD 3-Clause "New" or "Revised" License
Self-hosted R package repository
License: BSD 3-Clause "New" or "Revised" License
Need to add a LICENSE file to the root level.
It would be great to get the UI stuff working with a build tool like Webpack
A few other projects allow you the ability to specify a subset of CRAN to mirror. Might be a desirable feature.
Tests right now are running at the package root and creating a /src/contrib
directory there.
Consider support for installing old versions of packages with devtools: https://github.com/r-lib/devtools/blob/master/R/install-version.r#L45
Currently the only way to do it is to go to the path directly,
install.packages("http://cran.mycompany.com/src/contrib/mypackage_0.3.2.tar.gz", repos = NULL)
@ntdef @ngparas I think it would be valuable to create one or more "milestones" here and attach open issues to them. This has been really helpful for @jayqi and @bburns632 and I on pkgnet.
This project should be documented on RTD!
This involves:
.readthedocs.yml
, e.g. https://github.com/jameslamb/doppel-cli/blob/master/.readthedocs.ymlDockerfile installs are happening separate from the setup.py.
The code is modular enough to support other backends, but to swap out a backend you would actually have to modify the codebase as-is. Ideally, you should be able to point to point to a Python class at startup and use that as the storage backend.
It's probably worthwhile to look into how Flask/Gunicorn does this when looking for the app
object.
Support other cloud storage besides aws s3
Unit tests should run first, then integration.
Per TODO removed in #71 .
The __iter__
method on FileStorage
should exclude PACKAGES
when listing the main directory.
Needs some tests that go beyond unit tests.
Because the fileobj isn't copied into a buffer local to the Package instance we require resets outside of the class e.g. https://github.com/UptakeOpenSource/cran-server/blob/master/cranserver/server.py#L88
The constructor should copy the fileobj into a new BytesIO
We need some good library api docs.
Consider recording metrics on package downloads
It would be great if cran-server supported something like MRAN snapshots https://mran.microsoft.com/documents/rro/reproducibility
You shouldn't have to reach into lib
to get classes. All the relevant public ones should be exported.
Can you add a note in the README that makes it explicit that there's a UI bundled with this? Right now if I just read the README I'd think this just runs the CRAN service.
Can you add something like "navigate to <host>:8080
in your browser to see the UI..."
@ngparas what do you think about swapping out Vue for straight Jinja2 templates? We could probably achieve the same effect.
See docs here.
There are some environment variables in server.py
that aren't properly documented that affect deployment.
I hooked up our gh-pages
to look at the docs/
folder on master
. My suspicions were confirmed...markdowns don't get automagically rendered there.
https://uptakeopensource.github.io/cran-server/why_cran_server.md opens as a download link :/
I think we need to render the HTML equivalent of that file and check it into the repo in the docs/
folder. Thoughts?
The default configuration (local filesystem) used in the quickstart can have inconsistent behavior when installing packages from R.
Following the instructions in the quickstart, I uploaded httr
version 1.3.1
to test installs. I can start an R session and run
> install.packages('httr', repos = c('http://localhost:8080'))
and it will either succeed or fail out after not finding the bin/macosx PACKAGES file seemingly randomly.
There're on R-related tags for this project
Consider supporting a sqlite backend instead of writing to the PACKAGES text file directly
I think it would be awesome to have a vignette describing how to use this on AWS's Elastic Container Service. Ideally this would cover:
Thoughts?
Right now some files laying around in the repo will be copied in when building the container.
For background see Do Not Ignore .dockerignore
Need to add requests
to this list.
Test the S3 storage backed against S3.
The library should be properly documented.
Add pycodestyle
to the CI setup for this project to prevent PRs from introducing style issues and to document the preferred style for this project.
See this example PR for an example of how to add this type of check to your build. Basically you need to run pycodestyle
command on the repo and can optionally configure what it checks with a tox.ini
file.
Docs here: http://pycodestyle.pycqa.org/en/latest/intro.html
server.py relies on the R CMD build tarball naming convention when receiving uploads and doesn't do much in the way of validation. We should at least check that uploads conform to this convention
Consider adding a /delete route to the server. Currently the only way to remove packages is to manually edit the PACKAGES file and manually remove the artifact from s3/the file system.
Locking logic is currently implemented in server.py. We should consider pushing this logic into the storage instead of the server itself. For example, suppose we used something like sqlite or redis to handle the package metadata instead of writing to the PACKAGES file directly. We would prefer those handle locking over doing it ourselves, and it would allow multiple instances of the server container to run concurrently.
This issue addresses the TODO left in the code here:
https://github.com/UptakeOpenSource/cran-server/blob/master/cran-server/server.py#L102
Looks like the AWS S3 version was not updated to meet the new storage API.
Need to add some more description before jumping right into the Quick Start section.
First tests should target the file system storage backend.
Add build status badge to README.
Boto3 seems rather large when all we need to do is touch S3. I've been looking at tinyS3 which seems pretty neat.
FileStorage
class currently assumes the existence of a ./src/contrib
folder. It should create it if the folder doesn't exist on init.
This repo needs some kind of CI. I'm not sure the best way to test it, but we should at least set up a skeleton with Travis to test that the service can be run and basic functionality works.
Will make it easier to welcome new contributors to help out!
Add support for authentication and roles. As a added bonus, it would be nice if we could support OAuth if possible.
Issue #9 is dependent on this.
Need to get a code coverage badge on here.
Right now this project only supports packages stored under /src/
. To offer a full featured CRAN repo we need to support binary packages.
I don't have a Windows environment to test with but we should procure one.
Apparently install.packages
support multicore processing. We should test that cran-server
doesn't break when a client uses the option.
It would go along way to add a benefits / why this is awesome section in the readme.
Some questions might be:
A flashy example would add value too.
I don't think we need to inherit from the conda image.
See this line
https://github.com/UptakeOpenSource/cran-server/blob/999b7b8ff5d5b0418f8feccf76711d11912a8541/Dockerfile#L1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.