r-hub / cransays Goto Github PK
View Code? Open in Web Editor NEWCreates an Overview of CRAN Incoming Submissions :mailbox_with_mail:
Home Page: https://r-hub.github.io/cransays/articles/dashboard.html
License: Other
Creates an Overview of CRAN Incoming Submissions :mailbox_with_mail:
Home Page: https://r-hub.github.io/cransays/articles/dashboard.html
License: Other
https://cransays.itsalocke.com/articles/dashboard.html
'CRAN review worflow' link in sidebar is broken. For some reason, it can't be clicked.
whilst finding a way to guess the year probably from the difference between the current time and the submission time, taking timezones into account.
cf https://github.com/lockedata/cransays/blob/master/R/take-snapshot.R#L99
I know cransays is not really to deliver code, but I have some code to merge all the csv files of the history branch that I think it would be helpful to others (and myself) if it were documented here.
The code solves merging some files with different headers efficiently (previous iterations of the code lasted 30 minutes and now I can do it in just 1).
I think it doesn't have dependencies and wouldn't need to be run or tested but it could help others if they want to analyze the data.
Let me know if it would be helpful/appropriate and I would create a PR with the code.
Right now the sorting follows the character order, ideally it'd be sorted by number of days while keeping the current "pretty" display "number of days days ago"
I'm still undecided since it'd mean unpacking the .tar.gz to find the URL/BugReports. If we did that, we'd need to cache results somehow in order not to unpack all .tar.gz every hour.
Then using the information:
It seems there are two versions of the dashboard. One is updated, but not the other.
Correct: https://lockedata.github.io/cransays/articles/dashboard.html
Not updated ? https://itsalocke.com/cransays/articles/dashboard
Cc @Bisaloo
Opening this issue so we have a public & central place to discuss this matter.
Having the historical data on a branch seems suboptimal:
The cleanest option is probably to store this data in an actual database, hosted on an external service. This makes sense since we're not actually changing the file contents, just adding new files, and therefore don't need a Version Control System. But:
Another simpler (albeit imperfect) option would be to store the historical data in a distinct GitHub repository. This uses tools we already know, is free, public & easy to find.
@Bisaloo ๐
There are now two new folders, "newbies" and "waiting".
Cc @Bisaloo
Today on the r-pkg-devel mailing list there was a question about the meaning of the folders/labels
Reference email from UweLigges: https://stat.ethz.ch/pipermail/r-package-devel/2022q2/008084.html (and see the whole thread)
And there is an email from Ben Bolker highlighting the difference with foghorn fmichonneau/foghorn#42
The dashboard has been stuck at 2019-05-23 11:41 UTC+0000 for a while now.
The directory archive
is used on the CRAN incoming FTP but not described in the Dashboard for status.
Description from R Journal 01/2018 (https://journal.r-project.org/archive/2018-1/cran.pdf) is:
archive reject the package, if the package does not pass the checks cleanly and the problem
are not likely to be false positives.
Right now, it is a bit obscure for anyone unfamiliar with the submission process what each folder means. If I'm a package maintainer, I can see that my package has moved to pretest
but what does it mean?
My understanding is that there is no official documentation about the meaning of each folder (it may even depends on each maintainer????) so this may be a bit difficult.
The diagram from https://github.com/edgararuiz/cran-stages could help here but I'm not sure under which license it's been released.
In particular maybe foghorn::cran_incoming()
can help make the code here more elegant? h/t @fmichonneau
Can you please add the incoming time to the history branch CSV snapshots?
Those currently only include snapshot_time
which is constant for all of them.
I am trying to use the GitHub API to build an alternative frontend for the data.
Thank you for your time.
Edit: For reference I am doing this (TypeScript):
async function fetchCranSays() {
// fetch last commit by actions bot on history branch
const reCommints: any[] = await fetch(
"https://api.github.com/repos/lockedata/cransays/commits?sha=history&author=actions-user&per_page=1"
).then((response) => response.json());
const commitSha = reCommints[0].sha;
console.log(commitSha);
// fetch commit (for filename)
const reCommitExt: any = await fetch(
"https://api.github.com/repos/lockedata/cransays/commits/" + commitSha
).then((response) => response.json());
const csvFilename = reCommitExt.files[0].filename;
console.log(csvFilename);
// fetch csv
const reCsv: any = await fetch(
"https://api.github.com/repos/lockedata/cransays/contents/" +
csvFilename +
"?ref=history"
).then((response) => response.json());
const csv = atob(reCsv.content);
console.log(csv)
}
Appears to be last updated 2019-12-13 14:46 UTC+0000.
Brilliant site, by the way, looks great and very insightful for submitters. Cheers!!!
On the dashboard, it is written:
pending: the CRAN maintainers are waiting for an action on your side. You should check your emails!
Yet, for the lightr package, which is currently in pending
, I got the following email:
Dear maintainer,
package lightr_1.2.tar.gz has been auto-processed and is pending a manual inspection. A CRAN team member will typically respond to you within the next 5 working days. For technical reasons you may receive a second copy of this message when a team member triggers a new check.
Instead of a big table, have a table with columns reflecting the diagram in https://github.com/edgararuiz/cran-stages
It seems like the dashboard is not being updated. A look at the GitHub Actions page indicates that it is due to an automated deactivation of the cron job. I hope you can fix it and have your great service up and running again. Thanks for providing this service to the community!
I can't go on the dashboard
https://cransays.itsalocke.com/articles/dashboard.html
nor the website
https://cransays.itsalocke.com
Is this just me ?
remove repetition of code
have a better format for the human subfolders (DSok/ -> DS/ok)
for each line add a direct link to the corresponding folder
Hi,
great work on the dashboard.
I built a similar website, but with a different approach:
Website:
https://nx10.github.io/cransubs/
Repos:
https://github.com/nx10/cransubs
https://github.com/nx10/cransubs-server
It would be great to hear what you think.
Feel free to close this issue anytime.
Just wanted to resurface this suggestion to use ftp timestamps (instead of action time or in addition of it):
#36 (comment)
Currently, the CRAN incoming FTP server is polled once an hour:
cransays/.github/workflows/render-dashboard.yml
Lines 10 to 11 in b0cc818
Have you considered increasing this to, say, two or four times an hour? I doubt it would make a big dent in the total amount of traffic that the CRAN server sees. It might even help decrease the traffic by moving someone who's tracking their package manually to looking at CRANsays instead - once an hour is not enough for such use.
UPDATE: I see that https://nx10.github.io/cransubs/ is updated once every ten minutes.
UPDATE 2: It's updated only when someone access it, and I guess at most every 10 minutes.
Hi! Many thanks for providing this useful report!
Also hope that the new tracking history will help to provide some insight on the process. I already set up a reminder to analyse it on 2021 when more data will have accumulated.
I attempted to replicate the idea but with the current available packages at llrs/cranis, to provide a more complete view of the time between the submission and the appearance at CRAN (and about package removals and reappearances).
I have trouble mimicking the GHA set up: llrs/cranis#1. Maybe someone could explain how does it work or help with the setup. Thanks again!
At the moment, we don't provide a value for defaultSorted
in the reactable()
call.
This results in the default ordering being the data.frame ordering: results are ordered by folder first.
I'm not completely sure this matches the expectation of visitors. This is particularly visible at the moment because the human
folder contains quite a high number of package since many many months. Users who want to check the status of their recently submitted package will to scroll past it to get the info they need.
@Bisaloo @mitchelloharawild do you want me to add your ORCID ID in the DESCRIPTION?
do we want to get notified if the table has 0 row?
I suspect #86 wasn't the right fix since artifacts will always overwrite the previous ones. If jobs timings make it so that jobs from two different runs pull the same artifact, there will be nothing to commit.
Originally posted by @Bisaloo in #53 (comment)
I can pull the switch. I just want to let all potential contributors know that they should update their git setup.
The tidyverse blog has a great post about how you deal with this change using convenience functions from the usethis package: https://www.tidyverse.org/blog/2021/10/renaming-default-branch/
Background and upsides/downsides discussed here: r-lib/actions#597.
The idea to revive this proposal here comes from the realization we are using a severely outdated version of JamesIves/github-pages-deploy-action
.
The r-lib/actions
maintainers identified it was not a good fit for the pkgdown
action because it doesn't play well with the development: mode: auto
of pkgdown but I wonder if it would be good to have here.
Do you see any issues with making the switch?
Currently there can be a negative time since submission which doesn't seem right.
Why: For Science!
How: Google Big Query or something as part of the build
Pros: Insight into CRAN
Cons: Could be judgey
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.