Comments (5)
Good idea. I don't think it is difficult to implement. You want to help with it? :)
A new SQL (plpgsql) procedure is needed here: https://github.com/metacran/cranlogs.app/blob/master/db/proc.sql
from cranlogs.
Hmm...well, I'm not sure I know enough SQL/JSON to be of much help. Algorithmically, it would seem to require:
- Get names of all CRAN packages
- Run
cran_downloads
on that list - Calculate quantiles
2 and 3 are straightforward. 1 is clearly possible, but I wouldn't know how to do it through the SQL/JSON interface. Or perhaps there's a more efficient approach than all this?
from cranlogs.
EDIT 2021-11-30: Answer to a different question below ... (I've updated it to say fraction
instead of quantile
)
Since you can get the total download count for all packages by passing packages = NULL
("... for a sum of downloads for all packages."), you could use that for your denominator. Here's the gist:
cran_download_fraction <- function(packages, ...) {
counts <- cranlogs::cran_downloads(packages = packages, ...)
total <- cranlogs::cran_downloads(packages = NULL, ...)
z <- lapply(total$date, FUN = function(.date) {
x <- subset(counts, date == .date)
y <- subset(total, date == .date)
x$fraction <- x$count / y$count
x[, c("date", "count", "fraction", "package")]
})
z <- do.call(rbind, z)
rownames(z) <- NULL
z
}
Example:
pkgs <- c("rlang", "digest")
stats <- cran_download_fraction(pkgs, from = "2021-11-10", to = "2021-11-12")
stats
#> date count fraction package
#> 1 2021-11-10 86060 0.010044005 rlang
#> 2 2021-11-10 36999 0.004318129 digest
#> 3 2021-11-11 86956 0.011273038 rlang
#> 4 2021-11-11 36907 0.004784650 digest
#> 5 2021-11-12 78391 0.011641753 rlang
#> 6 2021-11-12 32555 0.004834704 digest
stats <- cran_download_fraction(pkgs, when = "last-week")
head(stats)
#> date count fraction package
#> 1 2021-11-17 87119 0.011624874 rlang
#> 2 2021-11-17 36247 0.004836681 digest
#> 3 2021-11-18 86853 0.012107869 rlang
#> 4 2021-11-18 37356 0.005207668 digest
#> 5 2021-11-19 72217 0.011277519 rlang
#> 6 2021-11-19 30428 0.004751684 digest
Suggestion
Add argument fraction = FALSE
to cran_downloads()
and make the above calculations internally.
Maybe fraction = TRUE
could even be the default?
Limitation: The above is only for download fraction per day. For anyone who wishes to calculate download fraction for a longer time period, say, per week or per month, will have to do something else.
from cranlogs.
Well, this isn't really returning quantiles (or at least, not what I had in mind). rlang
might represent 1.2% of all downloads on 2021-11-17, but I would assume that places it in the 99th percentile among all CRAN packages.
from cranlogs.
Doh! Fair point. I have no idea what I was thinking. I've updated my comment to say 'fraction' instead of 'quantile'.
from cranlogs.
Related Issues (20)
- Option to not count downloads < 1000 bytes HOT 9
- country variable typos in CRAN logs
- Add function for trending endpoint
- 502 error: Bad Gateway HOT 1
- package name errors in CRAN logs HOT 4
- Default "from" to the CRAN release date HOT 7
- Option to count only current, unarchived packages
- `cran_downloads` not working since `2020-01-01` HOT 1
- Days with no CRAN downloads HOT 30
- `cran_downloads` returning 0s starting from 22.01.2020 HOT 1
- limit on number of packages as argument to cran_downloads HOT 7
- aggregate counts by package over a period HOT 2
- cranlogs appears to be down
- Not Found (HTTP 404) in cran_downloads HOT 6
- cranlogs::cran_downloads() overcounts downloads on 8 days at end of 2012 and beginning of 2013 HOT 1
- Odd download counts recently
- Change to new cran checks badge URL HOT 1
- cranlogs::cran_downloads() double counting 2023-09-19 through 2023-10-01 HOT 1
- cranlogs::cran_downloads("R") (mostly) double counts from 2023-09-13 through 2023-10-02
- R-version for packages download statistics HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cranlogs.