breck7 / pldb Goto Github PK
View Code? Open in Web Editor NEWPLDB: a Programming Language DataBase
Home Page: https://pldb.io
PLDB: a Programming Language DataBase
Home Page: https://pldb.io
title PLDB Readme import rootHeader.scroll printTitle # A Programming Language Database wideColumns 1 #### View this readme as HTML https://pldb.io/readme.html import code/ciBadges.scroll PLDB is a public domain ScrollSet and website containing over 135,000 facts about over 4,000 programming languages. This repo contains the entire ScrollSet, code, and website for https://pldb.io. ## To download the data The entire ScrollSet is ready to analyze in popular formats. Full documentation is here: https://pldb.io/csv.html - As CSV: https://pldb.io/pldb.csv - As TSV: https://pldb.io/pldb.tsv - As JSON: https://pldb.io/pldb.json ## To add a new language ### Local Method: - Clone the repo locally. - Create a new Scroll file `concepts/[newId].scroll`. - Use the Designer if you need autocomplete help (recommended): https://sdk.scroll.pub/designer#url%20https%3A%2F%2Fpldb.io%2Fpldb.parsers%0AprogramUrl%20https%3A%2F%2Fpldb.io%2Fconcepts%2Ftxt.scroll Designer - Send a Pull Request ### Web Method: - Fork this repo - Visit https://github.com/[yourGithubUserName]/pldb/new/main/concepts - Use the Designer if you need autocomplete help (recommended): https://sdk.scroll.pub/designer#url%20https%3A%2F%2Fpldb.io%2Fpldb.parsers%0AprogramUrl%20https%3A%2F%2Fpldb.io - Send a Pull Request ## To update a language Edit the corresponding `concepts/*.scroll` file and send a pull request. ## To add a new measure Update the file `code/measures.parsers` and add at least 1 measurement to a concept in `concepts` and send a pull request. ## To build the site locally code git clone https://github.com/breck7/pldb cd pldb # Required to run this during first install only. npm i -g cloc # Required to run this on fresh checkout and when upgrading from an old checkout or periodically when there are new releases npm install . # (Optional) Run tests npm run test npm run build # After you make changes and before you commit make sure to run: npm run format ## To explore this repo The most important folder is `concepts`, which contains the ScrollSet (a file for each concept). The file `code/measures.parsers` contains the Parsers (schema) for the ScrollSet. You can see the `cloc` language stats on this repo at https://pldb.io/pages/about.html. import citation.scroll All sources for PLDB can be found here: https://pldb.io/pages/acknowledgements.html endColumns import footer.scroll
We need to make it easier for people to add content:
73706e5
It looks like the goaccess cronjob is now taking too long and causing some spikes.
No reason for us to reparse everything each time, there is an incremental option:
PROCESSING LOGS INCREMENTALLY
https://goaccess.io/man
alfred.pldb:
https://medium.com/@nikitavoloboev/writing-alfred-workflows-in-go-2a44f62dc432
References an IDE for mac users
alcor.pldb:
Seemed to have copy pasted a few extra sentences from Wikipedia, in the summary section.
algobox.pldb:
Wikipedia page https://en.wikipedia.org/wiki/Algoboxn does not seem to exist. There are no other links.
database/things/alpha-programming-language.pldb and database/things/alpha.pldb :
Seem to be duplicates: Each file contains new information not present in the other.
from Kokaiinum
https://www.reddit.com/r/ProgrammingLanguages/comments/x2m24s/comment/imrf4j5/?context=3
If you're taking issue reports, one I noticed - "Cish" and "SuperForth" are apparently the same language (SuperForth renamed to Cish).
Also the examples for BEEF appear to be those of BeefLang (although I must admit I've no idea what actual BEEF looks like)
https://scroll.pub/ has it done well.
Spot on feedback to address when upgrading features stuff:
https://news.ycombinator.com/item?id=32628257
`I understand whis is pretty much WIP, but still, it's too unorganized to be anything useful. I thought the most interesting to be features page[1], which is nearly empty, and this effort in taxonomy is rather too complicated to be crowd-sourced without supervision. For example, let's take a look at traits[2] and mixins[3]. There are a couple of issues here. First off, why it's 2 different pages? There's no real difference between a trait in PHP, and mixin in… well, no languages except for Racket actually have a syntactic construct called "mixin", but I guess modules in Ruby or Julia are close enough. Scala also has something that's called "traits", and it's also basically the same thing, but with caveats.
On the other hand, D has both "mixins" and "traits", but these are completely different features, and these "traits" have nothing to do with traits in Scala or PHP. So if somebody were to make a comprehensive list of features of D in this DB, should these "traits" appear on the same page as PHP and Scala traits (which are mixins)?
Furthermore, unlike PHP, Scala, Ruby or Julia — Python's "mixins" aren't just mixins with a different name. It's not even clear if it has mixins at all. There's something people call a "mixin" in Python, but these are just classes, so you cannot really say "yes". However, Python has multiple inheritance, which makes "mixins" borderline pointless: classes are (or can be used as) mixins, if you have multiple inheritance! Templates in some languages can be used this way as well.
Which brings us to the next issue — it's not clear, if a language should be marked as having a feature if it comes built-in, explicitly, or if a feature can be implemented in it. Does every language have a semaphore? I cannot remember any where it couldn't be implemented (that would be weird), but I cannot remember any where it's an explicit feature construct either (well, arguably, maybe some SQL-extensions?).
All this isn't to say that the current list is bad. All the questions above can be answered in any way, and it's up to a "researcher" which definition to use in order to actually get a useful taxonomy. It's a non-trivial job.`
[1] - https://pldb.com/lists/features.html [2] - https://pldb.com/languages/traits-feature.html [3] - https://pldb.com/languages/mixin-feature.html [4] - https://pldb.com/languages/semaphores-feature.html
Files:
database/things/accent-programming-language.pldb
and
database/things/accent.pldb
seem to be possible duplicates
as both list
https://en.wikipedia.org/wiki/Rational_Synergy#History
as reference .
Possibly related Issue #78
The single type is obviously very limiting. Should improve the other ways of tagging things. A lot can probably be automated/AI.
Hello, just saw your project on HN and it looks very interesting :) One thing I immediately noticed is that pldb seems to be using bigquery github data to show repository count of different languages. Sadly that bigquery dataset seems to be quite limited, and there's a much better way to find the number of repositories written in a specific language:
# Use per_page=1 so that we don't waste much bandwidth
$ curl "https://api.github.com/search/repositories?q=language:nim&per_page=1"
{
"total_count": 8013,
"incomplete_results": false,
"items": [
{
<omitted for readability purposes>
}
]
}
The actual count is in the total_count
field, and it's only unique repositories (it doesn't count forks). If you want to also count forks (but I don't think it'd be a good idea) you can do
$ curl "https://api.github.com/search/repositories?q=language:nim+fork:true&per_page=1"
{
"total_count": 18320,
"incomplete_results": false,
"items": [
{
<omitted for readability purposes>
}
]
}
I don't know if these results are 100% exact, but they seem to be much more real than the BigQuery count.
Here's what the page looks like now:
https://pldb.com/posts/buildPublicDomain.html
Obviously got some issues.
Anyone know <video>
tags and can come up with a fix(es)? Pull requests wanted!!!
(and of course we have to use the video tag and host these videos ourself, obviously). Cannot use a 3rd party video service.
2 people have mentioned it:
https://news.ycombinator.com/item?id=32621392
https://www.reddit.com/r/ProgrammingLanguages/comments/x2m24s/comment/imqyscm/?context=3
important we get the history correct.
Hi, I like your site that I just discovered.
Regarding Julia, its package manager, the repository moved, so your info at:
https://codelani.com/posts/does-every-programming-language-have-a-central-package-repository.html
i.e. not only 1,906 (not sure even at the time, don't recall when the moved happened).
i.e. you could substitute (if you need accurate numbers)
https://github.com/JuliaRegistries/General
or juliahub.com (for user-friendly access):
for https://julialang.org/packages/
everywhere.
Locally building and testing with cloc requires at least 3 GB of RAM according to informal tests.
Should the build scripts/tests scripts be changed to warm people who want to download and build project ??
Note: Without cloc the memory requirement for building pldb seems to be quiet less.
Related Pull request: #87
EDIT LOG: Made the text description clearer.
adding the branch for the github.pldb.com mirror slowed down git pull. figure that out.
(probably just change:)
pldb/.github/workflows/buildGithubDotPldbDotCom.yaml
should start adding more and better links on how these languages are related.
i think Diarmuid Pigott's HOPL really pioneered this. does anyone know him? he would definitely be the expert here i think.
It needs to be improved.
For visual languages, a picture is worth a thousand words. PLDB now shows a screenshot for visual languages when that keyword is present. Example: https://pldb.com/languages/scratch.html
For every visual language, let's take our own nice screenshot of using the language in action.
See the below commit which added 2 examples:
b367296
site/screenshots/[pldbId].png
screenshot https://pldb.com/screenshots/explorer.png
npm run format
before committing.Right now for things like Wikipedia the grammar asks for the full url but for things like reddit it just asks for the subreddit id. For example. subreddit Python
The url is the clear way to go. It's a little bit of redundancy, but it makes each pldb
file more useful on its own. And it's clearer for a new contributor what needs to be added (always just a url, never need to look up the encoding/decoding scheme).
Example:
https://pldb.com/languages/alumina.html: documentation https://docs.alumina-lang.net/
@celtic-coder you thinking what I'm thinking?
When I visit: https://api.github.com/repos/breck7/pldb/contributors, it seems to be pretty unstable (some days contributors disappear).
Perhaps
Line 218 in 1361617
I suggest the data entry "github" should possibly be named something else, like "repository". There are other places like gitlab.com, so naming a data key after just one platform (no matter how popular) when quite a few others are used just is a bit weird.
Currently less than <3000. This one should be a relatively easy one to 3x.
As wikipedia, it's a company name.
As their homepage, they have C and Basic compiler products.
Hi Breck (@breck7),
Back in mid-July, you added the leachim6 importer (8fdd117) for the "hello-world" programs.
Might it also be possible to create an importer for the Sample Programs in Every Language, a collection started in 2018 by Jeremy Grifski (@jrg94) as part of The Renegade Programmer project? At the end of July, the repo contained 162 languages with 597 code snippets.
Kind Regards,
Liam
This is because there is a file in the repo named nul.lani
, and nul
is a reserved word in Windows
Because your instructions say c# should be called c-sharp
to get around filesystem limitations, I suggest the same for nul.lani
Pr incoming
john@LAPTOP-PE9BBGOJ MINGW64 ~/projects
$ git clone https://github.com/StoneCypher/codelani.git
Cloning into 'codelani'...
remote: Enumerating objects: 5964, done.
remote: Counting objects: 100% (299/299), done.
remote: Compressing objects: 100% (268/268), done.
remote: Total 5964 (delta 27), reused 250 (delta 12), pack-reused 5665
Receiving objects: 100% (5964/5964), 1.14 MiB | 1.06 MiB/s, done.
Resolving deltas: 100% (626/626), done.
error: invalid path 'database/nul.lani'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
can probably delete half that code and should provide instructions on what the best patterns have turned out to be
example: https://pldb.com/languages/explorer.html demoVideo https://www.youtube.com/watch?v=0l2QWH-iV3k
Edit: Sorry, Enter submitted the form without a description 😄
curl https://pldb.com
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number
I tried on 2 different machines.
Maybe http could be left available, and not provide the redirect (so we can still use the site when https isn't available).
Second edit: Removing the redirect headers, those appear to come from my ISP.
Hi Breck (@breck7),
In the "Written In" section of the https://edit.pldb.com/pages/acknowledgements.html page, the individual URLs contain the BASE_URL replacement word, which has either not been replaced or is not required in its current form.
For example, the JavaScript link is: https://edit.pldb.com/pages/BASE_URL/languages/javascript.html
Since, I'm guessing that the URL should be either https://edit.pldb.com/languages/javascript.html or https://pldb.com/languages/javascript.html, then perhaps a replacement has already taken place?
It would appear that the WRITTEN_IN_TABLE is being replaced correctly, but that any internal replacements are being left "as-is". Should this replacement happening recursively or does it need to occur at another point in the processing of the page?
Kind Regards,
Liam
Thanks JS!
It looks like you're allowing anyone to make changes to the site, and any user can impersonate anyone. I don't see anything obviously malicious, although changes like 2bdc8cd look wrong.
Have you considered adding an authentication mechanism? Since the content is on GitHub, you could probably use that for auth.
Hi there,
fantastic project with lots of useful information on programming languages!
One suggestion for the GitHub repository. Please add a LICENSE file, then the License information is visible on the right.
Thanks!
-Thomas
Those files are a mistake. I think all that information should be moved to the grammar files, and then we should have a /site/features/
folder, and a buildFeaturesPagesCommand()
in SiteBuilder that generates those pages. That would make the code a lot clearer and fix a number of things.
Ccs, a scripting language for infoblox netmri. Here’s a link to official documentation https://www.infoblox.com/wp-content/uploads/infoblox-eval-download-netmri-NetMRI_CCS_Scripting_Guide.pdf
probably want to add binary + text
https://github.com/breck7/pldb/blob/main/site/lists/creators.tree
every creator should be able to add their:
appeared
(birth year)Hi Breck (@breck7),
The list All Languages states that the PLDB has 4,058 languages:
Doing a search from the home page, with nothing in the search box, gives a different count for the languages:
https://edit.pldb.com/search?q=#
On my local copy of the repo, when I search the .pldb files in the /things/ folder for the "title" keyword, which the CSV Documentation says has 100% coverage, then I get the same result:
There is, however, a further issue with the search results. When I search on the page for the $ language, for example, I get two matches rather than one (the URL https://pldb.com/languages/dollar-sign.html is the same for both).
Here is the first match:
... and here is the second:
Something is getting repeated in the search results, which can be seen about halfway down the page:
I am guessing that the 4,671 count is correct, but how the results are being displayed as well as the difference with the "All Languages" list would need further investigation.
Kind Regards,
Liam
Google scholar lists 3750 articles citing the main Julia paper (https://scholar.google.com/scholar?cites=12373977815425691465&as_sdt=40000005&sciodt=0,22&hl=en) and semantic scholar shows 38000 papers with Julia as a keyword since 2012, and of the first 10 pages, all appear to be Julia papers.
Also, github shows 14000 repositories with julia code https://github.com/search?q=language%3AJulia&type=Repositories&ref=advsearch&l=Julia&l=.
I'm also pretty sure the number of downloads is wrong given that https://www.hpcwire.com/2021/01/13/julia-update-adoption-keeps-climbing-is-it-a-python-challenger/ lists 9 million downloads in 2020.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.