paperswithcode / sota-extractor Goto Github PK
View Code? Open in Web Editor NEWThe SOTA extractor pipeline
License: Apache License 2.0
The SOTA extractor pipeline
License: Apache License 2.0
Hello,
I am not sure of whether this is a bug or a feature:
https://paperswithcode.com/sota/visual-question-answering-on-gqa-test2019
Is it ok that agent is repeated? I was assuming that I would only have 1 agent if all other fields (paperUrl, date, source, etc) are the same.
Thank you
Similar to #25, check https://paperswithcode.com/sota/question-answering-on-squad11, but found:
Or do I just wait for the parser to find it? It's https://arxiv.org/abs/1912.07390
If you have a look at nlpprogress.json the URLs sometimes contain part of the markdown it.
Have a look here: https://github.com/paperswithcode/sota-extractor/blob/master/data/tasks/nlpprogress.json#L326
or here:
https://github.com/paperswithcode/sota-extractor/blob/master/data/tasks/nlpprogress.json#L718
It appears that the manifest is missing at least one file necessary to build
from the sdist for version 0.0.10. You're in good company, about 5% of other
projects updated in the last year are also missing files.
+ /tmp/venv/bin/pip3 wheel --no-binary sota-extractor -w /tmp/ext sota-extractor==0.0.10
Looking in indexes: http://10.10.0.139:9191/root/pypi/+simple/
Collecting sota-extractor==0.0.10
Downloading http://10.10.0.139:9191/root/pypi/%2Bf/2c9/40e98af84fb9b/sota-extractor-0.0.10.tar.gz (22 kB)
ERROR: Command errored out with exit status 1:
command: /tmp/venv/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-wheel-vx3qe4ay/sota-extractor/setup.py'"'"'; __file__='"'"'/tmp/pip-wheel-vx3qe4ay/sota-extractor/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-wheel-vx3qe4ay/sota-extractor/pip-egg-info
cwd: /tmp/pip-wheel-vx3qe4ay/sota-extractor/
Complete output (5 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-wheel-vx3qe4ay/sota-extractor/setup.py", line 26, in <module>
install_requires=io.open("requirements.txt").read().splitlines(),
FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
In the "Image Classification" leaderboard on ImageNet, specific metrics, such as "Hardware Burden" and "Operations per network pass", were present in the HTML webpage source but did not appear in the actual ranking dataframes. Although these metrics are not visible on the webpage, they can be located in the "evaluation-table-data" element within the script block of the website source. Specifically, these missing results can be found in the "metrics" and "raw_metrics" attributes of the accompanying JSON table.
How the hierarchical classification of the Task is constructed, and can the Task data set be open sourced for download? Thanks!
Choose one SOTA table where it's easy to acquire the papers from arxiv (NOTE: can use the pwc database to translate titles into arxiv IDs).
Then, process all the papers using the pipeline from #1 and see if there is a way of clustering them according to overlap, or any other language cues.
Similar to #25, check https://paperswithcode.com/sota/machine-translation-on-wmt2014-english-german, but found:
A few days ago, I moved a demo Colab notebook in my perceiver-io GitHub repository (from notebooks/inference_examples.ipynb
to examples/inference.ipynb
) but I found that the Quickstart in Colab link at https://paperswithcode.com/paper/perceiver-io-a-general-architecture-for#code still points to the old location (same here and here). What can I do to get this link updated?
Similar to #20, it would be cool if we could parse out Open in GitHub Codespaces
links in a paper's associated README
, and display those as deep links. Here's the docs for the badge, and a sample repo.
Create a function that takes as input a LaTeX file, and extracts all tables in a consistent format.
Perhaps the right output format is a list of rows, as this is how the tables are specified within LateX.
Similar to #25, check https://paperswithcode.com/sota/object-detection-on-coco, but found:
Similar to #25, check https://paperswithcode.com/sota/machine-translation-on-wmt2014-english-french, but found:
Hi, I just noticed that in the case of https://paperswithcode.com/sota/visual-question-answering-on-gqa-test2019
The agent lxmert-adv-txt appears twice (that is correct from the JSON) but the year is 2020 (there is no date in evaluation-tables.json, see paper_date as null):
{
"code_links": [],
"metrics": {
"Accuracy": "61.12",
"Binary": "78.07",
"Consistency": "91.13",
"Distribution": "5.55",
"Open": "46.16",
"Plausibility": "84.8",
"Validity": "96.36"
},
"model_links": [],
"model_name": "lxmert-adv-txt",
"paper_date": null,
"paper_title": "",
"paper_url": "",
"uses_additional_data": false
},
{
"code_links": [],
"metrics": {
"Accuracy": "61.1",
"Binary": "77.99",
"Consistency": "91.08",
"Distribution": "5.52",
"Open": "46.19",
"Plausibility": "84.82",
"Validity": "96.36"
},
"model_links": [],
"model_name": "lxmert-adv-txt",
"paper_date": null,
"paper_title": "",
"paper_url": "",
"uses_additional_data": false
}
Where do you get the year from?
Thank you
I love paperswithcode, and know that I would love to contribute to it and add new features if the website was open source.
I would be very interested in adding more details about specific datasets (e.g. links/where they're hosted), along with possibly showing extra details like the affiliation of specific papers.
I'd also love to help work on a better system for automatically extracting results from new papers instead of relying 100% on crowdsourcing.
I think paperswithcode is a very valuable tool and could become an integral part of the ML community if you make it fully open source, encourage contributions, and focus on the features that the community wants.
Thanks!
Am not sure when this feature was introduced, but I love that Colab notebooks for models on Papers with Code are automatically referenced immediately below their repos (example):
Am assuming that there is some sort of logic in the website that looks for colab.research.google.com
links in models' README
s? What would a user need to do to add a similar link for a demo of a model in HuggingFace's Spaces, or in a reproducible Codespaces or Replicate.ai instance?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.