GithubHelp home page GithubHelp logo

lczd / portuguese_wsc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gabimelo/portuguese_wsc

0.0 1.0 0.0 97.58 MB

Solver for Winograd Schema Challenge like questions in Portuguese.

License: MIT License

Dockerfile 0.10% Makefile 0.10% HTML 19.35% Shell 0.12% Jupyter Notebook 76.44% Python 3.90%

portuguese_wsc's Introduction

portuguese_wsc

Currently under development

Solver for Winograd Schema Challenge in Portuguese. Portuguese translations for original Winograd Schema Challenge are also being proposed here.

@article{DBLP:journals/corr/abs-1806-02847,
  author    = {Trieu H. Trinh and
               Quoc V. Le},
  title     = {A Simple Method for Commonsense Reasoning},
  journal   = {CoRR},
  volume    = {abs/1806.02847},
  year      = {2018},
  url       = {http://arxiv.org/abs/1806.02847},
  archivePrefix = {arXiv},
  eprint    = {1806.02847},
  timestamp = {Mon, 13 Aug 2018 16:46:22 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1806-02847},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Project Setup

Python 3.7

  • This project has not been tested in machines without CUDA GPUs available.

  • A Dockerfile is available, and may be used with docker build -t wsc_port . followed by nvidia-docker run -it -v $PWD/models:/code/models wsc_port.

  • The Dockerfile contains a few different options for running, which can be selected by commenting and uncommenting the final sections of it.

  • For running outside of Docker container, Conda is required

  • To create the conda environment: conda env create -f environment.yml

  • Makefile contains some of the commands used to run the code. These commands must be run from inside the environment.

    • to setup the environment for running the project: make dev_init. This command also makes sure make processed_data is run, which prepares data needed to train model
    • running make corpus will speed up first run of code (but is not necessary)
    • make train trains a model
    • make winograd_test runs evaluation of Winograd Schema Challenge
    • make generate runs language model for generation of text
  • Code runs for both English and Portuguese cases, and this setting is controlled by the variable PORTUGUESE in src.consts

  • Run tests with make tests, which is equivalent to pytest --cov=src tests/. Use pytest --cov=src --cov-report=html tests/ for generation of HTML test report. Needs pytest and pytest-cov packages. If there are import errors, should run pip install -e . to locally install package from source code.


Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`.
├── README.md          <- The top-level README for developers using this project.
├── environment.yml    <- Contains project's requirements, generated from Anaconda environment.
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported.
│
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── githooks           <- Contains githooks scripts being used for development. Git hook directory for repo needs to be set to this folder.
│
├── models             <- Trained and serialized models, model predictions, or model summaries.
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting.
│
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module.
│   │
│   └── scripts           
└── tests              <- Tests module, using Pytest.

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.