GithubHelp home page GithubHelp logo

okfn-brasil / jarbas Goto Github PK

View Code? Open in Web Editor NEW
296.0 48.0 62.0 1.46 MB

🎩 API for information and suspicions about reimbursements by Brazilian congresspeople

Home Page: https://jarbas.serenata.ai/

jarbas's Introduction

Jarbas is part of Serenata de Amor main repo now.

jarbas's People

Contributors

anaschwendler avatar antonioj-mattos avatar brunoarueira avatar caduvieira avatar caiocarrara avatar cassiobotaro avatar cuducos avatar daneoshiga avatar decko avatar dgallinari avatar fcevado avatar feliperuhland avatar giovanisleite avatar gomex avatar guilhermeslucas avatar igorrozani avatar irio avatar jtemporal avatar juhhcarmona avatar ltouro avatar lucaslm avatar magnobiet avatar marabesi avatar matheushf avatar paulohp avatar pedrommone avatar petriuslima avatar pyup-bot avatar rogeriochaves avatar sboydd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jarbas's Issues

Session freezing

Ao efetuar uma consulta por partido por exemplo, ao navegar pelas páginas a consulta congela, impossibilitando navegar nas páginas, nem mesmo o menu ao lado funciona, sendo necessário limpar a URL e começar novamente, que novamente congela entre 3-4 consulta.

Cleanup: remove old models, serializers, views, tests etc.

Once #52 is done, we can do some house keeping:

  • Remove old models (Document, Receipt)
  • Remove old serializers (DocumentSerializer, ReceiptSerializer)
  • Remove old views (DocumentViewset, ReceiptViewSet)
  • Remove OldLoadCommand
  • Rename suppliers to company
    • model
    • views
    • serializers
    • tests
    • urls
    • documentation
    • fron-end
  • Rename NewReceipt object to Receipt (at models.py and subsequent views)
  • Rename command loadsuppliers to companies
  • Use RetriveAPIView instead of Viewset for companies
  • Remove Django Admin
  • Remove mentions to loadsuppliers and AMAZON_DATASETS_DATE
    • README.md
    • .env and contrib/.env.sample
    • settings.py

Add pagination

When the API returns more than 25 results, Jarbas should be able to fetch next pages (Load more button in the bottom maybe).

Add Memcached to Docker pipeline

The backend is ready to use Memcached, but that has not been implemented in the Docker pipeline. As we intend to use Docker for deploy/provision, this is a pending task.

Settings to Memcached access point (e.g. CACHE_LOCATION=localhost:11211) are to be set as envvars or in the .env.

Migrate application & database to the same server

Currently Jarbas is a slow application for four reasons:

  • Loading time is terrible beacuse it is hosted in a free tier at Heroku; this means Heroku might have to wake up the application when user reaches it
  • No CDN for assets.static files, all is being served by Heroku (with WhiteNoise)
  • Searching is slow because Heroku free tier just allows 10k rows in the database, thus we’re are hosting the app at Heroku but the database at AWS
  • Due to the benefit-cost of paying a decent server doesn't seem to be priority to Serenata de Amor at this point

Adapt front-end to be more stateless

Nowadays passing information (Language basically everywhere, parentId for RelatedTable etc.) has proved too verbose (and adds unnecessary complexity — e.g. with Html.map).

This design is more promising.

Write tests for commands

We have two custom Django commands that are untested, both in jarbas/core/management/commands/:

  • loaddatasets.py
  • loadsuppliers.py

Maybe a quick way to do it is to mock a CSV responde with a few records, call the command ([check test_static_files.py) and assert records form the mocked CSV is in the database.

But surely unit tests would be welcomed too ; )

Add CONTRIBUTING.md

It would be nice to have a CONTRIBUTING.md in order to know how to best contribute to the project.

Add .editorconfig file

In order to keep everything fine for everyone we need add this beauty piece of setting.

Automate deploy with Docker

Now we have a Docker structure, so we can think about how to use that to automate the deploy.

Ideally a piece of code to automate provision is much appreciated ; )

Load the datasets into our Postgres

cc @lucavgobbi @pablov

A new loaddatasets is online:

$ python manage.py loaddatasets --help
usage: manage.py loaddatasets [-h] [--version] [-v {0,1,2,3}]
                              [--settings SETTINGS] [--pythonpath PYTHONPATH]
                              [--traceback] [--no-color] [--start START]
                              [--limit LIMIT]

Load Serenata de Amor datasets into the database

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --start START, -s START
                        Where the load should start (skips first LIMIT
                        documents)
  --limit LIMIT, -l LIMIT
                        Limit the number of documents to load

Basically with the optional arguments --start and --limit we can split the task of loading tons of data to our Postgres. For example, python manage.py loaddatasets --start 1000 --limit 50 will only create/update the records number 1000 until 1049.

I once estimated we have ~2,7 million records. Running it locally is slower, but the script tends to run for a longer time. Running it at Heroku is quicker, but I guess Heroku drops the connection after a while.

I'll try to write a Python or shell script with a for loop to run this loaddatasets 1k records by 1k records at Heroku. But I'm open to new ideas ; )

Make receipts command faster

Multithreading and handling multiple requests in parallel are the key (it's the Reimbursement.get_receipt_url() from #76 that handles the HTTP requests)

Store company activities as JSONField

Nowadays each Company stores its main and secondary activities in a separate model (Activity) using a ManyToManyField.

I don't think it's necessary as we won't query by a company activity (at least not that frequently). Also this design makes the python manage.py companies <companies.xz> extremely slow because we cannot use bulk_create with relational tables (or can we?).

Create dashboard with reports

Hi,
I don't know if jarbas is the appropriate project for this funcionality, but I suggest to create a dashboard containing all reports informed to Chamber (https://twitter.com/cuducos/status/818434229314928641) and showing the information of each real invested in the project, how much was returned to society.
This funcionality is inspired in this article: http://www1.folha.uol.com.br/mercado/2017/01/1846255-jornalismo-investigativo-gera-lucro-para-a-sociedade-diz-diretor-de-stanford.shtml

Enable HTTPS

I've never done that — is letsencrypt.org a good idea? I feel like I'd like to pair with someone else to get that up and running because all this is new to me.

Feature request: Data aggregation

One way to quickly explore huge amounts of data is through data aggregation. For instance, what are all CNPJ/CPFs found in expenses? How deputies spent the most money?

Does this API aims to provide such feature? If you intend to address this in any other fashion, please let me know.

New UI/UX

Now Jarbas is changing. From an internal tool for Serenata it will become the public API for the general public. And this impacts UI/UX.

In terms of UX we might change the landing page to offer two ways to start a search, to start browsing our data, maybe two tabs or something like that (sorry about the pt-BR):

And in terms of UI, @tatianasb is working on it.

This is want we have in mind, but we are open to discuss how to better conduct this changes (the technical part is discussed in #52). I just ask you to be down-to-earth and precise in further comments, suggestions — we have a deadline with our 1296 bosses which is mid-January ; )

Add sample data files for development

It would be nice to have some sample files to import into the database in order to do development of the app (elm, css, etc). It does not need to be the latest data, just some data to work on.

Painful development with docker

When I was developing in docker environment, I felt that the development was painful because the following issues:

  • Folders are copied when docker starts. So, any modification isn't "on the fly"
  • Elm container ends after executing gulp for the first time
  • Jarbas container has the command to collectstatics after the update by elm container. So I had to execute the command manually every time that gulp executes

Automate deploy (and, eventually, provision)

The Digital Ocean droplet was set manually and the only piece of software that makes deploy easier is a git hook. Yet, nginx and gunicorn processes are owned by root — that's not cool.

Something like Ansible might help.


Personally I don't have devops know-how to fix the users and processes issues, and I've never used Ansible — so any help is appreciated. Surely I can share the application specific logic and help anyone willing to close that issue ; )

Better `loaddatasets`

Use argparse optional arguments to limit the number of rows to be imported and to set where to start. E.g. $ pyhton manage.py loaddatasets --start 200 --import 5000 would load 5k records, skipping the 1st 200 from the datasets.

0 results found when clicking to the second page

  1. Go to https://jarbas.datasciencebr.com/#/year/2016
  2. Click to the second page of results.

This error is printed in the inspector.

app.js:1 ApiFail: "BadPayload \"Expecting an Int at _.results[1].term_id but instead got: null\" { status = { code = 200, message = \"OK\" }, headers = Dict.fromList [(\"Allow\",\"GET, HEAD, OPTIONS\"),(\"Cache-Control\",\"max-age=600\"),(\"Content-Type\",\"application/json\"),(\"Date\",\"Wed, 04 Jan 2017 17:48:34 GMT\"),(\"Expires\",\"Wed, 04 Jan 2017 17:58:34 GMT\"),(\"Last-Modified\",\"Wed, 04 Jan 2017 17:48:34 GMT\"),(\"P3P\",\"CP=\\\"ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV\\\"\"),(\"Server\",\"nginx/1.10.0 (Ubuntu)\"),(\"Vary\",\"Accept, Cookie\"),(\"X-Frame-Options\",\"SAMEORIGIN\")], url = \"https://jarbas.datasciencebr.com/api/reimbursement/?format=json&page=2&year=2016\", body = \"{\\\"count\\\":178153,\\\"next\\\":\\\"http://jarbas.datasciencebr.com/api/reimbursement/?format=json&page=3&year=2016\\\",\\\"previous\\\":\\\"http://jarbas.datasciencebr.com/api/reimbursement/?format=json&year=2016\\\",\\\"results\\\":[{\\\"all_net_values\\\":[55.09],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":55.09,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/763/2016/6157575.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":55.09,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":763,\\\"document_id\\\":6157575,\\\"congressperson_id\\\":73535,\\\"congressperson_name\\\":\\\"JORGE TADEU MUDALEN\\\",\\\"congressperson_document\\\":363,\\\"party\\\":\\\"DEM\\\",\\\"state\\\":\\\"SP\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":13,\\\"subquota_description\\\":\\\"Congressperson meal\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"SERVIÇO NACIONAL DE APRENDIZAGEM COMERCIAL  - SENAC\\\",\\\"cnpj_cpf\\\":\\\"33469172001644\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"068933\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341200,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[2190.0],\\\"all_reimbursement_numbers\\\":[5615],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":2190.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/3150/2016/6158244.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":2190.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":3150,\\\"document_id\\\":6158244,\\\"congressperson_id\\\":null,\\\"congressperson_name\\\":\\\"PSOL\\\",\\\"congressperson_document\\\":null,\\\"party\\\":\\\"\\\",\\\"state\\\":\\\"\\\",\\\"term_id\\\":null,\\\"term\\\":0,\\\"subquota_id\\\":5,\\\"subquota_description\\\":\\\"Publicity of parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"DIMENSÃO LETREIROS E PLACAS LTDA - ME\\\",\\\"cnpj_cpf\\\":\\\"38068847000180\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"0572\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341480,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[6368.1],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":6368.1,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/1383/2016/6156713.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":6368.1,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":1383,\\\"document_id\\\":6156713,\\\"congressperson_id\\\":74688,\\\"congressperson_name\\\":\\\"LUIZ SÉRGIO\\\",\\\"congressperson_document\\\":313,\\\"party\\\":\\\"PT\\\",\\\"state\\\":\\\"RJ\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JOSE FERNANDO XIMENES ROCHA\\\",\\\"cnpj_cpf\\\":\\\"36954381772\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11/16\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1340979,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[1000.0],\\\"all_reimbursement_numbers\\\":[5616],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":1000.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2820/2016/6158251.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":1000.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2820,\\\"document_id\\\":6158251,\\\"congressperson_id\\\":171623,\\\"congressperson_name\\\":\\\"FABIO REIS\\\",\\\"congressperson_document\\\":178,\\\"party\\\":\\\"PMDB\\\",\\\"state\\\":\\\"SE\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":5,\\\"subquota_description\\\":\\\"Publicity of parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"FM TOBIAS BARRETO ALMEIDA REIS LTDA\\\",\\\"cnpj_cpf\\\":\\\"03826865000108\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"1168\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341485,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[4000.0],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":4000.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2339/2016/6157534.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":4000.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2339,\\\"document_id\\\":6157534,\\\"congressperson_id\\\":160641,\\\"congressperson_name\\\":\\\"PROFESSORA MARCIVANIA\\\",\\\"congressperson_document\\\":15,\\\"party\\\":\\\"PCdoB\\\",\\\"state\\\":\\\"AP\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"ADAN BRYAN NAVEGANTES DE SOUZA\\\",\\\"cnpj_cpf\\\":\\\"94049246287\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341171,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[1685.0],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":1685.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2405/2016/6156975.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":1685.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2405,\\\"document_id\\\":6156975,\\\"congressperson_id\\\":160538,\\\"congressperson_name\\\":\\\"BOHN GASS\\\",\\\"congressperson_document\\\":499,\\\"party\\\":\\\"PT\\\",\\\"state\\\":\\\"RS\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JAIR ROQUE DALL'ALBA\\\",\\\"cnpj_cpf\\\":\\\"46464263072\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1340963,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[4650.0],\\\"all_reimbursement_numbers\\\":[5615],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":4650.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2912/2016/6157154.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":4650.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2912,\\\"document_id\\\":6157154,\\\"congressperson_id\\\":112437,\\\"congressperson_name\\\":\\\"JOÃO MARCELO SOUZA\\\",\\\"congressperson_document\\\":76,\\\"party\\\":\\\"PMDB\\\",\\\"state\\\":\\\"MA\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":120,\\\"subquota_description\\\":\\\"Automotive vehicle renting or charter\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JV EVENTOS E TURISMO LTDA\\\",\\\"cnpj_cpf\\\":\\\"08194075000162\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"100678\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341034,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}}]}\" }"

API: /api/ is pointing to localhost

Now it's: {"document":"http://127.0.0.1:8001/api/document/"}

It should be: {"document":"http://jarbas.datasciencebr.com/api/document/"}

Pagination links also have this bug.

RFC: Ability to filter reimbursements by a range of 'issue_date's

The other day I was playing with the flight tickets dataset and was able to identify tickets for international flights (basically check if the total_net_value is > 4k).

With that info we'd be able to identify a period of time when the congressperson was abroad and investigate reimbursements in Brazil that requires the deputy to be physically present (like taxi and meals).

I believe we have more important things to cover before getting to the point of automating the extraction of "period away from Brazil" data points BUT I'm willing to manually check some of those reimbursements to ensure it is worth the trouble to automate this in rosie in the future and for that it would be super helpful if jarbas could provide a search for the following filters in combination:

  • issue_date_start
  • issue_date_end
  • congressperson_id (applicant_id might be enough as well)
  • subquota_number

Dunno if it would be useful for other use cases so feel free to 👎 this issue 😄

Issue with 'Document value' presentation

By going through http://jarbas.datasciencebr.com/#/party/PMDB, I noticed the values of 'Document value' are presented like ##.###. As we use English, it should be like ##.###,##: 1) use '.' as thousands separator; 2) ',' for decimal separators; 3) use only 2 decimal places. By spottingDocument value "60.000" I interpret like a spending of R$ 6000,00 instead of R$ 60,00 like is on the receipt.

Error "core_document" does not exist

Facing this error when running the command:
docker-compose run --rm jarbas python manage.py loaddatasets
tryed with and without sudo and same error occur.

Here is the error:
django.db.utils.ProgrammingError: relation "core_document" does not exist

I noticed that the file db.sqlite3 is not created.

as @cuducos pointed out:
Is your .env overriding docker-compose.yml's DATABASE_URL by any chance?

In my local .env I just replaced the line 5:
DATABASE_URL=sqlite:///db.sqlite3
by this:
DATABASE_URL=postgres://jarbas:mysecretpassword@db/jarbas

same error.

Add Chamber of Deputies request for information

Since we've being auditing some suspicious cases found by Rosie, it would be nice if we could see information related to requests we made to the Chamber of Deputies like protocol number and status on Jarbas front-end. In that way we can see whether or not one document has an request open.

Expose more resources through the API (Jarbas)

It is more like a question than an actual issue but here it goes. Are you guys planning on exposing more resources through the API? The reason I am asking this is because as of right now the only method available is /api/document which lists Document objects. The thing is, each of the documents listed have attributes that could be referred to by their id's, state and party are examples of such. Then we could have both /api/states and /api/parties, respectively, to list the states and the parties.

The reason I am asking this is because I am planning on writing a wrapper in either objective-c or java to consume this information. By doing this someone else (or even me) could use it to write an app similar to 'Meu deputado' that could list the expenses, and its details, for any given congressperson.

Here are my suggestions for new methods:
/api/states
/api/parties
/api/congresspeople

With at least these new methods I could start working on something more meaningful.

Let me know what you guys think. I really appreciate what you guys are doing and would like to contribute somehow.

Allow more results per page

It's is a bit limiting providing only 10 results per page. I think it would be useful to allow something like:

GET /api/document/?year=2015%_limit=100

What do you think?

Add docker image

Sometime we dont want to install all dependencies and want just run something like docker run jarbas and get everything working.

I want code the Dockerfile but I want know if is there something particular about the project of if I can make it as usual python application.

Jarbas and machine learning results

I'm documenting here the next steps of Jarbas in terms of development, i.e. it's a big issue, not a tiny and specific one.

We have a new player in Serenata de Amor operation that is Rosie. She's our robot, she'll tell us what she thinks of each reimbursement. Basically for each hypothesis of our roadmap Rosie will:

  1. Say that for each reimbursement uniquely identified by a certain combination of year, applicant_id anddocument_id there is the a probability of irregularity/immorality/illegality (i.e. 0 < probability < 1) .
  2. Send a HTTP POST request to Jarbas with this probability and an some meta information of the hypothesis this probability refers to.

Thus Jarbas should be able to: (removed in favor of this list)

Conceptually this means that Jarbas is moving from an internal tool to help people engaged in Serenata de Amor to a public data visualization platform to a share our artificial intelligence machine learning results with the world. A lot of changes (refactor, renaming, endpoints etc.) are expected 🎉

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.