okfn-brasil / jarbas Goto Github PK

View Code? Open in Web Editor NEW

296.0 47.0 61.0 1.46 MB

🎩 API for information and suspicions about reimbursements by Brazilian congresspeople

Home Page: https://jarbas.serenata.ai/

jarbas's Introduction

Jarbas is part of Serenata de Amor main repo now.

jarbas's People

Contributors

Stargazers

Watchers

jarbas's Issues

Settings instructions is relevant for Docker users too

The Settings is relevant to Docker users too, whe should reorganize that in the README.md. Thanks @danizavtz

Make receipts command faster

Multithreading and handling multiple requests in parallel are the key (it's the Reimbursement.get_receipt_url() from #76 that handles the HTTP requests)

Dates presented are off by one in relation to the date of the receipt

Looks like something in the client side based on the API response 🤔

Applying good patterns to project layout

So, in many projects that I have seen the utilized structure is:

Maybe this could to be a issue to fix it?

Check if receipt URL exist before showing the link

Use argparse optional arguments to limit the number of rows to be imported and to set where to start. E.g. $ pyhton manage.py loaddatasets --start 200 --import 5000 would load 5k records, skipping the 1st 200 from the datasets.

Add Chamber of Deputies request for information

Since we've being auditing some suspicious cases found by Rosie, it would be nice if we could see information related to requests we made to the Chamber of Deputies like protocol number and status on Jarbas front-end. In that way we can see whether or not one document has an request open.

Break Internationalization module into smaller modules

The Internationalization module got too big (more than 600 LOC) and it might be a good idea to break it in smaller modules.

Cleanup: remove old models, serializers, views, tests etc.

Once #52 is done, we can do some house keeping:

Suplier endpoint returning 404 when CNPJ not found

Example: http://jarbas.datasciencebr.com/api/suplier/02012862000160

Add CONTRIBUTING.md

It would be nice to have a CONTRIBUTING.md in order to know how to best contribute to the project.

Feature request: Data aggregation

One way to quickly explore huge amounts of data is through data aggregation. For instance, what are all CNPJ/CPFs found in expenses? How deputies spent the most money?

Does this API aims to provide such feature? If you intend to address this in any other fashion, please let me know.

0 results found when clicking to the second page

Go to https://jarbas.datasciencebr.com/#/year/2016
Click to the second page of results.

This error is printed in the inspector.

app.js:1 ApiFail: "BadPayload \"Expecting an Int at _.results[1].term_id but instead got: null\" { status = { code = 200, message = \"OK\" }, headers = Dict.fromList [(\"Allow\",\"GET, HEAD, OPTIONS\"),(\"Cache-Control\",\"max-age=600\"),(\"Content-Type\",\"application/json\"),(\"Date\",\"Wed, 04 Jan 2017 17:48:34 GMT\"),(\"Expires\",\"Wed, 04 Jan 2017 17:58:34 GMT\"),(\"Last-Modified\",\"Wed, 04 Jan 2017 17:48:34 GMT\"),(\"P3P\",\"CP=\\\"ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV\\\"\"),(\"Server\",\"nginx/1.10.0 (Ubuntu)\"),(\"Vary\",\"Accept, Cookie\"),(\"X-Frame-Options\",\"SAMEORIGIN\")], url = \"https://jarbas.datasciencebr.com/api/reimbursement/?format=json&page=2&year=2016\", body = \"{\\\"count\\\":178153,\\\"next\\\":\\\"http://jarbas.datasciencebr.com/api/reimbursement/?format=json&page=3&year=2016\\\",\\\"previous\\\":\\\"http://jarbas.datasciencebr.com/api/reimbursement/?format=json&year=2016\\\",\\\"results\\\":[{\\\"all_net_values\\\":[55.09],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":55.09,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/763/2016/6157575.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":55.09,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":763,\\\"document_id\\\":6157575,\\\"congressperson_id\\\":73535,\\\"congressperson_name\\\":\\\"JORGE TADEU MUDALEN\\\",\\\"congressperson_document\\\":363,\\\"party\\\":\\\"DEM\\\",\\\"state\\\":\\\"SP\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":13,\\\"subquota_description\\\":\\\"Congressperson meal\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"SERVIÇO NACIONAL DE APRENDIZAGEM COMERCIAL  - SENAC\\\",\\\"cnpj_cpf\\\":\\\"33469172001644\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"068933\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341200,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[2190.0],\\\"all_reimbursement_numbers\\\":[5615],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":2190.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/3150/2016/6158244.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":2190.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":3150,\\\"document_id\\\":6158244,\\\"congressperson_id\\\":null,\\\"congressperson_name\\\":\\\"PSOL\\\",\\\"congressperson_document\\\":null,\\\"party\\\":\\\"\\\",\\\"state\\\":\\\"\\\",\\\"term_id\\\":null,\\\"term\\\":0,\\\"subquota_id\\\":5,\\\"subquota_description\\\":\\\"Publicity of parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"DIMENSÃO LETREIROS E PLACAS LTDA - ME\\\",\\\"cnpj_cpf\\\":\\\"38068847000180\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"0572\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341480,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[6368.1],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":6368.1,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/1383/2016/6156713.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":6368.1,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":1383,\\\"document_id\\\":6156713,\\\"congressperson_id\\\":74688,\\\"congressperson_name\\\":\\\"LUIZ SÉRGIO\\\",\\\"congressperson_document\\\":313,\\\"party\\\":\\\"PT\\\",\\\"state\\\":\\\"RJ\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JOSE FERNANDO XIMENES ROCHA\\\",\\\"cnpj_cpf\\\":\\\"36954381772\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11/16\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1340979,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[1000.0],\\\"all_reimbursement_numbers\\\":[5616],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":1000.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2820/2016/6158251.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":1000.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2820,\\\"document_id\\\":6158251,\\\"congressperson_id\\\":171623,\\\"congressperson_name\\\":\\\"FABIO REIS\\\",\\\"congressperson_document\\\":178,\\\"party\\\":\\\"PMDB\\\",\\\"state\\\":\\\"SE\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":5,\\\"subquota_description\\\":\\\"Publicity of parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"FM TOBIAS BARRETO ALMEIDA REIS LTDA\\\",\\\"cnpj_cpf\\\":\\\"03826865000108\\\",\\\"document_type\\\":0,\\\"document_number\\\":\\\"1168\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341485,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}},{\\\"all_net_values\\\":[4000.0],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":4000.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2339/2016/6157534.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":4000.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2339,\\\"document_id\\\":6157534,\\\"congressperson_id\\\":160641,\\\"congressperson_name\\\":\\\"PROFESSORA MARCIVANIA\\\",\\\"congressperson_document\\\":15,\\\"party\\\":\\\"PCdoB\\\",\\\"state\\\":\\\"AP\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"ADAN BRYAN NAVEGANTES DE SOUZA\\\",\\\"cnpj_cpf\\\":\\\"94049246287\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341171,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[1685.0],\\\"all_reimbursement_numbers\\\":[5617],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":1685.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2405/2016/6156975.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":1685.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2405,\\\"document_id\\\":6156975,\\\"congressperson_id\\\":160538,\\\"congressperson_name\\\":\\\"BOHN GASS\\\",\\\"congressperson_document\\\":499,\\\"party\\\":\\\"PT\\\",\\\"state\\\":\\\"RS\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":1,\\\"subquota_description\\\":\\\"Maintenance of office supporting parliamentary activity\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JAIR ROQUE DALL'ALBA\\\",\\\"cnpj_cpf\\\":\\\"46464263072\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"11\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1340963,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":null},{\\\"all_net_values\\\":[4650.0],\\\"all_reimbursement_numbers\\\":[5615],\\\"all_reimbursement_values\\\":null,\\\"document_value\\\":4650.0,\\\"probability\\\":null,\\\"receipt\\\":{\\\"url\\\":\\\"http://www.camara.gov.br/cota-parlamentar/documentos/publ/2912/2016/6157154.pdf\\\",\\\"fetched\\\":true},\\\"remark_value\\\":0.0,\\\"total_net_value\\\":4650.0,\\\"total_reimbursement_value\\\":null,\\\"year\\\":2016,\\\"applicant_id\\\":2912,\\\"document_id\\\":6157154,\\\"congressperson_id\\\":112437,\\\"congressperson_name\\\":\\\"JOÃO MARCELO SOUZA\\\",\\\"congressperson_document\\\":76,\\\"party\\\":\\\"PMDB\\\",\\\"state\\\":\\\"MA\\\",\\\"term_id\\\":55,\\\"term\\\":2015,\\\"subquota_id\\\":120,\\\"subquota_description\\\":\\\"Automotive vehicle renting or charter\\\",\\\"subquota_group_id\\\":0,\\\"subquota_group_description\\\":\\\"\\\",\\\"supplier\\\":\\\"JV EVENTOS E TURISMO LTDA\\\",\\\"cnpj_cpf\\\":\\\"08194075000162\\\",\\\"document_type\\\":1,\\\"document_number\\\":\\\"100678\\\",\\\"issue_date\\\":\\\"2016-11-30\\\",\\\"month\\\":11,\\\"installment\\\":0,\\\"batch_number\\\":1341034,\\\"passenger\\\":\\\"\\\",\\\"leg_of_the_trip\\\":\\\"\\\",\\\"suspicions\\\":{\\\"meal_price_outlier\\\":false,\\\"suspicious_traveled_speed_day\\\":false,\\\"over_monthly_subquota_limit\\\":false}}]}\" }"

Jarbas and machine learning results

I'm documenting here the next steps of Jarbas in terms of development, i.e. it's a big issue, not a tiny and specific one.

We have a new player in Serenata de Amor operation that is Rosie. She's our robot, she'll tell us what she thinks of each reimbursement. Basically for each hypothesis of our roadmap Rosie will:

Say that for each reimbursement uniquely identified by a certain combination of year, applicant_id anddocument_id there is the a probability of irregularity/immorality/illegality (i.e. 0 < probability < 1) .
Send a HTTP POST request to Jarbas with this probability and an some meta information of the hypothesis this probability refers to.

Thus Jarbas should be able to: (removed in favor of this list)

Conceptually this means that Jarbas is moving from an internal tool to help people engaged in Serenata de Amor to a public data visualization platform to a share our artificial intelligence machine learning results with the world. A lot of changes (refactor, renaming, endpoints etc.) are expected 🎉

Update to Elm 0.18

Allow URL navigation & permalink to each document

Show widget of Google Street View

When accessing a document, would be awesome to also be able to navigate in the street where the expense was made.

Dependent of #2.

Add pagination

When the API returns more than 25 results, Jarbas should be able to fetch next pages (Load more button in the bottom maybe).

Store company activities as JSONField

Nowadays each Company stores its main and secondary activities in a separate model (Activity) using a ManyToManyField.

I don't think it's necessary as we won't query by a company activity (at least not that frequently). Also this design makes the python manage.py companies <companies.xz> extremely slow because we cannot use bulk_create with relational tables (or can we?).

Automate deploy with Docker

Now we have a Docker structure, so we can think about how to use that to automate the deploy.

Ideally a piece of code to automate provision is much appreciated ; )

Add link to variable descriptions

As proposed by @josircg and discussed here.

Error "core_document" does not exist

Facing this error when running the command:
docker-compose run --rm jarbas python manage.py loaddatasets
tryed with and without sudo and same error occur.

Here is the error:
django.db.utils.ProgrammingError: relation "core_document" does not exist

I noticed that the file db.sqlite3 is not created.

as @cuducos pointed out:
Is your .env overriding docker-compose.yml's DATABASE_URL by any chance?

In my local .env I just replaced the line 5:
DATABASE_URL=sqlite:///db.sqlite3
by this:
DATABASE_URL=postgres://jarbas:mysecretpassword@db/jarbas

same error.

Add Memcached to Docker pipeline

The backend is ready to use Memcached, but that has not been implemented in the Docker pipeline. As we intend to use Docker for deploy/provision, this is a pending task.

Settings to Memcached access point (e.g. CACHE_LOCATION=localhost:11211) are to be set as envvars or in the .env.

Make reimbursements command load a specific file/path

Right now it loads the date from the jarbas/settings.py and downloads the file from S3, but I think it can read Rosie's output in a similar way of the irregularities command.

For DRY-sake LoadCommand could manage that.

Load supplier information based on the CNPJ

New UI/UX

Now Jarbas is changing. From an internal tool for Serenata it will become the public API for the general public. And this impacts UI/UX.

In terms of UX we might change the landing page to offer two ways to start a search, to start browsing our data, maybe two tabs or something like that (sorry about the pt-BR):

And in terms of UI, @tatianasb is working on it.

This is want we have in mind, but we are open to discuss how to better conduct this changes (the technical part is discussed in #52). I just ask you to be down-to-earth and precise in further comments, suggestions — we have a deadline with our 1296 bosses which is mid-January ; )

Jarbas is crashing in Chrome & Firefox

Create dashboard with reports

Hi,
I don't know if jarbas is the appropriate project for this funcionality, but I suggest to create a dashboard containing all reports informed to Chamber (https://twitter.com/cuducos/status/818434229314928641) and showing the information of each real invested in the project, how much was returned to society.
This funcionality is inspired in this article: http://www1.folha.uol.com.br/mercado/2017/01/1846255-jornalismo-investigativo-gera-lucro-para-a-sociedade-diz-diretor-de-stanford.shtml

Issue with 'Document value' presentation

By going through http://jarbas.datasciencebr.com/#/party/PMDB, I noticed the values of 'Document value' are presented like ##.###. As we use English, it should be like ##.###,##: 1) use '.' as thousands separator; 2) ',' for decimal separators; 3) use only 2 decimal places. By spottingDocument value "60.000" I interpret like a spending of R$ 6000,00 instead of R$ 60,00 like is on the receipt.

Write tests for commands

We have two custom Django commands that are untested, both in jarbas/core/management/commands/:

loaddatasets.py
loadsuppliers.py

Maybe a quick way to do it is to mock a CSV responde with a few records, call the command ([check test_static_files.py) and assert records form the mocked CSV is in the database.

But surely unit tests would be welcomed too ; )

Adapt front-end to be more stateless

Nowadays passing information (Language basically everywhere, parentId for RelatedTable etc.) has proved too verbose (and adds unnecessary complexity — e.g. with Html.map).

This design is more promising.

Enable HTTPS

I've never done that — is letsencrypt.org a good idea? I feel like I'd like to pair with someone else to get that up and running because all this is new to me.

Make State search field case unsentive

API: /api/ is pointing to localhost

Now it's: {"document":"http://127.0.0.1:8001/api/document/"}

It should be: {"document":"http://jarbas.datasciencebr.com/api/document/"}

Pagination links also have this bug.

Migrate application & database to the same server

Currently Jarbas is a slow application for four reasons:

Loading time is terrible beacuse it is hosted in a free tier at Heroku; this means Heroku might have to wake up the application when user reaches it
No CDN for assets.static files, all is being served by Heroku (with WhiteNoise)
Searching is slow because Heroku free tier just allows 10k rows in the database, thus we’re are hosting the app at Heroku but the database at AWS
Due to the benefit-cost of paying a decent server doesn't seem to be priority to Serenata de Amor at this point

Split front-end (website) and backend (API endpoints)

Load the datasets into our Postgres

cc @lucavgobbi @pablov

A new loaddatasets is online:

$ python manage.py loaddatasets --help
usage: manage.py loaddatasets [-h] [--version] [-v {0,1,2,3}]
                              [--settings SETTINGS] [--pythonpath PYTHONPATH]
                              [--traceback] [--no-color] [--start START]
                              [--limit LIMIT]

Load Serenata de Amor datasets into the database

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --start START, -s START
                        Where the load should start (skips first LIMIT
                        documents)
  --limit LIMIT, -l LIMIT
                        Limit the number of documents to load

Basically with the optional arguments --start and --limit we can split the task of loading tons of data to our Postgres. For example, python manage.py loaddatasets --start 1000 --limit 50 will only create/update the records number 1000 until 1049.

I once estimated we have ~2,7 million records. Running it locally is slower, but the script tends to run for a longer time. Running it at Heroku is quicker, but I guess Heroku drops the connection after a while.

I'll try to write a Python or shell script with a for loop to run this loaddatasets 1k records by 1k records at Heroku. But I'm open to new ideas ; )

Painful development with docker

When I was developing in docker environment, I felt that the development was painful because the following issues:

Folders are copied when docker starts. So, any modification isn't "on the fly"
Elm container ends after executing gulp for the first time
Jarbas container has the command to collectstatics after the update by elm container. So I had to execute the command manually every time that gulp executes

Session freezing

Ao efetuar uma consulta por partido por exemplo, ao navegar pelas páginas a consulta congela, impossibilitando navegar nas páginas, nem mesmo o menu ao lado funciona, sendo necessário limpar a URL e começar novamente, que novamente congela entre 3-4 consulta.

Allow more results per page

It's is a bit limiting providing only 10 results per page. I think it would be useful to allow something like:

GET /api/document/?year=2015%_limit=100

What do you think?

Change DB to one that supports ~3 million rows

@lucavgobbi and @pablov suggested Amazon EC2 (free for 12 months). If someone can set up this instance and send in PVT the connection details, I can set this up in the server.

Add docker image

Sometime we dont want to install all dependencies and want just run something like docker run jarbas and get everything working.

I want code the Dockerfile but I want know if is there something particular about the project of if I can make it as usual python application.

Search by CEAP document # is not a good idea…

… because it looks like document_id is not unique 😮

@Irio, as this repo emerged from an issue you raised, how would you like to search for documents and company info here?

Add sample data files for development

It would be nice to have some sample files to import into the database in order to do development of the app (elm, css, etc). It does not need to be the latest data, just some data to work on.

RFC: Ability to filter reimbursements by a range of 'issue_date's

The other day I was playing with the flight tickets dataset and was able to identify tickets for international flights (basically check if the total_net_value is > 4k).

With that info we'd be able to identify a period of time when the congressperson was abroad and investigate reimbursements in Brazil that requires the deputy to be physically present (like taxi and meals).

I believe we have more important things to cover before getting to the point of automating the extraction of "period away from Brazil" data points BUT I'm willing to manually check some of those reimbursements to ensure it is worth the trouble to automate this in rosie in the future and for that it would be super helpful if jarbas could provide a search for the following filters in combination:

issue_date_start
issue_date_end
congressperson_id (applicant_id might be enough as well)
subquota_number

Dunno if it would be useful for other use cases so feel free to 👎 this issue 😄

Make file name a required argument for irregularities command (not an option)

And as soon as #68 is done, for reimbursement too.

Add tests for Elm

Add .editorconfig file

In order to keep everything fine for everyone we need add this beauty piece of setting.

Automate deploy (and, eventually, provision)

The Digital Ocean droplet was set manually and the only piece of software that makes deploy easier is a git hook. Yet, nginx and gunicorn processes are owned by root — that's not cool.

Something like Ansible might help.

Personally I don't have devops know-how to fix the users and processes issues, and I've never used Ansible — so any help is appreciated. Surely I can share the application specific logic and help anyone willing to close that issue ; )

Expose more resources through the API (Jarbas)

It is more like a question than an actual issue but here it goes. Are you guys planning on exposing more resources through the API? The reason I am asking this is because as of right now the only method available is /api/document which lists Document objects. The thing is, each of the documents listed have attributes that could be referred to by their id's, state and party are examples of such. Then we could have both /api/states and /api/parties, respectively, to list the states and the parties.

The reason I am asking this is because I am planning on writing a wrapper in either objective-c or java to consume this information. By doing this someone else (or even me) could use it to write an app similar to 'Meu deputado' that could list the expenses, and its details, for any given congressperson.

Here are my suggestions for new methods:
/api/states
/api/parties
/api/congresspeople

With at least these new methods I could start working on something more meaningful.

Let me know what you guys think. I really appreciate what you guys are doing and would like to contribute somehow.

okfn-brasil / jarbas Goto Github PK

jarbas's Introduction

jarbas's People

Contributors

Stargazers

Watchers

Forkers

jarbas's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs