eubr-bigsea / citrus Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 2.0 9.99 MB

License: Apache License 2.0

JavaScript 14.72% HTML 0.16% Vue 83.80% Dockerfile 0.03% Shell 0.01% SCSS 0.95% CSS 0.03% MDX 0.29%

citrus's People

Contributors

Stargazers

Watchers

Forkers

gassantos dpdi-unifor

citrus's Issues

PCA and Locality-sensitive hashing are only working with 'Vectorize attribute(s)'

Workflow id: 90.

===============================

Invalid or missing parameters: requirement failed: Column petallength must be of type struct<type:tinyint,size:int,indices:array,values:array> but was actually decimal(5,1).

Delete button is not working (on Firefox?)

The delete button (to delete a dataset or a workflow) is not working on Firefox (not sure if this issue can happen on other browsers).

This button is working for Opera and Chrome.

Escalador Padrão (Z-score) não funciona sem o Definidor de Feature

Fluxo 61

Sumário estatístico

Fluxo 46,

Inseri o Fluxo para o sumário estatístico:
Ao executar o fluxo, da erro de Valor obrigatório ausente, mas não tem nenhum atributo a ser preenchido. O campo Atributos (vazio=todos) está vazio, logo deve considerar todos.

Converter categórico para numerico

A operação Converter Categórico para Numerico apenas funciona se utilizar o Vetorizar Atributo (Definidor de feature)

Ler dados desabilitado

Ler dados aparece desabilitado na janela lateral esquerda.

Random Forest Error

When running Random Forest for classification (workflow id: 53), it returns the following error:

Traceback (most recent call last):
File "/usr/local/juicer/juicer/spark/spark_minion.py", line 438, in _perform_execute
self._state)
File "/usr/local/juicer/juicer/transpiler.py", line 296, in transpile
using_stdout, workflow, deploy, export_notebook)
File "/usr/local/juicer/juicer/transpiler.py", line 220, in generate_code
v = template.render(env_setup)
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 1008, in render
return self.environment.handle_exception(exc_info, True)
File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 780, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/juicer/juicer/spark/templates/operation.tmpl", line 56, in top-level template code
{%- handleinstance instance %}
File "/usr/local/juicer/juicer/util/template_util.py", line 53, in handle
raise(JuicerException(msg), None, sys.exc_info()[2])
File "/usr/local/juicer/juicer/util/template_util.py", line 40, in _handle
return caller()
File "/usr/local/lib/python2.7/dist-packages/jinja2/runtime.py", line 579, in _invoke
rv = self._func(*arguments)
File "/usr/local/juicer/juicer/spark/templates/operation.tmpl", line 119, in template
{{instance.generate_code().strip() | indent(width=8, indentfirst=False)}}
File "/usr/local/juicer/juicer/spark/ml_operation2.py", line 40, in generate_code
return "\n".join([algorithm_code, model_code])
JuicerException: Erro de tipo analisando modelo (template) para inst\xe2ncia 8739f50e-a193-4e7c-8c5e-9a2fa38e418b RandomForestModelOperation

Save workflow image as PDF

Lemonade should allow to save workflow images using the PDF format (or other vectorial format). This would help in viewing large workflows.

Erro na operação Mapa

Fluxo 42, ocorreu erro ao executar o fluxo.

Operation copy

It would be nice to have the option that copies an operation, even with a right mouse click or with 'Ctrl + C' followed by 'Ctrl + V'.

SVM error 2: when applying its model

This is an error associated to SVM classifier (I am creating a new issue as I could not reopen the previous issue).

I tried to apply its model, but it did not work.

The workflow id is 87. I am trying to create a base workflow for all classification models (that I am testing) right now, differing only on the classification algorithm itself.

====================

Traceback (most recent call last):
File "/usr/local/juicer/juicer/spark/spark_minion.py", line 460, in _perform_execute
self._emit_event(room=job_id, namespace='/stand'))
File "/tmp/juicer_app_87_87_420.py", line 617, in main
task_futures['aada130d-81f3-4d90-875f-309c000171de'].result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 462, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/thread.py", line 63, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/juicer_app_87_87_420.py", line 616, in
lambda: evaluate_model_5(spark_session, cached_state, emit_event))
File "/tmp/juicer_app_87_87_420.py", line 362, in evaluate_model_5
parent_result = task_futures['c5f52d1a-39f3-45e8-9af4-eb0fadc7cfaa'].result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 462, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/thread.py", line 63, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/juicer_app_87_87_420.py", line 355, in
lambda: apply_model_4(spark_session, cached_state, emit_event))
File "/tmp/juicer_app_87_87_420.py", line 323, in apply_model_4
out3 = model2.transform(sd15, params)
NameError: global name 'model2' is not defined

"Saving as image" is adding other things in the generated image

When trying to use the option "Save as image", I observed that this option is adding some lines in the output image:

The workflow id 71 does not have this line upon "Converter categórico para numérico".

Autosaving

As a suggestion, I recommend to automatically save the workflow for the user (e.g., every five minutes the system automatically saves the workflow).

Or even a message warning possible data lost if the user decides to change from the current environment to another:
"Do you want save the workflow before leaving? [Yes/No]"

Workflow copy

It would be nice to have the option to copy an workflow. This would avoid double work in some cases.

Deleted workflows and datasets are appearing on the landing page

Keyboard shortcuts

It would be nice to use the key 'Del' to delete an operation.

The same for 'Ctrl + Z' to undo and 'Ctrl + Shift + Z' to redo.

Left sidebar subgroup opened simultaneously

When a User opens 1 subgroups and he/she needs to see all the operations inside it, if there is another subgroup opened, the interface open both together.

See the picture
, the group "Manipulação de dados" and "aprendizado de máquina" are open. So, when the user choose to open the subgroup "Geral" from "Manipulação de dados", the subgroups "Geral" and "Agrupamento" opens simultaneously.

Erro na operação Carregar Modelo

Fluxo número 75.

Aparentemente o erro é causado por um problema com a operação Salvar Modelo

Random Forest Classifier - Minors

The word 'impureza' is wrongly written into the hyper-parameter with this name:

Recommendation: Replace 'Inpureza Gini' by 'Coeficiente Gini'.

=================

Hyper-parameter 'Number of Trees' (or 'Número de árvores') is not appearing. (I believe the parameter 'Impureza' is repeated and the second one just means the 'Number of Trees':

Confusion matrix

The generated confusion matrix (in the classification scenario) should translated to Portuguese, when the user is into the PT-BR environment.

Atributos não estão sendo carregados

Após conectar com o Ler dados, os atributos da base não aparecem.

'Saving as' is not working

The option 'Saving as' into the workflow is not working on Firefox.

Besides, into 'Saving as', just the sub-option 'As a image (a download window will appear)' works on Google Chrome. The remaining sub-options ('New name (a copy of the workflow will be created, but not loaded) ' and 'As a template workflow') are not working at all.

Gráficos não estão aparecendo

Fluxo 32

O Fluxo executou sem erros, mas não exibiu o gráfico em resultados.

Operação Filtrar não funciona

Fluxo 31

Carreguei a base GlobalLandTemperaturesByCountry.csv
Inseri a operação Filtrar por função 1 e selecionei para filtrar Country="Brazil"
Verifiquei que o filtro funciona, em outra ferramenta, como pode ser visto na figura abaixo:
Mas o lemonade não retornou nada.

Percentual splitting should show what is training and what is testing.

The weights should indicate what part is training and what part is testing.

It is a bit confusing. Until now, I don't know what is part of the data is set to be the training:

Colors palette

Looking for a set of colors to guarantee accessibility for the users, I found this:

The default palette of Highcharts is designed with accessibility in mind, so that any two neighbor colors are tested for different types of color blindness

source: https://www.highcharts.com/docs/chart-concepts/accessibility

The default color pallet of Highcharts already takes into account accessibility considerations.

@waltersf do you think necessary a new set of colors?

Case be necessary, is very simple to implement a custom palette of colors on Highcharts. There are some alternatives, how we can see in this post for example:

Color your productivity: Asana now adjusts for color-blindness

Local Outlier Factor (LOF) Error

Fluxo 64:

Traceback (most recent call last):
File "/usr/local/juicer/juicer/spark/spark_minion.py", line 460, in _perform_execute
self._emit_event(room=job_id, namespace='/stand'))
File "/tmp/juicer_app_64_64_280.py", line 301, in main
task_futures['5aff6923-3441-4dd0-a1d4-ec3bb903ac2d'].result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 462, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/thread.py", line 63, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/juicer_app_64_64_280.py", line 300, in
lambda: sort_2(spark_session, cached_state, emit_event))
File "/tmp/juicer_app_64_64_280.py", line 230, in sort_2
parent_result = task_futures['9feaab42-c0e5-4cc5-9b09-38db2d64e511'].result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 462, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/thread.py", line 63, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/juicer_app_64_64_280.py", line 228, in
lambda: outlier_detection_1(spark_session, cached_state, emit_event))
File "/tmp/juicer_app_64_64_280.py", line 142, in outlier_detection_1
algorithm = LocalOutlierFactor(minPts=5)
File "/usr/local/spark/python/pyspark/init.py", line 105, in wrapper
return func(self, **kwargs)
File "/usr/local/juicer/juicer/spark/ext/init.py", line 156, in init
self.uid)
File "/usr/local/spark/python/pyspark/ml/wrapper.py", line 63, in _new_java_obj
return java_obj(*java_args)
TypeError: 'JavaPackage' object is not callable

Map need to be redraw when it is used in a tab component

Map is not working when it is inside a tab component. It is necessary to detect tab changing event and handle it to redraw the map.

O Default da Operação não está como Habilitado

Ao inserir uma operação no fluxo ela vem desabilitada.
Exemplo, operação Mapa

Erro ao clicar em pre-visualização da base de dados

Opções do Filtrar por função

O campo "Expressão para filtro" não deveria ser obrigatório, pois ele funciona sem ser preenchido.

O campo está em inglês (Filtro)

SVM error

When running SVM for classification (workflow id: 39), it returns the following error:

_Traceback (most recent call last):
File "/usr/local/juicer/juicer/spark/spark_minion.py", line 438, in _perform_execute
self.state)
File "/usr/local/juicer/juicer/transpiler.py", line 296, in transpile
using_stdout, workflow, deploy, export_notebook)
File "/usr/local/juicer/juicer/transpiler.py", line 108, in generate_code
class_name = self.operations[task['operation']['slug']]
KeyError: u'svm-classification-model'

Clicar duas vezes em uma operação na área de execução aparece um popup vermelho.

Ao clicar duas vezes em uma operação na área de execução aparece um popup vermelho com informações irrelevantes para o usuário.

Change the name of 'Perceptron' to 'Multi-layer perceptron'

The current name is giving the wrong idea of the learning algorithm.

Gráfico de dispersão Error

Operação considera "Atributo usado para séries" como campo obrigatório e não executa. Mas, se adiciona e depois retira um atributo, a validação passa a não existir.

Adicionando o atributo:

Após retirar o atributo, a validação é desconsiderada:

Limpar o filtro de pesquisa

Fiz um filtro para buscar um fluxo de dados
Mas ao atualizar esta página ou clicar no menu Fluxo de trabalho o filtro permanece.

Base de dados Airplane_Crashes_and_Fatalities_Since_1908

Apesar da base funcionar na Pré-visualização, na operação Ler dados do fluxo 48, os dados são reconhecidos como null.

Sometimes, Lemonade is not executing workflows

Not sure if the Lemonade's system is overloaded or if there is another reason, but sometimes it takes too long to run a simple workflow (id 52), such as the one in the following figure:

Download results of the operation

As Lemonade only supports seeing the first 50 records regarding the results of the operation.

However, sometimes we would like to check these results on the entire dataset.

Therefore, Lemonade should have an option that downloads all results of a given operation.

Erro Salvar dados

Fluxo 5:

Voting classifier is not included in the menu

Neither is appearing in the operation search.

Validação na Operação desabilitada

Fluxo 31

Operação está desabilitada e ao executar o fluxo da erro nas operações que estão desabilitadas.

Erros:

Percentual splitting is not returning the correct number of instances.

I set the Percentual splitting operation to give me half of the samples for training and the other half for testing.

My dataset has 150 samples/examples. It should return 75 for each set. However, it returned 85 for the test.

The workflow id is 86. I used to test the perceptron algorithm.

Next, the confusion matrix:

Perceptron error

When running Perceptron for classification (workflow id: 52), it returns the following error:

File "/usr/local/juicer/juicer/spark/spark_minion.py", line 438, in _perform_execute
self._state)

File "/usr/local/juicer/juicer/transpiler.py", line 296, in transpile
using_stdout, workflow, deploy, export_notebook)

File "/usr/local/juicer/juicer/transpiler.py", line 220, in generate_code
v = template.render(env_setup)

File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 1008, in render
return self.environment.handle_exception(exc_info, True)

File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py", line 780, in handle_exception
reraise(exc_type, exc_value, tb)

File "/usr/local/juicer/juicer/spark/templates/operation.tmpl", line 56, in top-level template code
{%- handleinstance instance %}

File "/usr/local/juicer/juicer/util/template_util.py", line 53, in handle
raise(JuicerException(msg), None, sys.exc_info()[2])

File "/usr/local/juicer/juicer/util/template_util.py", line 40, in _handle
return caller()

File "/usr/local/lib/python2.7/dist-packages/jinja2/runtime.py", line 579, in _invoke
rv = self._func(*arguments)

File "/usr/local/juicer/juicer/spark/templates/operation.tmpl", line 119, in template
{{instance.generate_code().strip() | indent(width=8, indentfirst=False)}}

File "/usr/local/juicer/juicer/spark/ml_operation2.py", line 40, in generate_code
return "\n".join([algorithm_code, model_code])

JuicerException: Erro de tipo analisando modelo (template) para inst\xe2ncia 317dd992-b802-4921-8cab-cd18b79c0cbc PerceptronModelOperation

Dataset upload

After uploading the dataset, Lemonade should go back to the 'Data Sources' page, showing a message of success such as "Your dataset was successfully uploaded."

Depending on the person who is using the system, receiving a message and going back to the 'Data Sources' page may be more intuitive.

Erro na operação Salvar Modelo

Fluxo número 76.

Erro:

Traceback (most recent call last):
File "/usr/local/juicer/juicer/spark/spark_minion.py", line 460, in _perform_execute
self._emit_event(room=job_id, namespace='/stand'))
File "/tmp/juicer_app_76_76_368.py", line 844, in main
task_futures['d88149bf-2c42-47e7-a387-692d8fb43996'].result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 462, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/thread.py", line 63, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/juicer_app_76_76_368.py", line 841, in
lambda: save_model_4(spark_session, cached_state, emit_event))
File "/tmp/juicer_app_76_76_368.py", line 485, in save_model_4
_save_model(model, path, name)
File "/tmp/juicer_app_76_76_368.py", line 474, in save_model
register_model('http://limonero:23402', model_payload, '123456')
File "/usr/local/juicer/juicer/service/limonero_service.py", line 80, in register_model
raise RuntimeError(("Error saving model: {})").format(r.text))
RuntimeError: Erro salvando model: {"status": "ERROR", "message": "Validation error", "errors": {"workflow_id": ["Missing data for required field."], "job_id": ["Missing data for required field."], "task_id": ["Missing data for required field."]}}