GithubHelp home page GithubHelp logo

helmholtzai-consultants-munich / data-centric-platform Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 5.0 9.2 MB

A tool for all-kinds segmentation in microscopy imaging which encourages data centric approaches

License: BSD 3-Clause "New" or "Revised" License

Python 97.07% Shell 2.64% Dockerfile 0.29%

data-centric-platform's Introduction

Data Centric Platform

A data centric platform for all-kinds segmentation in microscopy imaging

stability-wip tests codecov Documentation Status

How to use this?

This repo includes a client and server side for using our data centric platform. The client and server communicate via the bentoml library. The client interacts with the server every time we run model inference or training. For full functionality of the software the server should be running, either locally or remotely. To install and start the server side follow the instructions described in DCP Server Installation & Launch.

To run the client GUI follow the instructions described in DCP Client Installation & Launch.

DCP handles all kinds of segmentation tasks! Try it out if you need to do:

  • Instance segmentation
  • Semantic segmentation
  • Multi-class instance segmentation

Toy data

This repo includes the data/ directory with some toy data which you can use as the Uncurated dataset folder. You can create (empty) folders for the other two directories required in the welcome window and start playing around.

Enabling data centric development

Our platform encourages the use of data centric practices. With the user friendly client interface you can:

  • Detect and remove outliers from your training data: only confirmed samples are used to train our models
  • Detect and correct labeling errors: editing labels with the integrated napari visualisation tool
  • Establish consensus: allows for multiple annotators before curated label is passed to train model
  • Focus on data curation: no interaction with model parameters during training and inference

Get more with less!

data-centric-platform's People

Contributors

christinab12 avatar donatella-cea avatar francesco-campi avatar georgii-helmholtz avatar gerome-v avatar hpelin avatar isramekki0 avatar korenmary avatar neuronflow avatar nickdelgrosso avatar shvardhan1994 avatar volomos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

data-centric-platform's Issues

Update documentation

  • Run client in readme and docs: dcp-client -m/--mode local/remote
  • Add badges for readthedocs in client and server readmes

[server] UNet: change to optional export of also instance masks

Currently by default unet model output is podt-processed to produce also instance masks. Change this by adding a 'post-processing' argument in the eval config. If false, only the class labels mask is returned, if true then also the instance mask is returned.

Address the circus/tornado "ConflictError: arbiter is already running"

I can't remember if we were getting this before, but after I hit train, the model trains and is saved successfully but after I get this error:
Traceback (most recent call last): File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/site-packages/tornado/ioloop.py", line 921, in _run val = self.callback() File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/site-packages/circus/util.py", line 1038, in wrapper raise ConflictError("arbiter is already running %s command" circus.exc.ConflictError: arbiter is already running arbiter_stop command Traceback (most recent call last): File "/Users/christina.bukas/Documents/AI_projects/code/data-centric-platform/src/server/dcp_server/main.py", line 31, in <module> main() File "/Users/christina.bukas/Documents/AI_projects/code/data-centric-platform/src/server/dcp_server/main.py", line 18, in main subprocess.run([ File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/subprocess.py", line 507, in run stdout, stderr = process.communicate(input, timeout=timeout) File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/subprocess.py", line 1126, in communicate self.wait() File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/subprocess.py", line 1189, in wait return self._wait(timeout=timeout) File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/subprocess.py", line 1917, in _wait (pid, sts) = self._try_wait(0) File "/Users/christina.bukas/opt/anaconda3/envs/dc-tool/lib/python3.9/subprocess.py", line 1875, in _try_wait (pid, sts) = os.waitpid(self.pid, wait_flags)

Originally posted by @christinab12 in #27 (comment)

Fix rescaling

Since changing rescale for mask error occurs and final mask is not same size as original image. Can be reproduced with non-square images

DOCUMENTATION

  • check that every function and class is documented
  • go over documentation in readmes
  • include readthedcos

[server] Split models.py

This is getting too big. Create a subfolder named 'models' and place all models in separate scripts in here.

[server] Error on train with CellposePatchCNN in front-end-dev branch

After clicking train model the following error appears:

File "/data-centric-platform/src/server/dcp_server/utils.py", line 168, in get_centered_patches
class_l = int(np.unique(mask_class[mask[:,:,0]==l]))
TypeError: only size-1 arrays can be converted to Python scalars

This happens after editing masks with napari viewer

Image name substring search

If we have 1.tiff and 11.tiff and generate labels and open 1.tiff, then masks of 1_seg.tiff and 11_seg.tiff both open

Compatibility with BentoML 1.1.10

The bug was encountered from the client's side with both the client and server using BentoML 1.1.10. We know it works with 1.0.16

Reformulate description in the readme

Here:
Uncurated dataset path: This folder should initially contain all images of your dataset. They may or may not be accompanied by corresponding segmentations, but if they do, the segmentations should have the same filename as the image followed by the ending defined in setup/seg_name_string, deifned in server/dcp_server/config.cfg (default extension is _seg)

Sugestion:
Uncurated Dataset Path: This folder is intended to store all images of your dataset. These images may be accompanied by corresponding segmentations. If present, segmentation files should share the same filename as their associated image, appended with a suffix as specified in 'setup/seg_name_string', defined in 'server/dcp_server/config.cfg' (default: '_seg').

[client] Allow for removal of outlier from dataset

If an outlier is detected which affects model training, this can be deleted from any folder.
The idea is to add a remove from dataset button in the napari window. This can happen at any stage (curated, uncurated, in progress)

make versioning dynamic

          this could be done dynamically instead.

Maybe line 23 dynamic list needs to be extended with 'version' and also the
[tool.setuptools.dynamic]
version = {file = "src/server/VERSION"}. However, the dynamic versioning could be difficult for both server and client within same repo.

Originally posted by @mahyar-HelmholtzAI in #8 (comment)

Bug in napari window: changing mask colors in the label segmentation mask.

Description:
After making changes to the instance segmentation mask by adding new objects, the colors of existing objects in the label mask are being reset to their initial colors.

Steps to Reproduce:
Open the napari window in dcp.
Add a new object to the instance segmentation mask.
Manually change the color of the added object in the label mask.
Add another new object to the instance segmentation mask.
The color of the first object will reset to the initial color.

Expected Behavior:
The color of the objects in the label mask should persist even after adding new objects to the instance segmentation mask.

[server] Make config_semantic run

  • Currently getting the error: segmentationclasses.py", line 42, in segment_image self.model.eval_config['segmentor']['z_axis'] = z_axis KeyError: 'segmentor'

segmentation classes should not depend on segmentor or classifier

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.