GithubHelp home page GithubHelp logo

tersekmatija / ewasr Goto Github PK

View Code? Open in Web Editor NEW
20.0 2.0 1.0 1.01 MB

An embedded-compute-ready maritime obstacle detection network eWaSR.

License: Apache License 2.0

Python 100.00%
deep-learning maritime-robots pytorch semantic-segmentation

ewasr's Introduction

eWaSR - an embedded-compute-ready maritime obstacle detection network

Luxonis, ViCOS

Matija Teršek, Lojze Žust, Matej Kristan

[paper] [BibTeX] [weights]

The official PyTorch implementation of the embedded-compute WaSR (eWaSR) network [1]. Repository contains scripts for training and running the network and weights pretrained on the MaSTr1325 [2] dataset.

eWaSR example

About eWaSR

eWaSR is an embedded-compute-ready variant of WaSR [3], that follows the most recent advancements of transformer-based lightweight networks. Compared to WaSR, iit is able to reduce the inference time by more than 10x at insignificant loss in detection accuracy.

eWaSR Architecture

Setup

Requirements: Python >= 3.6, PyTorch, PyTorch Lightning (for training)

Install the dependencies provided in requirements.txt.

pip install -r requirements.txt

Pretrained models

Currently available pretrained model weights. All models are trained on the MaSTr1325 [2] dataset.

model backbone IMU url
ewasr_resnet18_imu ResNet-18 weights
ewasr_resnet18 ResNet-18 weights

Export

You can export the pre-trained model to ONNX and blob compatible with OAK-D device.

python3 export.py \
--architecture ewasr_resnet18_imu \
--weights-file pretrained/ewasr_resnet18_imu.pth \
--output_dir output

Use --onnx_only to export only the ONNX file.

Model training

  1. Download and prepare the MaSTr1325 dataset (images and GT masks). If you plan to use the IMU-enabled model also download the IMU masks.
  2. Edit the dataset configuration (configs/mastr1325_train.yaml, configs/mastr1325_val.yaml) files so that they correctly point to the dataset directories.
  3. Use the train.py to train the network.
export CUDA_VISIBLE_DEVICES=0,1,2,3 # GPUs to use
python train.py \
--train_config configs/mastr1325_train.yaml \
--val_config configs/mastr1325_val.yaml \
--model_name my_ewasr \
--validation \
--batch_size 4 \
--epochs 50

Model architectures

By default the ResNet-18, IMU-enabled version of the eWaSR is used in training. To select a different model architecture use the --model argument. Repository also supports training the models from the official WaSR implementation. Currently implemented model architectures:

model backbone IMU
ewasr_resnet18_imu ResNet-18
ewasr_resnet18 ResNet-18
wasr_resnet101_imu ResNet-101
wasr_resnet101 ResNet-101
wasr_resnet50_imu ResNet-50
wasr_resnet50 ResNet-50
deeplab ResNet-101

Logging and model weights

A log dir with the specified model name will be created inside the output directory. Model checkpoints and training logs will be stored here. At the end of the training the model weights are also exported to a weights.pth file inside this directory.

Logged metrics (loss, validation accuracy, validation IoU) can be inspected using tensorboard.

tensorboard --logdir output/logs/model_name

Model inference

To run model inference using pretrained weights use the predict.py script. A sample dataset config file (configs/examples.yaml) is provided to run examples from the examples directory.

# export CUDA_VISIBLE_DEVICES=-1 # CPU only
export CUDA_VISIBLE_DEVICES=0 # GPU to use
python predict.py \
--dataset_config configs/examples.yaml \
--model ewasr_resnet18_imu \
--weights path/to/model/weights.pth \
--output_dir output/predictions

Predictions will be stored as color-coded masks to the specified output directory.

Citation

If you use this code, please cite our papers:

@article{tersek2023ewasr,
  author = {Ter\v{s}ek, Matija and \v{Z}ust, Lojze and Kristan, Matej},
  title = {eWaSR -- An Embedded-Compute-Ready Maritime Obstacle Detection Network},
  journal = {Sensors},
  year = {2023},
  volume = {23},
  number = {12},
  pages = {5386},
  doi = {10.3390/s23125386},
}

References

[1] Teršek, M., Žust, L., Kristan, M. (2023). eWaSR -- an embedded-compute-ready maritime obstacle detection network

[2] Bovcon, B., Muhovič, J., Perš, J., & Kristan, M. (2019). The MaSTr1325 dataset for training deep USV obstacle detection models. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

[3] Bovcon, B., & Kristan, M. (2021). WaSR--A Water Segmentation and Refinement Maritime Obstacle Detection Network. IEEE Transactions on Cybernetics

Code

Code based on the following amazing repositories:

All repositories included are Apache-2.0 licensed. Please refer to each repository for the individual licenses.

License

This repository, including pre-trained weights, is licensed under Apache-2.0.

ewasr's People

Contributors

tersekmatija avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

wvw321

ewasr's Issues

Export error

Hi everyone!

I tried to run export.py but I got some errors. At the end, I could get the blob model by using the online blobconverter . The error that I got is:

python3 export.py --architecture ewasr_resnet18_imu --weights-file models/ewasr_resnet18_imu.pth --output-dir output
/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
  warnings.warn(
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

ONNX stored at: output/ewasr_resnet18_imu.onnx
Downloading /home/jm/.cache/blobconverter/ewasr_resnet18_imu_openvino_2022.1_6shave.blob...
{
    "exit_code": 1,
    "message": "Command failed with exit code 1, command: /app/venvs/venv2022_1/bin/python /app/model_compiler/openvino_2022.1/converter.py --precisions FP16 --output_dir /tmp/blobconverter/6f2c9fed7eeb4019a2d2c3c3c8eeedf0 --download_dir /tmp/blobconverter/6f2c9fed7eeb4019a2d2c3c3c8eeedf0 --name ewasr_resnet18_imu --model_root /tmp/blobconverter/6f2c9fed7eeb4019a2d2c3c3c8eeedf0",
    "stderr": "usage: main.py [options]\nmain.py: error: unrecognized arguments: --mean_values image[123.675,116.28,103.53],imu[0,0,0] --scale_values image[58.395,57.12,57.375],imu[1,1,1] --output prediction\n",
    "stdout": "========== Converting ewasr_resnet18_imu to IR (FP16)\nConversion command: /app/venvs/venv2022_1/bin/python -- /app/venvs/venv2022_1/bin/mo --framework=onnx --data_type=FP16 --output_dir=/tmp/blobconverter/6f2c9fed7eeb4019a2d2c3c3c8eeedf0/ewasr_resnet18_imu/FP16 --model_name=ewasr_resnet18_imu --input= --reverse_input_channels '--mean_values image[123.675,116.28,103.53],imu[0,0,0]' '--scale_values image[58.395,57.12,57.375],imu[1,1,1]' '--output prediction' --data_type=FP16 --input_model=/tmp/blobconverter/6f2c9fed7eeb4019a2d2c3c3c8eeedf0/ewasr_resnet18_imu/FP16/ewasr_resnet18_imu.onnx\n\nFAILED:\newasr_resnet18_imu\n"
}
Traceback (most recent call last):
  File "/home/jm/Programming/CollisionAvoidence/mods-yolov5/segmentation/eWaSR/export.py", line 102, in <module>
    main()
  File "/home/jm/Programming/CollisionAvoidence/mods-yolov5/segmentation/eWaSR/export.py", line 99, in main
    export(args)
  File "/home/jm/Programming/CollisionAvoidence/mods-yolov5/segmentation/eWaSR/export.py", line 81, in export
    blob_path_temp = blobconverter.from_onnx(
  File "/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/blobconverter/__init__.py", line 424, in from_onnx
    return compile_blob(blob_name=Path(model_name).stem, req_data={"name": Path(model_name).stem}, req_files=files, data_type=data_type, **kwargs)
  File "/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/blobconverter/__init__.py", line 318, in compile_blob
    response.raise_for_status()
  File "/opt/anaconda/anaconda3/envs/yolo/lib/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: BAD REQUEST for url: https://blobconverter.luxonis.com/compile?version=2022.1&no_cache=False

Thanks in advance!

eWaSR Script for OAK

Hello @tersekmatija,

Thanks for the awesome project!

I'm trying to run ewasr_resnet18.blob in my OAK-D camera. For this, I first exported the model using export.py and then created a script, which follows the general ideal of prediction.py.

Nevertheless, the outputs of the OAK-D inference differ substantially from the ones of prediction.py (ran on my PC). Can you please help me out? We can surely put later the code in the repository ;)

My code:

from pathlib import Path
import cv2
import depthai as dai
import numpy as np


# Parameters
dir = "/your/directory/to/ewasr_resnet18.blob"
fps = 20

shape_rgb = (3, 384, 512)

# Load model
nnBlobPath = str((Path(__file__).parent / Path(dir)).resolve().absolute())
if not Path(nnBlobPath).exists():
    import sys
    raise FileNotFoundError(f'Required file/s not found, please run "{sys.executable} install_requirements.py"')

pipeline = dai.Pipeline()

# Define RGB camera
camRgb = pipeline.create(dai.node.ColorCamera)
camRgb.setPreviewSize(shape_rgb[2], shape_rgb[1])
camRgb.setFps(fps)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.BGR)
camRgb.setFp16(True)  # Model requires FP16 input

# Define neural network
nn = pipeline.create(dai.node.NeuralNetwork)
nn.setBlobPath(nnBlobPath)
nn.setNumInferenceThreads(2)
camRgb.preview.link(nn.input)

# xout for rgb and neural network
xoutNN = pipeline.create(dai.node.XLinkOut)
xoutNN.setStreamName("nn")
nn.out.link(xoutNN.input)

xoutRgb = pipeline.create(dai.node.XLinkOut)
xoutRgb.setStreamName("rgb")
nn.passthrough.link(xoutRgb.input)


def interpol_mask(frame, height, width):
    # Interpolation using CV2
    mask = cv2.resize(frame, (width, height), interpolation=cv2.INTER_LINEAR)
    return mask


with dai.Device(pipeline) as device:
    # Output queues will be used to get the rgb frames and nn data from the outputs defined above
    previewQueue = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
    networkQueue = device.getOutputQueue(name="nn", maxSize=4, blocking=False)

    while True:
        inRgb = previewQueue.get()
        in_nn = networkQueue.tryGet()

        # Model needs FP16 so we have to convert color frame back to U8 on the host
        frame = np.array(inRgb.getData()).view(np.float16).reshape(shape_rgb).transpose(1, 2, 0).astype(np.uint8).copy()
        # Prediction mask with extended image size
        pred_masks = np.zeros((shape_rgb[1], shape_rgb[2], 3))
        if in_nn is not None:
            # Get segmentation mask from nn and interpolate
            pred_origin = np.reshape(in_nn.getLayerFp16('prediction'), (3, 96, 128))
            pred_mask_1, pred_mask_2, pred_mask_3 = interpol_mask(pred_origin[0], shape_rgb[1], shape_rgb[2]), \
                                                    interpol_mask(pred_origin[1], shape_rgb[1], shape_rgb[2]), \
                                                    interpol_mask(pred_origin[2], shape_rgb[1], shape_rgb[2])
            pred_masks = np.zeros((shape_rgb[1], shape_rgb[2], 3))
            pred_masks[:, :, 0], pred_masks[:, :, 1], pred_masks[:, :, 2] = pred_mask_1, pred_mask_2, pred_mask_3

            # Argmax to get a single channel mask, then transform it to 3 channels to allow color overlay
            pred_masks_arg = np.argmax(pred_masks, axis=2)
            pred_masks[pred_masks_arg == 0] = [247, 195, 37]
            pred_masks[pred_masks_arg == 1] = [41, 167, 224]
            pred_masks[pred_masks_arg == 2] = [90, 75, 164]

        if pred_masks is not None:
            # convert to uint8 to use addWeighted
            pred_masks = pred_masks.astype(np.uint8)
            frame_mix = cv2.addWeighted(frame, 0.3, pred_masks, 0.7, 0)
            cv2.imshow("mask ", frame_mix)
        cv2.imshow("rgb", frame)

        if cv2.waitKey(1) == ord('q'):
            break

Thanks in addition!

Weird results on OAK-1

Hey, first I'd like to say awesome work! I've been waiting a while for this to become viable, and I've got an old OAK-1 to test it out on.

I've converted the non-imu onnx to blob since the Oak-1 doesn't have one (compiled to 6 shaves). I'm not entirely sure how the final overlay works exactly in predict.py so I've just setup a normalization pass and rendered it on top of the input to see if I get roughly the right results. It does seem sort of right it's also quite odd:

test
test2
test3

This is just me recording a monitor showing videos so the moire patterns may be causing something (unintentional adversarial attack? 😛 ), but it seems very consistent on outputting this 8x6 resolution instead of a full 128x96 segmentation. It's not super clear what size input and output images should be used but the onnx file lists the first layer as [1,3,384,512] and the prediction layer as [1,3,96,128], so that's how I've set up the inference test script:

#!/usr/bin/env python3

import cv2
import depthai as dai
import numpy as np

pipeline = dai.Pipeline()
pipeline.setOpenVINOVersion(version = dai.OpenVINO.VERSION_2021_4)

detection_nn = pipeline.create(dai.node.NeuralNetwork)
detection_nn.setBlobPath("ewasr_resnet18.blob/ewasr_resnet18_openvino_2021.4_6shave.blob")
detection_nn.setNumPoolFrames(4)
detection_nn.input.setBlocking(False)
detection_nn.setNumInferenceThreads(2)

cam = pipeline.create(dai.node.ColorCamera)
cam.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
cam.setPreviewSize(512, 384)
cam.setInterleaved(False)
cam.setFps(5)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_4_K)

xout_cam = pipeline.create(dai.node.XLinkOut)
xout_cam.setStreamName("cam")

xout_nn = pipeline.create(dai.node.XLinkOut)
xout_nn.setStreamName("nn")

cam.preview.link(detection_nn.input)
detection_nn.passthrough.link(xout_cam.input)
detection_nn.out.link(xout_nn.input)

def scale_image(image):
    # Calculate the minimum and maximum color values
    min_val = np.min(image)
    max_val = np.max(image)

    # Scale the image so that the minimum value becomes black and maximum becomes white
    scaled_image = (255 * (image - min_val) / (max_val - min_val)).astype(np.uint8)

    return scaled_image

with dai.Device(pipeline) as device:

    q_cam = device.getOutputQueue("cam", 4, blocking=False)
    q_nn = device.getOutputQueue(name="nn", maxSize=4, blocking=False)

    while True:
        in_frame = q_cam.get()
        in_nn = q_nn.get()

        frame = in_frame.getCvFrame()
        layer_data = np.array(in_nn.getLayerFp16("prediction"))

        # Reshape the output to the expected shape (1x3x96x128)
        final_image = layer_data.reshape((1, 3, 96, 128)).astype(np.uint8)
        final_image = np.transpose(final_image, (2, 3, 1, 0))  # Reshape to (96, 128, 3, 1)
        final_image = final_image[:, :, :, 0]  # Remove the extra dimension

        final_image = cv2.resize(final_image, (frame.shape[1], frame.shape[0]))  # Resize the mask to match the camera frame size
        final_image = cv2.normalize(final_image.astype(np.uint8), None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)

        result = cv2.addWeighted(frame,0.6,final_image,0.4,0)

        # Display the result
        cv2.imshow('Overlay Result', result)
        cv2.imshow("Output", final_image)

        if cv2.waitKey(1) == ord('q'):
            break

Any ideas?

I'm not sure if something isn't quite set up right, or if it's something to do with the OAK-1 in general as I see in the paper that you've only tested on the OAK-D. It would be cool to see the code used for that test setup 😄

Adding more classes

Dear @tersekmatija

Thanks for sharing your work and code. We have tested it with different images and situations and seems robust, giving us the expected results.

I would like to extend the number of classes, to include the boat and buoy.
Is it possible to include these 2 new classes? Which approach do you suggest to follow?

Regards

Add license to project

Really like the project! Great work!

would love to use in my project so would appreciate if it had a license!

export the pre-trained model to ONNX and blob compatible with OAK-D device

Awesome work! looking forward to test it using an OAK-D camera!
I am encountering an issue while trying to convert the model to onnx.

Traceback (most recent call last): File "eWaSR\export.py", line 106, in <module> main() File "eWaSR\export.py", line 103, in main export(args) File "eWaSR\export.py", line 57, in export torch.onnx.export( File "env_python_obstacle_detection\Lib\site-packages\torch\onnx\utils.py", line 516, in export _export( File "env_python_obstacle_detection\Lib\site-packages\torch\onnx\utils.py", line 1613, in _export graph, params_dict, torch_out = _model_to_graph( ^^^^^^^^^^^^^^^^ File "env_python_obstacle_detection\Lib\site-packages\torch\onnx\utils.py", line 1139, in _model_to_graph graph = _optimize_graph( ^^^^^^^^^^^^^^^^ File "env_python_obstacle_detection\Lib\site-packages\torch\onnx\utils.py", line 677, in _optimize_graph graph = _C._jit_pass_onnx(graph, operator_export_type) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "env_python_obstacle_detection\Lib\site-packages\torch\onnx\utils.py", line 1967, in _run_symbolic_function raise errors.UnsupportedOperatorError( torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::_upsample_bilinear2d_aa' to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.

Issue is happening while it attempts to export the model to onnx. I did find a related issue mentioning that setting opset_version=12 to 11 would fix it but no luck.
Issue persist with either imu or not.
Lib versions are :

  1. torch 2.2.1+cpu
  2. onnx 1.15.0

inference speed is slow

Hello, @tersekmatija,
I'm sorry to bother you.
I have some questions that I'd like to ask you.
When I use the ewasr model to train the LARS data set, the training speed is very fast, but the inference speed is very slow. Compared with wasr, the inference speed does not have a big advantage. It is consistent with your paper. There is a big gap in the ten times speed. What is the reason?

Prediction with external dataset

Hi everyone!

I'm trying to run the prediction.py as shown in here:

python predict.py \
--dataset_config configs/examples.yaml \
--model ewasr_resnet18 \
--weights path/to/model/weights.pth \
--output_dir output/predictions

I downloaded the weights and reference the path to my images in configs/examples.yaml. Nevertheless, I can not run the model with my own images in contrast to the wasr repository.

Could it be that in lines 59-65 of the function predict from predict.py there isn't any option for external datasets?

Thanks in advance!

Impossible to train without using IMU masks

Hello all, I'm pretty new to AI so I'm not fully aware of its configuration

The problem

I can launch a training with the mastr1325 dataset, all my dependencies seem okay, however:
I have many datasets to use for trainings (plus the mastr1325 one) but my datasets don't have any IMU masks for the images. From what I have read in the README.md, there are models that use IMU and models that don't. Even by choosing an "non IMU" model, I can't run the train.py script without having a complete folder of IMU masks set in the config files.

Launching a training without the IMU

My directory

I have created four folders in the eWaSR root directory (same level as train.py, predict.py etc.) :

  • images, the mastr1325 images
  • gt_masks, the mastr1325 ground truth annotations
  • imu_masks, the mastr1325 IMU masks
  • empty_folder, the name speaks for itself, for tests purposes

Config files

mastr1325_train.yaml

image_dir: ../images
mask_dir: ../gt_masks
imu_dir: ../empty_folder
image_list: train_images.txt

mastr1325_val.yaml

image_dir: ../images
mask_dir: ../gt_masks
imu_dir: ../imu_masks
image_list: val_images.txt

I let the imu_dir of this file be the directory for the validation process.

Modifications to the files

models.py (starting at line 31)

    elif model_name.startswith('wasr_resnet18'):
        model = wasr_deeplabv2_resnet18(num_classes=num_classes, imu=False) # imu=imu
    elif model_name.startswith('ewasr'):
        backbone = model_name.split("_")[1].split("_")[0]
        model = ewasr(num_classes = num_classes, imu = False, backbone=backbone, **kwargs) # imu=imu
    else:
        raise ValueError('Unknown model: %s' % model_name)

As I want to use eWaSR_resnet18, I edited the parameters of the associated defs from imu to False.

train.py (line 27)

MODEL = "ewasr_resnet18"#'wasr_resnet18_imu'

Command

python3 train.py --train_config configs/mastr1325_train.yaml --val_config configs/mastr1325_val.yaml \
--model_name my_ewasr --validation --batch_size 4 --epochs 2 --model ewasr_resnet18

Results

Namespace(batch_size=4, enricher='SS', epochs=2, focal_loss_scale='labels', gpus='auto', learning_rate=1e-06, log_steps=20, lr_decay_pow=0.9, mixer='CCCCSS', model='ewasr_resnet18', model_name='my_ewasr', momentum=0.9, monitor_metric='val/loss', monitor_metric_mode='min', no_augmentation=False, no_separation_loss=False, num_classes=3, output_dir='output', patience=None, precision=32, pretrained=True, pretrained_weights=None, project=False, random_seed=None, resume_from=None, separation_loss_lambda=0.01, train_config='configs/mastr1325_train.yaml', val_config='configs/mastr1325_val.yaml', validation=True, weight_decay=1e-06, workers=8)
/home/user/.local/lib/python3.8/site-packages/lightning_fabric/utilities/seed.py:40: No seed found, seed set to 3616557247
Seed set to 3616557247
/home/user/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/user/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
/home/user/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
/home/user/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:740: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() if nvml_count < 0 else nvml_count
Invalid MIT-MAGIC-COOKIE-1 keyGPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name         | Type          | Params
-----------------------------------------------
0 | model        | WaSR          | 60.3 M
1 | val_accuracy | PixelAccuracy | 0     
2 | val_iou_0    | ClassIoU      | 0     
3 | val_iou_1    | ClassIoU      | 0     
4 | val_iou_2    | ClassIoU      | 0     
-----------------------------------------------
60.3 M    Trainable params
0         Non-trainable params
60.3 M    Total params
241.013   Total estimated model params size (MB)
Sanity Checking DataLoader 0:   0%|                       | 0/2 [00:00<?, ?it/s]/home/user/.local/lib/python3.8/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
  warnings.warn(
/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/data.py:77: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 4. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
Epoch 0:   0%|                                          | 0/324 [00:00<?, ?it/s]Traceback (most recent call last):
  File "train.py", line 155, in <module>
    main()
  File "train.py", line 151, in main
    train_wasr(args)
  File "train.py", line 144, in train_wasr
    trainer.fit(model, train_dl, val_dl)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 989, in _run
    results = self._run_stage()
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1035, in _run_stage
    self.fit_loop.run()
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
    self.advance()
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 359, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 136, in run
    self.advance(data_fetcher)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 202, in advance
    batch, _, __ = next(data_fetcher)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/fetchers.py", line 127, in __next__
    batch = super().__next__()
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/loops/fetchers.py", line 56, in __next__
    batch = next(self.iterator)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/combined_loader.py", line 326, in __next__
    out = next(self._iterator)
  File "/home/user/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/combined_loader.py", line 74, in __next__
    out[i] = next(self.iterators[i])
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
    return self._process_data(data)
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
    data.reraise()
  File "/home/user/.local/lib/python3.8/site-packages/torch/_utils.py", line 694, in reraise
    raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/user/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/user/Documents/Semseg/eWaSR/datasets/mastr.py", line 89, in __getitem__
    imu_mask = np.array(Image.open(imu_path))
  File "/home/user/.local/lib/python3.8/site-packages/PIL/Image.py", line 3243, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/Documents/Semseg/eWaSR/empty_folder/0546.png'

Epoch 0:   0%|          | 0/324 [00:00<?, ?it/s]  

My question

Is it really possible to run a training without IMU masks or is it a deprecated feature from WaSR?

Additional question

I've also tried to reverse engineering the code and in mastr.py I saw a few things like:

# line 51
self.imu_dir = (self.dataset_dir / Path(data['imu_dir'])).resolve() if 'imu_dir' in data else None
...
# line 85
        if self.imu_dir is not None:
            imu_path = str(self.imu_dir / ('%s.png' % img_name))
            imu_mask = np.array(Image.open(imu_path))
            data['imu_mask'] = imu_mask

So this file seems to read the line 'imu_dir' in the .yaml files and get the path, from which the file opens each mask.

# line 102
        features = {'image': img}
        labels = {}

        if self.include_original:
            features['image_original'] = torch.from_numpy(img_original.transpose(2,0,1))

        if 'segmentation' in data:
            labels['segmentation'] = torch.from_numpy(data['segmentation'].transpose(2,0,1))

        if 'imu_mask' in data:
            features['imu_mask'] = torch.from_numpy(data['imu_mask'].astype(bool))

So I thought if I set the .yaml 'imu_dir' lines to null, no 'imu_dir' index would be used and so no IMU masks.

Config files

mastr1325_train.yaml

image_dir: ../images
mask_dir: ../gt_masks
imu_dir: null
image_list: train_images.txt

mastr1325_val.yaml

image_dir: ../images
mask_dir: ../gt_masks
imu_dir: null
image_list: val_images.txt

Modifications to the files

models.py (starting at line 31)

    elif model_name.startswith('wasr_resnet18'):
        model = wasr_deeplabv2_resnet18(num_classes=num_classes, imu=False) # imu=imu
    elif model_name.startswith('ewasr'):
        backbone = model_name.split("_")[1].split("_")[0]
        model = ewasr(num_classes = num_classes, imu = False, backbone=backbone, **kwargs) # imu=imu
    else:
        raise ValueError('Unknown model: %s' % model_name)

As I want to use eWaSR_resnet18, I edited the parameters of the associated defs from imu to False.

train.py (line 27)

MODEL = "ewasr_resnet18"#'wasr_resnet18_imu'

Command

python3 train.py --train_config configs/mastr1325_train.yaml --val_config configs/mastr1325_val.yaml \
--model_name my_ewasr --validation --batch_size 4 --epochs 2 --model ewasr_resnet18

Results

Namespace(batch_size=4, enricher='SS', epochs=2, focal_loss_scale='labels', gpus='auto', learning_rate=1e-06, log_steps=20, lr_decay_pow=0.9, mixer='CCCCSS', model='ewasr_resnet18', model_name='my_ewasr', momentum=0.9, monitor_metric='val/loss', monitor_metric_mode='min', no_augmentation=False, no_separation_loss=False, num_classes=3, output_dir='output', patience=None, precision=32, pretrained=True, pretrained_weights=None, project=False, random_seed=None, resume_from=None, separation_loss_lambda=0.01, train_config='configs/mastr1325_train.yaml', val_config='configs/mastr1325_val.yaml', validation=True, weight_decay=1e-06, workers=8)
/home/user/.local/lib/python3.8/site-packages/lightning_fabric/utilities/seed.py:40: No seed found, seed set to 1347808435
Seed set to 1347808435
Traceback (most recent call last):
  File "train.py", line 155, in <module>
    main()
  File "train.py", line 151, in main
    train_wasr(args)
  File "train.py", line 102, in train_wasr
    train_ds = MaSTr1325Dataset(args.train_config, transform=transform,
  File "/home/user/Documents/Semseg/eWaSR/datasets/mastr.py", line 51, in __init__
    self.imu_dir = (self.dataset_dir / Path(data['imu_dir'])).resolve() if 'imu_dir' in data else None
  File "/usr/lib/python3.8/pathlib.py", line 1042, in __new__
    self = cls._from_parts(args, init=False)
  File "/usr/lib/python3.8/pathlib.py", line 683, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/usr/lib/python3.8/pathlib.py", line 667, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I thought that because I set the 'imu_dir' to null, it would be None in Python so no 'imu_dir' attribute would be created in the class MaSTr1325Dataset. But again, it does not work.\

Is it possible to run eWaSR training without IMU masks, if yes, how?

Thank you for taking in consideration my request.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.