microsoft / seismic-deeplearning Goto Github PK

Deep Learning for Seismic Imaging and Interpretation

License: MIT License

Python 63.75% Makefile 0.31% Shell 3.18% Jupyter Notebook 32.45% Dockerfile 0.31%

segmentation computer-vision deep-learning neural-networks seismic seismic-data seismic-inversion seismic-imaging seismic-processing microsoft

seismic-deeplearning's Introduction

DeepSeismic

This repository shows you how to perform seismic imaging and interpretation on Azure. It empowers geophysicists and data scientists to run seismic experiments using state-of-art DSL-based PDE solvers and segmentation algorithms on Azure.

The repository provides sample notebooks, data loaders for seismic data, utilities, and out-of-the-box ML pipelines, organized as follows:

sample notebooks: these can be found in the examples folder - they are standard Jupyter notebooks which highlight how to use the codebase by walking the user through a set of pre-made examples
experiments: the goal is to provide runnable Python scripts that train and test (score) our machine learning models in the experiments folder. The models themselves are swappable, meaning a single train script can be used to run a different model on the same dataset by simply swapping out the configuration file which defines the model.
pip installable utilities: we provide cv_lib and interpretation utilities (more info below) which are used by both sample notebooks and experiments mentioned above

DeepSeismic currently focuses on Seismic Interpretation (mainly facies classification) with experimental code provided around Seismic Imaging in the contrib folder.

Here's a GIF illustrating what the repo offers:

Quick Start

Our repo is Docker-enabled and we provide a Docker file which you can use to quickly demo our codebase. If you are in a hurry and just can't wait to run our code, follow the Docker README to build and run our repo from Dockerfile.

For developers, we offer a more hands-on Quick Start below.

Dev Quick Start

There are two ways to get started with the DeepSeismic codebase, which currently focuses on Interpretation:

if you'd like to get an idea of how our interpretation (segmentation) models are used, simply review the demo notebook
to run the code, you'll need to set up a compute environment (which includes setting up a GPU-enabled Linux VM and downloading the appropriate Anaconda Python packages) and download the datasets which you'd like to work with - detailed steps for doing this are provided in the next Interpretation section below.

If you run into any problems, chances are your problem has already been solved in the Troubleshooting section.

The notebook is designed to be run in demo mode by default using a pre-trained model in under 5 minutes on any reasonable Deep Learning GPU such as nVidia K80/P40/P100/V100/TitanV.

Azure Machine Learning

Azure Machine Learning enables you to train and deploy your machine learning models and pipelines at scale, and leverage open-source Python frameworks, such as PyTorch, TensorFlow, and scikit-learn. If you are looking at getting started with using the code in this repository with Azure Machine Learning, refer to Azure Machine Learning How-to to get started.

Interpretation

For seismic interpretation, the repository consists of extensible machine learning pipelines, that shows how you can leverage state-of-the-art segmentation algorithms (UNet, SEResNET, HRNet) for seismic interpretation. We currently support rectangular data, i.e. 2D and 3D seismic images which form a rectangle in 2D. We also provide utilities for converting SEGY data with rectangular boundaries into numpy arrays where everything outside the boundary has been padded to produce a rectangular 3D numpy volume.

To run examples available on the repo, please follow instructions below to:

Set up the environment
Download the data sets
Run example notebooks and scripts

Setting up Environment

Follow the instructions below to read about compute requirements and install required libraries.

Compute environment

We recommend using a virtual machine to run the example notebooks and scripts. Specifically, you will need a GPU powered Linux machine, as this repository is developed and tested on Linux only. The easiest way to get started is to use the Azure Data Science Virtual Machine (DSVM) for Linux (Ubuntu). This VM will come installed with all the system requirements that are needed to create the conda environment described below and then run the notebooks in this repository.

For this repo, we recommend selecting a multi-GPU Ubuntu VM of type Standard_NC12. The machine is powered by NVIDIA Tesla K80 (or V100 GPU for NCv2 series) which can be found in most Azure regions.

NOTE: For users new to Azure, your subscription may not come with a quota for GPUs. You may need to go into the Azure portal to increase your quota for GPU VMs. Learn more about how to do this here: https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits.

Package Installation

To install packages contained in this repository, navigate to the directory where you pulled the DeepSeismic repo to run:

conda env create -f environment/anaconda/local/environment.yml

This will create the appropriate conda environment to run experiments. If you run into problems with this step, see the troubleshooting section.

Next, you will need to install the common package for interpretation:

conda activate seismic-interpretation
pip install -e interpretation

Then you will also need to install cv_lib which contains computer vision related utilities:

pip install -e cv_lib

Both repos are installed in developer mode with the -e flag. This means that to update simply go to the folder and pull the appropriate commit or branch.

During development, in case you need to update the environment due to a conda env file change, you can run

conda env update --file environment/anaconda/local/environment.yml

from the root of DeepSeismic repo.

Dataset download and preparation

This repository provides examples on how to run seismic interpretation on Dutch F3 publicly available annotated seismic dataset Dutch F3, which is about 2.2GB in size.

Please make sure you have enough disk space to download either dataset.

We have experiments and notebooks which use either one dataset or the other. Depending on which experiment/notebook you want to run you'll need to download the corresponding dataset. We suggest you start by looking at demo notebook which requires the Dutch F3 dataset.

Dutch F3 dataset prep

To download the Dutch F3 dataset for 2D experiments, please follow the data download instructions at this github repository (section Dataset). Atternatively, you can use the download script

data_dir="$HOME/data/dutch"
mkdir -p "${data_dir}"
./scripts/download_dutch_f3.sh "${data_dir}"

Download scripts also automatically create any subfolders in ${data_dir} which are needed for the data preprocessing scripts. At this point, your ${data_dir} directory should contain a data folder, which should look like this:

data
├── splits
├── test_once
│   ├── test1_labels.npy
│   ├── test1_seismic.npy
│   ├── test2_labels.npy
│   └── test2_seismic.npy
└── train
    ├── train_labels.npy
    └── train_seismic.npy

To prepare the data for the experiments (e.g. split into train/val/test), please run the following script:

# change working directory to scripts folder
cd scripts

# For patch-based experiments
python prepare_dutchf3.py split_train_val patch --data_dir=${data_dir}/data --label_file=train/train_labels.npy --output_dir=splits \
--stride=50 --patch_size=100 --split_direction=both

# For section-based experiments
python prepare_dutchf3.py split_train_val section --data-dir=${data_dir}/data --label_file=train/train_labels.npy --output_dir=splits --split_direction=both

# go back to repo root
cd ..

Refer to the script itself for more argument options.

Bring Your Own Data [BYOD]

Bring your own SEG-Y data

If you want to train these models using your own seismic and label data, the files will need to be prepped and converted to npy files. Typically, the segyio can be used to open SEG-Y files that follow the standard, but more often than not, there are non standard settings or missing traces that will cause segyio to fail. If this happens with your data, read these notebooks and scripts to help prepare your data files:

SEG-Y Data Prep README
convert_segy.py utility - Utility script that can read SEG-Y files with unusual byte header locations and missing traces
segy_convert_sample notebook - Details on SEG-Y data conversion
segy_sample_files notebook - Create test SEG-Y files that describe the scenarios that may cause issues when converting the data to numpy arrays

Penobscot example

We also offer starter code to convert Penobscot dataset (available here) into Tensor format used by the Dutch F3 dataset - once converted, you can run Penobscot through the same mechanisms as the Dutch F3 dataset. The rough sequence of steps is:

conda activate seismic-interpretation
cd scripts
wget -o /dev/null -O dataset.h5 https://zenodo.org/record/3924682/files/dataset.h5?download=1
# convert penobscot
python byod_penobscot.py --filename dataset.h5 --outdir <where to output data>
# preprocess for experiments
python prepare_dutchf3.py split_train_val patch --data_dir=<outdir from the previous step> --label_file=train/train_labels.npy --output_dir=splits --stride=50 --patch_size=100 --split_direction=both --section_stride=100

Run Examples

Notebooks

We provide example notebooks under examples/interpretation/notebooks/ to demonstrate how to train seismic interpretation models and evaluate them on Penobscot and F3 datasets.

Make sure to run the notebooks in the conda environment we previously set up (seismic-interpretation). To register the conda environment in Jupyter, please run:

python -m ipykernel install --user --name seismic-interpretation

Optional: if you plan to develop a notebook, you can install black formatter with the following commands:

conda activate seismic-interpretation
jupyter nbextension install https://github.com/drillan/jupyter-black/archive/master.zip --user
jupyter nbextension enable jupyter-black-master/jupyter-black

This will enable your notebook with a Black formatter button, which then clicked will automatically format a notebook cell which you're in.

Experiments

We also provide scripts for a number of experiments we conducted using different segmentation approaches. These experiments are available under experiments/interpretation, and can be used as examples. Within each experiment start from the train.sh and test.sh scripts which invoke the corresponding python scripts, train.py and test.py. Take a look at the experiment configurations (see Experiment Configuration Files section below) for experiment options and modify if necessary.

This release currently supports Dutch F3 local and distributed training

Dutch F3 Patch

Please note that we use NVIDIA's NCCL library to enable distributed training. Please follow the installation instructions here to install NCCL on your system.

Configuration Files

We use YACS configuration library to manage configuration options for the experiments. There are three ways to pass arguments to the experiment scripts (e.g. train.py or test.py):

default.py - A project config file default.py is a one-stop reference point for all configurable options, and provides sensible defaults for all arguments. If no arguments are passed to train.py or test.py script (e.g. python train.py), the arguments are by default loaded from default.py. Please take a look at default.py to familiarize yourself with the experiment arguments the script you run uses.
yml config files - YAML configuration files under configs/ are typically created one for each experiment. These are meant to be used for repeatable experiment runs and reproducible settings. Each configuration file only overrides the options that are changing in that experiment (e.g. options loaded from defaults.py during an experiment run will be overridden by arguments loaded from the yaml file). As an example, to use yml configuration file with the training script, run:
```
python train.py --cfg "configs/seresnet_unet.yaml"
```
command line - Finally, options can be passed in through options argument, and those will override arguments loaded from the configuration file. We created CLIs for all our scripts (using Python Fire library), so you can pass these options via command-line arguments, like so:
```
python train.py DATASET.ROOT "/home/username/data/dutch/data" TRAIN.END_EPOCH 10
```

Training

We run an aggressive cosine annealing schedule which starts with a higher Learning Rate (LR) and gradually lowers it over approximately 60 epochs to zero, at which point we raise LR back up to its original value and lower it again for about 60 epochs; this process continues 5 times, forming 60*5=300 training epochs in total in 5 cycles; model with the best frequency-weighted IoU is snapshotted to disc during each cycle. We suggest consulting TensorBoard logs to see which training cycle produced the best model and use that model during scoring.

For multi-GPU training, we run a linear burn-in LR schedule before starting the 5 cosine cycles, then the training continues the same way as for single-GPU.

Pretrained Models

There are two types of pre-trained models used by this repo:

pre-trained models trained on non-seismic Computer Vision datasets which we fine-tune for the seismic domain through re-training on seismic data
models which we already trained on seismic data - these are downloaded automatically by our code if needed (again, please see the notebook for a demo above regarding how this is done).

Viewers (optional)

For seismic interpretation (segmentation), if you want to visualize cross-sections of a 3D volume (both the input velocity model and the segmented output) you can use segyviewer. To install and use segyviewer, please follow the instructions below.

segyviewer

To install segyviewer run:

conda env create -n segyviewer python=2.7
conda activate segyviewer
conda install -c anaconda pyqt=4.11.4
pip install segyviewer

To visualize cross-sections of a 3D volume, you can run segyviewer like so:

segyviewer "${HOME}/home/username/data/dutch/data.segy"

Benchmarks

Dense Labels

This section contains benchmarks of different algorithms for seismic interpretation on 3D seismic datasets with densely-annotated data. We currently only support single-GPU Dutch F3 dataset benchmarks with this release.

Dutch F3

Source	Experiment	PA	FW IoU	MCA	single V100 (16GB) GPU training time	four V100 (16GB) GPUs training time
Alaudah et al.	Section-based	0.905	0.817	.832	N/A	N/A
	Patch-based	0.852	0.743	.689	N/A	N/A
DeepSeismic	Patch-based+fixed	.892	.811	.759	18h 42min	7h 24min
	Patch-based+fixed+skip	.909	.839	.802	19h 01min	7h 39min
	SEResNet UNet+section depth	.928	.872	.866	~9 days	35h 54min
	HRNet(patch)+section_depth (experimental)	.926	.869	.873	~10 days	43h 9min

Note: these are single-run performance numbers and we expect the results to fluctuate in-between different runs, i.e. some variability is to be expected, but we expect the performance numbers to be close to these with this codebase.

Reproduce benchmarks

In order to reproduce the benchmarks, you will need to navigate to the experiments folder. In there, each of the experiments are split into different folders. To run the Dutch F3 experiment navigate to the dutchf3_patch folder. In there is a training script train.sh which will run the training for any configuration you pass in. If your machine has multiple GPUs, you can run distributed training using the distributed training script train_distributed.sh. Once you have run the training you will need to run the test.sh script. Make sure you specify the path to the best performing model from your training run, either by passing it in as an argument or altering the YACS config file.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

Submitting a Pull Request

We try to keep the repo in a clean state, which means that we only enable read access to the repo - read access still enables one to submit a PR or an issue. To do so, fork the repo, and submit a PR from a branch in your forked repo into our staging branch.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Build Status

Build	Branch	Status
Legal Compliance	staging
Legal Compliance	master
Core Tests	staging
Core Tests	master

Troubleshooting

For Data Science Virtual Machine conda package installation issues, make sure you locate the anaconda location on the DSVM, for example by running:

which python

A typical output will be:

someusername@somevm:/projects/DeepSeismic$ which python
/anaconda/envs/py35/bin/python

which will indicate that anaconda folder is __/anaconda__. We'll refer to this location in the instructions below, but you should update the commands according to your local anaconda folder.

Data Science Virtual Machine conda package installation errors

It could happen that you don't have sufficient permissions to run conda commands / install packages in an Anaconda packages directory. To remedy the situation, please run the following commands

rm -rf /anaconda/pkgs/*
sudo chown -R $(whoami) /anaconda

After these commands complete, try installing the packages again.

Data Science Virtual Machine conda package installation warnings

It could happen that while creating the conda environment defined by environment/anaconda/local/environment.yml on an Ubuntu DSVM, one can get multiple warnings like so:

WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /anaconda/pkgs/ipywidgets-7.5.1-py_0/site-packages/ipywidgets-7.5.1.dist-info/LICENSE.  Please remove this file manually (you may need to reboot to free file handles)

If this happens, similar to instructions above, stop the conda environment creation (type Ctrl+C) and then change recursively the ownership /anaconda directory from root to current user, by running this command:

sudo chown -R $USER /anaconda

After these command completes, try creating the conda environment in __environment/anaconda/local/environment.yml__ again.

Model training or scoring is not using GPU

To see if GPU is being used while your model is being trained or used for inference, run

nvidia-smi

and confirm that you see your Python process using the GPU.

If not, you may want to try reverting to an older version of CUDA for use with PyTorch. After the environment has been set up, run the following command (by default we use CUDA 10) after running conda activate seismic-interpretation to activate the conda environment:

conda install pytorch torchvision cudatoolkit=9.2 -c pytorch

To test whether this setup worked, right after you can open ipython and execute the following code

import torch
torch.cuda.is_available()

The output should say True. If the output is still False, you may want to try setting your environment variable to specify the device manually - to test this, start a new ipython session and type:

import os
os.environ['CUDA_VISIBLE_DEVICES']='0'
import torch                                                                                  
torch.cuda.is_available()

The output should say True this time. If it does, you can make the change permanent by adding:

export CUDA_VISIBLE_DEVICES=0

to your $HOME/.bashrc file.

GPU out of memory errors

You should be able to see how much GPU memory your process is using by running:

nvidia-smi

and see if this amount is close to the physical memory limit specified by the GPU manufacturer.

If we're getting close to the memory limit, you may want to lower the batch size in the model configuration file. Specifically, TRAIN.BATCH_SIZE_PER_GPU and VALIDATION.BATCH_SIZE_PER_GPU settings.

How to resize Data Science Virtual Machine disk

Go to the Azure Portal and find your virtual machine by typing its name in the search bar at the very top of the page.
In the Overview panel on the left-hand side, click the Stop button to stop the virtual machine.
Next, select Disks in the same panel on the left-hand side.
Click the Name of the OS Disk - you'll be navigated to the Disk view. From this view, select Configuration on the left-hand side and then increase Size in GB and hit the Save button.
Navigate back to the Virtual Machine view in Step 2 and click the Start button to start the virtual machine.

seismic-deeplearning's People

Contributors

Stargazers

Watchers

Forkers

ehrlinger georgeaccnt-gh ibrahimediz msalvaris khemanta tobi-ore giantrksa gopakumargeetha stjordanis deeplearning2012 sharatsc mieussep mozamani aisensiy imatiach-msft chengzchengzhan battani olgaliak kirasoderstrom zhu-shengwei mvsaraiva squassina joseph8923 tzeghnoun pkuma-msft ponykid krishna999 keshava jyyjqq regginalee yalaudah yohanesnuwara mhy975815 fazamani tersapp rmoin damanotra bhaskers-blu-org2 wangshubiao99 junmugit asidosaputra nyleng faizao taffywrinkle bambang claudiusgonzo aschrc-ait johnarthur1 aiden213 malekimasoud putupradnya williamalbert94 tusharkalecam zaky9 sj-cai lkampoli abv-hub halahaa praffiah pongthepgeo geo-ex-machina zy911k24 coldwater-qcl sbachkheti yuefeizhu profmagmaticyemi guogui chenwuperth thomhard-lab matdelaterra amghanim jtarp26 olegjakushkin clemrissa jerukagung-seismologi 5l1v3r1 jaykimbravekjh rdysyh chandpes jfabriciocp seismic-storage jhgeeyang defineconst yufengwa afantunes74 imfinee griseljimenez maxkazmsft zhangleuestc scumechanics yagnihotri violetaseo aditya-zutshi flafferriere ebgaspar bibhabasumohapatra syedaliasghar14 motazalfarraj elmajdma sunshinei928

seismic-deeplearning's Issues

check/validate config to make sure datapath and model path are valid, if not solved debugging becomes hard

We should do a check/validate config to make sure datapath and model path are valid

Add Troubleshooting section for DSVM warnings

"Data Science Virtual Machine conda package installation warnings". Address "WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /data/anaconda/pkgs/ipywidgets-7.5.1-py_0/site-packages/ipywidgets-7.5.1.dist-info/LICENSE. Please remove this file manually (you may need to reboot to free file handles)
"

specify python version upon set up of conda env

This prevents notebooks switching python versions on diffs:
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
"version": "3.6.7"

In the README, we need to fix python version when running "conda env create"

create HRNet Azure ML notebook based on existing HRNet notebook - showcases the use of Azure ML

blocked by Azure ML pipelines

MODEL.PRETRAINED is not defined as key in default.py

(seismic-interpretation) rijai@rijaibugbash:~/DeepSeismic/experiments/interpretation/penobscot/local$ python train.py --cfg=configs/hrnet.yaml
Traceback (most recent call last):
File "train.py", line 292, in
fire.Fire(run)
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire
target=component.name)
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "train.py", line 88, in run
update_config(config, options=options, config_file=cfg)
File "/data/home/rijai/DeepSeismic/experiments/interpretation/penobscot/local/default.py", line 107, in update_config
cfg.merge_from_file(config_file)
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/yacs/config.py", line 213, in merge_from_file
self.merge_from_other_cfg(cfg)
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/yacs/config.py", line 460, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/data/anaconda/envs/seismic-interpretation/lib/python3.6/site-packages/yacs/config.py", line 473, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.PRETRAINED'

document how to run any of the experiment scripts which are not in the contrib folder - both train and test

add instructions in README on what to do next (after environment setup)

After the users are done with the environment setup, it's not clear what they are supposed to do next. Would be good to add short instructions on what to do next - download the data, then try these examples.

Templatize creation of "./configs/hrnet.yaml" in the HRNET notebook

Can we templatize creation of "./configs/hrnet.yaml" in the HRNET notebook? Something like create_config(data_path, prebuilt_model, config_path)?

MODEL.EXTRA = CN(new_allowed=True) should also be defined in default.py

Broken link on data download part on F3_block_training_and_evaluation_local.ipynb

[Filed by Christy Won]

Fix tensortensorboard==2.0.0 in environment.yml

tensortensorboard==2.0.0

Improve penobscot download instructions

[From Richin]
I have to add sudo to these commands below to get it to work

data_dir='/data/penobscot'
mkdir "$data_dir"
./scripts/download_penobscot.sh "$data_dir"

May be this points it out but it's not clear - Note that the specified download location (e.g /data/penobscot) should be created beforehand, and configured appropriate write pemissions.

add functional papermill tests for notebooks for notebooks ADO build pipeline

Right now the tests are dummy tests in https://github.com/microsoft/DeepSeismic/blob/staging/tests/cicd/notebooks_build.yml - make sure they are actually real tests with papermill

Make sure the notebooks run the training loop for a single epoch - basically these are integration tests for notebooks

indicate dataset sizes in the README

Indicate dataset download size of Penobscot and F3 Netherlands at "Dataset download and preparation" in README so that people can see how much storage they will need beforehand

Fix `conda env` to `conda create env` in segyviewer install instructions

[Filed by Christy Won]

Update README with HRNET model download info

Update the README with a download script for the HRNET model and explain why the model is being copied from a OneDrive account.

remove three collapsible sections in the main README

these sections look like headings and make the user miss their content

move Imaging section of the README to contrib folder

Support SEGY files that do not have geometry - consistent support of segy files

The current code uses segyio.tools.cube() to extract a volume of data from segy files that can easily be chuncked up. However, most segy files do not have geometry in them that allows segyio to infer the dimensions of the volume. In the case where geometry is not present. You need to manually infer the dimensions by reading through the trace headers and constructing a volume manually.

notebook tests: make sure HRNet demo notebook test runs in a reasonable time

README instructions need to specify Linux is the only OS we support

Perhaps include instructions on how to run on DS VM and barebone Azure Linux boxes - basically add Azure-specific instructions.

script HRNET model download and add supporting documentation

HRNET download needs to be scripted. wget https://1drv.ms/u/s!Aus8VCZ_C_33dKvqI6pBZlifgJk either errors out due to ! being interpreted by shell or --follow-links doesn't work with onedrive

Otherwise need to download the model separately, then upload to the VM, which is combersome - instead, script up wget to pretend that it's a web browser, similar to
wget --header 'Host: microsoft-my.sharepoint.com' --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8' --header 'Accept-Language: en-GB,en;q=0.5' --referer 'https://microsoft-my.sharepoint.com/personal/your_data_file&originalPath=aHR0cHM6Ly9taWNyb3NvZnQtbXkuc2hhcmVwb2ludC5jb20vOnU6L3AvbWF4a2F6L0VWWnpzYndEel9CQWxiZFMzbEZva0FNQnVkRkdDYjU4RF9vQXZYZWU3b3NUTUE_cnRpbWU9YnhUSk5GQm8xMGc' --header 'Cookie: MicrosoftApplicationsTelemetryDeviceId=9dd66cb1-2d64-4bef-a41f-3c91a461ffc5; MicrosoftApplicationsTelemetryFirstLaunchTime=2019-11-05T10:00:00.338Z; WacDataCenter=PP4; __CT_Data=gpv=1&ckp=tld&dm=sharepoint.com; utag_main=v_id:016c4cc02dde0022d9a27b6af3200104e008f00d00db4$_sn:1$_ss:1$_st:1564657558816$ses_id:1564655758816%3Bexp-session$_pn:1%3Bexp-session; rtFa=de4rb02LaPhJ2Cf06ZTwO0QhsSSVBUWLuz/t75ccwBsmRUM2M0IwOUItOTc0OC00N0JBLTkwMTgtQkVFQURENDA1MjA0IzEzMjE4MTMzMzk4MTExMDU1MiNCQzc5MTc5Ri00MDE3LTAwMDAtQzkxMy00NkE0QzE5MUQyNjZtst+nWuhP8SUo/ImzXv7MTjKVG/bOZP6+kctp5qKvgSjCRIQE1yyS5DMWMnEfpPBk5/ph3CybMxVQxlTny5PiDXGVnt0wkQGZH/k1zpTn271k/fLJMdm4HSwsjpONYG/oCasz6KoRYQjXqfSAYxM1lJtA21yDddg3LCg178yzsgdz68qkJdLU71y/nt5t2GOwCTfv6Ib6x6aHLeBw+Umlizbe1U7t54mwabFDk0ABD4kHwAAJ124hrgIJPFA0SeZW2+xlo9AX8iUHJtKnL1HxCNM7SRAbN/AkRZNq0X9dpWz6U/DC5D1DHT9hyDV2pG1QYs/f2YWVf7gS0u4yWQdQfQAAAA==; FedAuth=77u/PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48U1A+VjYsMGguZnxtZW1iZXJzaGlwfDEwMDMwMDAwOTg1NzcwMWNAbGl2ZS5jb20sMCMuZnxtZW1iZXJzaGlwfG1hc2FsdmFyQG1pY3Jvc29mdC5jb20sMTMyMTY3NTkxODYwMDAwMDAwLDEzMjA4NTI4MjQ4MDAwMDAwMCwxMzIxODU2NTM5ODA3OTc5NjgsMjE2LjU0LjEzNy4xNTAsNjcsZWM2M2IwOWItOTc0OC00N2JhLTkwMTgtYmVlYWRkNDA1MjA0LCxWMiExMDAzMDAwMDk4NTc3MDFDITEzMjE2NzU5MTg2LDMzN2IzZWUzLTdlMmYtNGIwOC05MjAzLWQxN2VhYmE4YmE1OSwzMzdiM2VlMy03ZTJmLTRiMDgtOTIwMy1kMTdlYWJhOGJhNTksLDAsMTMyMTgxMzY5OTc5MzkxNjYwLDEzMjE4MzkyNTk3OTM5MTY2MCwsLCxsZFAyV25pNUhuUC8zTFRIS1pHY2JGa0NaYVU4allyWi95dkFrUDE5UkwzakNQd09udUJ6OXVjVklwTVVTZ3NEMTFSYnYvSFluWG5rOXhoYmFXelJ4ZVp2L1Nzd1lXbnBNVExXN3lmV05uZjA0aVVYWEsxUVlHYWR3RE1MUHcrR0xYdGY2aXRMYmpMS0htWXVFL0FVZW5XUithZG1JdVBDaTlqY0VrKzJBWk1QT2IycTBEcnRaaTNvY28rTWNBa1dESjRZTVU2SzFUb0NLTVpWUVdjeG96dERrM2I1WFdSbDBRK0hWUlpseERkanhxWi9DZUV4WUxQNE94L2d6MmxhaGhRbXlOVEl1ampHeTRkcllTQkJ1MzBWUVZSclJERVBnNzA4NFd0bkdLSWdFZnJhVlRzWDViNlRiQm9DUHRNL0tadlM2MWtEVDZQSFhYZW5rYWJrSnc9PTwvU1A+; CCSInfo=MTEvMTYvMjAxOSA1OjQ4OjE3IFBNSrxwsghywJADyXhgItMeTnrnfuE2bIf7o8R0pad6pqvp93WP7Un20SYo2FuGSLtQieHtqMsH4LdKj7YpaaPFucs4pTkrNP0S1WaTlXuJWmyimRNdAhMaLdIRAw9aU10s2NEXvYm3OUhbD0Is/1LzTyvRR8rWWf9lV1NNeKAC3Gm9VEfx5dxPuFu35XAN7S7OlrxwGcDs+MX6fV3oHHSlmNOlkSL4IcBLCeEGJWttQcEHt95OftVXz/jmRZuODVoY+GV+dDvKNyBwOiNDv2Mlhxo21yeLaA7QuQPoz8saXyKkAs4KyrNZyrQWAxafZ3UzfrdE4CKa0lv0RcZu2arg9xUAAAA=; FeatureOverrides_enableFeatures=; FeatureOverrides_disableFeatures=' --header 'Upgrade-Insecure-Requests: 1' 'https://microsoft-my.sharepoint.com/personal/maxkaz_microsoft_com/_layouts/15/download.aspx?your_data_file_here' --output-document 'msft.tgz'

Suggested improvements to HRNet notebook, facilitates better UX

[from Patrick]
In HRNet_demo_notebook:

might be that the last two tensorboard command should be before training, so that training progress gets shown (otherwise I think it won't run the last two lines until the training is done).
could contain more high-level info, and generally if it's meant as the first thing for the user to execute then this notebook could be a bit more visually appealing.
could add instructions how people can bring their own dataset
Could visualize what the results of the trained model look like, and maybe how to save (so that can be deployed later)
Training takes a long time, could point out how long or print how many epochs it's running
Is there a way to see how the model performs (eg visually) without having to wait for the training to finish

provide ball-park estimates required to train and score segmentation models in table form in the README file

Original request after the bug bash
• Could also mention training/inference speed, and recommend what algo to use first.

The ask is to quote (approx) numbers on V100 GPU and include in the main README

We need to add a trouble shooting section to address issues like CUDA OOM/how to use nvidia-smi etc.

Add 'Download complete.' print out message at the end of prepare_penobscot.py

[From Patrick]

• Prepare_penobscot script was super fast, not sure if it ran successfully. Could add a print-out message in the end.

Update README instructions to include setting up Jupyter kernel in notebooks

Original comments: "move python kernel instruction to the main README?"

dutch f3 notebook test section fix

there are private functions (_compose_processing_pipeline, _output_processing_pipeline, _write_section_file) being used in the test section code in https://github.com/microsoft/DeepSeismic/blob/staging/examples/interpretation/notebooks/F3_block_training_and_evaluation_local.ipynb, however only public functions are imported. After import, the notebook errors out in the test section complaining that there isn't any test data left.

Add testing of scripts to Azure pipeline

Improve description of the repo in README

[From Patrick]
• Not fully clear from a first glance what functionality the repo supports/contains. Could e.g. adding a table early which points out what the functionality is.
• Could also describe approaches in 1-2 sentences
• Could also mention training/inference speed, and recommend what algo to use first.

broken dutchf3_voxel script in contrib

fix up dutchf3_voxel training script in contrib folder after all the merges - make sure it runs

move train and test experiment scripts along with model config files which we haven't reproduced to contrib folder before OSS release

[Nice to have] Pointer describing what yacs files are

README example data path is not consistent

python scripts/prepare_penobscot.py split_inline --data-dir=/data/penobscot --val-ratio=.1 --test-ratio=.2 uses /data/penobscot, but in the previous steps, it guides to download the data at $HOME/data/penobscot, so if I just copy and paste, it throws an exception.

Modify train and test scripts to take debug parameter

add YACS snippets into the documentation - how to override config files at command line

Add prerequisite (to ask download Penobscot dataset) to HRNet_demo_notebook.ipynb - facilitates better UX

F3_block_training_and_evaluation_local.ipynb notebook mentions it uses F3 Netherlands dataset, but HRNet_demo_notebook.ipynb does not say anything about which dataset it will use.

make it clear in the readme before download section which dataset to download and why

Specify which dataset to download and how and why - not obvious in the flow which dataset is needed and whether or not the user can download just one or if the user needs both

change download instructions to point to /home/$(whoami)/data instead of /data on DS VM

Original text "download_penobscot instructions point to /data which might need sudo access on DSVM. Maybe suggest "data" in user's home directory, specified as a full path instead (don't use ~ shortcut for user directory as this may not work with YACS)?"

clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12

Add this info to main README

enhance Azure-supporting documentation

Update README to reference DSVM as the preferred running enviornment or provide directions to install conda.
Update README to mentioned GPU machines as preferred running enviornment.

python test.py --cfg=logging.conf PARAM.NAME1 value1 PARAM.NAME2 value2

should override the parameter names specified in defaults.py.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble