GithubHelp home page GithubHelp logo

hku-dasc7606-a1's People

Contributors

haibao-yu avatar ilnehc avatar ilnehc96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hku-dasc7606-a1's Issues

Confusion on Validation Dataset

I am confused that the validation loss hovers at some level(0.6-0.75), but the mAP is growing up constantly(from 0.3 to 0.48). I am not good at CV and don't change the model structure. I may think the process is already overfitting based on validation loss if I am doing some NLP, RL tasks. But after checking the result of mAP, feel quite surpised.

Is that because the model don't tackle with difficult images and only deal with exactly more easy images?

How to get 'test_bbox_results.json'

I found that the 'test_bbox_results.json' file exists in the test_submission.py test file. At what stage is this file generated? Is it automatically generated by the program or does it need to be generated manually?

# load results in COCO evaluation tool
    coco_true = dataset_test.coco
    coco_pred = coco_true.loadRes('test_bbox_results.json')

Question about understanding losses.py

I'm confused by some code of losses.py in the 78th line.
IoU = calc_iou(anchors[0, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations

  1. If I understand corrently, the first dimension of anchors represents different examples in the batch, and bbox_annotation is the jth example in the batch. Then does it make more sense to be
    IoU = calc_iou(anchors[j, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations?

  2. For the code of clac_iou, which is the part we need to complete, we're required to calculate the IoU between two boxes, here we are calculating IoU between num_anchors and num_annotations, so does a,b in the function calc_IoU are batches of boxes?

Expected time for file compression takes another 23 hours

image
Dear Teachers, yesterday when I tried to compress my HKU-DASC7606-A1 into a zip file and then rename it to my UID, it suddenly appeared that it is expected to take another 23 hours just for the file compression. May I ask what should I do?

Question about model submission

Hi TAs,

In the assignment description said "Models, in the format of model checkpoint link (model_link.txt)". Does this mean we need to upload our final model to Google drive and put the link in the txt just like those train/test datasets?

Thanks!

GPU not utilized during model training, resulting in slow performance

I have added the CUDA path to ./bashrc and entered the GPU-interactive mode. However, during automatic model training, the CPU is being used, leading to extremely slow training. This issue occurs consistently after the initial successful training, which took approximately 1 hour under default settings.

Is it because I installed TensorBoard which modified the pytorch version?
I ran
conda install pytorch torchvision -c pytorch
pip install tensorboard
截屏2024-02-26 下午7 37 31
截屏2024-02-26 下午7 36 38
截屏2024-02-26 下午8 00 33

kill the epoch automatically

I don't know why every time I run a file in gpu farm, it gets killed in the middle of the epoch run, even after modifying the Losses.py file considering that it's the tensor. I was wondering if anyone else has encountered a similar problem, and would like to know how to fix it!

1

Program killed unexpectedly

I'm encountering an issue when running the command “python train.py…” . Each time I attempt to execute this script, the process gets terminated unexpectedly with a "Killed" message. Has anyone encountered a similar issue or have suggestions on how to resolve this?

Meaning of (x_1,y_1,x_2,y_2) in bbox annotation.

In the dataloader.py file, we're required to tranform the bbox annotation, but I'm not sure about the meaning of the representation, does (x,y), (x1,y1) mean the top-left corner of the bbox and (x2,y2) the bottom-right corner of that, which implies that h is a negative number? Does this a typical convetion of bbox annotation?

TODO: Please substitute the "?" to transform annotations from [x, y, w, h] to [x1, y1, x2, y2]`

too long runtime that the school gpu automatically disconnect

After filling in the blanks and trying to run python train.py --coco_path ./data --output_path ./output --depth 50 --epochs 20
it takes too long that the school gpu automatically disconnect (it happens again when I attempted to reconnect and run this command line again.) what should I do?

螢幕擷取畫面 2024-02-26 001555

image

Question about hyperparameter tuning

There are a lot of hyperparameters in the retinanet model, such as $\alpha$, $\beta$ in the focal loss, model depth, learning rate, e.t.c. However, it takes roughly 5 minutes to train a epoch and about 4 hours to train 40 epochs. It's really expensive to do it by trial and error and we only have 100 hours HKU Gpu Farm quota. So how to tune these hyperparameters in methods like grid search? And how to do analysis based on limited experiments.

It'd be appreciated if you can give some suggestions or reference!

A question about focal loss

In losses.py, we're required to finish the part of Focal Loss from 107th line to 117th line.

focal_weight = "?"

bce = "?"

cls_loss = focal_weight * bce

However, I'm a little bit confused about bce since we are handling multiclasses. Are we supposed to use cce(categorical cross entropy) instead?

Another question is about evaluation, since the evaluation metric (the AP score) is wrapped in COCOeval, I can't see very clearly. I want to ask, does the model's performance on categorical classification is taken into consideration in evaluation? Also, does model's ability of correctly classify the category will influence the performance of correctly positioning the object?

Really thanks!!

FileNotFoundError: [Errno 2] No such file or directory: "./data\\'val'.json"

image
螢幕擷取畫面 2024-03-01 031129
When I tried to the run
(retinanet) C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1>python test.py --coco_path ./data --checkpoint_path ./output/model_final.pt --depth 50 --set_name 'val'
The following error appeared. I guess it is related to the string formatting of the file location. (There is actually a val.json inside the data directory.) What should I do?:
loading annotations into memory...
Traceback (most recent call last):
File "test.py", line 68, in
main()
File "test.py", line 39, in main
dataset_test = CocoDataset(parser.coco_path, set_name=parser.set_name,
File "C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1\retinanet\dataloader.py", line 37, in init
self.coco = COCO(os.path.join(self.root_dir, self.set_name + '.json'))
File "C:\ProgramData\Anaconda3\envs\retinanet\lib\site-packages\pycocotools\coco.py", line 81, in init
with open(annotation_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: "./data\'val'.json"

files organization

After we git clone https://github.com/ilnehc96/HKU-DASC7606-A1.git
cd HKU-DASC7606-A1
As we are told to 'After downloading and extracting the dataset, you should put all files following the structure
below:', may I ask several questions regarding the structure ambiguity?
Firstly, inside the outermost 'HKU-DASC7606-A1', besides the required retinanet, train.py, test.py, test_submission.py, vis.py and README.md, there should also be two directories, namely 'output' and 'sources' with a vis.png inside.
Secondly, the directory 'image' inside 'data' is actually named as 'images' in the original trainval Google drive given to us.
Should we rename 'images' as 'image' instead?
Lastly, from the instruction 'If your student id is 12345, then the compressed file for submission on Moodle should be
organized as follows:', it seems that our final submitted format and file structure is very different compared with our given files.
May I ask what are the different functions and content we should put correspondingly in the (optional) README.md, (optioinal) source code for bonus, (optioinal) bonus_model_link.txt, (optioinal bonus) bonus_test_bbox_results.json, your source code and model_link.txt?
I know we are required to improve the baseline model with your own configuration, but I don't know where should I put the modification and what is the difference between newly created different files to be submitted.
For 'All the completed codes.' should we put the whole given 'HKU-DASC7606-A1' directory together in the same location with report.pdf, model_link.txt, test_bbox_results.json, ... and zip them together into 3035691248.zip /.tar /.tar.gz?
What does it mean by 'Models, in the format of model checkpoint link (model_link.txt) due to the limitation on
submission file size.'? (Isn't models stored as 'class ResNet(nn.Module):'... inside the original model.py? How can we implement another model using a txt file instead of defining it as a class inside the original python file?)
After producing test_bbox_results.json, may we reproduce bonus_test_bbox_results.json simply by editing the codes in the original python files, and then rename another test_bbox_results.json into the bonus_test_bbox_results.json?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.