ilnehc96 / hku-dasc7606-a1 Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 17.0 364 KB

HKU DASC 7606 Assignment 1 (Computer Vision), 2023-24 Spring

Python 100.00%

hku-dasc7606-a1's People

Contributors

Stargazers

Watchers

Forkers

chixnran aricl3 zhaoyanrong1112 louis-jackson yuxinzhang20 seuliyao gavin-yau duuong wenhaoy-0428 chweiqi freddygao zhsh9 chuujun yueming-github linzaill chjjj-59 mengqiancao

hku-dasc7606-a1's Issues

Confusion on Validation Dataset

I am confused that the validation loss hovers at some level(0.6-0.75), but the mAP is growing up constantly(from 0.3 to 0.48). I am not good at CV and don't change the model structure. I may think the process is already overfitting based on validation loss if I am doing some NLP, RL tasks. But after checking the result of mAP, feel quite surpised.

Is that because the model don't tackle with difficult images and only deal with exactly more easy images?

How to get 'test_bbox_results.json'

I found that the 'test_bbox_results.json' file exists in the test_submission.py test file. At what stage is this file generated? Is it automatically generated by the program or does it need to be generated manually?

# load results in COCO evaluation tool
    coco_true = dataset_test.coco
    coco_pred = coco_true.loadRes('test_bbox_results.json')

Question about understanding losses.py

I'm confused by some code of losses.py in the 78th line.
IoU = calc_iou(anchors[0, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations

If I understand corrently, the first dimension of anchors represents different examples in the batch, and bbox_annotation is the jth example in the batch. Then does it make more sense to be
IoU = calc_iou(anchors[j, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations?
For the code of clac_iou, which is the part we need to complete, we're required to calculate the IoU between two boxes, here we are calculating IoU between num_anchors and num_annotations, so does a,b in the function calc_IoU are batches of boxes?

Expected time for file compression takes another 23 hours

Dear Teachers, yesterday when I tried to compress my HKU-DASC7606-A1 into a zip file and then rename it to my UID, it suddenly appeared that it is expected to take another 23 hours just for the file compression. May I ask what should I do?

Question about model submission

Hi TAs,

In the assignment description said "Models, in the format of model checkpoint link (model_link.txt)". Does this mean we need to upload our final model to Google drive and put the link in the txt just like those train/test datasets?

Thanks!

How to generate model_link.txt

I have made model but it is too large to upload. I want to know which tool or method to use to upload models?

GPU not utilized during model training, resulting in slow performance

I have added the CUDA path to ./bashrc and entered the GPU-interactive mode. However, during automatic model training, the CPU is being used, leading to extremely slow training. This issue occurs consistently after the initial successful training, which took approximately 1 hour under default settings.

Is it because I installed TensorBoard which modified the pytorch version?
I ran
conda install pytorch torchvision -c pytorch
pip install tensorboard

kill the epoch automatically

I don't know why every time I run a file in gpu farm, it gets killed in the middle of the epoch run, even after modifying the Losses.py file considering that it's the tensor. I was wondering if anyone else has encountered a similar problem, and would like to know how to fix it!

Program killed unexpectedly

I'm encountering an issue when running the command “python train.py…” . Each time I attempt to execute this script, the process gets terminated unexpectedly with a "Killed" message. Has anyone encountered a similar issue or have suggestions on how to resolve this?

Meaning of (x_1,y_1,x_2,y_2) in bbox annotation.

In the dataloader.py file, we're required to tranform the bbox annotation, but I'm not sure about the meaning of the representation, does (x,y), (x1,y1) mean the top-left corner of the bbox and (x2,y2) the bottom-right corner of that, which implies that h is a negative number? Does this a typical convetion of bbox annotation?

TODO: Please substitute the "?" to transform annotations from [x, y, w, h] to [x1, y1, x2, y2]`

too long runtime that the school gpu automatically disconnect

After filling in the blanks and trying to run python train.py --coco_path ./data --output_path ./output --depth 50 --epochs 20
it takes too long that the school gpu automatically disconnect (it happens again when I attempted to reconnect and run this command line again.) what should I do?

Question about hyperparameter tuning

There are a lot of hyperparameters in the retinanet model, such as $\alpha$, $\beta$ in the focal loss, model depth, learning rate, e.t.c. However, it takes roughly 5 minutes to train a epoch and about 4 hours to train 40 epochs. It's really expensive to do it by trial and error and we only have 100 hours HKU Gpu Farm quota. So how to tune these hyperparameters in methods like grid search? And how to do analysis based on limited experiments.

It'd be appreciated if you can give some suggestions or reference!

Unable to launch jupyter notebook in school gpu

When I opened
http://localhost:8888/?token=e8516d502ae2d327646f6d932275ebf458f41f4805c2cf4c
or http://127.0.0.1:8888/?token=e8516d502ae2d327646f6d932275ebf458f41f4805c2cf4c
on my local device.
It asks me for a token or password and I can not use it. This bug has been happening occadionally.

A question about focal loss

In losses.py, we're required to finish the part of Focal Loss from 107th line to 117th line.

focal_weight = "?"

bce = "?"

cls_loss = focal_weight * bce

However, I'm a little bit confused about bce since we are handling multiclasses. Are we supposed to use cce(categorical cross entropy) instead?

Another question is about evaluation, since the evaluation metric (the AP score) is wrapped in COCOeval, I can't see very clearly. I want to ask, does the model's performance on categorical classification is taken into consideration in evaluation? Also, does model's ability of correctly classify the category will influence the performance of correctly positioning the object?

Really thanks!!

Missing form for estimation error

The form for displaying the estimation error has been disappeared. It was there before. Does anyone know why it happens?

FileNotFoundError: [Errno 2] No such file or directory: "./data\\'val'.json"

When I tried to the run
(retinanet) C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1>python test.py --coco_path ./data --checkpoint_path ./output/model_final.pt --depth 50 --set_name 'val'
The following error appeared. I guess it is related to the string formatting of the file location. (There is actually a val.json inside the data directory.) What should I do?:
loading annotations into memory...
Traceback (most recent call last):
File "test.py", line 68, in
main()
File "test.py", line 39, in main
dataset_test = CocoDataset(parser.coco_path, set_name=parser.set_name,
File "C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1\retinanet\dataloader.py", line 37, in init
self.coco = COCO(os.path.join(self.root_dir, self.set_name + '.json'))
File "C:\ProgramData\Anaconda3\envs\retinanet\lib\site-packages\pycocotools\coco.py", line 81, in init
with open(annotation_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: "./data\'val'.json"

files organization

After we git clone https://github.com/ilnehc96/HKU-DASC7606-A1.git
cd HKU-DASC7606-A1
As we are told to 'After downloading and extracting the dataset, you should put all files following the structure
below:', may I ask several questions regarding the structure ambiguity?
Firstly, inside the outermost 'HKU-DASC7606-A1', besides the required retinanet, train.py, test.py, test_submission.py, vis.py and README.md, there should also be two directories, namely 'output' and 'sources' with a vis.png inside.
Secondly, the directory 'image' inside 'data' is actually named as 'images' in the original trainval Google drive given to us.
Should we rename 'images' as 'image' instead?
Lastly, from the instruction 'If your student id is 12345, then the compressed file for submission on Moodle should be
organized as follows:', it seems that our final submitted format and file structure is very different compared with our given files.
May I ask what are the different functions and content we should put correspondingly in the (optional) README.md, (optioinal) source code for bonus, (optioinal) bonus_model_link.txt, (optioinal bonus) bonus_test_bbox_results.json, your source code and model_link.txt?
I know we are required to improve the baseline model with your own configuration, but I don't know where should I put the modification and what is the difference between newly created different files to be submitted.
For 'All the completed codes.' should we put the whole given 'HKU-DASC7606-A1' directory together in the same location with report.pdf, model_link.txt, test_bbox_results.json, ... and zip them together into 3035691248.zip /.tar /.tar.gz?
What does it mean by 'Models, in the format of model checkpoint link (model_link.txt) due to the limitation on
submission file size.'? (Isn't models stored as 'class ResNet(nn.Module):'... inside the original model.py? How can we implement another model using a txt file instead of defining it as a class inside the original python file?)
After producing test_bbox_results.json, may we reproduce bonus_test_bbox_results.json simply by editing the codes in the original python files, and then rename another test_bbox_results.json into the bonus_test_bbox_results.json?

ilnehc96 / hku-dasc7606-a1 Goto Github PK

hku-dasc7606-a1's People

Contributors

Stargazers

Watchers

Forkers

hku-dasc7606-a1's Issues

TODO: Please substitute the "?" to transform annotations from [x, y, w, h] to [x1, y1, x2, y2]`

Recommend Projects

Recommend Topics

Recommend Org

Jobs