ilnehc96 / hku-dasc7606-a1 Goto Github PK
View Code? Open in Web Editor NEWHKU DASC 7606 Assignment 1 (Computer Vision), 2023-24 Spring
HKU DASC 7606 Assignment 1 (Computer Vision), 2023-24 Spring
I am confused that the validation loss hovers at some level(0.6-0.75), but the mAP is growing up constantly(from 0.3 to 0.48). I am not good at CV and don't change the model structure. I may think the process is already overfitting based on validation loss if I am doing some NLP, RL tasks. But after checking the result of mAP, feel quite surpised.
Is that because the model don't tackle with difficult images and only deal with exactly more easy images?
I found that the 'test_bbox_results.json' file exists in the test_submission.py test file. At what stage is this file generated? Is it automatically generated by the program or does it need to be generated manually?
# load results in COCO evaluation tool
coco_true = dataset_test.coco
coco_pred = coco_true.loadRes('test_bbox_results.json')
I'm confused by some code of losses.py in the 78th line.
IoU = calc_iou(anchors[0, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations
If I understand corrently, the first dimension of anchors represents different examples in the batch, and bbox_annotation is the jth example in the batch. Then does it make more sense to be
IoU = calc_iou(anchors[j, :, :], bbox_annotation[:, :4]) # num_anchors x num_annotations
?
For the code of clac_iou, which is the part we need to complete, we're required to calculate the IoU between two boxes, here we are calculating IoU between num_anchors and num_annotations, so does a,b in the function calc_IoU are batches of boxes?
Hi TAs,
In the assignment description said "Models, in the format of model checkpoint link (model_link.txt)". Does this mean we need to upload our final model to Google drive and put the link in the txt just like those train/test datasets?
Thanks!
I have made model but it is too large to upload. I want to know which tool or method to use to upload models?
I have added the CUDA path to ./bashrc and entered the GPU-interactive mode. However, during automatic model training, the CPU is being used, leading to extremely slow training. This issue occurs consistently after the initial successful training, which took approximately 1 hour under default settings.
Is it because I installed TensorBoard which modified the pytorch version?
I ran
conda install pytorch torchvision -c pytorch
pip install tensorboard
I'm encountering an issue when running the command “python train.py…” . Each time I attempt to execute this script, the process gets terminated unexpectedly with a "Killed" message. Has anyone encountered a similar issue or have suggestions on how to resolve this?
In the dataloader.py file, we're required to tranform the bbox annotation, but I'm not sure about the meaning of the representation, does (x,y), (x1,y1) mean the top-left corner of the bbox and (x2,y2) the bottom-right corner of that, which implies that h is a negative number? Does this a typical convetion of bbox annotation?
There are a lot of hyperparameters in the retinanet model, such as
It'd be appreciated if you can give some suggestions or reference!
When I opened
http://localhost:8888/?token=e8516d502ae2d327646f6d932275ebf458f41f4805c2cf4c
or http://127.0.0.1:8888/?token=e8516d502ae2d327646f6d932275ebf458f41f4805c2cf4c
on my local device.
It asks me for a token or password and I can not use it. This bug has been happening occadionally.
In losses.py, we're required to finish the part of Focal Loss from 107th line to 117th line.
focal_weight = "?"
bce = "?"
cls_loss = focal_weight * bce
However, I'm a little bit confused about bce since we are handling multiclasses. Are we supposed to use cce(categorical cross entropy) instead?
Another question is about evaluation, since the evaluation metric (the AP score) is wrapped in COCOeval, I can't see very clearly. I want to ask, does the model's performance on categorical classification is taken into consideration in evaluation? Also, does model's ability of correctly classify the category will influence the performance of correctly positioning the object?
Really thanks!!
When I tried to the run
(retinanet) C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1>python test.py --coco_path ./data --checkpoint_path ./output/model_final.pt --depth 50 --set_name 'val'
The following error appeared. I guess it is related to the string formatting of the file location. (There is actually a val.json inside the data directory.) What should I do?:
loading annotations into memory...
Traceback (most recent call last):
File "test.py", line 68, in
main()
File "test.py", line 39, in main
dataset_test = CocoDataset(parser.coco_path, set_name=parser.set_name,
File "C:\Users\user\Desktop\academics\DASC7606\HKU-DASC7606-A1\retinanet\dataloader.py", line 37, in init
self.coco = COCO(os.path.join(self.root_dir, self.set_name + '.json'))
File "C:\ProgramData\Anaconda3\envs\retinanet\lib\site-packages\pycocotools\coco.py", line 81, in init
with open(annotation_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: "./data\'val'.json"
After we git clone https://github.com/ilnehc96/HKU-DASC7606-A1.git
cd HKU-DASC7606-A1
As we are told to 'After downloading and extracting the dataset, you should put all files following the structure
below:', may I ask several questions regarding the structure ambiguity?
Firstly, inside the outermost 'HKU-DASC7606-A1', besides the required retinanet, train.py, test.py, test_submission.py, vis.py and README.md, there should also be two directories, namely 'output' and 'sources' with a vis.png inside.
Secondly, the directory 'image' inside 'data' is actually named as 'images' in the original trainval Google drive given to us.
Should we rename 'images' as 'image' instead?
Lastly, from the instruction 'If your student id is 12345, then the compressed file for submission on Moodle should be
organized as follows:', it seems that our final submitted format and file structure is very different compared with our given files.
May I ask what are the different functions and content we should put correspondingly in the (optional) README.md, (optioinal) source code for bonus, (optioinal) bonus_model_link.txt, (optioinal bonus) bonus_test_bbox_results.json, your source code and model_link.txt?
I know we are required to improve the baseline model with your own configuration, but I don't know where should I put the modification and what is the difference between newly created different files to be submitted.
For 'All the completed codes.' should we put the whole given 'HKU-DASC7606-A1' directory together in the same location with report.pdf, model_link.txt, test_bbox_results.json, ... and zip them together into 3035691248.zip /.tar /.tar.gz?
What does it mean by 'Models, in the format of model checkpoint link (model_link.txt) due to the limitation on
submission file size.'? (Isn't models stored as 'class ResNet(nn.Module):'... inside the original model.py? How can we implement another model using a txt file instead of defining it as a class inside the original python file?)
After producing test_bbox_results.json, may we reproduce bonus_test_bbox_results.json simply by editing the codes in the original python files, and then rename another test_bbox_results.json into the bonus_test_bbox_results.json?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.