benathi / fastswa-semi-sup Goto Github PK

View Code? Open in Web Editor NEW

185.0 185.0 32.0 23.71 MB

Improving Consistency-Based Semi-Supervised Learning with Weight Averaging

Shell 98.63% Python 1.37%

fastswa-semi-sup's People

Contributors

Stargazers

Watchers

fastswa-semi-sup's Issues

I can't see how SWA is doing anything.

I read the main.py, which is quite lengthy. I saw many instances of swa_nets being instantiated and updated by the main model. However, I couldn't not see in code where the swa_nets make any adjustment to the main model during training.

Am I missing something here?

The main difference between mean_teacher and swa

According to my understanding, the mean_teacher model attempts to average (exponential mean average) the weights for every training steps and including a consistency loss. While swa only attempts to average the weights for different epochs after the cyclic changes of the learning rate.

AM I correct?

Test Acc is 0.

Thans for your job!
when I run cifar10_mt_cnn_short_n4k.py,The Test Acc is 0.Please help me.

Epoch: [10][0/479] Time 0.587 (0.587) Data 0.172 (0.172) Class 0.2557 (0.2557) Cons 0.5538 (0.5538) Prec@1 63.000 (63.000) Prec@5 90.000 (90.000)
Epoch: [10][10/479] Time 0.761 (0.746) Data 0.000 (0.047) Class 0.1798 (0.1807) Cons 0.3751 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][20/479] Time 0.760 (0.753) Data 0.000 (0.041) Class 0.1572 (0.1806) Cons 0.3082 (0.4412) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][30/479] Time 0.761 (0.756) Data 0.000 (0.039) Class 0.1211 (0.1828) Cons 0.5274 (0.4509) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][40/479] Time 0.762 (0.757) Data 0.000 (0.038) Class 0.1246 (0.1880) Cons 0.4772 (0.4541) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][50/479] Time 0.762 (0.758) Data 0.000 (0.038) Class 0.2377 (0.1924) Cons 0.4411 (0.4581) Prec@1 72.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][60/479] Time 0.761 (0.759) Data 0.000 (0.037) Class 0.1787 (0.1910) Cons 0.4363 (0.4700) Prec@1 75.000 (70.000) Prec@5 93.000 (93.000)
Epoch: [10][70/479] Time 0.760 (0.759) Data 0.000 (0.037) Class 0.1548 (0.1868) Cons 0.3258 (0.4645) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][80/479] Time 0.763 (0.759) Data 0.000 (0.037) Class 0.1739 (0.1863) Cons 0.4593 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][90/479] Time 0.762 (0.759) Data 0.000 (0.036) Class 0.2345 (0.1873) Cons 0.4418 (0.4580) Prec@1 66.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][100/479] Time 0.763 (0.760) Data 0.000 (0.036) Class 0.1874 (0.1891) Cons 0.3540 (0.4589) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][110/479] Time 0.760 (0.760) Data 0.000 (0.036) Class 0.1395 (0.1875) Cons 0.4130 (0.4591) Prec@1 84.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][120/479] Time 0.761 (0.760) Data 0.000 (0.036) Class 0.1363 (0.1885) Cons 0.3363 (0.4606) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][130/479] Time 0.762 (0.760) Data 0.000 (0.036) Class 0.2359 (0.1899) Cons 0.4791 (0.4646) Prec@1 66.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][140/479] Time 0.833 (0.764) Data 0.000 (0.036) Class 0.1649 (0.1883) Cons 0.3886 (0.4649) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][150/479] Time 0.759 (0.766) Data 0.000 (0.036) Class 0.1434 (0.1889) Cons 0.5803 (0.4672) Prec@1 78.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][160/479] Time 0.760 (0.766) Data 0.000 (0.036) Class 0.1944 (0.1891) Cons 0.4242 (0.4648) Prec@1 75.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][170/479] Time 0.760 (0.765) Data 0.000 (0.036) Class 0.2158 (0.1881) Cons 0.5003 (0.4673) Prec@1 72.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][180/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1671 (0.1885) Cons 0.5318 (0.4698) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][190/479] Time 0.759 (0.765) Data 0.000 (0.036) Class 0.1947 (0.1882) Cons 0.4729 (0.4685) Prec@1 72.000 (70.000) Prec@5 87.000 (94.000)
Epoch: [10][200/479] Time 0.761 (0.765) Data 0.000 (0.036) Class 0.2463 (0.1888) Cons 0.5507 (0.4716) Prec@1 60.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][210/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1725 (0.1880) Cons 0.3732 (0.4706) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][220/479] Time 0.759 (0.764) Data 0.000 (0.036) Class 0.1645 (0.1883) Cons 0.5199 (0.4715) Prec@1 75.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][230/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1921 (0.1876) Cons 0.4276 (0.4689) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][240/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1287 (0.1864) Cons 0.4631 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][250/479] Time 0.763 (0.764) Data 0.000 (0.036) Class 0.1925 (0.1863) Cons 0.5085 (0.4710) Prec@1 69.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][260/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.2457 (0.1871) Cons 0.3768 (0.4716) Prec@1 63.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][270/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1760 (0.1857) Cons 0.6240 (0.4730) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][280/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1457 (0.1865) Cons 0.5166 (0.4723) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][290/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1611 (0.1861) Cons 0.3313 (0.4715) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][300/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1698 (0.1861) Cons 0.5398 (0.4724) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][310/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1490 (0.1861) Cons 0.5581 (0.4732) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][320/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1857 (0.1857) Cons 0.4457 (0.4720) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][330/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1504 (0.1848) Cons 0.3339 (0.4715) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][340/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.1625 (0.1853) Cons 0.5476 (0.4709) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][350/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1536 (0.1852) Cons 0.5105 (0.4720) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][360/479] Time 0.759 (0.763) Data 0.000 (0.036) Class 0.1773 (0.1850) Cons 0.5541 (0.4724) Prec@1 66.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][370/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.2138 (0.1846) Cons 0.4766 (0.4709) Prec@1 69.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][380/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1604 (0.1844) Cons 0.5685 (0.4721) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][390/479] Time 0.762 (0.763) Data 0.000 (0.036) Class 0.0984 (0.1842) Cons 0.4666 (0.4716) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][400/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1564 (0.1846) Cons 0.5406 (0.4712) Prec@1 84.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][410/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2175 (0.1840) Cons 0.4788 (0.4713) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][420/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1360 (0.1836) Cons 0.4539 (0.4710) Prec@1 81.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][430/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1436 (0.1836) Cons 0.4669 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][440/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.2510 (0.1835) Cons 0.4651 (0.4703) Prec@1 66.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][450/479] Time 0.762 (0.763) Data 0.000 (0.035) Class 0.1897 (0.1837) Cons 0.5153 (0.4711) Prec@1 72.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][460/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2872 (0.1840) Cons 0.4851 (0.4720) Prec@1 63.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][470/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1542 (0.1840) Cons 0.5021 (0.4736) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
--- training epoch in 365.2766942977905 seconds ---
Evaluating the primary model:
Test: [0/79] Time 0.508 (0.508) Data 0.116 (0.116) Class 0.9628 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [10/79] Time 0.190 (0.218) Data 0.000 (0.011) Class 0.6180 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [20/79] Time 0.189 (0.205) Data 0.000 (0.006) Class 1.1369 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [30/79] Time 0.189 (0.200) Data 0.000 (0.004) Class 1.4853 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [40/79] Time 0.189 (0.197) Data 0.000 (0.003) Class 0.9658 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [50/79] Time 0.191 (0.196) Data 0.000 (0.002) Class 0.4800 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [60/79] Time 0.189 (0.195) Data 0.000 (0.002) Class 0.9950 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [70/79] Time 0.189 (0.194) Data 0.000 (0.002) Class 0.2665 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)

Prec@1 0.000 Prec@5 0.000
Evaluating the EMA model:
Test: [0/79] Time 0.320 (0.320) Data 0.130 (0.130) Class 0.7387 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [10/79] Time 0.189 (0.202) Data 0.000 (0.012) Class 0.3110 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [20/79] Time 0.191 (0.196) Data 0.000 (0.006) Class 1.0565 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [30/79] Time 0.189 (0.194) Data 0.000 (0.004) Class 1.4122 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [40/79] Time 0.189 (0.193) Data 0.000 (0.003) Class 1.0135 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [50/79] Time 0.189 (0.192) Data 0.000 (0.003) Class 0.5472 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [60/79] Time 0.190 (0.192) Data 0.000 (0.002) Class 0.7229 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [70/79] Time 0.191 (0.191) Data 0.000 (0.002) Class 0.4157 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Prec@1 0.000 Prec@5 0.000
--- validation in 30.15329122543335 seconds ---

Hyperparameter of the training process didn't match the setting of other papers.

Hi,
I notice that the pytorch version of MeanTeacher has different setting of parameters compared to the tensorflow version.

For example, the learning rate is 0.1 other than 0.003, which is used in other methods, like temporal ensembling. The rampup epoch and rampdown epoch are different from the tensorflow version too. But I notice that learning rate = 0.003, rampup epoch = 80, rampdown epoch = 50 is used in many other semi-supervised methods.

Does it mean that if we use the setting of parameters from tensorflow version, we can't achieve a good accuracy?

Thanks.

Random Translation range is different from the hyperparameters shown in paper

Hi,

In the code, the max translation is -4 to 4 but in paper, it says [-2,2] in Section A.6 Table 7. Did the experimental results shown in the paper use the [-4,4] range as in the code?

Thanks.

License?

Issues related to python package version incompatibility

Hi,

While trying to run the training scripts I experienced some issues related with some incompabilities between the versions of some packges.

Could you please provide me the required packages and its versions?

Thank you in advance.

Please add license

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble