benathi / fastswa-semi-sup Goto Github PK
View Code? Open in Web Editor NEWImproving Consistency-Based Semi-Supervised Learning with Weight Averaging
Improving Consistency-Based Semi-Supervised Learning with Weight Averaging
I read the main.py, which is quite lengthy. I saw many instances of swa_nets being instantiated and updated by the main model. However, I couldn't not see in code where the swa_nets make any adjustment to the main model during training.
Am I missing something here?
According to my understanding, the mean_teacher model attempts to average (exponential mean average) the weights for every training steps and including a consistency loss. While swa only attempts to average the weights for different epochs after the cyclic changes of the learning rate.
AM I correct?
Thans for your job!
when I run cifar10_mt_cnn_short_n4k.py,The Test Acc is 0.Please help me.
Epoch: [10][0/479] Time 0.587 (0.587) Data 0.172 (0.172) Class 0.2557 (0.2557) Cons 0.5538 (0.5538) Prec@1 63.000 (63.000) Prec@5 90.000 (90.000)
Epoch: [10][10/479] Time 0.761 (0.746) Data 0.000 (0.047) Class 0.1798 (0.1807) Cons 0.3751 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][20/479] Time 0.760 (0.753) Data 0.000 (0.041) Class 0.1572 (0.1806) Cons 0.3082 (0.4412) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][30/479] Time 0.761 (0.756) Data 0.000 (0.039) Class 0.1211 (0.1828) Cons 0.5274 (0.4509) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][40/479] Time 0.762 (0.757) Data 0.000 (0.038) Class 0.1246 (0.1880) Cons 0.4772 (0.4541) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][50/479] Time 0.762 (0.758) Data 0.000 (0.038) Class 0.2377 (0.1924) Cons 0.4411 (0.4581) Prec@1 72.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][60/479] Time 0.761 (0.759) Data 0.000 (0.037) Class 0.1787 (0.1910) Cons 0.4363 (0.4700) Prec@1 75.000 (70.000) Prec@5 93.000 (93.000)
Epoch: [10][70/479] Time 0.760 (0.759) Data 0.000 (0.037) Class 0.1548 (0.1868) Cons 0.3258 (0.4645) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][80/479] Time 0.763 (0.759) Data 0.000 (0.037) Class 0.1739 (0.1863) Cons 0.4593 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][90/479] Time 0.762 (0.759) Data 0.000 (0.036) Class 0.2345 (0.1873) Cons 0.4418 (0.4580) Prec@1 66.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][100/479] Time 0.763 (0.760) Data 0.000 (0.036) Class 0.1874 (0.1891) Cons 0.3540 (0.4589) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][110/479] Time 0.760 (0.760) Data 0.000 (0.036) Class 0.1395 (0.1875) Cons 0.4130 (0.4591) Prec@1 84.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][120/479] Time 0.761 (0.760) Data 0.000 (0.036) Class 0.1363 (0.1885) Cons 0.3363 (0.4606) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][130/479] Time 0.762 (0.760) Data 0.000 (0.036) Class 0.2359 (0.1899) Cons 0.4791 (0.4646) Prec@1 66.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][140/479] Time 0.833 (0.764) Data 0.000 (0.036) Class 0.1649 (0.1883) Cons 0.3886 (0.4649) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][150/479] Time 0.759 (0.766) Data 0.000 (0.036) Class 0.1434 (0.1889) Cons 0.5803 (0.4672) Prec@1 78.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][160/479] Time 0.760 (0.766) Data 0.000 (0.036) Class 0.1944 (0.1891) Cons 0.4242 (0.4648) Prec@1 75.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][170/479] Time 0.760 (0.765) Data 0.000 (0.036) Class 0.2158 (0.1881) Cons 0.5003 (0.4673) Prec@1 72.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][180/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1671 (0.1885) Cons 0.5318 (0.4698) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][190/479] Time 0.759 (0.765) Data 0.000 (0.036) Class 0.1947 (0.1882) Cons 0.4729 (0.4685) Prec@1 72.000 (70.000) Prec@5 87.000 (94.000)
Epoch: [10][200/479] Time 0.761 (0.765) Data 0.000 (0.036) Class 0.2463 (0.1888) Cons 0.5507 (0.4716) Prec@1 60.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][210/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1725 (0.1880) Cons 0.3732 (0.4706) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][220/479] Time 0.759 (0.764) Data 0.000 (0.036) Class 0.1645 (0.1883) Cons 0.5199 (0.4715) Prec@1 75.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][230/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1921 (0.1876) Cons 0.4276 (0.4689) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][240/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1287 (0.1864) Cons 0.4631 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][250/479] Time 0.763 (0.764) Data 0.000 (0.036) Class 0.1925 (0.1863) Cons 0.5085 (0.4710) Prec@1 69.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][260/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.2457 (0.1871) Cons 0.3768 (0.4716) Prec@1 63.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][270/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1760 (0.1857) Cons 0.6240 (0.4730) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][280/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1457 (0.1865) Cons 0.5166 (0.4723) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][290/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1611 (0.1861) Cons 0.3313 (0.4715) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][300/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1698 (0.1861) Cons 0.5398 (0.4724) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][310/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1490 (0.1861) Cons 0.5581 (0.4732) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][320/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1857 (0.1857) Cons 0.4457 (0.4720) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][330/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1504 (0.1848) Cons 0.3339 (0.4715) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][340/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.1625 (0.1853) Cons 0.5476 (0.4709) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][350/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1536 (0.1852) Cons 0.5105 (0.4720) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][360/479] Time 0.759 (0.763) Data 0.000 (0.036) Class 0.1773 (0.1850) Cons 0.5541 (0.4724) Prec@1 66.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][370/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.2138 (0.1846) Cons 0.4766 (0.4709) Prec@1 69.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][380/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1604 (0.1844) Cons 0.5685 (0.4721) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][390/479] Time 0.762 (0.763) Data 0.000 (0.036) Class 0.0984 (0.1842) Cons 0.4666 (0.4716) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][400/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1564 (0.1846) Cons 0.5406 (0.4712) Prec@1 84.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][410/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2175 (0.1840) Cons 0.4788 (0.4713) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][420/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1360 (0.1836) Cons 0.4539 (0.4710) Prec@1 81.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][430/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1436 (0.1836) Cons 0.4669 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][440/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.2510 (0.1835) Cons 0.4651 (0.4703) Prec@1 66.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][450/479] Time 0.762 (0.763) Data 0.000 (0.035) Class 0.1897 (0.1837) Cons 0.5153 (0.4711) Prec@1 72.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][460/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2872 (0.1840) Cons 0.4851 (0.4720) Prec@1 63.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][470/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1542 (0.1840) Cons 0.5021 (0.4736) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
--- training epoch in 365.2766942977905 seconds ---
Evaluating the primary model:
Test: [0/79] Time 0.508 (0.508) Data 0.116 (0.116) Class 0.9628 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [10/79] Time 0.190 (0.218) Data 0.000 (0.011) Class 0.6180 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [20/79] Time 0.189 (0.205) Data 0.000 (0.006) Class 1.1369 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [30/79] Time 0.189 (0.200) Data 0.000 (0.004) Class 1.4853 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [40/79] Time 0.189 (0.197) Data 0.000 (0.003) Class 0.9658 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [50/79] Time 0.191 (0.196) Data 0.000 (0.002) Class 0.4800 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [60/79] Time 0.189 (0.195) Data 0.000 (0.002) Class 0.9950 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [70/79] Time 0.189 (0.194) Data 0.000 (0.002) Class 0.2665 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Hi,
I notice that the pytorch version of MeanTeacher has different setting of parameters compared to the tensorflow version.
For example, the learning rate is 0.1 other than 0.003, which is used in other methods, like temporal ensembling. The rampup epoch and rampdown epoch are different from the tensorflow version too. But I notice that learning rate = 0.003, rampup epoch = 80, rampdown epoch = 50 is used in many other semi-supervised methods.
Does it mean that if we use the setting of parameters from tensorflow version, we can't achieve a good accuracy?
Thanks.
Hi,
In the code, the max translation is -4 to 4 but in paper, it says [-2,2] in Section A.6 Table 7. Did the experimental results shown in the paper use the [-4,4] range as in the code?
Thanks.
Hi,
While trying to run the training scripts I experienced some issues related with some incompabilities between the versions of some packges.
Could you please provide me the required packages and its versions?
Thank you in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.