GithubHelp home page GithubHelp logo

fastswa-semi-sup's People

Contributors

andrewgordonwilson avatar benathi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastswa-semi-sup's Issues

The main difference between mean_teacher and swa

According to my understanding, the mean_teacher model attempts to average (exponential mean average) the weights for every training steps and including a consistency loss. While swa only attempts to average the weights for different epochs after the cyclic changes of the learning rate.

AM I correct?

Test Acc is 0.

Thans for your job!
when I run cifar10_mt_cnn_short_n4k.py,The Test Acc is 0.Please help me.

Epoch: [10][0/479] Time 0.587 (0.587) Data 0.172 (0.172) Class 0.2557 (0.2557) Cons 0.5538 (0.5538) Prec@1 63.000 (63.000) Prec@5 90.000 (90.000)
Epoch: [10][10/479] Time 0.761 (0.746) Data 0.000 (0.047) Class 0.1798 (0.1807) Cons 0.3751 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][20/479] Time 0.760 (0.753) Data 0.000 (0.041) Class 0.1572 (0.1806) Cons 0.3082 (0.4412) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][30/479] Time 0.761 (0.756) Data 0.000 (0.039) Class 0.1211 (0.1828) Cons 0.5274 (0.4509) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][40/479] Time 0.762 (0.757) Data 0.000 (0.038) Class 0.1246 (0.1880) Cons 0.4772 (0.4541) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][50/479] Time 0.762 (0.758) Data 0.000 (0.038) Class 0.2377 (0.1924) Cons 0.4411 (0.4581) Prec@1 72.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][60/479] Time 0.761 (0.759) Data 0.000 (0.037) Class 0.1787 (0.1910) Cons 0.4363 (0.4700) Prec@1 75.000 (70.000) Prec@5 93.000 (93.000)
Epoch: [10][70/479] Time 0.760 (0.759) Data 0.000 (0.037) Class 0.1548 (0.1868) Cons 0.3258 (0.4645) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][80/479] Time 0.763 (0.759) Data 0.000 (0.037) Class 0.1739 (0.1863) Cons 0.4593 (0.4601) Prec@1 72.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][90/479] Time 0.762 (0.759) Data 0.000 (0.036) Class 0.2345 (0.1873) Cons 0.4418 (0.4580) Prec@1 66.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][100/479] Time 0.763 (0.760) Data 0.000 (0.036) Class 0.1874 (0.1891) Cons 0.3540 (0.4589) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][110/479] Time 0.760 (0.760) Data 0.000 (0.036) Class 0.1395 (0.1875) Cons 0.4130 (0.4591) Prec@1 84.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][120/479] Time 0.761 (0.760) Data 0.000 (0.036) Class 0.1363 (0.1885) Cons 0.3363 (0.4606) Prec@1 81.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][130/479] Time 0.762 (0.760) Data 0.000 (0.036) Class 0.2359 (0.1899) Cons 0.4791 (0.4646) Prec@1 66.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][140/479] Time 0.833 (0.764) Data 0.000 (0.036) Class 0.1649 (0.1883) Cons 0.3886 (0.4649) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][150/479] Time 0.759 (0.766) Data 0.000 (0.036) Class 0.1434 (0.1889) Cons 0.5803 (0.4672) Prec@1 78.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][160/479] Time 0.760 (0.766) Data 0.000 (0.036) Class 0.1944 (0.1891) Cons 0.4242 (0.4648) Prec@1 75.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][170/479] Time 0.760 (0.765) Data 0.000 (0.036) Class 0.2158 (0.1881) Cons 0.5003 (0.4673) Prec@1 72.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][180/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1671 (0.1885) Cons 0.5318 (0.4698) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][190/479] Time 0.759 (0.765) Data 0.000 (0.036) Class 0.1947 (0.1882) Cons 0.4729 (0.4685) Prec@1 72.000 (70.000) Prec@5 87.000 (94.000)
Epoch: [10][200/479] Time 0.761 (0.765) Data 0.000 (0.036) Class 0.2463 (0.1888) Cons 0.5507 (0.4716) Prec@1 60.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][210/479] Time 0.762 (0.765) Data 0.000 (0.036) Class 0.1725 (0.1880) Cons 0.3732 (0.4706) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][220/479] Time 0.759 (0.764) Data 0.000 (0.036) Class 0.1645 (0.1883) Cons 0.5199 (0.4715) Prec@1 75.000 (70.000) Prec@5 90.000 (94.000)
Epoch: [10][230/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1921 (0.1876) Cons 0.4276 (0.4689) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][240/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1287 (0.1864) Cons 0.4631 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][250/479] Time 0.763 (0.764) Data 0.000 (0.036) Class 0.1925 (0.1863) Cons 0.5085 (0.4710) Prec@1 69.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][260/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.2457 (0.1871) Cons 0.3768 (0.4716) Prec@1 63.000 (70.000) Prec@5 93.000 (94.000)
Epoch: [10][270/479] Time 0.761 (0.764) Data 0.000 (0.036) Class 0.1760 (0.1857) Cons 0.6240 (0.4730) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][280/479] Time 0.760 (0.764) Data 0.000 (0.036) Class 0.1457 (0.1865) Cons 0.5166 (0.4723) Prec@1 75.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][290/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1611 (0.1861) Cons 0.3313 (0.4715) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][300/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1698 (0.1861) Cons 0.5398 (0.4724) Prec@1 69.000 (70.000) Prec@5 96.000 (94.000)
Epoch: [10][310/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1490 (0.1861) Cons 0.5581 (0.4732) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][320/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1857 (0.1857) Cons 0.4457 (0.4720) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][330/479] Time 0.763 (0.763) Data 0.000 (0.035) Class 0.1504 (0.1848) Cons 0.3339 (0.4715) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][340/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.1625 (0.1853) Cons 0.5476 (0.4709) Prec@1 75.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][350/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1536 (0.1852) Cons 0.5105 (0.4720) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][360/479] Time 0.759 (0.763) Data 0.000 (0.036) Class 0.1773 (0.1850) Cons 0.5541 (0.4724) Prec@1 66.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][370/479] Time 0.760 (0.763) Data 0.000 (0.036) Class 0.2138 (0.1846) Cons 0.4766 (0.4709) Prec@1 69.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][380/479] Time 0.761 (0.763) Data 0.000 (0.036) Class 0.1604 (0.1844) Cons 0.5685 (0.4721) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][390/479] Time 0.762 (0.763) Data 0.000 (0.036) Class 0.0984 (0.1842) Cons 0.4666 (0.4716) Prec@1 81.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][400/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1564 (0.1846) Cons 0.5406 (0.4712) Prec@1 84.000 (71.000) Prec@5 90.000 (94.000)
Epoch: [10][410/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2175 (0.1840) Cons 0.4788 (0.4713) Prec@1 69.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][420/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1360 (0.1836) Cons 0.4539 (0.4710) Prec@1 81.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][430/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.1436 (0.1836) Cons 0.4669 (0.4699) Prec@1 75.000 (71.000) Prec@5 96.000 (94.000)
Epoch: [10][440/479] Time 0.760 (0.763) Data 0.000 (0.035) Class 0.2510 (0.1835) Cons 0.4651 (0.4703) Prec@1 66.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][450/479] Time 0.762 (0.763) Data 0.000 (0.035) Class 0.1897 (0.1837) Cons 0.5153 (0.4711) Prec@1 72.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][460/479] Time 0.759 (0.763) Data 0.000 (0.035) Class 0.2872 (0.1840) Cons 0.4851 (0.4720) Prec@1 63.000 (71.000) Prec@5 93.000 (94.000)
Epoch: [10][470/479] Time 0.761 (0.763) Data 0.000 (0.035) Class 0.1542 (0.1840) Cons 0.5021 (0.4736) Prec@1 78.000 (71.000) Prec@5 96.000 (94.000)
--- training epoch in 365.2766942977905 seconds ---
Evaluating the primary model:
Test: [0/79] Time 0.508 (0.508) Data 0.116 (0.116) Class 0.9628 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [10/79] Time 0.190 (0.218) Data 0.000 (0.011) Class 0.6180 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [20/79] Time 0.189 (0.205) Data 0.000 (0.006) Class 1.1369 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [30/79] Time 0.189 (0.200) Data 0.000 (0.004) Class 1.4853 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [40/79] Time 0.189 (0.197) Data 0.000 (0.003) Class 0.9658 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [50/79] Time 0.191 (0.196) Data 0.000 (0.002) Class 0.4800 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [60/79] Time 0.189 (0.195) Data 0.000 (0.002) Class 0.9950 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Test: [70/79] Time 0.189 (0.194) Data 0.000 (0.002) Class 0.2665 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)

  • Prec@1 0.000 Prec@5 0.000
    Evaluating the EMA model:
    Test: [0/79] Time 0.320 (0.320) Data 0.130 (0.130) Class 0.7387 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [10/79] Time 0.189 (0.202) Data 0.000 (0.012) Class 0.3110 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [20/79] Time 0.191 (0.196) Data 0.000 (0.006) Class 1.0565 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [30/79] Time 0.189 (0.194) Data 0.000 (0.004) Class 1.4122 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [40/79] Time 0.189 (0.193) Data 0.000 (0.003) Class 1.0135 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [50/79] Time 0.189 (0.192) Data 0.000 (0.003) Class 0.5472 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [60/79] Time 0.190 (0.192) Data 0.000 (0.002) Class 0.7229 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
    Test: [70/79] Time 0.191 (0.191) Data 0.000 (0.002) Class 0.4157 (0.0000) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
  • Prec@1 0.000 Prec@5 0.000
    --- validation in 30.15329122543335 seconds ---

Issues related to python package version incompatibility

Hi,

While trying to run the training scripts I experienced some issues related with some incompabilities between the versions of some packges.

Could you please provide me the required packages and its versions?

Thank you in advance.

I can't see how SWA is doing anything.

I read the main.py, which is quite lengthy. I saw many instances of swa_nets being instantiated and updated by the main model. However, I couldn't not see in code where the swa_nets make any adjustment to the main model during training.

Am I missing something here?

Hyperparameter of the training process didn't match the setting of other papers.

Hi,
I notice that the pytorch version of MeanTeacher has different setting of parameters compared to the tensorflow version.

For example, the learning rate is 0.1 other than 0.003, which is used in other methods, like temporal ensembling. The rampup epoch and rampdown epoch are different from the tensorflow version too. But I notice that learning rate = 0.003, rampup epoch = 80, rampdown epoch = 50 is used in many other semi-supervised methods.

Does it mean that if we use the setting of parameters from tensorflow version, we can't achieve a good accuracy?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.