GithubHelp home page GithubHelp logo

luoyetx / mx-lsoftmax Goto Github PK

View Code? Open in Web Editor NEW
177.0 177.0 46.0 2.3 MB

mxnet version of Large-Margin Softmax Loss for Convolutional Neural Networks.

License: BSD 3-Clause "New" or "Revised" License

Shell 1.03% Python 43.70% C++ 27.29% Cuda 27.98%
large-margin-softmax mxnet

mx-lsoftmax's People

Contributors

luoyetx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mx-lsoftmax's Issues

nan value

I reimplemented in tensorflow, but find it is hard to train.
It is very easy get nan value.
How could to avoid this?

The loss suddenly to be nan.

I set the parameters as :
beta:1000
margin:4
scale=0.9997
beta_min=5

And after some iteration, the cross entropy loss suddenly become nan. Using the C++ layer and compile with mxnet. Does anyone have idea of this situation and how to solve it?

Error while running mnist.py

Hi! I have a problem while i'm running mnist.py. Parameter: gpu=0, op-impl = 'py'.
`Namespace(batch_size=128, beta=100.0, beta_min=0, gpu=0, lr=0.01, margin=1, model_prefix='model/mnist', no_lsoftmax=False, num_epoch=20, op_impl='py', profile=False, scale=0.99, test=True, train=True)
[18:38:20] src/io/iter_mnist.cc:94: MNISTIter: load 60000 images, shuffle=1, shape=(128,1,28,28)
[18:38:21] src/io/iter_mnist.cc:94: MNISTIter: load 10000 images, shuffle=1, shape=(128,1,28,28)
Error in LSoftmax.infer_type: Traceback (most recent call last):
File "/home/sasha/mxnet/python/mxnet/operator.py", line 650, in infer_type_entry
types = [_DTYPE_MX_TO_NP[tensor_types[i]] for i in range(n_in)]
KeyError: -1

[18:38:21] /home/sasha/mxnet/dmlc-core/include/dmlc/./logging.h:304: [18:38:21] src/operator/custom/./custom-inl.h:231: Check failed: reinterpret_cast(info_->callbacks[kCustomOpPropInferType])( types.size(), types.data(), info_->contexts[kCustomOpPropInferType])

Stack trace returned 10 entries:
[bt] (0) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fce50b52bdc]
[bt] (1) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(ZNK5mxnet2op12CustomOpProp9InferTypeEPSt6vectorIiSaIiEES5_S5+0xf17) [0x7fce516edc37]
[bt] (2) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x12d4fa3) [0x7fce51787fa3]
[bt] (3) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x266ef7a) [0x7fce52b21f7a]
[bt] (4) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x2670282) [0x7fce52b23282]
[bt] (5) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x26710ee) [0x7fce52b240ee]
[bt] (6) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x32c) [0x7fce52b4d75c]
[bt] (7) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x3c9) [0x7fce51b5f9e9]
[bt] (8) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm4pass9InferTypeENS_5GraphESt6vectorIiSaIiEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1d8) [0x7fce51bb74c8]
[bt] (9) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x750) [0x7fce51baff10]

Traceback (most recent call last):
File "mnist.py", line 201, in
train()
File "mnist.py", line 89, in train
epoch_end_callback=mx.callback.do_checkpoint(args.model_prefix))
File "/home/sasha/mxnet/python/mxnet/module/base_module.py", line 459, in fit
for_training=True, force_rebind=force_rebind)
File "/home/sasha/mxnet/python/mxnet/module/module.py", line 399, in bind
state_names=self.state_names)
File "/home/sasha/mxnet/python/mxnet/module/executor_group.py", line 214, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/home/sasha/mxnet/python/mxnet/module/executor_group.py", line 310, in bind_exec
shared_group))
File "/home/sasha/mxnet/python/mxnet/module/executor_group.py", line 586, in bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/home/sasha/mxnet/python/mxnet/symbol.py", line 1433, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (128, 1L, 28L, 28L)
softmax_label: (128,)
Error in operator custom0: [18:38:21] src/operator/custom/./custom-inl.h:231: Check failed: reinterpret_cast(info
->callbacks[kCustomOpPropInferType])( types.size(), types.data(), info
->contexts[kCustomOpPropInferType])

Stack trace returned 10 entries:
[bt] (0) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fce50b52bdc]
[bt] (1) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(ZNK5mxnet2op12CustomOpProp9InferTypeEPSt6vectorIiSaIiEES5_S5+0xf17) [0x7fce516edc37]
[bt] (2) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x12d4fa3) [0x7fce51787fa3]
[bt] (3) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x266ef7a) [0x7fce52b21f7a]
[bt] (4) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x2670282) [0x7fce52b23282]
[bt] (5) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(+0x26710ee) [0x7fce52b240ee]
[bt] (6) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x32c) [0x7fce52b4d75c]
[bt] (7) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x3c9) [0x7fce51b5f9e9]
[bt] (8) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm4pass9InferTypeENS_5GraphESt6vectorIiSaIiEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1d8) [0x7fce51bb74c8]
[bt] (9) /home/sasha/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x750) [0x7fce51baff10]
`

How to run c++ code?

Hi, I am a freshman about mxnet and wonder to know how to run your code(c++ version). Could you help me, please?

L-Softmax not building

I tried using the LSoftmax files but got the errors (can be seen below):

perator/lsoftmax.o
src/operator/lsoftmax.cc:30:3: error: stray '\302' in program
   <title>insightface/lsoftmax.cc at master · deepinsight/insightface · GitHub</title>
   ^
src/operator/lsoftmax.cc:30:3: error: stray '\267' in program
src/operator/lsoftmax.cc:30:3: error: stray '\302' in program
src/operator/lsoftmax.cc:30:3: error: stray '\267' in program
src/operator/lsoftmax.cc:159:10: warning: missing terminating ' character
     <!-- '"` --><!-- </textarea></xmp> --></option></form><form class="js-site-search-form" data-scope-type="Repository" data-scope-id="102057483" data-scoped-search-url="/deepinsight/insightface/search" data-unscoped-search-url="/search" action="/deepinsight/insightface/search" accept-charset="UTF-8" method="get"><input name="utf8" type="hidden" value="&#x2713;" />
          ^
src/operator/lsoftmax.cc:159:5: error: missing terminating ' character
     <!-- '"` --><!-- </textarea></xmp> --></option></form><form class="js-site-search-form" data-scope-type="Repository" data-scope-id="102057483" data-scoped-search-url="/deepinsight/insightface/search" data-unscoped-search-url="/search" action="/deepinsight/insightface/search" accept-charset="UTF-8" method="get"><input name="utf8" type="hidden" value="&#x2713;" />
     ^
src/operator/lsoftmax.cc:506:69: error: stray '#' in program
         <td id="LC7" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&quot;</span>./lsoftmax-inl.h<span class="pl-pds">&quot;</span></span></td>
                                                                     ^
src/operator/lsoftmax.cc:812:10: warning: missing terminating ' character
     <!-- '"` --><!-- </textarea></xmp> --></option></form><form class="js-jump-to-line-form" action="" accept-charset="UTF-8" method="get"><input name="utf8" type="hidden" value="&#x2713;" />
          ^
src/operator/lsoftmax.cc:812:5: error: missing terminating ' character
     <!-- '"` --><!-- </textarea></xmp> --></option></form><form class="js-jump-to-line-form" action="" accept-charset="UTF-8" method="get"><input name="utf8" type="hidden" value="&#x2713;" />
     ^
src/operator/lsoftmax.cc:864:5: error: stray '\342' in program
     You can’t perform that action at this time.
     ^
src/operator/lsoftmax.cc:864:5: error: stray '\200' in program
src/operator/lsoftmax.cc:864:5: error: stray '\231' in program
src/operator/lsoftmax.cc:7:1: error: expected unqualified-id before '<' token
 <!DOCTYPE html>
 ^
src/operator/lsoftmax.cc:506:150: error: expected unqualified-id before '<' token
         <td id="LC7" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&quot;</span>./lsoftmax-inl.h<span class="pl-pds">&quot;</span></span></td>
                                                                                                                                                      ^
src/operator/lsoftmax.cc:506:200: error: expected unqualified-id before '<' token
         <td id="LC7" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&quot;</span>./lsoftmax-inl.h<span class="pl-pds">&quot;</span></span></td>
                                                                                                                                                                                                        ^
Makefile:431: recipe for target 'build/src/operator/lsoftmax.o' failed
make: *** [build/src/operator/lsoftmax.o] Error 1

@luoyetx It would be helpful if anyone can tell me about the error.

doubt for c++ api

following code,where your definition of those two parameter(data,label) in lsoftmax-inl.h,how python know those params
fc4 = mx.sym.LSoftmax(data=embedding, label=label, num_hidden=10, beta=args.beta, margin=args.margin, scale=args.scale, beta_min=args.beta_min, verbose=True)

Make beta play exactly the same role as lambda

Current implement uses beta to weight f_i_yi = |w_yi||x_i|cos(mt) instead of lambda to weight f_i_yi = |w_yi||x_i|cos(t) because lambda is a keyword in Python and can not be used to name a variable. As described in Largin Margin Softmax Paper, we may want to generally reduce lambda to 0 which means generally increase beta to some number as big as 1000. It's kind of wired. I consider to use beta exactly as lambda to weight original f_i_yi = |w_yi||x_i|cos(t) and add a scale parameter in the op to generally reduce it to 0 during training.

derivative of f repect to x

Hello,
Could you help explain why we need to calculate the derivative of f respect to x (df / dx)?

I am confused by x. My understanding is we only need to calculate the df / dw

Thanks

Why this particular construction?

Hi --

I was wondering where you got the idea for the specific construction of the L-softmax. It seems like maybe you could achieve a similar goal by enforcing a margin like

norm(W) * norm(x) * (m * cos(theta) - m + 1)

instead of

norm(W) * norm(x) * cos(m * theta)

as you do in the paper.

The former seems simpler because you don't have to worry about constructing a psi function that behaves well for all values of theta, m doesn't have to be integer valued, etc. Also, in the paper, the gradient of psi is 0 at pi/2, which AFAICT is an undesirable side effect of the choice of psi. Is that right, or is there some reason that grad psi(pi/2) should be 0?

The proposed alternative above would have the same shape as cos in [0, pi] but with a range of [-m, 1], which seems maybe more natural.

Thoughts? Am I missing something? Did you try this and it stunk in practice?

Thanks

Why the directions of digits in visualization are exactly the same?

Not only in your ReadMe, but also in my training results. The directions of each digits in the visualization are exactly same. I think the directions should depend on the random initialization to some extent. So is there are something in your code which caused that?

Thx~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.