GithubHelp home page GithubHelp logo

Comments (11)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
thanks for reporting this. I'll work on it immediately.

Original comment by [email protected] on 11 Jun 2012 at 6:24

  • Changed state: Started

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
fix released with 1.2.2

Original comment by [email protected] on 11 Jun 2012 at 8:10

  • Changed state: Fixed

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Thank you for fixing this so quickly!!!

Original comment by [email protected] on 12 Jun 2012 at 6:55

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Hi again,

I seem to be running into another problem concerning the wait-interrupt part of 
RpcClient.callBlockingMethod. This seems to go into a busy loop of some sort 
taking all CPU. 

I'm prototyping a distributed system with many server nodes that are called 
from one client and I'm testing how to recover from random server or client 
crashes and restarts over a not necessarily reliable network. This latest 
problem seems to occur when the client crashes and tries to re-open a 
connection using a client port that is already open on the server side. Or at 
least I get a java.io.IOException: DuplexTcpServer CONNECT_RESPONSE indicated 
error ALREADY_CONNECTED. I've attached a screenshot of the callstack for two 
threads, although this is probably not very helpful. I can try to debug this 
further, can you tell me where the thread waiting in callBlockingMethod is 
interrupted from?

Br,
-M-

Original comment by [email protected] on 12 Jun 2012 at 12:56

Attachments:

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
The ALREADY_CONNECTED issue can be a situation encountered under load, where 
the server has yet to fully complete "discarding" a client which has crashed - 
for whatever reason. It could be that the TCP stack of the server has not 
realized the disconnect yet since nothing has been sent to the client, before 
the client reconnects. The idea would be that a client would have to retry to 
connect with a smallish sleep time between retries - and eventually it should 
cleanly reconnect. I preferred this approach than the alternative, more 
dangerous ( denial-of-service ) , which would be to kick out the existing 
connected client if it presents with the same identity. If you want to avoid 
the identity problem a bit , you could pass the <processId> as part of the 
clients identity, which should be unique still after a crash. ( 
http://stackoverflow.com/questions/35842/process-id-in-java )

IF the client still cannot reconnect after say 90s ( usual TCP stack Operating 
system parameter for socket close wait or something like this ) - then it's 
probably because of a bug. In this case a stack trace of the server side would 
be great.

About the wait-interrupt part of the RpcClient.callBlockingMethod. I'm looking 
at this code and think it's not great, I havent been able to reproduce your 
problem. Its not clear to me if its the server or the client which ends up in 
the tight loop? If you could send me the stacktrace of the affected JVM i'd 
appreciate it. send to [email protected] if you have privacy converns about 
attaching to the bug comments.

Thanks and i hope to resove this soon. P.eter.

Original comment by [email protected] on 12 Jun 2012 at 6:30

  • Changed state: Started

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
looking at your VisualVM CPU chart, does not say to me that the 
callBlockingMethod is caught in a tight CPU loop - but rather that the methods 
are simply waiting about 1s per call for the remote answer to come back. Just 
like the nio "select" which is waiting all the time and indicating 100% cpu 
use, its deceiving - but to be honest i dont think there a problem here. 

If you would use the non blocking call variant, you wouldnt see the "high" CPU 
as in the picture, but your answers from the "listNodeContent" wouldnt get back 
any faster from the remote side.


Original comment by [email protected] on 12 Jun 2012 at 7:32

  • Changed state: Fixed

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Hi,

Thank you for your answers. I'll explain my system a bit more: The system is 
not really under heavy load, and when I crash the client there's more than 30 
sec before I restart it, and I still get the ALREADY_CONNECTED. But this might 
be due to the TCP stack as you explained, I'll look into this in more detail 
tomorrow.

As to your second comment, I get the busy loop on the client side and it never 
gets out, i.e. nothing happens on the server side and it seems the client gets 
interrupted every second and then goes back to sleep. Is there a way of 
breaking out of this in my code? To set some kind of delay for missing answers? 
Anyway, I'll try the non-blocking variant to see if it changes anything.

Br,
-M-

Original comment by [email protected] on 12 Jun 2012 at 7:58

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
you're right, the more i look at the picture the more confused i get. If there 
were only 2 calls but 169 calls of wait/interrupt theres something very wrong. 
I dont think there is much point in looping around when the threads been 
interrupted - so i intend to make a fix to at least log the interrupted 
exception and exit the loop .


Original comment by [email protected] on 12 Jun 2012 at 8:05

  • Changed state: Started

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
so i made a new release 1.2.3 which should fix the wait/notify issue. 
Br, Peter.

Original comment by [email protected] on 12 Jun 2012 at 9:45

  • Changed state: Fixed

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Thanks! As far as I've been able to test this today, it seems to work fine. 
I'll let you know if I run into new problems.
Br,
-M-

Original comment by [email protected] on 13 Jun 2012 at 1:23

from protobuf-rpc-pro.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024

Original comment by [email protected] on 18 Nov 2012 at 6:49

  • Changed state: Done

from protobuf-rpc-pro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.