GithubHelp home page GithubHelp logo

Comments (11)

wilkinsona avatar wilkinsona commented on April 28, 2024 1

Thanks very much, @lared.

I'm guessing your mailing list thread is this one. @markt-asf can this be improved in Tomcat or is there anything we can do in Boot to eliminate the unwanted delay?

from spring-boot.

markt-asf avatar markt-asf commented on April 28, 2024 1

Most of this comes down to user configuration. Ideally, don't change the default shutdown address. If there is a valid reason for changing it, use a specific IP address rather than 0.0.0.0 or ::. If the user does use one of the "any local address" options then Tomcat has to try and guess a valid IP address it can connect to and while we can probably improve the way that works, I suspect there are always going to be some network configurations that cause confusion.

Ignore the above. I'm mixing up shutdown and the Acceptor. Insufficient caffeine.

If the Connector is bound to 0.0.0.0 or :: then Tomcat has to try and guess an IP address to connect to to unlock the Acceptor.

We'll have to have a look and see what we can do to improve the selection algorithm when the connector is bound to 0.0.0.0.

One configuration option that should improve things is to use a specific address setting in the Connector.

from spring-boot.

wilkinsona avatar wilkinsona commented on April 28, 2024 1

Thanks, Mark. We're just leaning on Tomcat's defaults here so an improvement to the selection algorithm would certainly be welcome.

For those following this issue and looking to work around Tomcat's default behavior, you can set the server.address property. This will map down to setting the address on the connector's endpoint and stop it from using the wildcard address.

from spring-boot.

wilkinsona avatar wilkinsona commented on April 28, 2024 1

We try to align the behavior of @SpringBootTest as closely as possible with running an application's main method. With this in mind, I don't think we should use a different server.address or management.server.address for tests. Furthermore, this problem isn't test-specific. The same 10s delay may occur when shutting down a service that has been run via its main method. That too may be unwanted so I think it's better to fix this at source.

from spring-boot.

wilkinsona avatar wilkinsona commented on April 28, 2024

I can't reproduce this on macOS 13.6.4.

@lared, if the web server is taking a long time to stop, it may be a Tomcat problem. You could try enabling debug logging for Tomcat (org.apache.tomcat, org.apache.catalina, and org.apache.coyote) to see if you can identify the point where the delay is occurring. If Tomcat's logging doesn't reveal anything, you could also try taking some thread dumps of the test worker JVM during the 12 second delay to see where the process is stuck.

from spring-boot.

lared avatar lared commented on April 28, 2024

You're right.

When the server shuts down, it's stopping all of the endpoints. In tomcat-embed-core 10.1.18, in org.apache.tomcat.util.net.NioEndpoint, in method stopInternal, there's a lovely hardcoded wait of 10s for the acceptor to shut down:

 @Override
    public void stopInternal() {
        if (!paused) {
            pause();
        }
        if (running) {
            running = false;
            acceptor.stop(10);
       [...]
    }

Inside of org.apache.tomcat.util.net.Acceptor it sets a flag and waits for the acceptor loop to shut down, but it reliably fails to do so until the wait period elapses:

    public void stop(int waitSeconds) {
        stopCalled = true;
        if (waitSeconds > 0) {
            try {
                if (!stopLatch.await(waitSeconds, TimeUnit.SECONDS)) {
                   log.warn(sm.getString("acceptor.stop.fail", getThreadName()));
                }
            } catch (InterruptedException e) {
                log.warn(sm.getString("acceptor.stop.interrupted", getThreadName()), e);
            }
        }
    }

This explains perfectly why it's almost always 12s of wait when it was ~2ish seconds prior. I'll take it to Tomcat team and see what they can say about it. @wilkinsona thanks for pointing me in the right direction!

from spring-boot.

wilkinsona avatar wilkinsona commented on April 28, 2024

If the discussion with the Tomcat team happens somewhere that's linkable, could you please add a link to it here so that others can follow along?

from spring-boot.

lared avatar lared commented on April 28, 2024

@wilkinsona it's on mailing lists so I'll spare the pain for now, but I can explain what was going on.

This is absolutely unrelated to Spring in any way shape of form, it's purely core Tomcat thing and how it (IMO incorrectly) treats machines with multiple NICs.

What went wrong

When an endpoint gets shut down, you have to shut down all the threads which are part of the connector. In particular, you need to shut down the Acceptor thread.

Acceptor thread is close to guaranteed to be blocked on .accept() method. The only reliable and non-destructive way of waking it up is to make a connection to it.

This is done in Tomcat here: https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/AbstractEndpoint.java#L1394

This is relatively simple - you get the address you are listening on, and you connect to that socket, sending it some random OPTIONS request. If this doesn't work for whatever reason, you will get stuck on that latch I linked above until 10s elapses.

What I noticed is that in my case the request was timing out (it was kind of difficult to pick because you need a very detailed logging level to see this):

Feb 25, 2024 7:13:10 PM org.apache.tomcat.util.net.AbstractEndpoint unlockAccept
FINE: Caught exception trying to unlock accept on port [0]
java.net.SocketTimeoutException: Connect timed out
	at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
	at java.base/java.net.Socket.connect(Socket.java:633)
	at org.apache.tomcat.util.net.AbstractEndpoint.unlockAccept(AbstractEndpoint.java:1124)
	at org.apache.tomcat.util.net.NioEndpoint.unlockAccept(NioEndpoint.java:390)
	at org.apache.tomcat.util.net.AbstractEndpoint.pause(AbstractEndpoint.java:1394)
	at org.apache.coyote.AbstractProtocol.pause(AbstractProtocol.java:678)
	at org.apache.catalina.connector.Connector.pause(Connector.java:963)
	at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:484)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
	at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:974)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
	at org.apache.catalina.startup.Tomcat.stop(Tomcat.java:447)
	at SlowStop.main(SlowStop.java:14)

Why did this happen?

For reason unknown to me, instead of connecting to the actual address (loopback in case of any tests), Tomcat looks through all the interfaces and picks the first one that meets some criteria:

https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/AbstractEndpoint.java#L1164

Feb 25, 2024 7:13:08 PM org.apache.tomcat.util.net.AbstractEndpoint unlockAccept
FINER: About to unlock socket for:/[fc00:f853:ccd:e793:0:0:0:1%br-e26d1e697a66]:37591

In my scenario, due to iteration order, this is a bridge network I had defined in Docker. It has no routes defined to it, so connection attempts fail.

How to know if you are affected?

I created a minimum, reproducible (if you are affected) example here: https://github.com/lared/tomcat-acceptor-not-stopping-cleanly

All you need to do is clone it and run ./no_gradle.sh. If you see that the unlock socket address is not localhost, you are affected.

What is the most likely issue?

In my scenario, it was having some sort of empty docker network which had no routing defined. One docker network rm made it go away. If you need those, well, good luck.

from spring-boot.

quaff avatar quaff commented on April 28, 2024

don't change the default shutdown address.

FYI, I encountered this with vanilla apache-tomcat-9.0.86.tar.gz.

from spring-boot.

quaff avatar quaff commented on April 28, 2024

@lared You can work around it by adding -Djava.net.preferIPv4Stack=true to JVM args.

from spring-boot.

lared avatar lared commented on April 28, 2024

Thank you all for help! Overriding the IP address helped immediately.

On Spring Boot's side, would it not make sense to default server.address, management.server.address and similar to 127.0.0.1 in context of tests with non-mock web environment? That would be a breaking change, but I think for most people that would be the expected behavior, regardless of the address selection algorithm on shutdown or web server being used in general. I don't think there's that many people running external services against those test instances, but you know the product the best.

from spring-boot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.