GithubHelp home page GithubHelp logo

Comments (19)

Smackd0wn avatar Smackd0wn commented on June 4, 2024 1

Not sure if the same issue as the original poster, but let's try this with the command line tool:
time curl -vZm 3 'http://127.0.0.1:1234/[1-50]'
I have a non-responding TCP server running on port 1234 with socat tcp-l:1234,fork -. curl behaved as expected, i.e., opened 50 connections in parallel and timed out after 3 seconds.

However, run this:
time curl -vZm 3 --parallel-max 25 'http://127.0.0.1:1234/[1-50]'
and it took 24 seconds to finish. But it should be 6 seconds: the first batch of 25 connections should time out after 3 seconds, and then goes the second batch of 25 connections, which times out after another 3 seconds.

Interestingly, curl behaves as expected (timed out after 6 seconds) if any of the following is met:

  • curl_multi_setopt(multi, CURLMOPT_PIPELINING, 0)
  • --parallel-immediate which does curl_easy_setopt(curl, CURLOPT_PIPEWAIT, 0)
  • curl release <8.1.0 (e.g., 8.0.1), i.e., this behaviour is probably a regression

from curl.

Smackd0wn avatar Smackd0wn commented on June 4, 2024 1

Is this a known issue ? Any latest release has a fix delivered for this ? How do I use this option and still make the timeout happen in the configured time.

Probably not a known issue. I think using the PIPEWAIT option should not interfere with timeout, so it's actually a bug of (lib)curl, not your usage of the library. I don't think the latest release (8.7.1) fixed this issue.

Unfortunately I have not yet found the root cause of the issue. But my test commands above using the curl command line tool should make it very easy to reproduce the problem. And I believe a fix for the command line tool probably will fix your problem as well.

from curl.

bagder avatar bagder commented on June 4, 2024 1

Can you check if #13276 makes anything better?

from curl.

bagder avatar bagder commented on June 4, 2024 1

there was a crash

I believe I've fixed that in an update

But, there was evident drift in timeout before this crash happened.

@Smackd0wn's curl -Z cmdline test went from 24 seconds to 6 seconds for me, so it does have some impact in my tests.

As we don't have a recipe for your case @rahulKrishnaM, it is harder for me to debug that.

from curl.

bagder avatar bagder commented on June 4, 2024 1

I merged #13276 as a first step towards a fix.

Now I offer #13371 as second step, which seems to take it further.

from curl.

bagder avatar bagder commented on June 4, 2024

can you provide a complete stand-alone example that reproduces the problem?

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

can you provide a complete stand-alone example that reproduces the problem?

Hi @bagder, I didn't try for a fully stand-alone program since I would have to replace the eventloop part with another publicly available one and integrate with the curl logic, we are using our own eventloop mechanism which is already in place. But the curl piece of code is exactly the same what I have captured.

I was just concerned if there is anything wrong in my expectations or any aspect in the curl configurations I am setting could add to the delay we are seeing in my experiments.

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

Interestingly, curl behaves as expected (timed out after 6 seconds) if any of the following is met:

  • curl_multi_setopt(multi, CURLMOPT_PIPELINING, 0)
  • --parallel-immediate which does curl_easy_setopt(curl, CURLOPT_PIPEWAIT, 0)
  • curl release <8.1.0 (e.g., 8.0.1), i.e., this behaviour is probably a regression

Hi @Smackd0wn, I will try out option 1 and 2 above which you suggested just to be sure if it's the same thing happening for me as well. However, regarding point 3, are you saying the delay shouldn't be seen for curl versions below 8.1.0, because the curl version I am using is 7.84.0, where I am seeing this issue.

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

Hi @Smackd0wn, seems like you were spot on!

I tried out a few more combinations and was able to see that if i DISABLE CURLOPT_PIPEWAIT, things are working as expected, connection_timeout is happening at the exact configured time (2.5seconds as below).

With curl_easy_setopt(curl, CURLOPT_PIPEWAIT, 0):

{"2024-04-02T07:40:12.633122","Starting a connection :"http://192.168.1.8:9001/code/1"}
{"2024-04-02T07:40:13.632655","Starting a connection :"http://192.168.1.8:9001/code/2"}
{"2024-04-02T07:40:14.632545","Starting a connection :"http://192.168.1.8:9001/code/3"}
{"2024-04-02T07:40:15.585072","Starting a connection :"http://192.168.1.8:9001/code/4"}
{"2024-04-02T07:40:16.632501","Starting a connection :"http://192.168.1.8:9001/code/5"}
{"2024-04-02T07:40:17.632728","Starting a connection :"http://192.168.1.8:9001/code/6"}
{"2024-04-02T07:40:18.633872","Starting a connection :"http://192.168.1.8:9001/code/7"}
{"2024-04-02T07:40:19.633042","Starting a connection :"http://192.168.1.8:9001/code/8"}
{"2024-04-02T07:40:20.633137","Starting a connection :"http://192.168.1.8:9001/code/9"}
{"2024-04-02T07:40:21.634283","Starting a connection :"http://192.168.1.8:9001/code/10"}
{"2024-04-02T07:40:15.182251","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/1","result":28} --> 2.54923s
{"2024-04-02T07:40:16.181521","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/2","result":28} --> 2.54895s
{"2024-04-02T07:40:17.183281","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/3","result":28} --> 2.5515s
{"2024-04-02T07:40:18.134850","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/4","result":28} --> 2.54983s
{"2024-04-02T07:40:19.185437","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/5","result":28} --> 2.55308s
{"2024-04-02T07:40:20.182822","Connection timeout after 2550 ms","connection":"http://192.168.1.8:9001/code/6","result":28} --> 2.55021s
{"2024-04-02T07:40:21.183944","Connection timeout after 2549 ms","connection":"http://192.168.1.8:9001/code/7","result":28} --> 2.55018s
{"2024-04-02T07:40:22.183992","Connection timeout after 2550 ms","connection":"http://192.168.1.8:9001/code/8","result":28} --> 2.55118s
{"2024-04-02T07:40:23.183388","Connection timeout after 2550 ms","connection":"http://192.168.1.8:9001/code/9","result":28} --> 2.5504s
{"2024-04-02T07:40:24.186057","Connection timeout after 2551 ms","connection":"http://192.168.1.8:9001/code/10","result":28} --> 2.5519s

However, once i ENABLE CURLOPT_PIPEWAIT, the connection_timeout starts drifting, as captured in the original ticket description.

Is this a known issue ? Any latest release has a fix delivered for this ? How do I use this option and still make the timeout happen in the configured time.

from curl.

Smackd0wn avatar Smackd0wn commented on June 4, 2024

After running git bisect, the issue is likely caused by commit 843b3ba by @bagder

Here is my understanding of the problem in the curl tool:

  1. Because HTTP connection may be multiplexed, a non-responding HTTP server will keep all new connections in a "may be multiplexed later" state.

    curl/lib/url.c

    Lines 962 to 969 in df3d5ef

    if(IsMultiplexingPossible(data, needle)) {
    if(bundle->multiuse == BUNDLE_UNKNOWN) {
    if(data->set.pipewait) {
    infof(data, "Server doesn't support multiplex yet, wait");
    *waitpipe = TRUE;
    CONNCACHE_UNLOCK(data);
    return FALSE; /* no reuse */
    }
  2. With pipewait on, Curl_connect() returns error CURLE_NO_CONNECTION_AVAILABLE.

    curl/lib/url.c

    Lines 3669 to 3722 in df3d5ef

    if(waitpipe)
    /* There is a connection that *might* become usable for multiplexing
    "soon", and we wait for that */
    connections_available = FALSE;
    else {
    /* this gets a lock on the conncache */
    struct connectbundle *bundle =
    Curl_conncache_find_bundle(data, conn, data->state.conn_cache);
    if(max_host_connections > 0 && bundle &&
    (bundle->num_connections >= max_host_connections)) {
    struct connectdata *conn_candidate;
    /* The bundle is full. Extract the oldest connection. */
    conn_candidate = Curl_conncache_extract_bundle(data, bundle);
    CONNCACHE_UNLOCK(data);
    if(conn_candidate)
    Curl_disconnect(data, conn_candidate, FALSE);
    else {
    infof(data, "No more connections allowed to host: %zu",
    max_host_connections);
    connections_available = FALSE;
    }
    }
    else
    CONNCACHE_UNLOCK(data);
    }
    if(connections_available &&
    (max_total_connections > 0) &&
    (Curl_conncache_size(data) >= max_total_connections)) {
    struct connectdata *conn_candidate;
    /* The cache is full. Let's see if we can kill a connection. */
    conn_candidate = Curl_conncache_extract_oldest(data);
    if(conn_candidate)
    Curl_disconnect(data, conn_candidate, FALSE);
    else {
    infof(data, "No connections available in cache");
    connections_available = FALSE;
    }
    }
    if(!connections_available) {
    infof(data, "No connections available.");
    conn_free(data, conn);
    *in_connect = NULL;
    result = CURLE_NO_CONNECTION_AVAILABLE;
    goto out;
    }
  3. Because of that error, the handle is set to the PENDING state.

    curl/lib/multi.c

    Lines 1930 to 1943 in df3d5ef

    result = Curl_connect(data, &async, &connected);
    if(CURLE_NO_CONNECTION_AVAILABLE == result) {
    /* There was no connection available. We will go to the pending
    state and wait for an available connection. */
    multistate(data, MSTATE_PENDING);
    /* add this handle to the list of connect-pending handles */
    Curl_llist_insert_next(&multi->pending, multi->pending.tail, data,
    &data->connect_queue);
    /* unlink from the main list */
    unlink_easy(multi, data);
    result = CURLE_OK;
    break;
    }
  4. The aforementioned commit (843b3ba) unlinks PENDING items from the main list. So that it won't be checked during the next run, until all pending handles are added back by process_pending_handles(), triggered by some events.

Prior to that commit, pending handles were not removed from the main list. If I comment out the following line, the bug goes away in my case.

curl/lib/multi.c

Lines 1939 to 1940 in df3d5ef

/* unlink from the main list */
unlink_easy(multi, data);

This change is between 8.0.1 and 8.1.0. So I am less certain it is the same issue encountered by @rahulKrishnaM who uses 7.84.0.

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

Hi @Smackd0wn, I tried moving to curl version 8.1.2 and applying the patch you mentioned above. I still see the same observation as in the ticket description. Also, reattempted the same disabling CURLOPT_PIPEWAIT, and I see timeouts happening at the configured time as expected.

from curl.

Smackd0wn avatar Smackd0wn commented on June 4, 2024

Can you check if #13276 makes anything better?

It caused segmentation fault in multi_done() (backtrace to multi_handle_timeout() and multi_runsingle()), probably because of the removal of data->conn check.

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

Hi @bagder, I tried out the patch, but I think I still see the drift. And, after some time there was a crash, with bt as below (as @Smackd0wn has reported).

Program terminated with signal SIGSEGV, Segmentation fault.
#0  multi_done (data=data@entry=0x5c0ef1940000, status=status@entry=CURLE_OPERATION_TIMEDOUT, premature=premature@entry=true) at multi.c:684
684     multi.c: No such file or directory.
[Current thread is 1 (Thread 0x734999fd1a00 (LWP 30))]
(gdb) bt
#0  multi_done (data=data@entry=0x5c0ef1940000, status=status@entry=CURLE_OPERATION_TIMEDOUT, premature=premature@entry=true) at multi.c:684
#1  0x000073499d51d7ed in multi_handle_timeout (data=data@entry=0x5c0ef1940000, now=now@entry=0x7ffcbccdb440,
    stream_error=stream_error@entry=0x7ffcbccdb3e6, result=result@entry=0x7ffcbccdb3e8, connect_timeout=connect_timeout@entry=false) at multi.c:1670
#2  0x000073499d51db06 in multi_runsingle (multi=0x5c0ef17ee1c0, nowp=0x7ffcbccdb440, data=0x5c0ef1940000) at multi.c:1929
#3  0x000073499d51f794 in multi_socket (multi=multi@entry=0x5c0ef17ee1c0, checkall=checkall@entry=false, s=<optimized out>, ev_bitmask=<optimized out>,    running_handles=0x7ffcbccdb4dc) at multi.c:3258
#4  0x000073499d51f8bf in curl_multi_socket_action (multi=0x5c0ef17ee1c0, s=<optimized out>, ev_bitmask=<optimized out>,    running_handles=<optimized out>) at multi.c:3380
<---application timer processing in eventloop--->
(gdb) f 0
#0  multi_done (data=data@entry=0x5c0ef1940000, status=status@entry=CURLE_OPERATION_TIMEDOUT, premature=premature@entry=true) at multi.c:684
684     in multi.c
(gdb) p conn
$1 = (struct connectdata *) 0x0

But, there was evident drift in timeout before this crash happened.

from curl.

Smackd0wn avatar Smackd0wn commented on June 4, 2024

there was a crash

I believe I've fixed that in an update

But, there was evident drift in timeout before this crash happened.

@Smackd0wn's curl -Z cmdline test went from 24 seconds to 6 seconds for me, so it does have some impact in my tests.

Thank you very much for the fix. Indeed it fixed the segfault and improved the timeout behaviour. Now timeout without any connections is mostly correct. However, when I increased the numbers and ran curl -vZm 1 'http://127.1:1234/[1-500]' for several times, the command sometimes took 12 or 11 seconds instead of 10. When it's 12 or 11, some requests log Operation timed out after 2000 milliseconds with 0 bytes received instead of ~1000 milliseconds.

from curl.

bagder avatar bagder commented on June 4, 2024

I believe the reason for this is related to the optimization I did to fix #2369, which is not really optimal when where are many transfers pending due to PIPEWAIT... I have few ideas to work on.

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

I'm yet to try out with the latest changes @bagder. However, when I said the drift was still there, I am trying out with a higher timeout so that the variation is evident. I tried out with 25 seconds as connection_timeout, where I could see the variations as:
25.0496, 24.0507, 48.1019, 72.1 before the program crashed.

This is the pattern of logs when timeouts happen.

"Failed to connect to 192.168.1.8 port 9001 after 25048 ms: Timeout was reached","connection":"http://192.168.1.8:9001/code/1","result":28}
"Failed to connect to 192.168.1.8 port 9001 after 0 ms: Operation was aborted by an application callback","connection":"http://192.168.1.8:9001/code/2","result":42}
"Failed to connect to 192.168.1.8 port 9001 after 25048 ms: Timeout was reached","connection":"http://192.168.1.8:9001/code/3","result":28}
"Connection timed out after 25000 milliseconds","connection":"http://192.168.1.8:9001/code/4","result":28}

from curl.

bagder avatar bagder commented on June 4, 2024

@rahulKrishnaM please try again with current master and let us know if there is any difference

from curl.

rahulKrishnaM avatar rahulKrishnaM commented on June 4, 2024

Hi @bagder I need to check this. Will get back with the latest commit changes how my testing goes.

from curl.

bagder avatar bagder commented on June 4, 2024

Presumed to be fixed.

from curl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.