GithubHelp home page GithubHelp logo

amqp's People

Contributors

adinapoli avatar alaendle avatar alain-odea-vgh avatar devwout avatar fegu avatar frincon avatar gseitz avatar himura avatar hreinhardt avatar joehealy avatar jwiegley avatar lemastero avatar michaelklishin avatar moysesb avatar nikomi avatar ocharles avatar peterbecich avatar qnikst avatar sarahhodne avatar tiredpixel avatar woffs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amqp's Issues

coMaxChannel, or some other way to limit the number of simultaneous threads

I'm writing a Yesod app, and part of my architecture is that a background worker process consumes "job" messages from an AMQP queue. Jobs are pushed onto the queue at an uneven and unpredictable rate. I.e. sometimes 1000 jobs will suddenly be pushed, while often 5 or 10 minutes will pass without any being pushed.

The callback I pass to consumeQueue performs a synchronous http callout and also runs database queries before and after each callout. So each callback thread needs to own a database connection for a while (up to a few seconds).

I'm running into situations where the background worker process is spinning off a bunch of callback threads in very quick succession, and pretty soon there are more concurrent callback threads than there are database connections. (At least, that's what I think is happening.) So a bunch of the threads end up dying, each reporting the following SqlError:

SqlError {sqlState = "", sqlExecStatus = FatalError, sqlErrorMsg = "FATAL: remaining connection slots are reserved for non-replication superuser connections\n", sqlErrorDetail = "", sqlErrorHint = ""}

My sense is that the right way to solve this problem is to limit the max number of channels in use at any one time. I'm not actually sure if that's right -- in my mind, 1 channel means 1 thread consuming 1 message (and passing it to my callback) at a time. So if that's not right, please correct me. But if it is right... are you close to implementing coMaxChannel? =) Or can you think of any other way to approach this situation?

Thanks!!

openConnection/closeConnection not thread safe

I noticed that if multiple connections to the same amqp server are opened on separate threads, one thread calling closeConnection causes all of the connections to stop working.

I don't plan to have per-thread connections in production (this arose when running tests in parallel) but it strikes me as something that would be good to fix, or at least document.

`ackEnv envelope` kills query consumer when channel is in `NoAck` mode

In my app, by mistake, I did manually ackEnv a message received by a channel in the NoAck mode. The application did not crash, but the query consumer was silently dropped. It took me an hour to figure out what had happened. Is this a bug?

I am using Stack resolver lts-9.2, so I am at amqp-0.15.1.

Should exceptions in a consumer close the channel?

Exceptions in a consumer are caught by amqp, so the program continues to run. However, the channel doing consuming is now unable to actually do any work - subsequent messages cause:

"ERROR: channel not open 1"

It seems bad to leave the program in a running state, so I think we should either propagate the error up to the application (so it entirely crashes), close the channel, or unregister the consumer.

Thoughts?

Add From/ToFieldValue classes

Thanx again for your AMQP library!

Here is something I added to my client which you might find useful:

type FieldEntry = (Text, FieldValue)

toFieldEntry ∷ ToFieldValue a ⇒ T.Text → a → FieldEntry
toFieldEntry k v = (k, toFieldValue v)

This relies on classes From/ToFieldValue:

class FromFieldValue a where
    fromFieldValue ∷ FieldValue → Maybe a

instance FromFieldValue () where
    fromFieldValue FVVoid = Just ()
    fromFieldValue _      = Nothing

instance FromFieldValue Bool where
    fromFieldValue (FVBool b) = Just b
    fromFieldValue _          = Nothing

instance FromFieldValue Int where
    fromFieldValue (FVInt8  i) = Just (fromIntegral i)
    fromFieldValue (FVInt16 i) = Just (fromIntegral i)
    fromFieldValue (FVInt32 i) = Just (fromIntegral i)
    fromFieldValue (FVInt64 i) = Just (fromIntegral i)
    fromFieldValue _      = Nothing

instance FromFieldValue ByteString where
    fromFieldValue (FVByteArray b) = Just b
    fromFieldValue _               = Nothing

instance FromFieldValue T.Text where
    fromFieldValue (FVString t) = Just t
    fromFieldValue _            = Nothing

instance FromFieldValue Float where
    fromFieldValue (FVFloat f)  = Just f
    fromFieldValue (FVDouble f) = Just (double2Float f)
    fromFieldValue _            = Nothing

instance FromFieldValue Double where
    fromFieldValue (FVDouble f) = Just f
    fromFieldValue (FVFloat f)  = Just (float2Double f)
    fromFieldValue _            = Nothing

and

class ToFieldValue a where
    toFieldValue ∷ a → FieldValue

instance ToFieldValue () where
    toFieldValue = const FVVoid

instance ToFieldValue Bool where
    toFieldValue = FVBool

instance ToFieldValue Int where
    toFieldValue = FVInt64 ∘ fromIntegral

instance ToFieldValue ByteString where
    toFieldValue = FVByteArray

instance ToFieldValue T.Text where
    toFieldValue = FVString

instance ToFieldValue Float where
    toFieldValue = FVFloat

instance ToFieldValue Double where
    toFieldValue = FVDouble

The integer handling could definitely be improved but this is all I needed for the moment.

Please feel free to add this to your AMQP library if you think it is useful.

Automate testing

I don't see any automated tests for the project. Are there any?

I'd be happy to contribute some (there are many examples that can be stolen from Bunny and Langohr test suites) if you can
recommend a testing library that's well suited for testing asynchronous workflows.

No way to declare a queue with headers

RabbitMq has support for TTL and dead letter exchanges, which are specified as headers when declaring a queue. amqp has support for specifying headers when binding a queue, but not when declaring one, so it is not possible to create queues with a dead letter exchange or TTL.

Heartbeats

I notice the Heartbeat field of ConnectionOpts, and the note that says it is for future use. RabbitMQ changed heartbeat defaults in the initial tune of the handshake recently. RabbitMQ v2.x suggests no heartbeat, while RabbitMQ v3.x suggests heartbeat. The server will send heartbeats and expects no reply. This helps a lot when one has stateful firewalls between a server and long-running client, as the firewalls typically have an idle timeout before severing the connection. This issue is also the reason RabbitMQ changed defaults.

Will this library actively change the suggested heartbeat value in the tune handshake (i.e. cancel heartbeats)?

Does this library support simply receiving the heartbeats and doing nothing with them, or will an incoming heartbeat cause an error? If so, I am motivated to develop this feature and make a pull request.

This issue should, in any case, be clarified in the docs.

Introduce ConnectionParams

The more complete the implementation becomes, the more parameters need be to passed into the various openConnection functions. I wanted to work on heartbeats next and I don't want to introduce openConnection''' just for passing in the heartbeat delay parameter, and then later on maxFrameSize etc.

How about adding something like ConnectionParams (for the lack of a better name on my part) which captures host, port, vhost, SASL, max frame size, heartbeat delay.
Not exporting the data constructor would allow to add more fields later. Exporting functions like

withHeartbeatDelay :: Int -> ConnectionParams -> ConnectionParams

, one could then easily build up the connection param value like so:

withHeartbeatDelay 10 $ withMaxFrameSize 1337 defaultConnectionParams

I think this would in the end allow for gracefully deprecating the current openConnection[']{0,2} functions for an overall cleaner way.

WDYT?

Channel thread not killed when connection is closed while a subscriber callback is executing

When a connection is closed, an exception is thrown in the channel threads, so they are killed. https://github.com/hreinhardt/amqp/blob/master/Network/AMQP/Internal.hs#L285

However, when a subscriber callback is running, this exception gets caught.
https://github.com/hreinhardt/amqp/blob/master/Network/AMQP/Internal.hs#L546

This means the channel thread will not get killed, and any registered ChannelExceptionHandlers will not get run.

The catching of callback errors should probably be more selective. Should it ignore ThreadKilled and AMQPException? Or should finaliser errors get tagged and ignored explicitly in the callback catch?

Would make sense to have a "addChannelClosedHandler"?

Hi @hreinhardt ,

In production it would be nice to know and take action every time a channel is closed, for any reason. Since each channel connection is created in a separate thread, we are out of luck at the moment:

https://github.com/hreinhardt/amqp/blob/master/Network/AMQP/Internal.hs#L589

The library is rightfully using finally here, so all it would take is:

  • A function very similar to addConnectionClosedHandler, which would allow us to register new handlers
  • Call all the registered handler in the finally block, to make sure we get notified.

A perhaps even better solution would be to leverage forkFinally:

http://hackage.haskell.org/package/base-4.7.0.2/docs/Control-Concurrent.html#v:forkFinally

This would allow us to expose two new functions, addChannelClosedHandler and addChannelClosedExceptionHandler. Simply pattern matching on the result given back from forkFinally would allow us to call one group on handlers rather than the other. This would have the benefit to react differently whether the channel is closed "naturally" or if an exception caused it to be closed. Bear in mind thought that forkFinally is available from base 4.7 onwards, so a bit of CPP would be needed to support prior to 7.8.x

Do you think all of this would be an overkill?
Thanks a lot!

Alfredo

No Channel.Close_ok sent when channel is closed from server

When a soft error from the server is raised, the server sends a Channel.Close message, and expect a Channel.Close_ok response which is not send.

Then, when try to open a new channel with the same connection, the server sent a CHANNEL_ERROR - second 'channel.open' seen making the connection to be closed.

According with the documentation (https://www.rabbitmq.com/resources/specs/amqp0-9-1.pdf) under section 2.3.7, it is mandatory to return the Channel.Close_ok comand, whether either peer send Channel.Close command.

How to reproduce:

#!/usr/bin/env stack
-- stack script --resolver lts-13.26 --package amqp
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
import Network.AMQP
import Control.Exception (catch)
import Control.Concurrent (threadDelay)
import Control.Monad (void)


main :: IO ()
main = do
  conn <- openConnection "127.0.0.1" "/" "guest" "guest"
  chan <- openChannel conn
  confirmSelect chan False

  -- The exchange does not exists, the server send a Channel.close
  publishMsg chan "not-existing-exchange" "" newMsg {msgBody = "some body"}
  threadDelay 1000000
  catch (void $ publishMsg chan "not-existing-exchange" "" newMsg {msgBody = "some body"})
    (\ (_ :: AMQPException) -> return ())

  -- So the channel is closed, open new one
  chan <- openChannel conn
  return ()

Expected behavior: Open second channel should not throw an exception.
Current behavior: The operation fails with CHANNEL_ERROR - second 'channel.open' seen

Lack of type safety is distrubing

Coming at this library from a functional programmer perspective (and ignoring the typing and conventions inherent in the RabbitMQ community), the library feels a bit too untyped.

Would you accept a patch? I'm thinking of addtions along the lines of:

newtype Queue = Queue Text
newtype RoutingKey = RoutingKey Text
instance IsString Queue where fromString  = Queue . fromString
instance IsString RoutingKey where fromString = RoutingKey . fromString
instance Show Queue where show (Queue q) = show q
instance Show RoutingKey where show (RoutingKey k) = show k
-- Other instances as needed or is sensible (read, eq, ord, etc)
toText :: (Show a, IsString b) => a -> b  -- or substituting `Text` for type var `b`
toText = fromString . show

After this ground work we can have more structure to our calls, which should also make the library easier to learn and use. For example:

declareQueue :: Channel -> QueueOpts -> IO (Queue, Int, Int)
bindQueue :: Channel -> Queue -> Exchange -> RoutingKey -> IO ()
publishMsg :: Channel -> Exchange -> RoutingKey -> Message -> IO ()

And thanks to OverloadedStrings most if not all of the examples will work fine.

availability of reason for connection closing in ConnectionClosedHandler

Is there a possibility to get the string of ConnectionClosedException String inside the handler when using addConnectionClosedHandler? Maybe there needs to be another signature for that, with (e -> IO ()) instead of just IO (). Or is there another possibility to get the reason string for closed connections?

Lower minimum bound on `text` to `0.11.2.0`

Would it be possible to extend the minimum text bound to 0.11.2.0 instead of 0.11.2.3? The reason I request this is because I would like to use amqp >= 0.4 on Debian stable and the Haskell platform packaged for that system comes with text == 0.11.2.0.

Design: Exception handling (swallowing?) in subscriber callback.

I'm unhappy with the exception handling of the subscriber callback. Before looking into the code I expected that a unhandled exception gets down to the uncaught exception handler. Since especially in IO a lot of unforeseeable things can happen that just couldn't be handled in a meaningful way by client code (think of out of memory situations).

CE.catches (subscriber (msg, env))
[
CE.Handler (\(e::ChanThreadKilledException) -> CE.throwIO $ cause e),
CE.Handler (\(e::CE.SomeException) -> hPutStrLn stderr $ "AMQP callback threw exception: " ++ show e)
]

But with current design it means that client code need to wrap the subscriber call back with catches - something like:

        consumerIdentifier <- AMQP.consumeMsgs ch name AMQP.Ack (\(msg, env) -> do ...
          `CE.catches`
          [
            CE.Handler (\(e::AMQP.ChanThreadKilledException) -> CE.throwIO e),  
            -- need to catch and rethrow
            -- exposes implementation details - why should a client care about ChanThreadKilledException? what if things change?
            CE.Handler (\(e::CE.SomeException) -> 
             -- what next? The subscriber just could do anything useful here by itself!
             -- I see two possibilities here
             -- 1) get the uncaught exception handler and raise it manually (isn't this strange?)
             -- 2) use posix signals to initiate a graceful shutdown
              raiseSignal sigTERM)
          ])

All in all this workaround seems unnecessary to me - because if the subscriber callback is able to handle some exceptions in a meaningful way it should do so and no exception will reach your code; if unhandled exceptions occur these should be treated as meant - and not be silently swallowed.

I know this has it's pros and cons and it would be a breaking change to remove the exception handling and it would need some work to ensure proper clean-up, but I really believe it could contribute to more stable clients in the long term.

And please don't get me wrong - I'm really happy with the AMQP library and appreciate the work of all contributors. I just want to discuss this issue, as at least to me the actual behaviour was unexpected, and so I thought that it might be worth to consider this topic if a major update or redesign of the API happens.

consumeMessages hangs when committing transaction

I have this simple consumer:

 consumeMsgs chan "ping-queue" Ack $ \(msg, env) -> do
   txSelect chan
   LBS.putStrLn $ msgBody msg
   ackEnv env
   txCommit chan

The queue contains a number of messages.

When I run this program, I see that only one message is being printed to the console. RabbitMQ management console shows that all the messages are in unacked state.

Removing txCommit prints all the messages to the console (still unacked, of course).

This makes me think that txCommit within consumeMsgs doesn't work.

At the same time this code works fine and does not hang after committing a transaction:

  msg <- getMsg chan Ack "ping-queue"
  case msg of
    Nothing -> LBS.putStrLn $ "!! NOTHING !!"
    Just (msg, env) -> do
      LBS.putStrLn $ msgBody msg
      ackEnv env
      txCommit chan

Support Consumer Cancellation Notification

First of all - thanx for this AMQP library!

In addition to the cool things the library already supports I would suggest Consumer Cancellation Notification as described in https://www.rabbitmq.com/consumer-cancel.html

I think to support this the following would be needed:

  • clientProperties in function start_ok within openConnection'' would need to add key "capabilities" with the value being a table containing "consumer_cancel_notify" = FVBool True (see https://www.rabbitmq.com/consumer-cancel.html#capabilities)
  • the channelReceiver would need a function handleAsync (SimpleMethod (Basic_cancel ... to handle the event and dispatch the registered Consumer Cancellation Notification handlers
  • a new function addConsumerCancellationHandler to register such a callback

I hope I have not missed something vital.

What do you think?

DNS lookup

Not really an issue, just a question.

I'm using your module on Heroku, where my RabbitMQ service is provided by CloudAMQP. Since you require that host be an IP string, I'm doing a DNS lookup on Heroku's CLOUDAMQP_URL environment variable before I call openConnection. Am I overlooking built-in library functionality that should obviate the need for that manual lookup step?

Thanks!

Implement closeChannel

Network.AMQP mentions that there is currently no channel.close support. I'm not sure
why and see no reason for a client to not implement this operation (even though channels are fairly rarely closed explicitly).

Would be happy to look into contributing this.

Why is publishing on the same channel thread while consuming a problem?

Hi,

I understand the solution of forking a new thread to do any complex consuming that involves further using the channel where we are consuming messages. What I don't fully understand is the reason why this is needed.

I understand it has to do with deadlocking the channel. In the case of publishing a new message, the operation might block if the channel is in flow mode. But why would this be a deadlock? Is it because the channel thread is blocked and thus it cannot receive the signal to be released from flow mode? Or is it a more subtle internal thing? Or is it just that other consumers from the same channel would get blocked, which is generally something you wouldn't want, but not necessarily a deadlock?

Thank you for your time!

Bug in `amqp-0.5`

When I upgrade my code from 0.4.2 to 0.5 I get the following mysterious error when connecting to a message queue:

Data.Binary.Get.runGet at position 0: demandInput: not enough bytes

I'm limited for time at the moment, so I haven't had time to narrow down this test case from within the application that this occurred in, but my best guess from studying the diff between 0.4.2 and 0.5 is that this exception probably originated in your internal readFrame function.

Accessing BasicNack and BasicAck

AckType is defined in Network.AMQP.Internal, but does not seem to be exported externally. Is there a way to get access to it to be able to write a confirmation listener effectively?

Documentation clarification regarding threads

I'm sorry for raising this as an issue, perhaps the maintainer would prefer an email for future issues such as this?

A common minimum initial call sequence is
openConnection
openChannel
declareQueue
consumeMsgs

Some programs are multi-threaded. It should perhaps be pointed out that while this works fine:
forkIO $
openConnection
openChannel
declareQueue
consumeMsgs

and this works fine:
openConnection
forkIO $
openChannel
declareQueue
consumeMsgs

this just silently fails (no errors, just does not work):
openConnection
openChannel
forkIO $
declareQueue
consumeMsgs

The reason being, I suppose, that each channel gets its own thread by the amqp library. However, this consequence of the separate channel threads is not obvious from the documentation.

On a side note, after addConnectionClosedHandler was added in v0.2.7 and
qos in v0.4.0, it is now possible to make quite solid AMQP clients with this library.

[docs] closeConnection notice

closeConnection should be used before program/thread exit or messages could be not sent. I think this should be reflected in a function comment.

Loosing connection but no exception thrown

I'm writing a worker consuming RabbitMQ messages. For that I'm establishing a connection and subscribe to a queue with consumeMsgs - like in the example in the Network.AMQP docs.

If if then run the program and shut down RabbitMQ I do not see any exception getting thrown. Shouldn't I get an ConnectionClosedException? I'd need to detect this case so I can try to reconnect.

consumeMsgs forever

In a long running service what is the correct way to wait for the consumeMsgs function?

The examples in the documentation is using getLine or threadDelay and for my application I haven't found a better way than using forever $ threadDelay 1000000, but it seems a bit of a hack.

Might be something obvious in the use of Control.Concurrency that I'm missing.

withChannel :: Connection -> (Channel -> IO a) -> IO a
withChannel conn =
  bracket (openChannel conn) closeChannel

consumeMaterials :: Config -> Connection -> (Material -> IO ()) -> IO ()
consumeMaterials (Config {..}) conn f =
  withChannel conn $ \chan -> do
    void $ declareQueue chan newQueue { queueName = cfgMqQueue }
    void $ consumeMsgs chan cfgMqQueue Ack (upsertMessage f)
    forever $ threadDelay 1000000

upsertMessage :: (Material -> IO ()) -> (Message, Envelope) -> IO ()
upsertMessage f (msg, env) = do
  logStr $ "upsertMessage: " ++ show msg
  void $ f $ fromMessage msg
  ackEnv env

Use of channels and forkIO

The documentation for consumeMsgs says :"[..] DO NOT perform any request on chan inside the callback (however, you CAN perform requests on other open channels inside the callback, though I wouldn't recommend it). Functions that can safely be called on chan are ackMsg, ackEnv, rejectMsg, recoverMsgs. If you want to perform anything more complex, it's a good idea to wrap it inside forkIO.".

When implementing RPC with AMQP, the server needs to send the reply on the client's reply-queue. I understand that this should not be done on the same chan (and have seen that it will fail). However, I do not understand the rationale for the parts "though I wouldn't recommend it" and "it's a good idea to wrap it inside forkIO". Why would you not recommend putting a heavy time-consuming calculation inside the callback (and then send the reply on a separate channel)? Why the need for forkIO?

We should update the docs to reflect the reasons why.

I am currently running a RPC server in this way, and it works ok.

Add withConnection function

Lately I've worked a lot with RabbitMQ, unfortunately with Kotlin/Arrow. One "pattern"* that proved to be very useful are with... functions that make sure connections (and channels in Kotlin) are closed automatically under all circumstances.
This reduces boilerplate and hides the connection management logic from the user.

http-client uses this approach to handle connections, too: https://hackage.haskell.org/package/http-client-0.7.4/docs/Network-HTTP-Client.html#v:withConnection

withConnection for ampq could look like withConnection :: String -> Text -> Text -> Text -> (Connection -> IO a) -> IO a.

One could go even one step further and define withConnectedChannel :: String -> Text -> Text -> Text -> (Channel -> IO a) -> IO a, assuming that most users work with one connection and one channel.

I would happily volunteer to implement this, but thought it might make sense to first discuss this topic.

*) Unfortunately I don't know how it's really called.

Impossible to create named connections

It seems that it is impossible to create a connection that will show up in RabbitMQ management with a name. Other bindings/libraries allow for that, e.g. RabbitMQ.Client (.Net library). Am I just missing how to do it (cannot find it by reading source code or documentation) or is it missing?

Should return published message's sequence number when in confirm mode

Either getting publishMsg to return it or via a new wrapping method as it would be a breaking change to the API.

Not being able to match the outgoing message with the delivery-tag in the addConfirmationListener method prevents additional, message specific logic from being performed. e.g. to mark messages as having been published in an external system.

bug in amqp0.5

v0.5 workds fine with RabbitMQ 2.8.4, but when trying to connect to RabbitMQ 3.1.5 I get amqptest: <socket: 212>: hGetBuf: failed (Unknown error). I made a small test program called amqptest to narrow it down and also instrumented the amqp lib with some debug logging. The error happens after sending three frames to the server during login/setup.

flow function crashes consumer.

I tried extending the example provided in the library by making it possible to pause processing of messages, but the flow function causes the connection to be closed for some reason. Here's my code:

{-# LANGUAGE OverloadedStrings #-}
import Network.AMQP
import Data.Binary
import Data.Binary.Generic
import qualified Data.ByteString.Lazy.Char8 as L
import Control.Concurrent    

main = do
  conn <- openConnection "127.0.0.1" "/" "guest" "guest"
  chan <- openChannel conn
  consumeMsgs chan "myQueue" Ack myCallback
  pauser chan False
  closeConnection conn
  L.putStrLn "connection closed"

 pauser chan p = do        
  ch <- getLine
  if ch == "p"
    then do
    flow chan p 
    pauser chan (not p)
    else
    return ()

myCallback :: (Message, Envelope) -> IO ()
myCallback (msg, env) = do
    L.putStrLn $ L.append "received message: " (msgBody msg)
    ackEnv env

Whenever I hit "p" and enter, I would expect this code to stop receiving messages. However, it throws an INTERNAL_ERROR exception instead.

Publisher Confirms and thread safety on Channel.

I'm working on adding support for publisher confirms to amqp. In principle it is possible to implement this feature and still preserve Channel's thread safety, but that seems to me to be a unnecessary compromise given that no one should be using a channel from multiple threads anyway.

Would it be ok to make a Channels not thread-safe when the publisher confirm mode is selected?

Dangerous uses of forkIO

Hi!

While going through a rabit hole together with @vidocco, we found that the amqp library uses the dangerous forkIO.

Our issue is as follows:
We have a worker executable that should watch for certain messages and "deal" with thim.
This worker executable should keep running at all times.
However, consumeMsgs is non-blocking, so we use something like takeMVar to block until a certain variable is filled by our connectionClosedHandler.
So far so good, but this has the following problem:
If the handler ever stops receiving messages without closing the connection, then our worker will hang indefinitely. (Any advice on how to do this more robustly would be appreciated.)

We recently had an incident where the worker stopped processing messages without crashing, so @vidocco and I went went to investigate.
Our hypothesis, after reading the amqp code, is that the following might have happened:

The channelReceiver seems to be the function that is responsible for handling incoming messages and is started using forkFinally, which uses forkIO.
This means that if this thread ever stops looping (because it throws an exception for example), the main thread will not notice this and the connection will not be closed either.

Here is an example to show that that's how forkIO works:

#!/usr/bin/env stack
-- stack --resolver lts-15.15 script
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent
main :: IO ()
main = do
  putStrLn "Starting our 'server'."
  forkIO $ do            
    putStrLn "Serving..."
    threadDelay 1_000_000
    putStrLn "Oh no, about to crash!"
    threadDelay 1_000_000
    putStrLn "Aaaargh"
    undefined
  threadDelay 5_000_000
  putStrLn "Still running, eventhough we crashed"
  threadDelay 5_000_000                 
  putStrLn "Ok that's enough of that, stopping here."

Which outputs:

$ ./test.hs
Starting our 'server'.
Serving...
Oh no, about to crash!
Aaaargh
test.hs: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
  undefined, called at /home/syd/test/test.hs:17:5 in main:Main
Still running, eventhough we crashed
Ok that's enough of that, stopping here.

We haven't found which part of the channelReceiver might have thrown the exception that may have caused the handler thread to stop, but we found some usages of error that may have been the culprit:

In short; forkIO is dangerous and should be avoided. You can use async together with link instead.

Deadlock on calling cancelConsumer.

I'm calling cancelConsumer from another thread while consuming messages - I would expect that his call is successful - but it blocks undefinedly. Please note that the call succeeds if there are no in-flight messages. After reading #84 - is this a related problem? As far as I get it the usage of request inside cancelConsumer could be a problem? For now I keep short, but please let me know if you need further information.

Should print all connection exceptions if connection fails

In Internal.hs, the following code tries all host/port combinations in series:

    connect ((host, port) : rest) = do
        ctx <- Conn.initConnectionContext
        result <- CE.try (Conn.connectTo ctx $ Conn.ConnectionParams
                              { Conn.connectionHostname  = host
                              , Conn.connectionPort      = port
                              , Conn.connectionUseSecure = tlsSettings
                              , Conn.connectionUseSocks  = Nothing
                              })
        either
            (\(ex :: CE.SomeException) -> do
                connect rest)
            (return)
            result
    connect [] = CE.throwIO $ ConnectionClosedException $ "Could not connect to any of the provided brokers: " ++ show (coServers connOpts)

Since this swallows all exceptions without printing them, if there's a deeper problem involved, the only thing the user will see is "Cannot connect to ", with no clue as to why. In my case, it was because I was missing netbase in my Ubuntu Docker image, so it wasn't a connection problem, but something much deeper (which the Exception clearly indicated).

I suggest capturing all these exceptions to a list, and if no connection is possible, print out the whole list, with each exception paired to the host/port that caused it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.