tokio-rs / mio Goto Github PK
View Code? Open in Web Editor NEWMetal I/O library for Rust.
License: MIT License
Metal I/O library for Rust.
License: MIT License
I'm considering moving away from using std::net
types in favor of re-implementing them in Mio. Mio started off using this strategy, then attempted to use std::net
types when available. Mio introduced NonBlock<_>
in order to differentiate between blocking std
sockets and non-blocking std
sockets.
I am considering going back to the original strategy in order to achieve a few things.
First, I would like to get mio working on stable. Unfortunately, Error::from_os_error
is not stable, which means that it is impossible to create an error value without allocating. This makes me wary of relying on std::net
types in the future to be able to support the features that mio wants to.
Second, I have put a lot more though into windows support. My opinions re: windows support have changed over time. I thought that it would be best to leave windows support to a different library, however, I started writing this library and spent a lot of time with the windows IOCP APIs and I believe that it would be possible to support windows equivalently with some minor API tweaks (backwards compatible). However, to do this, all the network types would need to be implemented in mio.
Also, if mio owns all the types, there wouldn't be much use to have NonBlock<_>
anymore since all the types in mio could be non-blocking by default, but this is an optional change.
So, the strategy would be:
TcpStream
& TcpListener
into miostd::io::Error
in favor of mio::Error
<NonBlock<_>>
(optional)It isn't worth the overhead as well as the fact that it has some quirks.
I have been switching mio to use the net types provided by std
(TcpListener
, TcpStream
, UdpSocket
, etc...). However, this brings up the issue that it isn't really possible to tell if a socket is in non-blocking mode or not.
I have been considering decorating with a struct to indicate that the underlying socket is in in non-blocking mode. However, I am unsure of the exact name. A few options:
NonBlocking<TcpStream>
: A couple of problem is that the marker type is relatively long and will add noise to code. Also, it would be very close to the NonBlock<T>
type that is used as a return value for a non-blocking operation.
Try<TcpStream>
: Short and mirrors the Try*
traits that are used to add non-blocking operations (for example TryRead
, TryWrite
, etc...). However, Try
is a very vague name, so it may be too fuzzy.
Poll<TcpStream>
: Fits in conceptually, but would require renaming the current Poll
type.
So, the way this would work is that, you could take a blocking socket and turn it non-blocking. So:
TcpStream::non_blocking(self) -> Nb<TcpStream> // where Nb is a placeholder
All of the various Try*
traits would be implemented for Nb<SocketType>
directly.
Another option is to not use typing to differentiate between blocking & non-blocking sockets. Thoughts?
cc @wycats @rrichardson @reem @vhbit @rozaliev @fhartwig @vberger @vbuslov
It is a violation of the most fundamental layers of abstraction rules. It also violates any other notions of abstration. Event_loops don't "connect". Sockets "connect". It's driving me fucking crazy.
Some types are not exported in the public API of mio, but are used as return type of some functions of the public API.
They are visible in the generated documentation, but are not links.
So far I found:
EventLoopError
, used in EventLoopResult
TimerResult
, which is the return type of EventLoop::timeout
IoHandle
, not sure about this one. It appears in various generics in the lib, but leaving it private effectively forbids using them with non-mio structures.If so, we should integrate, if not, we should delete the branch.
c-ares provides async DNS resolution. Investigate using it with mio.
kqueue
definitely supports timers as an event via EVFILT_TIMER
, haven't checked epoll
.
Which are pros/cons for using EventLoop::timeout
over "native" timer events?
MIO should provide a high performance, scalable timer API. Coarse grained time resolution is OK given that the use case is IO.
An appropriate strategy would be to use hashed timing wheels (Varghese & Lauck). Netty uses this approach using a tick size of 100ms. It is worth noting that, if the max timeout used is 60 seconds and the tick size is 100ms, a wheel with 1024 slots (rounded to the closest power of 2) would allow the timer to have O(1) performance characteristics for registering a timeout, clearing a timeout, and triggering a timeout.
The core reactor is a single threaded construct. It sleeps for a bit while waiting for events, processes them, and loops. It is often needed to communicate with the reactor by sending it messages. If the reactor is currently waiting for events, it should wakeup and process the received messages.
This system has two components: a queue and a wakeup strategy. Eventually, the queue could be pluggable, but initially it will be a lock-free MPMC queue (provided by Rust). The queue does not provide a blocking mechanism. Writes to it will fail if the queue is full and reads don't need to block since the reactor already has a blocking mechanism via it's IO polling.
When the reactor is asleep, it should be woken up. There are a number of ways to do this, and the most efficient option available to the current platform should be used. The simplest and most portable option is to use a pipe. The reactor listens to the read end and when a message is enqueued and the reactor needs to be woken up, a byte is written on the write end. On linux, a more efficient mechanism is available with eventfd.
An extra optimization can be performed by avoiding the wakeup mechanism if the reactor is not currently sleeping. This can be achieved using an AtomicInt that represents the reactor state. Before the reactor sleeps, it checks the atomic to see if there are any available messages, if there are, it does not block and processes the messages immediately.
I've taken the echo server example from the test
directory and made a stripped-down version which simply accepts and holds onto tcp connections from clients.
I wrote a python client to make 10,000 connections. The client reports that all of the connections were made successfully. However, the server only reports 9,872 clients. I tried adding stdio::flush() after the print statement to ensure that this isn't an output buffering issue. I wrote a separate client in NodeJS to do the same thing to rule out something specific to the python socket library and saw the same results.
There's a very solid chance that I'm simply misusing the library! However, I thought it might be worthwhile to open a ticket.
plan? timeline?
Thank you.
For certain input, next_power_of_tow returns unspecified values. For certain input, the behavior of heap::allocate is undefined.
io::read uses a while loop which assumes that the data is stream-oriented and doesn't need to be separated on segment boundaries.
It reads data into a buffer as long as there is data and buffer space available. This means there is no delineation between messages which can be catastrophic. read() and recv() will only ever return, at most, a single segment, and a buffer should always be large enough to read 1 segment.
This can be circumvented by building a Buf that puts each read into a separate buffer. Or that will return false from has_remaining if it already has a message, but then you're getting into errors with the Edge Triggered model (#45)
My recommendation would be to make write only ever read 1 message. It could return something which indicates that there might be additional remaining data.
I wrote an example here: https://github.com/12sidedtech/mio/blob/master/src/io.rs#L82
This is important so that we can pass pending io registrations through the system in a higher level event loop.
The method accept
is TcpAcceptor
is not accessible.
I get the following error:
$ rustc --version
rustc 0.13.0-nightly (1add4dedc 2014-10-10 18:47:03 +0000)
$ cargo build
Fresh nix v0.0.1 (https://github.com/carllerche/nix-rust#5510805c)
Fresh mio v0.0.1 (https://github.com/carllerche/mio.git#b9b308fc)
Compiling test1 v0.0.1 (file:///home/ayosec/Rust/tests-mio)
/home/ayosec/Rust/tests-mio/src/main.rs:44:32: 44:40 error: type `mio::net::tcp::TcpAcceptor` does not implement any method in scope named `accept`
/home/ayosec/Rust/tests-mio/src/main.rs:44 let sock = self.acceptor.accept().unwrap().unwrap();
With this code:
struct TestHandler {
acceptor: TcpAcceptor,
}
// [...]
// Bind and register the socket
let acceptor = socket.bind(&addr).unwrap().listen(256u).unwrap();
event_loop.register(&acceptor, Token(0)).unwrap();
event_loop.run(TestHandler::new(acceptor)).ok().expect("Failed to run EventLoop");
// [...]
// Handler implementation. readable have to accept the connection
impl Handler<uint, ()> for TestHandler {
fn readable(&mut self, event_loop: &mut TestEventLoop, token: Token, _: ReadHint) {
// ...
let sock = self.acceptor.accept().unwrap().unwrap();
Is there anything wrong in the code?
I see that accept
is defined in the mio::io
module, but that module is private.
Work in progress #51
There isn't much point to using a generic for the token that represents a socket because:
Because of this, it makes more sense for Token to be a new type around uint.
We need some examples to write tcp server, :)
Thx.
Before attempting to write generic token patch, I looked through git history and realised that you've actually dropped this capability in 49e99b3. While I'm not feeling confident at all about this decision and hope you'll revisit it one day, currently I suggest making it consistent and dropping generic T argument from Timer and EventLoop. It's of little use without support for tokens in the events and it's extremely confusing. It required me too look through the whole source code (and git history!) to finally understand, why EventLoop accepts type argument like Token, but then uses unrelated type Token that is just uint (and IMO should be dropped in favor of just using uint).
On a related side, I saw discussion in the Rust upstream about using uint and int, can't find link now, but it was considered a bad practice using int and uint in most cases. I think it's better to use specific sized types.
TL;DR I suggest dropping T from Timer and EventLoop and making all tokens just uXX where XX is a size, possibly platform-dependent.
Following rust-lang/rust#19274, mpmc_bounded_queue
is no longer in the libstd public API of Rust.
Thus mio does no longer compile on last Rust and I don't see any immediate fix.
Hey there,
So, I've been meaning to write this for a while ๐ I tried writing an application using mio about 2 weeks ago - you can see the results here. Essentially, it's a demultiplexing proxy - it accepts an incoming connection, reads up to 512 bytes, and tries to determine the protocol. From there, it should connect to a backend server and proxy data back and forth.
Some general thoughts:
mio
Buf was a bit awkward (see here).Overall, the project isn't finished, but writing it was very fun - thanks for all the work, folks!
Signals, how do they work?
The goal of this proposal is to make it possible to write programs using mio that can successfully handle Unix signals.
Unfortunately, signal handling is somewhat tricky, made even more complex by the way Unix delivers signals to multi-threaded programs.
The basics:
SIGKILL
and SIGSTOP
cannot be caught).SIGSEGV
, SIGFPE
), thesigwaitinfo
function allows a thread to synchronously wait for a setsignalfd
creates a file descriptor that can be used toepoll_wait
).sigwaitinfo
to emulate signalfd
,At a high level, the goal for mio
is to allow a consumer of a reactor to register interest in signals, to be delivered to a mio handler.
This means that programs that want to use this facility will ignore signals on all threads, and we will use sigwaitinfo
or signalfd
to allow the reactor to register interest in signals. This also means that only one reactor can be interested in signals. Otherwise, an arbitrary interested reactor would receive the signals.
If a program uses this facility without successfully ignoring signals, signals may be delivered to random user threads instead of the reactor.
To make it easy for users to successfully ignore signals across their entire program, a new function, mio::initialize
, is added. Programs should run this function at the beginning of their main
function, to ensure that any subsequent threads ignore signals, ensuring that the mio reactor gets notified of signals.
mio::Handler
gets a new callback:
pub trait Handler<T: Token, M: Send> {
// ...
fn signal(&mut self, reactor: &mut Reactor<T, M>, info: Siginfo);
}
The information provided to the handler will be the subset of siginfo_t
that is reliably supported on Linux and Darwin. It will be organized by signal type; in particular fields related to SIGCHLD
will be grouped together.
In order to ensure that signal notifications are sent to the reactor loop, we need to:
It is in theory possible to use this facility without control over signal masking, but that will mean that signals can be missed if they get dispatched to another thread. For programs that want to handle signals, this is very unlikely to be desirable.
EINTR
When a thread recieves a signal during a blocking system call, the system call may return with an EINTR
error.
Typically, this means that system calls must guard themselves against this possibility and attempt to retry the call in the case of EINTR
. There is even a (not fully reliable) sigaction
flag SA_RESTART
that can be used to help with this problem.
For mio internals, this problem should not exist:
signal
handler method, not sigaction
.signalfd
or sigwaitinfo
) to unmask theFor programs that want to sanely handle signals with mio, this problem also should not exist:
It is true that this strategy requires some control over the process, or at > least the thread running the reactor. However, if a user of mio wants to be
notified of signals, they will have to ensure that mio, and only mio, can
receive signals. Otherwise, signals may be silently dropped.Also, if people have a use-case for using mio in a thread that cannot have
signals masked, we can revisit handling internal-to-mioEINTR
s by
retrying. At the moment, it seems very strange to need to allow the same
thread that runs the mio reactor to register asigaction
and receive the
signal through a facility other than mio, but I may be missing something
obvious!
signalfd
)On Linux, we simply use signalfd
to accept all signals, after having masked all signals on the thread.
We then register signalfd
with epoll
. When epoll_wait
returns, if the FD was readable, we consume signals via read
, get the associated signalfd_siginfo
stuctures and call the handler.
signalfd
)In the interest of simplicity, on platforms without signalfd
, we will emulate it using a separate thread and self-pipe.
sigwaitinfo
in a loop to listen for signalssiginfo_t
structures into the pipeIt seems possible to implement the functionality without a separate thread
using kqueue. If that proves important, we can investigate more
platform-specific options.
As a minimal example:
use mio::TryRead;
extern crate mio;
fn main() {
let listener = ::mio::tcp::listen(&"127.0.0.1:8000".parse().unwrap()).unwrap();
let mut socket;
loop {
match listener.accept() {
Ok(Some(s)) => { socket = s; break; }
Ok(None) => {},
Err(e) => panic!("{}", e)
}
}
let mut buffer = ::mio::buf::RingBuf::new(1024);
loop {
match socket.read(&mut buffer) {
Ok(Some(n)) => println!("Read {} bytes.", n),
Ok(None) => {},
Err(e) => panic!("{}", e)
}
}
}
Connection to the server described above works (I'm doing it with nc
from command line), writing text to it works (I get several Read n bytes.
lines), but once the client closes the connexion, it keeps looping on Read 0 bytes.
forever.
Shouldn't socket.read(..)
return something like Err(EndOfFile)
, or something like that ?
if not, how do you check if the connexion has been closed by client ?
Following last rust changes, EventLoop is not longer considered Send
by the compiler. It effectively prevents initialising an EventLoop and then spawing a new thread to make it run.
The fix might be as simple as adding
unsafe impl<T: 'static, M: Send> Send for EventLoop<T, M> {}
but it can probably be handled better on the various subtypes of the library.
The question is whether or not Poll
is a useful abstraction on its own or whether it should be rolled into EventLoop
.
I know that @rrichardson had an opinion about this and @Divius submitted a PR to implement Iterator on Poll (#43).
Writing to a closed socket will return SIGPIPE. This should be replaced with a dummy handler when starting a reactor for the first time.
There are a couple fundamental flaws in using Edge Triggered without One Shot.
Data Loss due to non notification
The first and most major is that if the read() in response to a read-event doesn't consume all of the data from the socket's buffer, it will never be notified about data again, which means the read-event won't be called again and the data will remain forever alone.
Starvation
The model of needing to consume all data out of a socket can lead to starvation of other sockets if there is an extreme amount of data being sent into that socket. A better model would be to consume a small amount of data per each socket, in turn, and go back to epoll_wait.
There are two solutions to this:
Level Triggered
This is easier for people to get right, as you'll always be notified if data is present in the case of read, or the socket is ready in case of write. The downside is it can be spammy, you don't need to be constantly notified of Writable events if you don't care about writing. A web server is a great example, it only needs to write after it receives a request. The solution is, obviously, to allow an application to un-register its interest in writing when it doesn't care about writing.
Edge Triggered with One Shot
This is the recommended approach. This would also require frequent re-registration, but epol_ctl is a very cheap syscall (cheaper than read() returning EAGAIN)
This would require every read and writer to re-register their interest in reading and writing after every epoll_wait call that returns events for that FD.
This solves both of the above problems. Epoll_wait will return fds who have data waiting regardless of whether it has been resting a while or has recently arrived.
IMO, the best approach is to let the user specify if they want edge triggered vs level triggered. But regardless of that model, there should definitely be a reregister function.
Support UDP sockets.
Hoping for a patch from @pvachon :)
In #3, @rrichardson suggested that it would be useful for handlers to know when the remote side of a socket hangs up.
The epoll
backend can support this via the EPOLLHUP
and EPOLLRDHUP
, but other backends do not expose this directly (and instead require the user to attempt to read from the fd to discover that the other side hung up). If this information is directly available, it can be useful as an optimization, even in portable code.
Since we want mio to make it easy to write portable code, it would be better not to expose this as an additional callback on Handler. If we did that, people who wanted to write portable code would have to write duplicate code in backend-specific handlers.
Instead, I suggest that we add a new parameter to the readable
callback that provides backend-specific information if it's available. Since its goal is to make it easy to optimize portable code, in general, and cannot be generally relied on, I suggest calling an enum called ReadableHint
.
enum ReadableHint {
Data, // data is available
Hup, // the other side of the socket hung up
Error, // there was an error
Unknown // the backend does not provide the hint
}
Document MIO for use by others
Currently it seems impossible to e.g. call send_to
from the TrySendTo
trait, becuase the inherent member function of UdpSocket
shadows the definition. I could only call it via UFCS: TrySendTo::send_to(self, ...)
.
impl NonBlock<TcpListener> {
pub fn accept(&self) -> io::Result<Option<NonBlock<TcpStream>>> {
net::accept(as_io(self), true)
.map(|fd| Some(FromFd::from_fd(fd)))
.or_else(io::to_non_block)
}
}
Why is this a Result<Option<NonBlock<TcpStream>>>
instead of a Result<NonBlock<TcpStream>>
?
Could we make some documentation regarding what these return values are? When does it return Ok(None)
? How is this different from Err(e)
?
Uses in examples:
I think the convention was to only give the name as
to functions that do absolutely nothing in terms of the generated code (after inlining), so the function should probably be named into_nonblock
.
Turns out they're not! Repro:
extern crate mio;
use mio::net::udp::UdpSocket;
use mio::*;
use std::time::duration::Duration;
const LISTENER: Token = Token(0);
const TIMEOUT: Token = Token(1);
fn main() {
// Create an event loop
let mut event_loop = EventLoop::<Token, u64>::new().unwrap();
// Register Interest
let listener = UdpSocket::bind("127.0.0.1:12345").unwrap();
event_loop.register(&listener, LISTENER).unwrap(); // Token lets us distinguish.
// Increments
let incrementer = event_loop.channel();
for i in 0.. 5 {
incrementer.send(i).unwrap();
}
// Decrements
event_loop.timeout(TIMEOUT, Duration::milliseconds(250)).unwrap();
// Start it
let sender = UdpSocket::bind("127.0.0.1:12346").unwrap();
event_loop.run(&mut BearHandler {
count: 0,
listener: listener,
sender: sender
}).unwrap();
}
struct BearHandler {
sender: UdpSocket,
listener: UdpSocket,
count: u64,
}
impl Handler<Token, u64> for BearHandler {
fn readable(&mut self, _reactor: &mut EventLoop<Token, u64>, _token: Token, _hint: ReadHint) {
let mut buffer = buf::RingBuf::new(1024);
// Drain socket, otherwise infinite loop!
net::TryRecv::recv_from(&self.listener, &mut buffer.writer()).unwrap();
self.count -= 1;
println!("Decremented, Total: {}", self.count);
}
fn timeout(&mut self, reactor: &mut EventLoop<Token, u64>, _token: Token) {
self.sender.send_to(&[0], "127.0.0.1:12345").unwrap();
// Reset
reactor.timeout(TIMEOUT, Duration::milliseconds(250)).unwrap();
println!("Timeout");
}
fn notify(&mut self, _reactor: &mut EventLoop<Token, u64>, msg: u64) {
self.count += msg;
println!("Increment by: {}, Total: {}", msg, self.count);
}
}
The bindings and utility functions are valuable to those writing other extensions, like support for reading/writing from files.
Here is a description - it is fixed in 10.10 today, but still exists on 10.9 and earlier.
It seems on kqueue
having additional OOB hint maybe critical.
is it possible: mio as the low layer of hyper and rust-websocket?
current rust low layer is difficult to support the c10k ability for web app, (use to many thread ). I wonder if mio become the low layer part of connection, that problem will be solved.
Thank you.
The error messages for low level system errors (aka "making sense of errno").
I believe many of the syscalls we make can return EINTR
, which means we need to retry it, in a lovely userspace loop. Most systems I've worked with have a function fn retry<T>(|| -> MioResult<T>) -> MioResult<T>
that loops until the errno isn't EINTR
. We should also do this.
To be truly close to the metal, stdlib must be dropped. Drop stdlib. :)
Even though bytes is updated, since the version is hard-coded we are not receiving the update. This also requires doing a new release of bytes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.