I'm not a number, I'm a free file descriptor!!1 (our protagonist promptly disappears down a wormhole)
Welcome to the research / everything repo for my talk at !!con 2016!
You can find my slides and presenter notes in three formats:
The talk is online too: YouTube.
First, two terms: "file descriptor" is a number referring to an entry in the process's file table. Those entries point to global "file description" in the kernel. If you're confused now, these are the terms that POSIX uses. I'm very sorry.
Now that the technical details are out of the way, here's what is happening: My colleague Nelson once told me over lunch how UNIX domain sockets are super weird in Linux: You can very easily write a program that runs the Kernel out of file descriptions and even run the kernel out of memory!
This intrigued me, so I decided to write up some test programs to do exactly this.
The way this works is that the tests here create many throwaway file
descriptors (on the Linux, this uses memfd_create
; everywhere else,
it just unlinks mkstemp
ed files), and then send them into a UNIX
domain socket pair, closing the files afterwards.
On Linux, this lets you get 500 file descriptors past the per-process file limit, or more (depending on how many FDs you send in a single control message; I opted to send a single one because that made the type signatures easier).
Of course, that's not enough! We want pathological behavior! No, we DEMAND it! OK, fine. Here's what you do then: You take the ends of this UNIX domain socket pair (the "inner ring"), and then send them down another UNIX domain socket pair (the "outer ring"). Then you close that old inner ring, make a new one and start from the top, until the kernel runs out of file descriptions entries in its file table.
Wheeeee! This works on Linux, because it has a socket GC (do check out that file, the change log at the top is amazing). When you close a pair of UNIX domain sockets, Linux recursively traverses UNIX domain sockets from open roots, marking all reachable FDs as open and then closing the ones it can't reach.
On Mac OS X (and other BSDs, apparently), this trick doesn't work, because as soon as you close a pair of sockets, the FDs contained in the socket pair are closed. This is eminently reasonable, but ever so slightly less fun than the machinery Linux has.
You can run the Linux kernel out of memory! My test programs make one
new temp FD for each message being sent, until the global file table
runs over. But you can also create a single file throwaway
descriptor and then use
dup
to create
copies of that single global file description entry! You'll be able to
store many more messages in rings (and will probably have to recurse
ring storage a bit), but at some point the kernel itself might run out
of memory. Wheeeeeeeeeeeeeee!
Another super fun thing to do is use UNIX domain sockets for those
temporary FDs: I use memfd
s because they don't trigger recursive
garbage collection slowdown as much as if you used UNIX sockets. You
can slow everything down by a lot with just a few levels of UNIX
domain sockets stored in each other. I didn't do the math but I
believe this is
accidentally (or I guess we're doing it intentionally) quadratic.
The source code for my research is in the file-descriptor-fun directory.
I wrote these test programs in Rust (1.8.0), using nix for the UNIX interactions. The programs are all done as integration-style tests in the tests directory.
You can browse the source (if anything isn't clear, I'm very sorry! I'm pretty new to Rust, and I coded under a bit of time pressure. I'll try to improve it if you file an issue!)
There are some API docs for the ring buffer in the github pages of this repo.
See the README.md in file-descriptor-fun for details on how to run / build this.
-
Nelson for the initial inspiration, continual guidance and handholding through the various problems I encountered.
-
Julia, for the feedback that made this presentation way more fun than it was initially.
-
Kamal for this post on nix that motivated me to write the research programs for this talk in Rust - I would probably never have finished writing this in C with all my hair still attached to my head.
-
Last but definitely not least, Veronika for always encouraging me <3