moby / libnetwork Goto Github PK
View Code? Open in Web Editor NEWnetworking for containers
License: Apache License 2.0
networking for containers
License: Apache License 2.0
When we reexec
in order to move the interfaces to the network namespace, we don't provide the serialized interfaces to the child process.
Create a pipe and pass one end to the child process as fd 3 (this will require small refactoring of reexec.go
in order to support passing ExtraFiles
).
This should explain how the bridge driver is used, what configuration options there are etc...
Think of it as a lite version of what would appear on docs.docker.com
Take inspiration from libcontainer code for veth, but implement using vishvananda/netlink
library.
Libnetwork provides all the necessary abstraction to provide networking for containers.
This includes current Single-Host Docker Bridge mode.
By bringing Simplebridge driver to be feature equivalent of Docker Bridge, Docker project can
start using the libnetwork as a library for various networking use-cases (via plugins).
They provide NAT like functionality which is network specific, therefore it makes sense to have them in libnetwork.
There are options that network drivers will require which are driver specific globals. Currently the only way to pass options to driver is on a per network basis.
The libnetwork test code has revealed that the simple bridge driver is not cleaning up the veth interfaces in endpoint delete operation leaving some stale interfaces on the bridge.
Also, currently bridge driver only expect one endpoint. It should instead store one endpoint per sandbox.
This PR is for fixing the above two issues.
As the interface & datamodel currently stands, Endpoints can be created via the libcontainer public interface Network, and contain a reference to a SandboxInfo. These SandboxInfo's hold references to a list of Interfaces. When creating a new Endpoint, the current bridge driver always creates a fresh SandboxInfo, and always populates this a single Interfaces.
The datamodel implies an endpoint can have multiple Interfaces, I don't understand why - is this intentional? Is the intention that these SandboxInfos should be shared in some way?
I'm also assuming we want to support container being connected to multiple networks, across potentially different drivers. Each driver will potentially want to offer different default gateways and nameservers, but it is only meaningful to have a single default gateway per container, and its typically only meaningful to have a single nameserver provider (You can have multiple nameserver, but only for redundancy - the OS doesn't query them unless their is a failure, not a negative response, AFAIU).
Given this, when starting a container the behaviour around which default gateway to choose, and which nameserver to insert, is currently undefined.
Is my understanding correct? Could you please clarify how this is suppose to be used?
Thanks
Tom
Bring it to at least 80%
(This does not include driver/bridge coverage which is greater than 80% already)
According to current API, I see that you pass "name" of a network while creating it in libnetwork. libnetwork in turn provides only the uuid to the driver.
e.g: docker network create -d ovn --name=foo
For multi-host networking, from a different host, if someone does a "docker network ls", will libnetwork somehow share state and list both the name and uuid?
If it does indeed share state, how is it going to store that persistent information?
libnetwork_test.go: createTestNetwork
network, err := controller.NewNetwork(networkType, networkName,
libnetwork.NetworkOptionGeneric(netOption))
the caller transfers the "netOption", one example is in "TestBridge":
netOption := options.Generic{
"BridgeName": bridgeName,
"AddressIPv4": subnet,
"FixedCIDR": cidr,
"FixedCIDRv6": cidrv6,
"EnableIPv6": true,
"EnableIPTables": true,
"EnableIPMasquerade": true,
"EnableICC": true,
"AllowNonDefaultBridge": true}
function "parseNetworkOptions" in bridge.go wil parse the option by the way
genericData, ok := option[netlabel.GenericData]
they are not math, so the requested bridge will not created.
the solution is to change libnetwork_test.go: createTestNetwork:
" ....
netGenericOption := make(map[string]interface{})
netGenericOption[netlabel.GenericData] = netOption
network, err := controller.NewNetwork(networkType, networkName,
libnetwork.NetworkOptionGeneric(netGenericOption))
"
This is a development and debugging tool
--dns=[] : Set custom dns servers for the container
--net="bridge" : Set the Network mode for the container
'bridge': creates a new network stack for the container on the docker bridge
'none': no networking for this container
'container:<name|id>': reuses another container network stack
'host': use the host network stack inside the container
--add-host="" : Add a line to /etc/hosts (host:IP)
--mac-address="" : Sets the container's Ethernet device's MAC address
There are many places across libnetwork where return errors are just some strings with no type associated with it. It makes it practically impossible to compare the error and take any remedy by the caller.
This issue is opened to address the problem across libnetwork project.
I am a little confused around the life cycle.
What does 'docker network join CONTAINERUUID NETWORKNAME -l foo=bar' call in terms of libnetwork apis?
Does this call the following 2 one after the other?:
network.CreateEndpoint(NETWORKNAME, labels(foo=bar))
-> driver.CreateEndpoint(nid, eid, labels(foo=bar))
endpoint.join(CONTAINERUUID, labels(foo=bar))
-> driver.Join(nid, eid, sboxkey, labels(foo=bar))
As mentioned in issue #71 the behavior of multiple endpoints in a sandbox is undefined. This issue tracks discussion on how to make it more defined. One proposal from @mrjana is below.
Yes CNM allows multiple endpoints from the same container to connect to different networks(the CNM Network) and we are thinking of introducing certain driver generic config to the Endpoint. One of them could possibly indicate whether this is a primary or a secondary endpoint in the container. This hasn't been completely discussed yet so please take it FWIW.
It would be nice to be able to assign static ip addresses to Docker containers (for web servers for instance) as a built in feature. I have a temporary solution here with pipework but it is sub-optimal as docker inspect --format '{{ .NetworkSettings.IPAddress }}' "$@"
no longer works.
The interest for this feature can be seen at moby/moby#6743
Add support for host configurations where firewalld is the master iptables manager
The plugins subsystem will need a way to register a plugin (RPC stub) as a network driver.
Do you plan the Overlay Driver is to be released at the same time as the docker 1.7.0?
In overlay driver, which will you use either of the Linux Bridge and OVS?
If libnetwork is to be considered useful independent of Docker, it should not depend on the Docker plugin system. However, this conflicts with the design goal of not having driver-related types in the libnetwork API.
To wit: one very simple design would be to have a libnetwork.RegisterRemoteDriver
procedure, which adds a driverapi.Driver
implementation to the list of available drivers; but this trivially compromises the "no driver types in API" goal.
Since drivers cannot be supplied to libnetwork, it must reach out and get them from somewhere. The plugin implementation (at this time) has a procedure for registering a handler for a plugin type -- e.g., "NetworkDriver"
-- and this can be done during libnetwork's initialisation. However, to do so libnetwork needs to import the plugin package and use its types and handler registration mechanism.
This (near-)cycle can be broken by introducing interfaces which the plugin system implements or can be adapted to implement. The difficulty then is how does Docker (or some other application) supply its implementation to libnetwork?
One alternative might be to add a means for supplying such an implementation. For example, smuggling it through a special case of ConfigureNetworkDriver; or, a southbound API procedure.
The difficulty here is that it is needed by the remote driver initialisation, and that is locked up behind libnetwork's initialisation. So there would need to be a complicated dance to make sure things are initialised in the correct order (supply plugin implementation to remote package; initialise libnetwork) and there would be no good way to enforce the ordering.
Another alternative is to recognise that remote drivers are a special case, take the remote package initialisation out of libnetwork initialisation, and let people hook them together separately. This would require an additional API procedure in NetworkController
, but it needn't use driver types, only the plugin adapter interfaces.
I am open to other suggestions, including that of just depending on docker/pkg/plugins despite the downsides.
The CI runs a battery of tests, but there is no easy way to test beforehand if a given PR will be green from CI perspective.
We should add a Dockerfile and/or a Makefile providing a single command to run all tests.
Port Allocator is supposed to be a reflection of the Operating System construct and must be in sync with OS view of the allocated ports.
The current port allocator reflects the software db of that construct.
Also, port allocator must also provide allocation for individual network namespace and not just for the global space.
Remove capitalization of identifiers (function names) in comments in interface.go
Depending on ipallocator bring docker/docker/daemon/networkdriver
as a whole, and docker/libcontainer/netlink
with it.
Per https://docs.docker.com/reference/api/docker_remote_api_v1.18/#inspect-a-container
The current API only shows a single MAC and IP per container.
There's not much point in having the whole driver subsystem unless one can actually create a network. I don't see that as part of moby/moby#13060.
If it is a requirement to not add new commands to docker (e.g.,docker network create
), then perhaps it can be part of the syntax of the --net
argument. For example,
docker run --net=weave:mynet
where weave
refers to the driver, and mynet
is a network created with that driver (if it doesn't already exist).
--ip=IP_ADDRESS
--mtu=BYTES
My understanding after reading the README.md and ROADMAP.md is that libnetwork package will be included statically in the docker daemon. Any future plugin will be between libnetwork and the drivers. i.e., if the plugin takes the form of a REST API, then this will make network.go (https://github.com/docker/libnetwork/blob/master/network.go) call REST APIs to the bridge driver to Create network, create endpoint etc.
Is the above understanding correct? Or is it that the Docker daemon will make a REST API call to libnetwork which will sit as a separate binary?
Docker 1.6 supports Labels for Container and Image objects.
There has been a lot of discussions and requests to support a similar labels concept for Networks and Endpoints. This enhancement is raised to address this requirement.
Currently it is expected that driver.CreateEndpoint() returns back the sandbox.info structure that needs to include ip, gateway, veth_inside and veth_outside information. (Where veth_inside is "The name that will be assigned to the interface once moved inside a network namespace.")
As I see it, the driver cannot return back that information because it does not know how many other endpoints will be added to the container in question. What is the thought process of the designers here?
Its my understanding that libnetwork keeps references to network objects it creates; however, I can't see an interface for retrieving or iterating over these objects. Is it expected that the caller (ie the docker daemon) keeps a duplicate list of networks, and uses this for looking up network objects for deleting them and listing them? Or would you be receptive to add Get() and Walk() methods to the NetworkController interface?
A similar question arises for NetworkDrivers. I understand libnetwork is under development right now, and am happy to put together some code for this (and do a PR).
This will be needed in order for us to remain compatible with links in Docker 1.7.
The following is a starter for 10:
foo
to resolve to host foo on local network and bar.foo
to resolve to host foo
on the bar
network. Extra information about exposed ports isn't necessary as we have open communication between the two, although it would be advisable to filter it.As it stands --link
and it's iptables magic will work today. My concern is that networks don't offer much beyond IP connectivity between hosts and therefore it's going to be tricky to sell their value. I don't think that it's a tenable situation to release the concept of multiple networks in 1.7 while requiring the following in order for things to work....
docker network create myapp
docker run -itd --net=myapp --name=foo busybox
docker run -itd --net=myapp --name=bar busybox
# this doesn't work
docker exec foo ping bar
# so I have to do this
docker stop bar && docker rm bar
docker run -itd --net=myapp --link foo:foo --name bar busybox
Its my understanding that in libnetwork, creating an endpoint calls the underlying driver, where IPs are likely to be allocated, advertised, routed etc. In our PoC implementation, we found it useful to defer this action to container start; ie non-running containers would not have "plugged" interfaces. This allows network drivers to not advertise these containers, allows them to potentially reuse IP addresses, and generally makes all the allocation decisions lazy.
Does this sound like a reasonable ask? I am happy to prepare code for this. I'd envisage there being a Plug and Unplug method in the Endpoint interface, and having these return the SandboxInfo struct instead of Network.CreateEndpoint.
Currently libcontainer creates and sets up the interfaces inside the container in the execution work flow.
It would be nice to replace the code located https://github.com/docker/libcontainer/blob/master/network_linux.go with libnetwork.
In the design document it's stated that networks and endpoints are "global scope within a cluster".
What is a cluster?
What are the semantics associated with these being global? For example: if I try to create a network "foo"
on one host, then delete "foo"
on another host, should I expect this to succeed, or to report an error?
What specifically is required of drivers to support these globally-scoped objects?
Libnetwork will operate on the Options
ONLY if the key
matches any of the well-known Label
defined in the net-labels
package.
I can't find any other references to this package.
There is a TODO in the Sandbox.Info code (https://github.com/docker/libnetwork/blob/043a23ae122fa79bd51dfe6ee7feed2e1645a91c/sandbox/sandbox.go#L48) but nothing in the design docs/wiki about how this will be implemented.
For my use case, I need to be able to create the following kinds of routes in the sandbox
ip route add [GATEWAY_IP] dev [DEVICE]
ip route add default via [GATEWAY_IP] dev [DEVICE]
Will that be possible?
The user passes a "name" to libnetwork apis (for e.g., a network name or an endpoint name.). The libnetwork does not pass the same name to the driver but passes a uuid instead. IMO, the "name" is an information from the user that should ideally reach the driver. But currently libnetwork blocks it.
It would be nice if libnetwork passes this information to the driver too along with the uuid. This way, the driver can keep a relationship between the uuid that libnetwork generates and the name that the user provides.
Usecases:
The usecases is for hybrid networks where a docker container and a VM and a physical machine could be reachable to each other. IMO, this provides a good transition path from current workloads to pure-docker workloads. In this case, a network would have to be pre-created outside docker. But we can make docker aware of this network with 'docker network create'.
Even assuming that the driver has to live with this restriction, a 'docker network join' would look unwieldy and confusing looking something like:
docker network join CONTAINER NETWORK --label ENDPOINT=OS-UUID --label NETWORK=OS-UUID
Automate tests on new PR (with the additional constraint that tests need root)
Support for both variants of -P
and -p
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.