GithubHelp home page GithubHelp logo

antrea-io / antrea Goto Github PK

View Code? Open in Web Editor NEW
1.6K 56.0 346.0 34.05 MB

Kubernetes networking based on Open vSwitch

Home Page: https://antrea.io

License: Apache License 2.0

Makefile 0.25% Shell 4.59% Go 94.14% Dockerfile 0.11% HCL 0.05% PowerShell 0.71% Smarty 0.05% Jinja 0.09%
kubernetes cni cncf networking security

antrea's Introduction

Antrea

Antrea Logo

Build Status Go Report Card CII Best Practices License GitHub release FOSSA Status

Overview

Antrea is a Kubernetes networking solution intended to be Kubernetes native. It operates at Layer 3/4 to provide networking and security services for a Kubernetes cluster, leveraging Open vSwitch as the networking data plane.

Antrea Overview

Open vSwitch is a widely adopted high-performance programmable virtual switch; Antrea leverages it to implement Pod networking and security features. For instance, Open vSwitch enables Antrea to implement Kubernetes Network Policies in a very efficient manner.

Prerequisites

Antrea has been tested with Kubernetes clusters running version 1.19 or later.

  • NodeIPAMController must be enabled in the Kubernetes cluster.
    When deploying a cluster with kubeadm the --pod-network-cidr <cidr> option must be specified. Alternately, NodeIPAM feature of Antrea Controller should be enabled and configured.
  • Open vSwitch kernel module must be present on every Kubernetes node.

Getting Started

Getting started with Antrea is very simple, and takes only a few minutes. See how it's done in the Getting started document.

Contributing

The Antrea community welcomes new contributors. We are waiting for your PRs!

Community

Also check out @ProjectAntrea on Twitter!

Features

  • Kubernetes-native: Antrea follows best practices to extend the Kubernetes APIs and provide familiar abstractions to users, while also leveraging Kubernetes libraries in its own implementation.
  • Powered by Open vSwitch: Antrea relies on Open vSwitch to implement all networking functions, including Kubernetes Service load-balancing, and to enable hardware offloading in order to support the most demanding workloads.
  • Run everywhere: Run Antrea in private clouds, public clouds and on bare metal, and select the appropriate traffic mode (with or without overlay) based on your infrastructure and use case.
  • Comprehensive policy model: Antrea provides a comprehensive network policy model, which builds upon Kubernetes Network Policies with new features such as policy tiering, rule priorities, cluster-level policies, and Node policies. Refer to the Antrea Network Policy documentation for a full list of features.
  • Windows Node support: Thanks to the portability of Open vSwitch, Antrea can use the same data plane implementation on both Linux and Windows Kubernetes Nodes.
  • Multi-cluster networking: Federate multiple Kubernetes clusters and benefit from a unified data plane (including multi-cluster Services) and a unified security posture. Refer to the Antrea Multi-cluster documentation to get started.
  • Troubleshooting and monitoring tools: Antrea comes with CLI and UI tools which provide visibility and diagnostics capabilities (packet tracing, policy analysis, flow inspection). It exposes Prometheus metrics and supports exporting network flow information to collectors and analyzers.
  • Network observability and analytics: Antrea + Theia enable fine-grained visibility into the communication among Kubernetes workloads. Theia provides visualization for Antrea network flows in Grafana dashboards, and recommends Network Policies to secure the workloads.
  • Network Policies for virtual machines: Antrea-native policies can be enforced on non-Kubernetes Nodes including VMs and baremetal servers. Project Nephe implements security policies for VMs across clouds, leveraging Antrea-native policies.
  • Encryption: Encryption of inter-Node Pod traffic with IPsec or WireGuard tunnels.
  • Easy deployment: Antrea is deployed by applying a single YAML manifest file.

To explore more Antrea features and their usage, check the Getting started document and user guides in the Antrea documentation folder. Refer to the Changelogs for a detailed list of features introduced for each version release.

Adopters

For a list of Antrea Adopters, please refer to ADOPTERS.md.

Roadmap

We are adding features very quickly to Antrea. Check out the list of features we are considering on our Roadmap page. Feel free to throw your ideas in!

License

Antrea is licensed under the Apache License, version 2.0

FOSSA Status

antrea's People

Contributors

abhiraut avatar antoninbas avatar antrea-bot avatar atish-iaf avatar ceclinux avatar dependabot[bot] avatar dreamtalen avatar dyanngg avatar gran-vmv avatar graysonwu avatar heanlan avatar hjiajing avatar hongliangl avatar jainpulkit22 avatar jianjuns avatar ksamoray avatar luolanzone avatar lzhecheng avatar mengdie-song avatar qiyueyao avatar ruicao93 avatar srikartati avatar tnqn avatar weiqiangt avatar wenqiq avatar wenyingd avatar xinshuyang avatar xliuxu avatar zhangyw18 avatar zyiou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

antrea's Issues

"make fmt" fails on Mac OS after making one of the "docker-*" targets

Describe the bug
Running make fmt fails on Mac OS after running make docker-test-unit because too many Go source files are found when evaluating GO_FILES:

GO_FILES        := $(shell find . -name '*.go')

This is because make docker-test-unit creates a .cache folder where all the Go dependencies are pulled. find descends in this directory and too many files are found, producing a shell argument list which exceeds the system limits.

To Reproduce
Run make docker-test-unit followed by make fmt on Mac OS.

Expected
make fmt should invoke gofmt on the Go source files.

Actual behavior

===> Formatting Go files <===
make: /bin/sh: Argument list too long
make: *** [fmt] Error 1

Versions:
N/A

Additional context
A simple fix would be to exclude the .cache folder from the Go source file search.

Building image failed intermittently

Describe the bug
make build sometimes failed because of missing intermediate image, which was there at Step 6/19 but was missing when copying bits from it at Step 15/19. Not sure if only me met this problem or it's a multi-stage Dockerfile bug.

Here is the failure output:

# make build
===> Building Antrea bins and antrea-ubuntu Docker image <===
docker build -t antrea-ubuntu -f build/images/Dockerfile.build.ubuntu .
Sending build context to Docker daemon  4.493MB
Step 1/19 : FROM ubuntu:18.04 as cni-binaries
 ---> 775349758637
Step 2/19 : RUN apt-get update &&     apt-get install -y --no-install-recommends wget ca-certificates
 ---> Using cache
 ---> 9e6ffbddf095
Step 3/19 : RUN mkdir -p /opt/cni/bin &&     wget -q -O - https://dl.k8s.io/network-plugins/cni-plugins-amd64-v0.7.5.tgz | tar xz -C /opt/cni/bin ./host-local
 ---> Using cache
 ---> 96eb7fde3052
Step 4/19 : FROM ubuntu:18.04 as ovs-debs
 ---> 775349758637
Step 5/19 : RUN apt-get update &&     apt-get install -y --no-install-recommends wget ca-certificates build-essential fakeroot graphviz             bzip2 autoconf automake debhelper dh-autoreconf libssl-dev libtool openssl procps             python-all python-twisted-conch python-zopeinterface python-six libunbound-dev
 ---> Using cache
 ---> 7fa6559fe58f
Step 6/19 : RUN wget -q -O - https://www.openvswitch.org/releases/openvswitch-2.11.1.tar.gz  | tar xz -C /tmp &&     cd /tmp/openvswitch* && DEB_BUILD_OPTIONS='parallel=8 nocheck' fakeroot debian/rules binary &&     cd /tmp && mkdir ovs-debs &&     mv libopenvswitch_*.deb openvswitch-common_*.deb openvswitch-switch_*.deb python-openvswitch_*.deb        openvswitch-ipsec_*.deb ovs-debs/
 ---> Using cache
 ---> 18001c975fc6
Step 7/19 : FROM golang:1.12 as antrea-build
 ---> bc0268f5ce47
Step 8/19 : COPY . /antrea
 ---> d18a47a3a2f7
Step 9/19 : WORKDIR /antrea
 ---> Running in 3be2a70493bd
Removing intermediate container 3be2a70493bd
 ---> c5c007137991
Step 10/19 : RUN make bin
 ---> Running in d916da84d8aa
GOBIN=/antrea/bin go install  -ldflags ' -X github.com/vmware-tanzu/antrea/pkg/version.Version=v0.0.1 -X github.com/vmware-tanzu/antrea/pkg/version.GitSHA=c46f4d8 -X github.com/vmware-tanzu/antrea/pkg/version.GitTreeState=dirty -X github.com/vmware-tanzu/antrea/pkg/version.ReleaseStatus=unreleased' github.com/vmware-tanzu/antrea/cmd/...
...
Removing intermediate container d916da84d8aa
 ---> 57b8ec227476
Step 11/19 : FROM ubuntu:18.04
18.04: Pulling from library/ubuntu
Digest: sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
Status: Downloaded newer image for ubuntu:18.04
 ---> 775349758637
Step 12/19 : LABEL maintainer="Antrea <[email protected]>"
 ---> Using cache
 ---> 0f558a7f094f
Step 13/19 : LABEL description="A docker image to deploy the Antrea CNI. Takes care of building the Antrea binaries as part of building the image."
 ---> Using cache
 ---> fbaf9a99359a
Step 14/19 : USER root
 ---> Using cache
 ---> e51ce7d43067
Step 15/19 : COPY --from=ovs-debs /tmp/ovs-debs/* /tmp/ovs-debs/
invalid from flag value ovs-debs: No such image: sha256:18001c975fc64558008b0e3bb8346cebc0a9e3f1249ecdacda7e109df346c827
Makefile:141: recipe for target 'build-ubuntu' failed
make: *** [build-ubuntu] Error 1

To Reproduce
make build could reproduce it, but not everytime.

Expected
make build should success consistently.

Actual behavior
make build failed intermittently.

Versions:
Please provide the following information:

  • Antrea version (Docker image tag).
    0.0.1

Additional context

go mod tidy is broken with go 1.13

Describe the bug
Failed to execute go mod tidy with go 1.13.4

go: finding github.com/vmware/octant v0.9.1
github.com/vmware-tanzu/antrea/cmd/antrea-octant-plugin imports
        github.com/vmware/octant/pkg/navigation tested by
        github.com/vmware/octant/pkg/navigation.test imports
        github.com/vmware/octant/pkg/store/fake: module github.com/vmware/octant@latest found (v0.9.1, replaced by github.com/vmware-tanzu/[email protected]), but does not contain package github.com/vmware/
octant/pkg/store/fake

To Reproduce
Execute go mod tidy

Expected
It should tidy the requirements.

Actual behavior
It failed.

Versions:
Please provide the following information:

  • Antrea version (Docker image tag): 0.0.1

Do an audit of Go dependencies

According to @timothysc some entries in go.sum are surprising. They may have been pulled-in by things like the code generator and are not actually required to run the Antrea code.

We can try removing the go.sum file and running go mod tidy again.

Review verbosity of log messages

Some log messages are probably using a default verbosity value of 0, but should only be logged at higher verbosity levels. For example, I was looking at some recent agent logs for e2 tests and saw the following:

I1104 18:52:26.096285       1 monitor.go:143] Updating agent monitor CRD &{{ } {antrea-agent-6st6m   /apis/clusterinformation.crd.antrea.io/v1beta1/antreaagentinfos/antrea-agent-6st6m c3277bfc-bbed-4325-8f17-59364d5b0530 415142 1 2019-11-04 18:51:26 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} UKNOWN {Pod kube-system antrea-agent-6st6m    } {Node  k8s-node-master    } [10.10.0.0/24] { br-int map[0:3 10:3 100:1 110:2 20:3 30:2 31:6 40:2 50:1 60:1 70:3 80:3 90:1]} 0 []}
I1104 18:53:26.102658       1 monitor.go:143] Updating agent monitor CRD &{{ } {antrea-agent-6st6m   /apis/clusterinformation.crd.antrea.io/v1beta1/antreaagentinfos/antrea-agent-6st6m c3277bfc-bbed-4325-8f17-59364d5b0530 415250 2 2019-11-04 18:51:26 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} UKNOWN {Pod kube-system antrea-agent-6st6m    } {Node  k8s-node-master    } [10.10.0.0/24] { br-int map[0:3 10:3 100:1 110:2 20:3 30:2 31:6 40:2 50:1 60:1 70:3 80:3 90:1]} 0 []}
I1104 18:54:26.109453       1 monitor.go:143] Updating agent monitor CRD &{{ } {antrea-agent-6st6m   /apis/clusterinformation.crd.antrea.io/v1beta1/antreaagentinfos/antrea-agent-6st6m c3277bfc-bbed-4325-8f17-59364d5b0530 415250 2 2019-11-04 18:51:26 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} UKNOWN {Pod kube-system antrea-agent-6st6m    } {Node  k8s-node-master    } [10.10.0.0/24] { br-int map[0:3 10:3 100:1 110:2 20:3 30:2 31:6 40:2 50:1 60:1 70:3 80:3 90:1]} 0 []}
I1104 18:55:26.113783       1 monitor.go:143] Updating agent monitor CRD &{{ } {antrea-agent-6st6m   /apis/clusterinformation.crd.antrea.io/v1beta1/antreaagentinfos/antrea-agent-6st6m c3277bfc-bbed-4325-8f17-59364d5b0530 415250 2 2019-11-04 18:51:26 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} UKNOWN {Pod kube-system antrea-agent-6st6m    } {Node  k8s-node-master    } [10.10.0.0/24] { br-int map[0:3 10:3 100:1 110:2 20:3 30:2 31:6 40:2 50:1 60:1 70:3 80:3 90:1]} 0 []}

I think @abhiraut wanted to take a look at this.

Except field in NetworkPolicy is not supported

Describe the bug
Except field in NetworkPolicy is not supported

To Reproduce
Create a NetworkPolicy with ipBlock which includes except field

Expected
The IP address in except CIDR should not be allowed

Actual behavior
The IP address in except CIDR is allowed

Versions:
Please provide the following information:

  • Antrea version (Docker image tag): 0.0.1

Add lastHeartbeatTime to UI

Describe what you are trying to solve
To identify Antrea Controller and Agent status via UI.

Describe the solution you have in mind
Get lastHeartbeatTime from monitoring CRDs,
then add a new column named "Last Heartbeat Time" for both controller
and agent on UI to show controller and agent health status.

The column should be updated every 60 seconds if controller/agent runs fine,
as a result, we can easily identify controller/agent health status via UI with this column.

Describe how your solution impacts user flows
Users can identify Antrea Controller and Agent status via UI by checking this new column.

Describe the main design/architecture of your solution

  1. Get lastHeartbeatTime from condition field of monitoring CRDs
  2. Add a new column named "Last Heartbeat Time" for controller and agent information tables.
  3. Populate the new column with lastHeartbeatTime.

Consolidate developer-facing documentation

Documents like docs/manual-installation.md, docs/mocks.md, docs/monitoring-crd.md should be moved to their own subdirectory, with its own README field.

Also consider automating the steps in docs/manual-installation.md using a script, if these steps are important for developer workflow.

"make build" downloads go packages repeatedly

Describe the bug
"make build" downloads go packages repeatedly, which slow down the building several minutes.

To Reproduce
run "make build" on linux

Expected
Maybe we should just build the binary as it was on linux, or consider leveraging gocache in docker build too, otherwise it's not worth to extend the building time several minutes on Linux for the ability to build it on Windows/OSX.

Actual behavior
"make build" downloads go packages repeatedly

Versions:
Please provide the following information:

  • Antrea version (Docker image tag): 0.0.1

Curate / simplify the README to make it more user-friendly

  • see the Contour README for a good example
  • add the "Overview" keyword for the first section and simply it, assume readers are already familiar with such things as "CNI" otherwise they would not have landed therew
  • "Features" section: unless something is very different from other CNI, just add a link in the README to that list of features. Or put it at the end of the README. Do not be redundant with ROADMAP.

Ability to run Kubernetes conformance tests in an automated fashion

Describe the problem/challenge you have
We currently do not run the network conformance tests: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/network

Describe the solution you'd like
We should be able to run the applicable tests - and in particular network policy tests - either as part of CI (if they can be run in a reasonable amount of time) or on-demand (nightly / for each tagged release) on our CI testbed. Once the project matures a little, we should not be able to make a new release if there is a regression in the conformance tests.

Octant UI plugin for monitoring the Antrea components runtime information and health status

Describe what you are trying to solve
To display Antrea components runtime information and health status on UI

Describe the solution you have in mind
Write an Octant plugin to display the information gathered by monitoring CRDs (AntreaControllerInfo and AntreaAgentInfo)

Describe how your solution impacts user flows
User can check Antrea components runtime information and health status via clicking on Navigation bar (Antrea Information) and tables with detailed information for both Antrea Controller and Antrea Agent will be shown on UI.

Describe the main design/architecture of your solution
Antrea-Octant-Plugin will query K8s API server to get monitoring CRDs, and
display the content of monitoring CRDs by different UI components provided by Octant.

Alternative solutions that you considered
N/A

Test plan
Test on the live testbed to check UI display.

Additional context
N/A

Add and populate field lastHeartbeatTime to monitoring CRDs

Describe what you are trying to solve
Add and populate lastHeartbeatTime to show Antrea Controller and Agent status by CRD

Describe the solution you have in mind
Add lastHeartbeatTime to monitoring CRDs (AntreaControllerInfo and AntreaAgentInfo).
Monitoring CRDs will be updated every 60 seconds with lastHeartbeatTime, so we can easily know
the status of Controller/Agent by checking if lastHeartbeatTime is expected.

Describe how your solution impacts user flows
Users will know the general status of Controller and Agent by the field lastHeartbeatTime in AntreaControllerInfo and AntreaAgentInfo.

Describe the main design/architecture of your solution

  1. Add a new field lastHeartbeatTime in the condition of AntreaControllerInfo and AntreaAgentInfo.
  2. Create/Update AntreaControllerInfo and AntreaAgentInfo with the timestamp.

Alternative solutions that you considered
N/A

Test plan
Test it on the live testbed.

Additional context
N/A

Antrea Controller API endpoint is not secured

Describe the bug
Antrea Controller API endpoint is not secured, anonymous clients can get the computed networkpolicy resources.

To Reproduce

  1. Deploy antrea-controller deployment and service
  2. Access the service without any token/cert provided, the request can be accepted

Expected
Only users/serviceaccounts that have permission of reading resources under "networkpolicy.antrea.io" can get computed networkpolicy resources.

Actual behavior
Anonymous clients can get computed networkpolicy resources.

Versions:
Please provide the following information:

  • Antrea version (Docker image tag).
    0.0.1
  • Kubernetes version (use kubectl version). If your Kubernetes components have different versions, please provide the version for all of them.
    1.16.2
  • Container runtime: which runtime are you using (e.g. containerd, cri-o, docker) and which version are you using?
  • Linux kernel version on the Kubernetes Nodes (uname -r).
  • If you chose to compile the Open vSwitch kernel module manually instead of using the kernel module built into the Linux kernel, which version of the OVS kernel module are you using? Include the output of modinfo openvswitch for the Kubernetes Nodes.

Additional context
Add any other context about the problem here, such as Antrea logs, kubelet logs, etc.

(Please consider pasting long output into a GitHub gist or any other pastebin.)

OVS module loading failure for RHEL and CentOS Nodes

Describe the bug
modprobe of initContainer will fail on RHEL or CentOS Nodes.

To Reproduce
Deploy Antrea on RHEL or CentOS Nodes, and initContainer will fail.

Expected
Loading OVS module by initContainer for RHEL and CentOS Nodes.

Actual behavior
modprobe in the initContainer returns an error about binary execution.

Versions:
Please provide the following information:

  • Antrea version v0.0.1

Support IPSec ESP for tunnel traffic

Describe what you are trying to solve
Enable IPSec ESP on VXLAN or Geneve tunnels across Nodes.

Describe the solution you have in mind
OVS supports IPSec tunnels. But flow based tunnels no more work, instead a separate tunnel port needs to be created for each remote Node with IPSec configured.
We can use StrongSwan as the IKE for IPSec key exchange, and might start from using PSK (pre-shared key) for IKE authentication. The PSK value can be saved in a K8s Secret.
We will need to run the StrongSwan daemon and the OVS IPSec daemon on every Node.
These two daemons can be run in a separate container of the Antrea Agent DaemonSet.

Describe how your solution impacts user flows
There will be an option in Atrea Agent config file to enable IPSec tunnels. Also a K8s Secret for saving the IKE PSK needs to be created to enable IPSec tunnels.

Test plan
Will add e2e tests that test IPSec traffic across Nodes.

  • Enhance OVSDB client to support creating IPSec tunnel ports(#132)
  • Enable IPSec encryption of tunnel traffic (#209)
  • Allow IPSec encryption only for GRE tunnel (#329)
  • Add documentation for IPSec

Use GitHub actions to do unit test

Describe what you are trying to solve
Use GitHub actions to do unit test.

Describe the solution you have in mind
Add github actions file to enable this feature.

Describe how your solution impacts user flows
By using this, we can save more resources for CI to do e2e tests.
Each push would trigger the GitHub action.
When creating pull requests, the action result would show accordingly:
Failed
image
Success
image

Describe the main design/architecture of your solution
Empty.

Alternative solutions that you considered
Empty.

Test plan
Empty.

Additional context
An example.

Support Kubernetes NetworkPolicy

Describe what you are trying to solve
Make Antrea support Kubernetes NetworkPolicy.

Describe the solution you have in mind
Refer to https://github.com/vmware-tanzu-private/antrea/blob/master/docs/architecture.md#networkpolicy

Describe how your solution impacts user flows
User can create Kubernetes NetworkPolicy and expect they are enforced by Antrea.

Describe the main design/architecture of your solution
Refer to https://github.com/vmware-tanzu-private/antrea/blob/master/docs/architecture.md#networkpolicy

  • Add a common ram-based storage interface (f8d6306)
  • Implement Antrea Network Policy store (b9df736)
  • Implement Antrea APIServer (1e6d6b0)
  • Implement antrea-agent side Network Policy controller which watches computed internal Network Policy resources from Antrea Controller API endpoint and conducts Openflow interfaces (#53, #55)
  • Implement antrea-controller side Network Policy controller which watches K8s NetworkPolicy, Pod and Namespace resources and compute the internal resources consumed by antrea-agent(#54, #82)
  • Implement Openflow interfaces that program OVS to enforce NetworkPolicy(#57 )

Test plan

Additional context

Leverage Openflow go binding instead of command line

Describe the problem/challenge you have
Currently Antrea is leveraging ovs-ofctl command to install Openflow entries for connectivity and network policy, as each command starts a new process, the efficiency is lower.

Describe the solution you'd like
I will introduce a go binding for installing Openflow entries. Antrea initiates a Unix Domain Socket to connection to OVS, and it is used to send Openflow messages.
The go binding should follow existent Openflow control interface in Antrea, and the Openflow entries realized on OVS should be the same as using command line.

Describe how your solution impacts user flows
N/A

Curate "Getting started" section in README

  • reduce the size of the section and consider having a link to a separate page (e.g. something like https://projectcontour.io/getting-started/)
  • most users will be using kubeadm so assume that: make kubeadm the default and shows which parameters to pass to kubeadm.
  • many people want to try the project first and see if it works before digging in more, the kubectl apply command (pointing to an online Antrea manifest) should come first

Kubelet errors since host netns file doesn't exist

After deploying antrea on a cluster (provisioned with Cluster API + kubeadm), pods were stuck in ContainerCreating state.

From the kubelet I saw:

Warning  FailedCreatePodSandBox  2m35s                kubelet, target-cluster01-md-0-7d99bd4955-xf6qp  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9056817dafde39da20dba740f2b60f5f057f5a61f5ea0d56bdbd5da388ed978b": failed to Statfs "/host/var/run/netns/cni-47e6e75f-07a2-9890-3c38-95b9f8f3771c": no such file or directory  Warning  FailedCreatePodSandBox  2m35s                kubelet, target-cluster01-md-0-7d99bd4955-xf6qp  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9056817dafde39da20dba740f2b60f5f057f5a61f5ea0d56bdbd5da388ed978b": failed to Statfs "/host/var/run/netns/cni-47e6e75f-07a2-9890-3c38-95b9f8f3771c": no such file or directory

From antrea-agent I saw similar logs:

E1104 19:03:43.791611       1 pod_configuration.go:215] Failed to open netns with /host/var/run/netns/cni-90c11885-925e-c670-8bbd-3727281ddae1: failed to Statfs "/host/var/run/netns/cni-90c11885-925e-c670-8bbd-3727281ddae1": no such file or directory
E1104 19:03:43.791622       1 server.go:351] Failed to configure container 2bf8333eafa19313d398bc07afeaa0496d11d01ab4eb9dd5d3f7485fb7e76d83 interface: failed to Statfs "/host/var/run/netns/cni-90c11885-925e-c670-8bbd-3727281ddae1": no such file or directory
W1104 19:03:43.791628       1 server.go:317] CmdAdd has failed, and try to rollback

I tried host path mounting /var/run/netns to the antrea agent but that didn't fix it.

Running antrea agent on the host breaks

While running antrea agent on the host it fails immediately with the following error:

Nov 18 05:53:30 antrea-vm1 antrea-agent[7305]: F1118 05:53:30.088121 7305 main.go:56] Error running agent: error creating Antrea client: unable to load in-cluster configuration, ANTREA_SERVICE_HOST and ANTREA_SERVICE_PORT must be defined

Antrea needs a process for security issue

Describe the bug
Not a software bug, but a missing element for project governance.

To Reproduce
N/A

Expected
Contributors should have clear indication concerning the process for reporting, fixing, and disclosing security issues.

Actual behavior
There is no process at the moment.

Versions:
N/A

Additional context
N/A

Cleanup Makefile targets

We should probably put the most "important" targets together at the top and move helper targets further down.

@salv-orlando also pointed out that information message sometimes use ==> X <== and sometimes use ===> X <=== and we should probably make them uniform.

go get fails with an "invalid version" error

Describe the bug
go get fails with an "invalid version" error

To Reproduce
Run:

go get github.com/vmware-tanzu/antrea

Expected
The repository to be downloaded / cloned

Actual behavior

$ go get github.com/vmware-tanzu/antrea
go get: github.com/vmware-tanzu/[email protected] requires
	github.com/vmware/[email protected]: invalid version: unknown revision 000000000000

Upload base Docker image with pre-built OVS to docker repo

Describe what you are trying to solve
Building the Antrea Docker image currently takes upwards of 10 minutes since the OVS userspace daemons have to be built from source.

Describe the solution you have in mind
Move this code (https://github.com/vmware-tanzu-private/antrea/blob/master/build/images/Dockerfile.build.ubuntu#L19) to a separate Dockerfile. Build the image locally and push it once to the antrea Dockerhub organization as antrea/openvswitch:2.11.1. The Antrea Docker image can then use that image as a base. The steps will be documented in case we need to upload a different Docker image for OVS in the future. I am open to suggestions for which tag to use for the OVS Docker image.

Describe how your solution impacts user flows
End users are not affected. For developers & CI, this means the Antrea image can be built much faster.

Describe the main design/architecture of your solution
N/A

Alternative solutions that you considered
N/A

Test plan
N/A

Additional context
N/A

Get OVS Version from database for Monitoring CRD to show

Describe what you are trying to solve
Let Monitoring CRD to show OVS version information

Describe the solution you have in mind
Get the ovs_version from OVSDB table Open_vSwitch, then add ovs_version to CRD AntreaAgentInfo.

Describe how your solution impacts user flows
N/A

Describe the main design/architecture of your solution

  1. Add a query interface to get ovs_version from OVSDB.
  2. Add the field ovs_version to monitoring CRD AntreaAgentInfo.
  3. Populate ovs_version for AntreaAgentInfo.

Update the mailing list addresses

We have created the following googlegroups:

  • projectantrea as the main mailing list, including where users can ask for help
  • projectantrea-dev as the open mailing list to discuss development activities (probably not as useful as Github issues)
  • projectantrea-announce as a restricted & curated mailing list (everyone can subscribe, only a few people can join) where we only post announcements for releases & events
  • projectantrea-maintainers to reach out to the maintainers privately

Document OVS Datapath Flows

Describe the problem/challenge you have
I think a diagram of the OVS data path flows managed by Antrea would be really useful to users (and developers). This would make debugging issues, developing features and fixing bugs way easier.

Describe the solution you'd like
A page that illustrates the data path of a packet going through open flow tables managed by Antrea.

Anything else you would like to add?

Revisit API groups naming

For example currently we have clusterinformation.crd.antrea.io . We will need to make this uniform with other vmware-tanzu repos (e.g. should end with a vmware-tanzu.com domain).

Document and image for deploying octant and antrea-octant-plugin

Describe what you are trying to solve
Add document and image for deploying octant and antrea-octant-plugin.

Describe the solution you have in mind
Add document and a docker image to help users to easily deploy octant and antrea-octant-plugin.

Describe how your solution impacts user flows
Users can deploy octant and install antrea-octant-plugin as a Pod or as a process depends on their demands.

Describe the main design/architecture of your solution

  1. Provide a separate docker image antrea-octant to deploy octant together with antrea-octant-plugin..
  2. Add instructions in document for uses to easily deploy octant together with antrea-octant-plugin in different ways.

Alternative solutions that you considered
N/A

Test plan
N/A

Additional context
N/A

Remove stale gateway routes on antrea-agent startup

Describe what you are trying to solve
At the moment the Node route controller simply starts watching for Node updated from the K8s API server. For new Nodes, the appropriate flows are added to the vSwitch (by calling InstallNodeFlows) and the appropriate route is added to the gateway interface on the host. On agent restart, we replay flows & routes for all Nodes, however we do not clean-up stale routes on the gateway, which seems like something worth doing.

Describe the solution you have in mind
When starting the Node route controller, after syncing-up the caches in the Run method, we should list all the Nodes using the nodeLister object and take care of removing stale routes by comparing the set of desired routes with the set of existing routes.

Describe how your solution impacts user flows
N/A

Describe the main design/architecture of your solution
All the logic will be included in a single Go method (method on the Controller object) called reconcile. The method will be called synchronously from the Run method, after syncing-up the caches (so that nodeLister is up-to-date) and before starting the workers. reconcile will install missing routes and remove stale routes. It may also replay the necessary flows with InstallNodeFlows, or let the workers be responsible for doing it.

Alternative solutions that you considered
Because the reconcile method is called synchronously, it could increase the initialization time for the Node route controller for large clusters with 1000s of Nodes. Workers will not be started until reconcile completes. I believe this is unlikely to be an issue, but if it turns out to be one, we should consider 1) provisioning multiple goroutines to perform route reconciliation, and / or 2) perform reconciliation in the background while the workers are running. 2) would definitely increase the code complexity since some synchronization mechanism would then be required. At the moment, I believe it is better to stick with the simpler solution described above.

Test plan
I will write an e2e test to test that stale routes are correctly removed. I will simulate a stale route by adding a dummy route to the gateway interface and restarting the Agent.

Additional context
N/A

Document the CI pipeline

Describe which sets of tests are being run, along with which linters are run. The document should also indicate how to debug CI failures.

Add Jenkins CI

When the repo becomes public, some (or all) of the CI can be transitioned to a public CI provider such as Travis CI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.