GithubHelp home page GithubHelp logo

mesos-consul's Introduction

image

Overview

Join the chat at https://gitter.im/CiscoCloud/mantl Build Status Stories in Progress

Mantl is a modern, batteries included platform for rapidly deploying globally distributed services

Table of Contents

Features

Core Components

  • Consul for service discovery
  • Vault for managing secrets
  • Mesos cluster manager for efficient resource isolation and sharing across distributed services
  • Marathon for cluster management of long running containerized services
  • Kubernetes for managing, organizing, and scheduling containers
  • Terraform deployment to multiple cloud providers
  • Docker container runtime
  • Traefik for proxying external traffic
  • mesos-consul populating Consul service discovery with Mesos tasks
  • Mantl API easily install supported Mesos frameworks on Mantl
  • Mantl UI a beautiful administrative interface to Mantl

Addons

  • ELK Stack for log collection and analysis
  • GlusterFS for container volume storage
  • Docker Swarm for clustering Docker daemons between networked hosts
  • etcd, distributed key-value store for Calico
  • Calico, a new kind of virtual network
  • collectd for metrics collection
  • Chronos a distributed task scheduler
  • Kong for managing APIs

See the addons/ directory for the most up-to-date information.

Goals

  • Security
  • High availability
  • Rapid immutable deployment (with Terraform + Packer)

Architecture

The base platform contains control nodes that manage the cluster and any number of agent nodes. Containers automatically register themselves into DNS so that other services can locate them.

mantl-diagram

Control Nodes

The control nodes manage a single datacenter. Each control node runs Consul for service discovery, Mesos and Kubernetes leaders for resource scheduling and Mesos frameworks like Marathon.

The Consul Ansible role will automatically bootstrap and join multiple Consul nodes. The Mesos role will provision highly-availabile Mesos and ZooKeeper environments when more than one node is provisioned.

Agent Nodes

Agent nodes launch containers and other Mesos- or Kubernetes-based workloads.

Edge Nodes

Edge nodes are responsible for proxying external traffic into services running in the cluster.

Getting Started

All development is done on the master branch. Tested, stable versions are identified via git tags. To get started, you can clone or fork this repo:

git clone https://github.com/mantl/mantl.git

To use a stable version, use git tag to list the stable versions:

git tag
0.1.0
0.2.0
...
1.2.0


git checkout 1.2.0

A Vagrantfile is provided that provisions everything on a few VMs. To run, first ensure that your system has at least 2GB of RAM free, then just:

vagrant up

Note:

  • There is no support for Windows at this time, however support is planned.
  • Use the latest version of Vagrant for best results. Version 1.8 is required.
  • There is no support for the VMware Fusion Vagrant provider; hence your provider is set to Virtualbox in your Vagrantfile.

Software Requirements

The only requirements for running Mantl are working installations of Terraform and Ansible (or Vagrant, if you're deploying to VMs). See the "Development" sections for requirements for developing Mantl.

Deploying on multiple servers

Please refer to the Getting Started Guide, which covers cloud deployments.

Documentation

All documentation is located at http://docs.mantl.io.

To build the documentation locally, run:

sudo pip install -r requirements.txt
cd docs
make html

Roadmap

Mesos Frameworks

  • Marathon
  • Kafka
  • Riak
  • Cassandra
  • Elasticsearch
  • HDFS
  • Spark
  • Storm
  • Chronos
  • MemSQL

Note: The most up-to-date list of Mesos frameworks that are known to work with Mantl is always in the mantl-universe repo.

Security

  • Manage Linux user accounts
  • Authentication and authorization for Consul
  • Authentication and authorization for Mesos
  • Authentication and authorization for Marathon
  • Application load balancer (based on Traefik)
  • Application dynamic firewalls (using consul template)

Operations

  • Logging (with the ELK stack)
  • Metrics (with the collectd addon)
  • In-service upgrade with rollback
  • Autoscaling of worker nodes
  • Self maintaining system (log rotation, etc)
  • Self healing system (automatic failed instance replacement, etc)

Supported Platforms

Community Supported Platforms

Please see milestones for more details on the roadmap.

Development

If you're interested in contributing to the project, install Terraform and the Python modules listed in requirements.txt and follow the Getting Started instructions. To build the docs, enter the docs directory and run make html. The docs will be output to _build/html.

Good issues to start with are marked with the low hanging fruit tag.

To keep your fork up to date.

1. Clone your fork:

git clone [email protected]:YOUR-USERNAME/mantl.git

2. Add remote from original repository in your forked repository:

cd into/cloned/fork-repo
git remote add upstream git://github.com/mantl/mantl.git
git fetch upstream

3. Updating your fork from original repo to keep up with their changes:

git pull upstream master

Getting Support

If you encounter any issues, please open a Github Issue against the project. We review issues daily.

We also have a gitter chat room. Drop by and ask any questions you might have. We'd be happy to walk you through your first deployment.

Cisco Intercloud Services provides support for OpenStack based deployments of Mantl.

License

Copyright © 2015 Cisco Systems, Inc.

Licensed under the Apache License, Version 2.0 (the "License").

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

mesos-consul's People

Contributors

annaken avatar atongen avatar awdrius avatar barthoda avatar brianhicks avatar bryanstephens avatar chrisaubuchon avatar corebug avatar cpredmann avatar dylanmei avatar ernestas-poskus avatar ersitzt avatar isaacd9 avatar jnonon avatar kamaradclimber avatar mnp avatar naemono avatar rncry avatar ryane avatar stevendborrelli avatar theaxiom avatar thomasvincent avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mesos-consul's Issues

dns failing inside the container

We're using local consul agents to provide dns for all our nodes in the cluster. This is setup via resolv.conf and works without any problems both on the host and inside all other running containers.

mesos-consul however cannot seem to resolve hosts (namely zookeeper). I can ping zookeeper on the host and it resolves, I can start an ubuntu container and ping zookeeper from inside it. I've tried changing the entrypoint of mesos-consul to /bin/sh and pinging our zookeeper hosts fail inside the container. Any ideas?

mesos-consul acts up when mesos-agent is not responding

I have a big mesos cluster with 200 nodes participating.

I am seeing issues with mesos-consul that when one or more mesos-slaves are not working:

  1. It doesn't de-register those mesos-slaves.
  2. And since some of the slaves are not responding, it doesn't update the services properly in consul. For example: when I scale apps down in marathon, it doesn't update the members for that consul service.

And the only way I am able to get it to work is first to resolve the issue with the mesos-slave which it thinks is a part of the cluster. And then restart mesos-consul. Then only it updates the members properly from marathon to consul.

I think we need to make mesos-consul more robust to handle these scenarios for running in production. Because consul services are the the source of truth. And if the consul services are not reflecting the right members from marathon, it defeats the whole purpose of using mesos-consul.

-Imran

Failed to Connect to 127.0.0.1:2181

Hello Chris,

I am trying to run mesos-consul docker image using following command:

docker run -e "--zk=zk://myip:2181/mesos" -t ciscocloud/mesos-consul

docker instances is trying to connect to 127.0.0.1:2181 instead of myip:2181. I wonder if i am passing zk parameter in the right way(i tried zk=zk://myip:2181/mesos which didnt work either). please let me know if anything need to be updated.

thanks

disable chronos services

Hi,
Is there a way to disable chronos framework with mesos-consul ? I have chronos finished task showing up a services in consul. I dont have a use for it and it is polluting the consul services namespace.

-Imran

Registering private IPs thru mesos-consul

I am provisioning containers on mesos framework using Calico (multi host networking solution). Calico assigns a separate IP for each of the container that I provision thru Calico from my defined IP pool.

Issue is when my container got registered thru mesos-consul, in my Consul I see the HOST IP of the container. Instead is there a way I can register container's IP thru mesos-consul and have Consul report Container's IP instead of Host IP?

Thanks
Lax

Integrate with the mesos-dns state, records, and labels packages

The mesos-dns state parser and some of its other generators do a reasonably good job of parsing state.json and exposing the key datapoints for service discovery.

It would be ideal, in my mind, for effort to not be split across the two projects to support building these primitives. Perhaps an opportunity for collaboration exists (and a reduction of overall effort) by importing those parsers and simply shipping the output to consul?

mesos-consul can't detect all services/containers

I am running consul servers through progrium/consul Docker image on all three of my mesos-master nodes. When I start mesos-consul, it can register the mesos service but cannot register any of the containers started using Marathon. All I see on the Consul UI is consul and mesos services.

Wrong ENTRYPOINT or no binary found for ciscocloud/mesos-consul:v0.3-ubuntu

docker run ciscocloud/mesos-consul:v0.3-ubuntu --zk=zk://X.X.X.X:2181/mesos
Unable to find image 'ciscocloud/mesos-consul:v0.3-ubuntu' locally
v0.3-ubuntu: Pulling from docker.io/ciscocloud/mesos-consul

9231a146a3b3: Pull complete
616b550dc233: Pull complete
dd59d9976e39: Pull complete
4e9c8310e711: Pull complete
fd0a3f117a78: Pull complete
81f58fa75605: Pull complete
d9841a21fe6e: Pull complete
Digest: sha256:56d3faee8561c8bc53c5cae8fcb20862402ed0232fe0695b5532f6d6a942a234
Status: Downloaded newer image for docker.io/ciscocloud/mesos-consul:v0.3-ubuntu
exec: "/bin/mesos/consul": stat /bin/mesos/consul: no such file or directory
Error response from daemon: Cannot start container 4d582f9067b91c279c033af2b55d66301c7b2ea561af0389cc551ee96f295760: [8] System error: exec: "/bin/mesos/consul": stat /bin/mesos/consul: no such file or directory

So I guess either the build failed, the binary is actually somewhere else or the ENTRYPOINT is wrong?

mesos-consul doesn't de-register inactive services after restart

When mesos-consul starts up, it does not know about any of the services it may have previously registered. This can result in orphaned, seemingly duplicate services in consul. mesos-consul needs a way to identify services that it had previously registered so that it can properly deregister them if they are no longer active.

can't compile under ubuntu

Attempting to compile under ubuntu (to fix the search domain 'feature' or alpine linux) fails with the following error:

root@81ada8270ab0:/go/src/github.com/CiscoCloud/mesos-consul# go get
# github.com/CiscoCloud/mesos-consul/mesos/zkdetect
mesos/zkdetect/client.go:253: undefined: zk.StateSyncConnected

Dockerfile used:

FROM ubuntu:14.04

MAINTAINER Chris Aubuchon <[email protected]>

COPY . /go/src/github.com/CiscoCloud/mesos-consul

RUN apt-get update -y
RUN apt-get install -y golang git mercurial
#RUN cd /go/src/github.com/CiscoCloud/mesos-consul \
#   && export GOPATH=/go \
#   && go get \
#   && go build -o /bin/mesos-consul \
#   && rm -rf /go \
#   && apt-get remove golang git

ENTRYPOINT [ "/bin/mesos-consul" ]

Consul in HA mode

Just a small question: Is mesos-consul the actual implementation of Hashicorp's Consul or just a bridge into existing consul cluster ? If it is an implementation of Hashicorp's Consul, how can one put multiple consul servers into a cluster node, because mesos-consul.json looks like a single instance implementation.

Thanks in advance.

Alex

Build failing - Directory not empty

Trying to build mesos-consul, unfortunately getting errors:

[root@node236 mesos-consul]# docker build -t mesos-consul .
Sending build context to Docker daemon 331.8 kB
Step 1 : FROM gliderlabs/alpine:3.1
 ---> 49658ac01bf6
Step 2 : MAINTAINER Chris Aubuchon <[email protected]>
 ---> Running in 1080bd0e3558
 ---> aa6d2bb3d8ab
Removing intermediate container 1080bd0e3558
Step 3 : COPY . /go/src/github.com/CiscoCloud/mesos-consul
 ---> 65772d7cdaf8
Removing intermediate container a657cd2bc73f
Step 4 : RUN apk add --update go git mercurial  && cd /go/src/github.com/CiscoCloud/mesos-consul        && export GOPATH=/go    && go get  && go build -o /bin/mesos-consul         && rm -rf /go   && apk del --purge go git mercurial
 ---> Running in 48f821a34d3c
fetch http://alpine.gliderlabs.com/alpine/v3.1/main/x86_64/APKINDEX.tar.gz
(1/21) Installing go (1.3.3-r1)
(2/21) Installing run-parts (4.4-r0)
(3/21) Installing openssl (1.0.1s-r0)
(4/21) Installing lua5.2-libs (5.2.3-r0)
(5/21) Installing lua5.2 (5.2.3-r0)
(6/21) Installing lua5.2-posix (32-r1)
(7/21) Installing ca-certificates (20141019-r0)
(8/21) Installing libssh2 (1.4.3-r1)
(9/21) Installing curl (7.39.0-r0)
(10/21) Installing expat (2.1.0-r1)
(11/21) Installing pcre (8.36-r2)
(12/21) Installing git (2.2.1-r0)
(13/21) Installing libbz2 (1.0.6-r3)
(14/21) Installing libffi (3.0.13-r0)
(15/21) Installing gdbm (1.11-r0)
(16/21) Installing ncurses-terminfo-base (5.9-r3)
(17/21) Installing ncurses-libs (5.9-r3)
(18/21) Installing readline (6.3-r3)
(19/21) Installing sqlite-libs (3.8.10.2-r1)
(20/21) Installing python (2.7.9-r0)
(21/21) Installing mercurial (3.2.2-r0)
Executing busybox-1.22.1-r15.trigger
Executing ca-certificates-20141019-r0.trigger
OK: 173 MiB in 36 packages
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/hooks': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/info': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/logs/refs/heads': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/logs/refs/remotes/origin': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/logs/refs/remotes': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/logs/refs': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/logs': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/objects/pack': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/objects': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/refs/heads': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/refs/remotes/origin': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/refs/remotes': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git/refs': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/.git': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/config': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/consul': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/mesos': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/registry': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/state': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul/ubuntu': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud/mesos-consul': Directory not empty
rm: can't remove '/go/src/github.com/CiscoCloud': Directory not empty
rm: can't remove '/go/src/github.com': Directory not empty
rm: can't remove '/go/src': Directory not empty
rm: can't remove '/go': Directory not empty
The command '/bin/sh -c apk add --update go git mercurial       && cd /go/src/github.com/CiscoCloud/mesos-consul        && export GOPATH=/go        && go get       && go build -o /bin/mesos-consul        && rm -rf /go   && apk del --purge go git mercurial' returned a non-zero code: 1
[root@node236 mesos-consul]# 

Any help?

Thanks,

Alex

Stale services in Consul

After multiple deployments, I have a single service running in Marathon:

But unfortunately, Consul has some stale info in it:

The only app per Marathon should be running on 31263 but you can see that Consul is populated with some old tasks:

Any ideas ?

Thanks a lot in advance!

ip and hostname confusion

Hi,

I've got a lot of messages saying [WARN] master changed to 10.76.102.60. In fact each check lead to this message, even if the mesos leader does not change.
Here is a complete log which keep repeating :

2015/07/02 09:35:37 [INFO] Zookeeper leader: <zookeeper hostname>
2015/07/02 09:35:37 [INFO] reloading from master <mesos master hostname>
2015/07/02 09:35:37 [WARN] master changed to 10.76.102.60

I dive into the code to find the source of those logs and it leads me to
https://github.com/CiscoCloud/mesos-consul/blob/master/mesos/mesos.go#L71-L81

ip, port := m.getLeader() you get the hostname (the value is named ip, but it's an hostname and the port of mesos leader. It's displayed in this line : [INFO] reloading from master

Then you compare this ip with the one from mesos-API
if rip := leaderIP(sj.Leader); rip != ip.
Here is the problem, I've manually make the call messos-consul is making to get the sj object.

There is a leader key but the value associated is an ip not a hostname.
The comparison always fails since you compare an IP with an hostname.

I've tried to launch mesos with --hostname <ip address of the host>, the problem seems to be resolve.

I'm not sure if you can qualified this as a bug, but at least it could be mentioned in the documentation.

Service found. Not registering.

None of the services I launch ever get registered with consul. Here are the logs:

time="2015-10-14T16:21:04Z" level=info msg="Running parseState"
time="2015-10-14T16:21:04Z" level=debug msg="Running RegisterHosts"
time="2015-10-14T16:21:04Z" level=info msg="Host found. Comparing tags: ([agent], [agent])"
time="2015-10-14T16:21:04Z" level=info msg="Host found. Comparing tags: ([leader master], [leader master])"
time="2015-10-14T16:21:04Z" level=debug msg="Done running RegisterHosts"
time="2015-10-14T16:21:04Z" level=debug msg="Service found. Not registering: mesos-consul:127.0.0.1:service1:13254"
time="2015-10-14T16:21:04Z" level=debug msg="Service found. Not registering: mesos-consul:127.0.0.1:service2:49394"
time="2015-10-14T16:21:04Z" level=debug msg="Service found. Not registering: mesos-consul:127.0.0.1:service3:18796"

I'm launching with this command:

docker run -i ciscocloud/mesos-consul \
   --zk=zk://zk-1.service.consul:2181/mesos \
   --refresh=15s \
   --log-level=debug \
   --consul

Question - This project VS marathon-consul

This is not an issue, but a question.

I am trying to wrap my head around why someone would use this mesos-consul bridge over the other project you have the marathon-consul bridge?

Is it simply for those that use mesos but do not use marathon?

Question: mesos-consul interaction with marathon folder

Marathon framework recently introduced folders to group application.

The effect on mesos task is to prefix the task name by the name of the prefix and an underscore. For instance:

/my/group/application

will have tasks named:

my_group_application.16836b3f-d40e-11e5-b006-c4346bb5c9dc 

Consul recommends to use dns compatible names which does not include underscores.

What would be you recommendation to cope with those names?
Could we consider a way to rewrite task names to replace underscores ?

Non-routable IP Address w/ docker

Perhaps my setup is wrong but I'm trying to figure out how to get the proper IP address registered in consul. When a new service is registered in Marathon it gets assigned a docker IP address:

   statuses:[  
      {  
         labels:[  
            {  
               key:"Docker.NetworkSettings.IPAddress",
               value:"172.17.0.14"
            }
         ],
         state:"TASK_RUNNING",
         timestamp:1444843301.64405
      }
   }

The problem is that the 172.17.0.14 gets registered in consul but I wont actually be able to access that IP from outside the host. What I need to be able to do is have it register the IP address of the HOST so that I can access it. How are you guys handling this? Is there a better of way setting this up or am have I missed the boat completely?

ERROR: logging before flag.Parse: W1028

I'm just pulled the latest version from the docker hub and this is what I'm getting:

Oct 28 14:15:07 localhost.localdomain systemd[1]: Started Mesos Consul bridge.
Oct 28 14:15:07 localhost.localdomain docker[28458]: time="2015-10-28T14:15:07Z" level=info msg="Using zookeeper: zk://10.0.2.15:2181/mesos"
Oct 28 14:15:07 localhost.localdomain docker[28458]: 2015/10/28 14:15:07 Connected to 10.0.2.15:2181
Oct 28 14:15:07 localhost.localdomain docker[28458]: 2015/10/28 14:15:07 Authenticated: id=94767720597880839, timeout=40000
Oct 28 14:15:07 localhost.localdomain docker[28458]: ERROR: logging before flag.Parse: W1028 14:15:07.802896       1 detect.go:338] Leading master is using a Protobuf binary format when registering with Zookeeper (info_0000000001): this will be deprecated as of Mesos 0.24 (see MESOS-2340).
Oct 28 14:15:07 localhost.localdomain docker[28458]: time="2015-10-28T14:15:07Z" level=info msg="Done waiting for initial leader information from Zookeeper."

At this point it just hangs. We're currently running mesos v0.23.0 (running mesosphere dcos).

Configuration Help

I am trying to setup mesos-consul in our env and am having some trouble with configurations. In our setup I am running mesas-consul via marathon. I finally got the health check working with help from @gusnuf . It appears that i had to run the container with network: HOST instead of BRIDGE.

Now the next challenge for me for my apps to show up in consul. The first question i have

Do i have to update anything with my docker daemon ?

Question about consul service registration with regards to ports and docker

i currently have an app that is running in a docker container using mesos scheduled with marathon, along with the mesos-consul bridge.

Current marathon app configuration is using bridge networking and allowing mesos/marathon to select whatever port that is available for the host port, but the docker container itself is bound to 8080:

{
"container": {
"type": "DOCKER",
"docker": {
"image": "sarlindo/wildfly-app",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 8080, "hostPort": 0, "servicePort": 0, "protocol": "tcp" }
]
}
},
"id": "wildfly",
"cmd": "/opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0 -bmanagement=0.0.0.0",
"instances": 1,
"cpus": 0.3,
"mem": 256
}

Now, when this service gets registered with consul by the mesos-consul bridge, I see it being registered to the following ip/port.

172.17.0.4:31657

Now the ip here is the internal docker ip and not the host and the port number is the host port that mesos/marathon assigned.

The issue now is I can't get to this service because inside the docker container the port is actually 8080.

Is this the way this is suppose to work? Or am I doing something wrong here?

should consul have any port?

i build consul in docker ,and run it on marathon, i'm a little confused ,should consul expose some port like mesos-dns , and should i change the /etc/resolve.conf ,point the nameserver to consul's IP ,
and i see consul-temples the command like this

$ consul-template
-consul 127.0.0.1:8500
-template "/tmp/template.ctmpl:/var/www/nginx.conf:service nginx restart"
-retry 30s
-once

is the port 8500 belong to consul ?

Question about health checks

I'd like to register health checks on services declared by mesos-consul. This will avoid to rely on the aliveness (and speed) of mesos-consul to clean dead instances and would leverage consul health checking instead.

Mesos already have some health checks (command health check) and might have http health checks (https://reviews.apache.org/r/36816) but I don't know in which endpoint we could see them.

Do other users do that ? Would it be possible to register healthchecks ?

Cleaning up old failed tasks?

This is more of a question than an issue.

i was wondering why I was getting duplicative service ids for the same service, even though it was an old deployment of the service.

from here:

https://github.com/CiscoCloud/mesos-consul/blob/master/mesos/mesos.go#L150-L158

It appears that only running tasks are investigated.

I'm seeing this in my setup:

http://myhost:8500/v1/catalog/service/nginx-test
[
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31678","ServiceName":"nginx-test","ServiceTags":["nginx-test","mmontgomery","dev"],"ServiceAddress":"192.168.200.2","ServicePort":31678,"ServiceEnableTagOverride":false,"CreateIndex":1330,"ModifyIndex":1330},
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31679","ServiceName":"nginx-test","ServiceTags":["nginx-test","mmontgomery","dev"],"ServiceAddress":"192.168.200.2","ServicePort":31679,"ServiceEnableTagOverride":false,"CreateIndex":1331,"ModifyIndex":1331},
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31889","ServiceName":"nginx-test","ServiceTags":["nginx-test","mmontgomery","dev"],"ServiceAddress":"192.168.200.4","ServicePort":31889,"ServiceEnableTagOverride":false,"CreateIndex":1108,"ModifyIndex":1108},
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31890","ServiceName":"nginx-test","ServiceTags":["nginx-test","mmontgomery","dev"],"ServiceAddress":"192.168.200.4","ServicePort":31890,"ServiceEnableTagOverride":false,"CreateIndex":1109,"ModifyIndex":1109},
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31902","ServiceName":"nginx-test","ServiceTags":[],"ServiceAddress":"192.168.200.3","ServicePort":31902,"ServiceEnableTagOverride":false,"CreateIndex":1043,"ModifyIndex":1043},
{"Node":"iadmesos1.here.com","Address":"10.57.128.114","ServiceID":"mesos-consul:10.57.128.114:nginx-test:31903","ServiceName":"nginx-test","ServiceTags":[],"ServiceAddress":"192.168.200.3","ServicePort":31903,"ServiceEnableTagOverride":false,"CreateIndex":1044,"ModifyIndex":1044}
]

Only one of these tasks is actually active.

Does de-registration not remove these?

Thanks for any further insight you can give.

Connect to Zookeeper fails

Hi,

getting this error:

2015/10/27 11:29:31 Connected to mesos01.bva.nu:2181
2015/10/27 11:29:31 Authenticated: id=94756170276864056, timeout=40000
2015/10/27 11:31:31 [ERROR] Timed out waiting for initial ZK detection

Seems like it can connect and then later times out. Any ideas?

Don't register tasks without listening ports

Registering tasks without any listening ports looks very deliberate in the code but is for us only producing noise we would like to avoid. Blacklisting works fine but requires a list of task name patterns for batch jobs to be kept up to date I would rather have a command line flag to disable all tasks without ports. WDYT?

mesos-consul not registering all instances into Consul

Here is what I am doing. I am using mesos IP per task (thru Calico) where in each container is given a separate IP. I had a case where multiple instances of a task deployed on the same node, with each instance having assigned an IP of its own(10.100.0.179, 10.100.0.180) . So far good, but when it comes to registering those instances into Consul, mesos-consul happens to register only one of the instance (10.100.0.179) and skipping the other one.

Is there a known limitation with mesos-consul?

Also whenever I register IP per container service thru mesos-consul I see 2 entries with same IP within Consul. One with the Port service is listening on and one with port 0. Why
eg: If my container has an IP 10.100.0.155 and listening on port 8000. Consul sees 2 entries one with
10.100.0.155:8000
10.100.0.155:0

Why does mesos-consul registers with port 0? I tried with mesos-ip-order as 'docker,mesos,host' and also as just 'docker'. Both the times I get an entry with port 0.

Do you plan to do Canaries?

Hi,

we have project https://github.com/eBayClassifiedsGroup/PanteraS (PaaS in a box)
where we use registrator, but since with registartor is hard to get any feature :)
we would like to switch for something more maintainable.

We were able to do canaries due to possibility to register with different name
(SERVICE_NAME in ENV in marathon deployment plan).

Do you plan to do something similar, using tags or env, so consul service name is different than marathon id ?

how to ignore some ip which it shall not in haproxy config

I run mesos, marathon and docker solution.
But i have two kind of app, one is http/rest, another is our private rpc app.
The rpc app's discovery and LB are handled by other gateway.
So I want to ignore these rpc app when generate haproxy config file.

Any one can help?

Thanks a lot!

Comma-separated values for labels

The problem: As a Mesos/Marathon user I would like to set a label containing config in consul and recover it using Traefik. Traefik supports comma-separated values, like traefik.frontend.entryPoints=http,https in https://docs.traefik.io/toml/#marathon-backend but mesos-consul replace it as
"ServiceTags": [
"traefik.frontend.entryPoints=http",
"https"
]
because of hard-coded comma in https://github.com/CiscoCloud/mesos-consul/blob/fa88da9f6ac102a281def5381790f399df2910d5/mesos/mesos.go#L88

Possible solutions:

  • Configurable separator, so I can use another symbol, like |
  • Support escaped comma, like gliderlabs/registrator#401 (they had the same problem)

I'm available to send a PR as soon some maintainer decides what is better for the project.

Services with HOST networking

Anybody have any tips on how to get these to register specific host ports and not the mapped ports that Marathon assigns from the EXPOSE declaration?

Duplicated Consul services

I have one app in Marathon, two tasks/instances for it on 2 different Mesos slaves, yet I see 4 services for it:

Whitelisting tasks

When using a Spark framework, many entries are created for "task-0", "task-1", and so on. These are entries I didn't want. But when using the Kafka framework, "broker-1", "broker-2" are exactly what I want. I'm interested in a way to whitelist on some level, either by the framework, a role, and/or some string pattern.

Dots on key names

Hi,

Currently, I have a setup with mesos + consul + consul-template and have been using mesos-consul to connect everything together.

I ran on an issue with consul having the keys registered with dots. For instance, web.nginx-v13 becomes invalid on consul-template due to the dot on its name.

Apparently, it's been fixed with #19 but not released.

Any changes to release 0.2.1 with master fixes? Would appreciate it.

Thanks
Ricardo

consul service deregistration fails if the mesos slave is down

It seems that it tries to connect to the node that "used" to run the task to deregister the service from it.

What if you run a multi-master cluster consul setup and with local agents running on each box and you lost the slave that was running the service? The service is never deregistered and when running with log level DEBUG it seems that it will try forever to deregister the service by continuing to hit the old IP address of where the service was running.

Allow the consul addr to be configured

When I try to run mesos-consul I'm unable to specify the location of consul, it seems to want to connect to consul as if it's running on a mesos master node. It would be great if I could specify consul-addr 192.168.1.2, looks like consul-port is already supported.

Add tags to mesos-consul itself ?

Is there a possibility to add a tag to the mesos service itself ?

The issue is that if you run multiple Mesos clusters with a single shared Consul cluster, both environments get mixed.

Assuming a dev Mesos cluster:

dev.mesos.service.consul

And a prod Mesos cluster:

prod.mesos.service.consul

It would be great to be able to override the actual service name, mesos being really generic and also taking in account where you already have a mesos service in Consul.

Thoughts ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.