With this project you can launch your own MongoDB cluster with up to thousands of shards in Google Compute Cloud with just a few commands.
- Easy to configure, with
ansible
YAML - Set any amount of shards
- Latest MongoDB version on 4.0.X ubuntu sources list, which is
4.0.12
, you can change it to 4.2, or any other version in a single place in and it would set the whole cluster to that version, ingcp_common.yml
XFS
file system for the replication nodes storage- Authentication enabled with
KeyFile
in all nodes - Password protected with root user
- Configured for simple vertical scaling
- 2 Replication nodes and 1 arbiter per shard
- 1 Config server
- 1 Mongos
Ansible
,Python
andgoogle-auth
python libraries- A google compute cloud account with billing enabled for the API's
- A JSON service account file with the required permissions
- A compute engine project
- Edit the
inventory/inventory.base.yml
file, changingPROJECT
with your google compute engine project- Change the
username
andpassword
- Set your desired zone, it's configured for frankfurt's
europe-west3-b
. - Change the ansible ssh user, which should be your configured user in Google Compute Engine so the ssh keys are copied automatically when creating the instances.
NOTE: You don't have to edit the rest of the configuration in the inventory.base.yml
but it's contents are used for the inventory file that will be used, i.e. inventory.yml
- Configure the project with the
configure.py
script
$ python3 configure.py --shards 2 --id cl7
- Place your service account JSON file downloaded from the google compute engine website in
~/gcp_sa.json
, you can change the path if you want editing theinventory.gcp.yml
file
That will configure everything, first parameter of the script is the amount of shards, the second one is the cluster identifier which can't be more than 3 characters long, i.e. cl8
or any combination of 3 characters.
- Run the launch script
$ ./launch.sh
- The
configure.py
python script configures theinventory
variables and the google compute inventory. - The launch script Launches the nodes, with the ansible command
ansible-playbook ./launch_nodes.yml -i inventory
which- Launches the instances based on the ubuntu 18.04 image
- Creates the disks, etc.
You can edit this file launch_nodes.yml
if you would like a different image, type of disk, instance type, etc.
For production you'd need a better instance type, i.e. not f1-micro
and larger ssd disks, i.e. "pd-ssd"
instead of "pd-standard"
, etc.
- Sets up the whole MongoDB cluster with
ansible-playbook ./launch_cluster.yml -i inventory
. This playbook imports all the set-up tasks in the right order, which aregcp_common.yml
- common provision to all nodesgcp_xfs.yml
- set up XFS diskst_gcp_node_p.yml
- provision replica set nodest_gcp_arbiter.yml
- provision shard arbiterst_gcp_cluster.yml
- create replica sets in shardst_gcp_node_auth.yml
- enable auth in nodest_gcp_arb_auth.yml
- enable auth in arbiterst_gcp_config.yml
- provision config servert_gcp_mongos.yml
- provision mongos server
After all of this runs, you can go to your google compute engine page, find the mongos instance external IP address, connect and check the sharding status.
For example
$ mongo --host IP_ADDRESS:27017 \
-u 'mongouser' -p '123456' --authenticationDatabase 'admin'
MongoDB shell version v4.0.11
MongoDB server version: 4.0.12
mongos> sh.status()
shards:
{ "_id" : "shrs0", "host" : "shrs0/cl8-0-ab.c.PROJECT.internal:27017,cl8-0-n.c.PROJECT.internal:27017,cl8-2-n.c.PROJECT.internal:27017", "state" : 1 }
{ "_id" : "shrs1", "host" : "shrs1/cl8-1-ab.c.PROJECT.internal:27017,cl8-1-n.c.PROJECT.internal:27017,cl8-3-n.c.PROJECT.internal:27017", "state" : 1 }
active mongoses:
"4.0.12" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
trimmed for clarirty, replace IP_ADDRESS with your mongos external IP.
The cluster is configured in a way in which you can reboot the nodes, after stopping all transactions of course and re-scale the instances, i.e. from f1-micro
to n1-highmem-2
, reboot it and enjoy the new computing power, or the other way around.
Something doesn't work? Open an issue.
Contributing? YES
LICENSE: MIT