GithubHelp home page GithubHelp logo

blog's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

blog's Issues

(Reading notes)-How Soccer Players Would do Stream Joins

This is my reading notes of this paper(How Soccer Players Would do Stream Joins, SIGMOD 2011)

Contributions

  • Proposing a new stream join called handshake like soccer players, which can take advantage of multicores.
    handshake

Background

  • Sliding-Windows Joins, three-step procedure devised by Kang.
  • CellJoin

Strategies

  • Converting the original control flow problem into a data flow representation
  • Processing units only interact with their immediate neighbors
  • Processign all comparisons locally
    comparison
  • Immediate scan
    immediate scan
  • Proposing asynchronous message queues and two-phase forwarding to solve missed-join pair problem
    two-phase forwarding
  • Autonomic load balancing
    skew
    balance

Drawback

  • The results not in order

My English blog is open!

This is my English blog

Why open an English blog?

I am a new PhD in CS, and my English is poor. Due to the pressure of papers, I must improve my ability of writing and reading English papers. So I will exercise in this way.

What will I write in this blog?

I will write some technical articles in this blog. When I read papers and find something interesting, I will record them. This is a practice of expressing something in English and I will persist on it.

I hope I can improve my English through this way! Come on!
This is my pervious blog: MiracleMa's Chinese Blog

How to use rbd-mirror to achieve remote disaster recovery?

Description

From Jewel, Ceph support remote disaster recovery through rbd-mirror. You can use rbd-mirror to synchronize data between your cluster and a remote cluster.

Introduction

Background

Ceph is strongly consistent internally, so you can't deploy a Ceph cluster across region. So we need a mechanism to achieve:

  • disaster recovery
  • global block device distribution

Internal realization

realization
Rbd-mirror is a daemon in two cluster. It is responsible for synchronizing data from primary image/pool to non-primary image/pool. It depends on a new attribute of rbd, journaling.
Generally, first write journal, then write local image and remote image. When an/a image/pool is primary, it can be written and the remote is locked(unwritable).

Practice

Environment

We will configure a two-way replication rbd-mirror with image mode.
We need to configure two cluster, so I use 16 machines and every 8 machines a cluster with CentOS7. And one cluster is local and another is remote. The host of local is tstore04 and host of remote is tstore12.

Configuration files and keyring

Firstly, I process configuration files and keyring.
On tstore04(local):

scp /etc/ceph/ceph.conf tstore12:/etc/ceph/local.conf
scp /etc/ceph/ceph.client.admin.keyring tstore12:/etc/ceph/local.client.admin.keyring
cp /etc/ceph/ceph.conf /etc/ceph/local.conf
cp /etc/ceph/ceph.client.admin.keyring /etc/ceph/local.client.admin.keyring

On tstore12(remote):

scp /etc/ceph/ceph.conf tstore04:/etc/ceph/remote.conf
scp /etc/ceph/ceph.client.admin.keyring tstore04:/etc/ceph/remote.client.admin.keyring
cp /etc/ceph/ceph.conf /etc/ceph/remote.conf
cp /etc/ceph/ceph.client.admin.keyring /etc/ceph/remote.client.admin.keyring

And give two host corresponding authprity(On two host):

chown ceph:ceph -R /etc/ceph

We can check whether the configuration is successful:

root@tstore04:~/ceph# ceph mon stat --cluster local
e1: 1 mons at {tstore04=192.168.50.14:6789/0}, election epoch 3, quorum 0 tstore04
root@tstore04:~/ceph# ceph mon stat --cluster remote
e1: 1 mons at {tstore12=192.168.50.22:6789/0}, election epoch 3, quorum 0 tstore12

Then two clusters can communication each other by local and remote.

Create pools and add peer

We now create pools on tstore04 using local and remote:

ceph osd pool create pool1 100 100 --cluster local
ceph osd pool create pool1 100 100 --cluster remote

And then we need to set the mirror mode of pool, including pool and image. Pool mode means that rbd-mirror will synchronize all images between two clusters, and image mode means that rbd-mirror will synchronize specified image.
We will set image mode on two pools:

rbd mirror pool enable pool1 image --cluster local
rbd mirror pool enable pool1 image --cluster remote

After that, we will add peer of each cluster as following:

rbd mirror pool peer add <pool-name> <client-name>@<cluster-name>

Cluster-name is local or remote and client-name we choose is admin(as keyring).
So we add peer as following:

rbd mirror pool peer add pool1 client.admin@remote --cluster local
rbd mirror pool peer add pool1 client.admin@local --cluster remote

We can check whether we configure correctly:

[root@tstore04 ceph]# rbd mirror pool info pool1 --cluster local
Mode: image
Peers: 
  UUID                                  NAME   CLIENT       
  a050a0f5-9448-43f2-872f-87c394083871 remote client.admin
[root@tstore04 ceph]# rbd mirror pool info pool1 --cluster remote
Mode: image
Peers: 
  UUID                                  NAME  CLIENT       
  8d7b3fa4-be44-4e25-b0b7-cf4bdb62bf10 local client.admin

If show as above, it is correctly.
We can inquire the state of pool as following:

[root@tstore04 ceph]# rbd mirror pool status pool1
health: OK
images: 0 total

Pool health is OK, representing the configuration is right. If it is WARNING, there is something wrong.
If the pool has images, the number 0 will become numbers of images, appending their state.

Create image and data synchronization

Now we will enable the mirror of image, so we need to create an image firstly:

rbd create --size 100G pool1/image0 --image-feature exclusive-lock,journaling

The attribute of journing is necessary.
Then we will enable mirror:

rbd mirror image enable pool1/image0

We also need to start the daemon of rbd-mirror on two hosts:

systemctl start ceph-rbd-mirror@admin

After that, data synchronization starts.
We can examine the state of mirror:

[root@tstore04 ~]# rbd mirror image status pool1/image0 --cluster remote
image-1:
  global_id:   dabdbbed-7c06-4e1d-b860-8dd104509565
  state:       up+replaying
  description: replaying, master_position=[object_number=2, tag_tid=2, entry_tid=3974], mirror_position=[object_number=3, tag_tid=2, entry_tid=2583], entries_behind_master=1391
  last_update: 2018-08-21 13:54:22
[root@tstore04 ~]# rbd mirror image status pool1/image0 --cluster local
image0:
  global_id:   746d97c7-3b7d-4344-b75e-ffbf0d635265
  state:       up+stopped
  description: local image is primary
  last_update: 2018-08-21 13:55:17

We can see the state of primary image is up+stopped and state of non-primary is up+replaying.
And we can check whether the data is synchronized:

[root@tstore12 cluster]# rbd info pool1/image0 
rbd image 'image0':
	size 102400 MB in 25600 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.48f0a74b0dc51
	format: 2
	features: exclusive-lock, journaling
	flags: 
	create_timestamp: Tue Aug 21 14:21:12 2018
	journal: 48f0a74b0dc51
	mirroring state: enabled
	mirroring global id: 746d97c7-3b7d-4344-b75e-ffbf0d635265
	mirroring primary: false

Now the data has been synchronized. And we can use watch ceph df to see the number of objects in pools and the usage of pools. The pool of primary image will have more 4 objects than the pool of non-primary image, which are journal_data.

How to achieve recovery

When the state of clusters is right, the primary image is writable and the non-primary image is unwritable, that means we can read and write on the block device mapping from the primary image and can't read and write on the block device mapping from the non-primary image.
With the attribute journaling, we can't map image to rbd device, so we must use rbd-nbd:

rbd nbd map pool1/image0
rbd nbd unmap /dev/nbd0
mkfs -t xfs /dev/nbd0
mount -o sync,discard /dev/nbd0 ./test

We can use od -x /dev/nbd0 to check the content of files, including block device.
If we want to read and write the data in the non-primary image, we need to demote the primary image and promote the non-primary image.

rbd mirror image demote pool1/image0 --cluster local
rbd mirror image promote pool1/image0 --cluster remote

After that, the non-primary image will become primary and the primary image will fall into split-brain.
So we need to resync the data from the primary(on the split-brain image):

rbd mirror image resync pool1/image0

Then the image will resynchronize data from the primary image.

Finally, we master how to use rbd-mirror to achieve disater recovery successfully.

Reference

Ceph Document
rbd-mirror technical insider
rbd-mirror configuration guide

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.