go-mysql-org / go-mysql-elasticsearch Goto Github PK

View Code? Open in Web Editor NEW

4.1K 177.0 795.0 1.89 MB

Sync MySQL data into elasticsearch

License: MIT License

Makefile 0.43% Go 98.07% Shell 0.72% Dockerfile 0.77%

go-mysql-elasticsearch's Introduction

go-mysql-elasticsearch is a service syncing your MySQL data into Elasticsearch automatically.

It uses mysqldump to fetch the origin data at first, then syncs data incrementally with binlog.

Call for Committer/Maintainer

Sorry that I have no enough time to maintain this project wholly, if you like this project and want to help me improve it continuously, please contact me through email ([email protected]).

Requirement: In the email, you should list somethings(including but not limited to below) to make me believe we can work together.

Your GitHub ID The contributions to go-mysql-elasticsearch before, including PRs or Issues. The reason why you can improve go-mysql-elasticsearch.

Install

Install Go (1.9+) and set your GOPATH
go get github.com/siddontang/go-mysql-elasticsearch, it will print some messages in console, skip it. :-)
cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch
make

How to use?

Create table in MySQL.
Create the associated Elasticsearch index, document type and mappings if possible, if not, Elasticsearch will create these automatically.
Config base, see the example config river.toml.
Set MySQL source in config file, see Source below.
Customize MySQL and Elasticsearch mapping rule in config file, see Rule below.
Start ./bin/go-mysql-elasticsearch -config=./etc/river.toml and enjoy it.

Notice

MySQL supported version < 8.0
ES supported version < 6.0
binlog format must be row.
binlog row image must be full for MySQL, you may lost some field data if you update PK data in MySQL with minimal or noblob binlog row image. MariaDB only supports full row image.
Can not alter table format at runtime.
MySQL table which will be synced should have a PK(primary key), multi columns PK is allowed now, e,g, if the PKs is (a, b), we will use "a:b" as the key. The PK data will be used as "id" in Elasticsearch. And you can also config the id's constituent part with other column.
You should create the associated mappings in Elasticsearch first, I don't think using the default mapping is a wise decision, you must know how to search accurately.
mysqldump must exist in the same node with go-mysql-elasticsearch, if not, go-mysql-elasticsearch will try to sync binlog only.
Don't change too many rows at same time in one SQL.

Source

In go-mysql-elasticsearch, you must decide which tables you want to sync into elasticsearch in the source config.

The format in config file is below:

[[source]]
schema = "test"
tables = ["t1", t2]

[[source]]
schema = "test_1"
tables = ["t3", t4]

schema is the database name, and tables includes the table need to be synced.

If you want to sync all table in database, you can use asterisk(*).

[[source]]
schema = "test"
tables = ["*"]

# When using an asterisk, it is not allowed to sync multiple tables
# tables = ["*", "table"]

Rule

By default, go-mysql-elasticsearch will use MySQL table name as the Elasticserach's index and type name, use MySQL table field name as the Elasticserach's field name.
e.g, if a table named blog, the default index and type in Elasticserach are both named blog, if the table field named title, the default field name is also named title.

Notice: go-mysql-elasticsearch will use the lower-case name for the ES index and type. E.g, if your table named BLOG, the ES index and type are both named blog.

Rule can let you change this name mapping. Rule format in config file is below:

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"
parent = "parent_id"
id = ["id"]

    [rule.field]
    mysql = "title"
    elastic = "my_title"

In the example above, we will use a new index and type both named "t" instead of default "t1", and use "my_title" instead of field name "title".

Rule field types

In order to map a mysql column on different elasticsearch types you can define the field type as follows:

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"

    [rule.field]
    // This will map column title to elastic search my_title
    title="my_title"

    // This will map column title to elastic search my_title and use array type
    title="my_title,list"

    // This will map column title to elastic search title and use array type
    title=",list"

    // If the created_time field type is "int", and you want to convert it to "date" type in es, you can do it as below
    created_time=",date"

Modifier "list" will translates a mysql string field like "a,b,c" on an elastic array type '{"a", "b", "c"}' this is specially useful if you need to use those fields on filtering on elasticsearch.

Wildcard table

go-mysql-elasticsearch only allows you determind which table to be synced, but sometimes, if you split a big table into multi sub tables, like 1024, table_0000, table_0001, ... table_1023, it is very hard to write rules for every table.

go-mysql-elasticserach supports using wildcard table, e.g:

[[source]]
schema = "test"
tables = ["test_river_[0-9]{4}"]

[[rule]]
schema = "test"
table = "test_river_[0-9]{4}"
index = "river"
type = "river"

"test_river_[0-9]{4}" is a wildcard table definition, which represents "test_river_0000" to "test_river_9999", at the same time, the table in the rule must be same as it.

At the above example, if you have 1024 sub tables, all tables will be synced into Elasticsearch with index "river" and type "river".

Parent-Child Relationship

One-to-many join ( parent-child relationship in Elasticsearch ) is supported. Simply specify the field name for parent property.

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"
parent = "parent_id"

Note: you should setup relationship with creating the mapping manually.

Filter fields

You can use filter to sync specified fields, like:

[[rule]]
schema = "test"
table = "tfilter"
index = "test"
type = "tfilter"

# Only sync following columns
filter = ["id", "name"]

In the above example, we will only sync MySQL table tfiler's columns id and name to Elasticsearch.

Ignore table without a primary key

When you sync table without a primary key, you can see below error message.

schema.table must have a PK for a column

You can ignore these tables in the configuration like:

# Ignore table without a primary key
skip_no_pk_table = true

Elasticsearch Pipeline

You can use Ingest Node Pipeline to pre-process documents before indexing, like JSON string decode, merge fileds and more.

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "_doc"

# pipeline id
pipeline = "my-pipeline-id"

Node: you should create pipeline manually and Elasticsearch >= 5.0.

Why not other rivers?

Although there are some other MySQL rivers for Elasticsearch, like elasticsearch-river-jdbc, elasticsearch-river-mysql, I still want to build a new one with Go, why?

Customization, I want to decide which table to be synced, the associated index and type name, or even the field name in Elasticsearch.
Incremental update with binlog, and can resume from the last sync position when the service starts again.
A common sync framework not only for Elasticsearch but also for others, like memcached, redis, etc...
Wildcard tables support, we have many sub tables like table_0000 - table_1023, but want use a unique Elasticsearch index and type.

Todo

MySQL 8
ES 6
Statistic.

Donate

If you like the project and want to buy me a cola, you can through:

PayPal	微信
	[

Feedback

go-mysql-elasticsearch is still in development, and we will try to use it in production later. Any feedback is very welcome.

Email: [email protected]

go-mysql-elasticsearch's People

Contributors

Stargazers

Watchers

Forkers

hongxuewen no2key huanglz chenyun zhangwei5095 baiyunping333 royzeng fusionrsrch varver jshiying bandaot eaglechen maxid jiania roth1002 xiaohelong trigrass2 dongfanliang 292388900 bikrant ningoo is00hcw geminiblue pancz adriacidre zsiddique bugsnotes blueprajna littleyang ehalpern lzpfmh kingljl jschwehn liule wuchengjiang woodlgz onlywsx lina1 jiesensun eaton-luyy jxhangithub nangzi mad3310 nonconforme hoai sputnik13 guitycrown archlevel arkorsky sjhp xurenlu wfxiang08 mrfsong363 fanqinghui aidhamza moterhub forestkeeper banyue stonenon gaodoo 798221028 uedev yy1117 zjpjohn xjtuafds zacklin923 feilongyang yuanfeng0905 hzruandd wanglong glycerine mixadior-casafari wujibing mantoudingdang aidyfeng 95zz wenzhenxing eyobebisrat thinkdb ghrib-aboubaker timcoffee sucre03 xhbpiao kilnamkim myfect fiefdx dlpc chenminhua xzj675 fengbaicanhe zhouruisong liuzxc qieangel2013 maple603 rembau joe9724 yuanskk hongtoushizi blmeena1991 saadmahboob

go-mysql-elasticsearch's Issues

[Bug] go-mysql-elasticsearch plugin can't sync date from mysql instantly!

I have been already installed success, and sync the date success once. like:
[root@5b9dbaaa148a go-mysql-elasticsearch]# ./bin/go-mysql-elasticsearch -config=./etc/river.toml
[2016/06/23 10:22:23] dump.go:95 [Info] skip dump, use last binlog replication pos (mysql-bin.000001, 106)
[2016/06/23 10:22:23] sync.go:15 [Info] start sync binlog at (mysql-bin.000001, 106)
[2016/06/23 10:22:23] status.go:52 [Info] run status http server 10.8.5.101:12800
[2016/06/23 10:22:23] sync.go:46 [Info] rotate binlog to (mysql-bin.000001, 106)
^C[2016/06/23 10:22:50] river.go:249 [Info] closing river
[2016/06/23 10:22:50] canal.go:159 [Info] close canal

But when I delete data from sql, like:
mysql> select * from cc;
+----+--------------------+---------+---------------------+
| id | name | status | modified_at |
+----+--------------------+---------+---------------------+
| 1 | laoyang360 | ok | 0000-00-00 00:00:00 |
| 2 | test002 | ok | 2016-06-23 06:16:42 |
| 3 | dlulaoyang | ok | 0000-00-00 00:00:00 |
| 4 | huawei | ok | 0000-00-00 00:00:00 |
| 5 | jdbc_test_update08 | ok | 0000-00-00 00:00:00 |
| 7 | test7 | ok | 0000-00-00 00:00:00 |
| 8 | test008 | ok | 0000-00-00 00:00:00 |
| 9 | test9 | ok | 0000-00-00 00:00:00 |
| 10 | test10 | deleted | 0000-00-00 00:00:00 |
| 11 | test1111 | ok | 2016-06-23 04:10:00 |
| 12 | test012 | ok | 2016-06-23 04:21:56 |
+----+--------------------+---------+---------------------+
11 rows in set (0.01 sec)

mysql>
mysql>
mysql> delete from cc where id = 11;
Query OK, 1 row affected (0.05 sec)

mysql> delete from cc where id = 12;
Query OK, 1 row affected (0.02 sec)

mysql> select * from cc;
+----+--------------------+---------+---------------------+
| id | name | status | modified_at |
+----+--------------------+---------+---------------------+
| 1 | laoyang360 | ok | 0000-00-00 00:00:00 |
| 2 | test002 | ok | 2016-06-23 06:16:42 |
| 3 | dlulaoyang | ok | 0000-00-00 00:00:00 |
| 4 | huawei | ok | 0000-00-00 00:00:00 |
| 5 | jdbc_test_update08 | ok | 0000-00-00 00:00:00 |
| 7 | test7 | ok | 0000-00-00 00:00:00 |
| 8 | test008 | ok | 0000-00-00 00:00:00 |
| 9 | test9 | ok | 0000-00-00 00:00:00 |
| 10 | test10 | deleted | 0000-00-00 00:00:00 |
+----+--------------------+---------+---------------------

but the plugin go-mysql-elasticsearch can't excute successfully, just like:
[root@5b9dbaaa148a go-mysql-elasticsearch]# ./bin/go-mysql-elasticsearch -config=./etc/river.toml
[2016/06/23 10:31:33] status.go:52 [Info] run status http server 10.8.5.101:12800
[2016/06/23 10:31:33] dump.go:107 [Info] try dump MySQL and parse
[2016/06/23 10:31:33] dump.go:113 [Info] dump MySQL and parse OK, use 0.08 seconds, start binlog replication at (mysql-bin.000001, 288)
[2016/06/23 10:31:33] sync.go:15 [Info] start sync binlog at (mysql-bin.000001, 288)
[2016/06/23 10:31:33] sync.go:46 [Info] rotate binlog to (mysql-bin.000001, 288)

and the es date for search_all is
{

"took": 1,
"timed_out": false,
"_shards": {
    "total": 8,
    "successful": 8,
    "failed": 0
},
"hits": {
    "total": 11,
    "max_score": 1,
    "hits": [
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "3",
            "_score": 1,
            "_source": {
                "id": 3,
                "modified_at": "0000-00-00 00:00:00",
                "name": "dlulaoyang",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "8",
            "_score": 1,
            "_source": {
                "id": 8,
                "modified_at": "0000-00-00 00:00:00",
                "name": "test008",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "9",
            "_score": 1,
            "_source": {
                "id": 9,
                "modified_at": "0000-00-00 00:00:00",
                "name": "test9",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "11",
            "_score": 1,
            "_source": {
                "id": 11,
                "modified_at": "2016-06-23 03:10:00",
                "name": "test1111",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "1",
            "_score": 1,
            "_source": {
                "id": 1,
                "modified_at": "0000-00-00 00:00:00",
                "name": "laoyang360",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "2",
            "_score": 1,
            "_source": {
                "id": 2,
                "modified_at": "2016-06-23 05:16:42",
                "name": "test002",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "4",
            "_score": 1,
            "_source": {
                "id": 4,
                "modified_at": "0000-00-00 00:00:00",
                "name": "huawei",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "12",
            "_score": 1,
            "_source": {
                "id": 12,
                "modified_at": "2016-06-23 03:21:56",
                "name": "test012",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "7",
            "_score": 1,
            "_source": {
                "id": 7,
                "modified_at": "0000-00-00 00:00:00",
                "name": "test7",
                "status": "ok"
            }
        }
        ,
        {
            "_index": "goriver",
            "_type": "goriver_t",
            "_id": "5",
            "_score": 1,
            "_source": {
                "id": 5,
                "modified_at": "0000-00-00 00:00:00",
                "name": "jdbc_test_update08",
                "status": "ok"
            }
        }
    ]
}

}

you known, the deleted date already exist in es.
Tell me why? Thanks!

I Want to sync mysql date instantly for insert, update, delete operations!

Thank you!

add more logs

Can I get a log?

after setting ./etc/river.toml
./bin/go-mysql-elasticsearch -config=./etc/river.toml
but I got noting and nothing report to ES, Can you print some error log?

Couldn't execute 'FLUSH TABLES WITH READ LOCK': Access denied for user 'root'@'%' (using password: YES) (1045)

[root@ip-172-31-20-242 go-mysql-elasticsearch]# ./bin/go-mysql-elasticsearch -config=./etc/river.toml
[2015/10/20 21:55:29] dump.go:107 [Info] try dump MySQL and parse
[2015/10/20 21:55:29] status.go:52 [Info] run status http server 127.0.0.1:12800
mysqldump: Couldn't execute 'FLUSH TABLES WITH READ LOCK': Access denied for user 'root'@'%' (using password: YES) (1045)
[2015/10/20 21:55:29] canal.go:138 [Error] canal dump mysql err: exit status 2

Will it works with AWS RDS mysql ?

support decimal in go-mysql

Can you review & merge go-mysql-org/go-mysql#16 ?

And update the dependency in this repo?

Cannot find package "encoding"….

Hello,
When I try to "make" I get the following error and not sure how to proceed (I'm not a developer - at all…lol). I do not have an "encoding" folder anywhere.
Thanks for any help!
Rob

root@jerSite:~/go/src/github.com/siddontang/go-mysql-elasticsearch# make
godep go build -o bin/go-mysql-elasticsearch ./cmd/go-mysql-elasticsearch
Godeps/_workspace/src/github.com/BurntSushi/toml/encoding_types.go:10:2: cannot find package "encoding" in any of:
/usr/src/pkg/encoding (from $GOROOT)
/root/go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/encoding (from $GOPATH)
/root/go/src/encoding
godep: go exit status 1
make: *** [build-elasticsearch] Error 1

No data synced

the programme works perfectly. But there is no data synced. below I attached the screenshot.

I am using XAMPP. This is my rivor.toml configuration.
`# MySQL address, user and password

user must have replication privilege in MySQL.

my_addr = "127.0.0.1:3306"
my_user = "root"
my_pass = ""

Elasticsearch address

es_addr = "127.0.0.1:9200"

Path to store data, like master.info, and dump MySQL data

data_dir = "./var"

Inner Http status address

stat_addr = "127.0.0.1:12800"

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = "mysql"

mysqldump execution path

if not set or empty, ignore mysqldump.

mysqldump = "mysqldump"

MySQL data source

[[source]]
schema = "test"

Only below tables will be synced into Elasticsearch.

"test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don't think it is necessary to sync all tables in a database.

#tables = ["test_river", "test_river_[0-9]{4}"]
tables = ["a"]

Below is for special rule mapping

[[rule]]
schema = "test"
table = "a"
index = "users"
type = "user"

# title is MySQL test_river field name, es_title is the customized name in Elasticsearch
[rule.field]
# This will map column title to elastic search my_title
title="es_title"
# This will map column tags to elastic search my_tags and use array type
tags="my_tags,list"
# This will map column keywords to elastic search keywords and use array type
keywords=",list"

wildcard table rule, the wildcard table must be in source tables

[[rule]]

schema = "test"

table = "test_river_[0-9]{4}"

index = "river"

type = "river"

# title is MySQL test_river field name, es_title is the customized name in Elasticsearch
# [[rule.fields]]
# mysql = "title"
# elastic = "es_title"

#table name change to a
`

refactor using go-mysql canal

one-to-many join question

Hi,
I played with this library I've managed to import data to Elastic very cool thanks, but I didn't manage to store nested indexes. I want to consult maybe someone cold point on what I'm doing wrong.
So here I have two cases in ./etc/river.toml file:

when relation is in the same schema
when relation is in different schema and it's related to first schema

# MySQL data source
[[source]]
schema = "blog"
tables = ["users", "posts"]

# MySQL data source
[[source]]
schema = "gallery"
tables = ["user_gallery", "images"]

[[rule]]
schema = "blog"
table = "users"
index = "users"
type = "user"

[[rule]]
schema = "blog"
table = "posts"
index = "posts"
type = "post"
parent = "user_id"

[[rule]]
schema = "gallery"
table = "user_gallery"
index = "galleries"
type = "gallery"
parent = "user_id"

[[rule]]
schema = "gallery"
table = "images"
index = "images"
type = "image"
parent = "user_gallery_id"

So with this I've manage to store users but it didn't create nested dependencies and it didn't store other type of index it only created them.

offtop

I configured a Data Importer Handler for solr and there I specifically described what field is related to other from what data source. So maybe I've missed something.

Getting error when restart sync

Hello,I got the log bellow:

[2016/06/17 15:25:15] sync.go:15 [Info] start sync binlog at (mysql-bin.000001, 4969960)
[2016/06/17 15:25:15] status.go:52 [Info] run status http server 127.0.0.1:12800[2016/06/17 15:25:15] sync.go:46 [Info] rotate binlog to (mysql-bin.000001, 4969960)
[2016/06/17 15:25:15] sync.go:50 [Error] handle rows event error ERROR 1146 (42S02): Table 'test.geo' doesn't exist
[2016/06/17 15:25:15] canal.go:146 [Error] canal start sync binlog err: ERROR 1146 (42S02): Table 'test.geo' doesn't exist

I have synchronize my data before, and the index is on a table of a database. Then I insert some data into another table and drop the table after shutdown the sync program. And when I restart the sync program, it got the error. The program found the dump data, and just run according to the binlog. However, the binlog is recorded in order, and the drop record is after the insert record. It causes the problem and the program stopped. The record after the dropping record can't be synchronized.

I think it is important that the synchronization could not be affected by the operations of other table.

At last, I need to delete the dump data before, dump the data, and restart go-mysql-elasticsearch. It worked well.

server id was not set sync error

hi, got such errors as the following, don't how to handle it

2016/09/13 10:07:25 binlogstreamer.go:47: [error] close sync with err: ERROR 1236 (HY000): Misconfigured master - server id was not set
2016/09/13 10:07:25 canal.go:151: [error] canal start sync binlog err: ERROR 1236 (HY000): Misconfigured master - server id was not set

what does type mean in rule part?

I don't understand the meaning of type in rule part. If I have already created a mapping json, can I just type its name in index part?

go test failed with parent test

Hi @EagleChen

run go test failed with parent test:

FAIL: river_extra_test.go:92: riverTestSuite.TestRiverWithParent

river_extra_test.go:103:
    s.testElasticExtraExists(c, "1", "1", true)
river_extra_test.go:86:
    c.Assert(r.Code, Equals, http.StatusOK)
... obtained int = 404
... expected int = 200

I use siddontang/mysql and newest elasticsearch:latest docker to test.
Can you help me find why failed? Thank you.

what is inner http status address

Am running into following error:

./bin/go-mysql-elasticsearch -config=./etc/river.toml 
[2015/07/06 18:59:41] status.go:52 [Info] run status http server 127.0.0.1:12800
[2015/07/06 18:59:41] dump.go:107 [Info] try dump MySQL and parse
[2015/07/06 18:59:41] canal.go:133 [Error] canal dump mysql err: exit status 2

realized that I didn't change the 'Inner http status address' in ./etc/river.toml. But what does that mean exactly?

Filtering table field support ?

This is missing feature as a note in README, but after take a look in code I've found that it is a trivial change as :
sync.go:197

func (r *River) makeInsertReqData(...) {
...
    if name != "-" {
    req.Data[name] = r.makeReqColumnData(&c, values[i])
    } 
...

and similar for makeUpdateReqData

By this way I can filter table field in rule's field config as:

[[rule]]

    [rule.field]
    password = "-"

So my question is: Does it has any risk?

Command not working

I'm new to ES and GO. What does make command do? It does not work.

replication required?

Hi, I'm trying to use this on a remote database which I do not have replication privileges, I can use mysqldump just fine however, is there something I can do to make this work without replication?

undefined: atomic.SwapInt32

Getting this as an error when running make

godep go build -o bin/go-mysql-elasticsearch ./cmd/go-mysql-elasticsearch

github.com/siddontang/go/sync2

Godeps/_workspace/src/github.com/siddontang/go/sync2/semaphore.go:48: undefined: atomic.SwapInt32
Godeps/_workspace/src/github.com/siddontang/go/sync2/semaphore.go:59: undefined: atomic.SwapInt32
godep: go exit status 2
make: *** [build-elasticsearch] Error 1

Have never used GO before so not sure how I go about fixing it

go1.1.1 linux/amd64

I encounter the following errors

panic: runtime error: comparing uncomparable type []uint8

goroutine 18 [running]:
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeUpdateReqData(0xc2080dfc70, 0xc208160000, 0xc2080dfb80, 0xc2080ea2a0, 0xe, 0xe, 0xc2080ea380, 0xe, 0xe)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:189 +0x1d9
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeUpdateRequest(0xc2080dfc70, 0xc2080dfb80, 0xc2080b4270, 0x2, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:124 +0x552
github.com/siddontang/go-mysql-elasticsearch/river.(*rowsEventHandler).Do(0xc20802c020, 0xc2080b4300, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:38 +0x64d
github.com/siddontang/go-mysql/canal.(*Canal).travelRowsEventHandler(0xc20803e3c0, 0xc2080b4300, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/handler.go:32 +0x13a
github.com/siddontang/go-mysql/canal.(*Canal).handleRowsEvent(0xc20803e3c0, 0xc2080b42a0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/sync.go:86 +0x2e3
github.com/siddontang/go-mysql/canal.(*Canal).startSyncBinlog(0xc20803e3c0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/sync.go:49 +0x6ed
github.com/siddontang/go-mysql/canal.(*Canal).run(0xc20803e3c0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/canal.go:144 +0x1a1
created by github.com/siddontang/go-mysql/canal.(*Canal).Start
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/canal.go:129 +0x67

goroutine 1 [chan receive, 99 minutes]:
main.main()
    /go/src/github.com/siddontang/go-mysql-elasticsearch/cmd/go-mysql-elasticsearch/main.go:82 +0x666

goroutine 5 [select, 99 minutes]:
github.com/siddontang/go/log.(*Logger).run(0xc20800c080)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go/log/log.go:100 +0x267
created by github.com/siddontang/go/log.New
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go/log/log.go:80 +0x1de

goroutine 6 [syscall, 99 minutes]:
os/signal.loop()
    /usr/src/go/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
    /usr/src/go/src/os/signal/signal_unix.go:27 +0x35

goroutine 17 [IO wait, 99 minutes]:
net.(*pollDesc).Wait(0xc208010ed0, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208010ed0, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).accept(0xc208010e70, 0x0, 0x7f37d510abb0, 0xc20802abb8)
    /usr/src/go/src/net/fd_unix.go:419 +0x40b
net.(*TCPListener).AcceptTCP(0xc20802c028, 0xc208020700, 0x0, 0x0)
    /usr/src/go/src/net/tcpsock_posix.go:234 +0x4e
net.(*TCPListener).Accept(0xc20802c028, 0x0, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/tcpsock_posix.go:244 +0x4c
net/http.(*Server).Serve(0xc208064480, 0x7f37d510f4e0, 0xc20802c028, 0x0, 0x0)
    /usr/src/go/src/net/http/server.go:1728 +0x92
github.com/siddontang/go-mysql-elasticsearch/river.(*stat).Run(0xc20803cde0, 0xc20802b2d0, 0xd)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/status.go:65 +0x4a5
created by github.com/siddontang/go-mysql-elasticsearch/river.NewRiver
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:62 +0x39a

goroutine 49 [IO wait]:
net.(*pollDesc).Wait(0xc2080101b0, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2080101b0, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc208010150, 0xc2080a8000, 0x1000, 0x1000, 0x0, 0x7f37d510abb0, 0xc2080e05d8)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc2080a4000, 0xc2080a8000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
bufio.(*Reader).fill(0xc2080a6000)
    /usr/src/go/src/bufio/bufio.go:97 +0x1ce
bufio.(*Reader).Read(0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0x5ee, 0x0, 0x0)
    /usr/src/go/src/bufio/bufio.go:174 +0x26c
io.ReadAtLeast(0x7f37d510add0, 0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0x4, 0x0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:298 +0xf1
io.ReadFull(0x7f37d510add0, 0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0xc2081242a0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:316 +0x6d
github.com/siddontang/go-mysql/packet.(*Conn).ReadPacketTo(0xc20800a440, 0x7f37d510f3a0, 0xc2081242a0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/packet/conn.go:81 +0xe1
github.com/siddontang/go-mysql/packet.(*Conn).ReadPacket(0xc20800a440, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/packet/conn.go:35 +0x9f
github.com/siddontang/go-mysql/replication.(*BinlogSyncer).onStream(0xc2080b2000, 0xc2081ec980)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/replication/binlogsyncer.go:465 +0x9b
created by github.com/siddontang/go-mysql/replication.(*BinlogSyncer).startDumpStream
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/replication/binlogsyncer.go:230 +0x10f

goroutine 8 [IO wait]:
net.(*pollDesc).Wait(0xc2080aa530, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2080aa530, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc2080aa4d0, 0xc208158000, 0x1000, 0x1000, 0x0, 0x7f37d510abb0, 0xc2080e0410)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc2080a4018, 0xc208158000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
net/http.noteEOFReader.Read(0x7f37d510f378, 0xc2080a4018, 0xc208156058, 0xc208158000, 0x1000, 0x1000, 0x6f8280, 0x0, 0x0)
    /usr/src/go/src/net/http/transport.go:1270 +0x6e
net/http.(*noteEOFReader).Read(0xc20800b000, 0xc208158000, 0x1000, 0x1000, 0xc208013200, 0x0, 0x0)
    <autogenerated>:125 +0xd4
bufio.(*Reader).fill(0xc2080a67e0)
    /usr/src/go/src/bufio/bufio.go:97 +0x1ce
bufio.(*Reader).Peek(0xc2080a67e0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/src/go/src/bufio/bufio.go:132 +0xf0
net/http.(*persistConn).readLoop(0xc208156000)
    /usr/src/go/src/net/http/transport.go:842 +0xa4
created by net/http.(*Transport).dialConn
    /usr/src/go/src/net/http/transport.go:660 +0xc9f

goroutine 9 [select]:
net/http.(*persistConn).writeLoop(0xc208156000)
    /usr/src/go/src/net/http/transport.go:945 +0x41d
created by net/http.(*Transport).dialConn
    /usr/src/go/src/net/http/transport.go:661 +0xcbc

what's wrong? How can I fix this?

Rivers have been deprecated,if you have user other methd?

in es'homepage,the Rivers have been deprecated,if the plugin connect es by other road

数据库连接无故断掉了，river 不继续同步数据

日志如下：

[2016/06/20 11:27:37] status.go:52 [Info] run status http server 0.0.0.0:12800
[2016/06/20 11:27:37] dump.go:108 [Info] try dump MySQL and parse
[2016/06/20 12:19:59] dump.go:114 [Info] dump MySQL and parse OK, use 3141.24 seconds, start binlog replication at (mariadb-bin.000018, 95570104)
[2016/06/20 12:19:59] sync.go:15 [Info] start sync binlog at (mariadb-bin.000018, 95570104)
[2016/06/20 12:19:59] canal.go:146 [Error] canal start sync binlog err: connection was bad

binlog must ROW format, but STATEMENT now

github.com/siddontang/go-mysql/canal/canal.go:242: binlog must ROW format, but STATEMENT now
github.com/siddontang/go-mysql/canal/canal.go:84:
/root/.go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:82:
/root/.go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:44:

配置文件如下：

`[root@iZ22vmzhajwZ go-mysql-elasticsearch]# vi etc/river.toml

MySQL address, user and password

user must have replication privilege in MySQL.

my_addr = "127.0.0.1:3306"
my_user = "root"
my_pass = "jjjjjj"

Elasticsearch address

es_addr = "10.25.166.191:9200"

Path to store data, like master.info, and dump MySQL data

data_dir = "./var"

Inner Http status address

stat_addr = "127.0.0.1:12800"

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = "mysql"

mysqldump execution path

if not set or empty, ignore mysqldump.

mysqldump = "mysqldump"

MySQL data source

[[source]]
schema = "torrent"

Only below tables will be synced into Elasticsearch.

"test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don't think it is necessary to sync all tables in a database.

tables = ["torrent[0-9]{1}"]

Below is for special rule mapping

[[rule]]
schema = "torrent"
table = "torrent[0-9]{1}"
index = "torrent"`

How can I set the analyzer of a field

In elasticsearch-jdbc:
type_mapping: {
"blog": {
dynamic: true,
properties: {
"title" : { "type" : "string", "analyzer" : "ik_analyzer", "search_analyzer" : "ik_analyzer" },
"content" : { "type" : "string", "analyzer" : "ik_analyzer", "search_analyzer" : "ik_analyzer" }
}
}
},

Do go-mysql-elasticesearch support any way to set the properties of a field, besides the elasticsearch type?

what's the meaning "binlog must ROW format, but now"

binlog must ROW format, but now

Do you have plans for supporting parent-children relationship in ES?

I've already add this function in my fork.
I can send a pull request if you want to add this.

GME & the future?

Hi,

with the announcement that rivers are being discontinued in 2.0, and are discouraged from 1.5 on, will this affect Go-Mysql-Elasticsearch?

Will you be pursuing other alternatives? We are looking into ways of syncing using the recommended method which best suits us (in our case we dont want to start from zero, so looking at Logstash)... Are you/will you be considering GME as an input to logstash perhaps? or is there too much overlap in code alreadt in your opinion?

Logstash doesnt have a mysql input layer, and probably wont ever have something binlog capable... on the otherhand, logstash is already bulking requests and inserting them using the correct apis on ES, so a bridge between GME and logstash could be a natural progression?

I'd like to hear the dev's opinions...

mysql version: 5.7.13-log, run about 30 minutes error

the lastest version go-mysql-elasticsearch, mysql version: 5.7.13-log, I run about 30 minutes, will get following log error:
2016/11/02 05:23:31 binlogstreamer.go:47: [error] close sync with err: invalid stream header þ.

I debug the source code, seens read data from mysql, and the first byte of data is invalid for program. In log data[0] is print with %c, so regret not get exactly value, I will continue test, hope get the actual data[0] value.

How to set river.toml ,and parse the json format .

Hi:
I have a mysql table like this:
table name:
torrent0

field infohash :
0F850D075A59522FD975404CF6D205530F066243

field data :
{"Infohash":"0F850D075A59522FD975404CF6D205530F066243","Name":"John.Wick.2014.DUAL.BDRip-AVC.x264.mkv","Length":2484494781,"Heat":0,"FileCount":1,"Files":[{"Name":"John.Wick.2014.DUAL.BDRip-AVC.x264.mkv","Length":2484494781}],"CreateTime":"2016-08-23T03:17:55.97859726-04:00"}

field create_time :
2016-08-23 03:17:55

Now how to do the river.toml to set the elasticsearch like this .

curl -XGET 10.25.166.191:9200/torrent/_search?pretty

{
"_index" : "torrent",
"_type" : "0",
"_id" : "0F850D075A59522FD975404CF6D205530F066243",
"_score" : 1.0,
"_source" : {
"Name" : "Beshenye.2015.P.HDRip.1400MB.nnmclub.avi",
"Length" : 1467906048,
"Heat" : 0,
"CreateTime" : "2016-09-02T14:40:53.779099802+08:00"
}
},

I've tryed setted this below:
[rule.field]
id="infohash,string"
source="data,list"
Result:
{
"index" : "torrent",
"type" : "0",
"id" : "800910",
"score" : 1.0,
"source" : {
"create_time" : "2016-08-23 03:17:55",
"data" : "{\"Infohash\":\"02A5B835DF1831C808CDDBDEEBBC6CBBA2AC3478\",\"Name\":\"Tokyo_Marble_Chocolate(2007)[720p,BluRay,x264]-THORA\",\"Length\":1049992650,\"Heat\":0,\"FileCount\":5,\"Files\":[{\"Name\":\"Tokyo_Marble_Chocolate_part1_Mata_Aimashou[720p,BluRay,x264]-THORA.mkv\",\"Length\":525121923},{\"Name\":\"Tokyo_Marble_Chocolate_part2_Zenryoku_Shounen[720p,BluRay,x264]-THORA.mkv\",\"Length\":524839477},{\"Name\":\"Tokyo_Marble_Chocolate(2007)[720p,BluRay,x264]-THORA.nfo\",\"Length\":16634},{\"Name\":\"한글 자막 - 보실 때 압축풀고 압축파일은 보관해주세요.7z\",\"Length\":14395},{\"Name\":\"Tokyo_Marble_Chocolate(2007)[720p,BluRay,x264]-_THORA.md5\",\"Length\":221}],\"CreateTime\":\"2016-08-23T03:17:55.967615474-04:00\"}",
"id" : 800910,
"infohash" : "02A5B835DF1831C808CDDBDEEBBC6CBBA2AC3478"
}
}
......

Lost connection to MySQL server

内存 1gb 单核当导入的数据量达到10w左右的时候频繁出现

mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table duobao_all at row: 45895

canal.go:138 [Error] canal dump mysql err: exit status 3

Why this project needs `go get github.com/tools/godep`

Why don't you use godep save -r to rewrite source code?

cannot sync data

I don't know what configuration is wrong.
mysqldump is OK.
But sync does not work.

Could you help me to figure out why?
Here are some logs:

dump.go:113 [Info] dump MySQL and parse OK, use 704.79 seconds, start binlog replication at (mariadb-bin.000001, 1478069)
sync.go:15 [Info] start sync binlog at (mariadb-bin.000001, 1478069)
canal.go:146 [Error] canal start sync binlog err: connection was bad

Have anyone run it on windows 7?

When i run make on win7,I got an error as following:
godep go build -o bin/go-mysql-elasticsearch ./cmd/go-mysql-elasticsearch
process_begin: CreateProcess(NULL, godep go build -o bin/go-mysql-elasticsearch
./cmd/go-mysql-elasticsearch, ...) failed.
make (e=2): 系统找不到指定的文件。
make: *** [build-elasticsearch] 错误 2

What's wrong here?

mysql error 1064 when starting

I encountered the following error when i tried to start:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"binlog_format"' at line 1

My database is mysql 5.6.22. and the binlog_format is "row".
So could you help me figure this out. Thx

不支持单表索引多个不同type吗？

单表所属多个父表，es好像不支持多parent，想将一表索引成不同type，然后分别指定各parent。但同步时好像无法将数据同步至各个不同type中，只在最后一个type里有数据。

Can I use sql queries to get a table?

I can only get original tables by using table name in Source.

Can I use sql queries to fetch a table? One of the advantage of SQL queries is the join operation. From many tables, new tuples can be formed.

You can find an example here

批量更新的功能是否有考虑添加？

can not sync data from mysql

[2016/06/22 10:28:26] canal.go:145 [Error] canal start sync binlog err: invalid stream header þ --- [fe]
[2016/06/22 11:11:58] status.go:52 [Info] run status http server 127.0.0.1:12800
[2016/06/22 11:11:58] dump.go:95 [Info] skip dump, use last binlog replication pos (mysql-bin.000007, 247504563)
[2016/06/22 11:11:58] sync.go:15 [Info] start sync binlog at (mysql-bin.000007, 247504563)
[2016/06/22 11:11:58] sync.go:45 [Info] rotate binlog to (mysql-bin.000007, 247504563)
[2016/06/22 11:11:58] handler.go:33 [Error] handle ESRiverRowsEventHandler err: make up

when print this error log，can not sync data。[fe] is the binlog's header

Error in Parsing dumps output

An error will come out when SQL expressions exist in dump's output content.
And this error only happens in dumps step, but not in binlogs' sync.
The log can be seen as follows.

[2015/11/19 18:37:44] dump.go:107 [Info] try dump MySQL and parse
[2015/11/20 05:51:26] dump.go:31 [Error] get procurement.news_info_lst` VALUES (1196669,'3e2fc4f031e005c212d809eb471703b3','AEAICRMV1.5.1升级说明，开源客户关系管理系统','2015-11-17 09:09:01.379400','2015-11-17','http://www.oschina.net/news/68105/aeai-crm-1-5-1',117,'开源**社区','2015-11-17 09:09:01.379397',NULL,' 12月12日北京OSC源创会 -- 开源技术的年终盛典   本次发版的AEAI CRM_v1.5.1版本为AEAI CRM _v1.5.0版本的升级版本，该产品现已开源并上传至开源社区。 1 升级说明 本产品是公司
根据实际项目中客户提出的良好建议，从而对产品进行的一次升级和完善，升级后的产品更加简洁优雅，代码结构也更加合理，建议升级后使用。 2 升级内容 1.统一修改  1)代码重
构，采用最新版本的开发平台支持远程、增量、热部署  2)按钮权限配置化  3)文本域填写信息提示字数限制  4)查看模式只能查看，表单元素不可编辑，且进行反色处理  2.潜在客
户：  1)电话记录生成线索的时候，要能自动填写电话  2)组织类的潜在客户添加分类字段及分类查询机制  3.客户信息：  1)解决客户不能复制、迁移的问题 3 升级步骤 由于本次
代码调整幅度较大建议直接使用新版本。另外，本次升级对数据表做了几处微调：  CRM_ORG_INFO表的ORG_INTRODUCTION字段长度调整为(1024)  CRM_ORG_INFO表添加“分类”字段：ORG_CLASSIFICATION varchar(32)  编码分组表以及编码定表中添加如下分类定义数据：    INSERT INTO `sys_codetype` VALUES (\'ORG_CLASSIFICATION\', \'企业分类\', \'sys_code_define\', \'\', \'Y\', \'Y\', \'Y\', \'20\', \'\', \'N\', \'\', \'\');  INSERT INTO `sys_codelist` VALUES (\'ORG_CLASSIFICATION\', \'ENTERPRISE_ENTITY\', \'企
业实体\', \'\', \'1\', \'1\');  INSERT INTO `sys_codelist` VALUES (\'ORG_CLASSIFICATION\', \'SOFTWARE_AGENTS\', \'软件代理商\', \'\', \'3\', \'1\');  INSERT INTO `sys_codelist` VALUES (\'ORG_CLASSIFICATION\', \'SOFTWARE_DEVELOPERS\', \'软件开发商\', \'\', \'2\', \'1\');  INSERT INTO `sys_codelist` VALUES (\'ORG_CLASSIFICATION\', \'SYSTEM_INTEGRATOR\', \'系统集成商\', \'\', \'4\', \'1\');  INSERT INTO `sys_codelist information err: ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'sys_codetype` VALUES (\'ORG_CLASSIFICATION\', \'企业分类\', \'sys_code_defin' at line 1

Does anyone has this Base64 Encoding problem?

I have a filed of type "TEXT" in my database table. After I insert "hello" in that filed, no matter through phpMyAdmin or PHP or command-line, it'll index Base64 encoding as "aGVsbG8=" into EL document.
VARCHAR just works fine, the problem is only with "TEXT" type.

If I delete the index, stop go-mysql-elasticsearch, and restart it to sync the database, it'll index "hello" this time and everything just works perfect.

Does anyone has the same problem with me? Really need some directions.
By the way, this project is awesome!!! Thanks a lot.

Hi Everybody If somebody wants to make use of, I want to share /etc/init.d script for background processing of go-mysql-elasticsearch

! /bin/bash

go-mysql-elasticsearch server init script - Levent ALKAN.

BEGIN INIT INFO

Provides: syncelasticsearch

Required-Stop:

Default-Start: 2 3 4 5

Default-Stop: 0 1 6

Short-Description: go-mysql-elasticsearch

END INIT INFO

USER="root" # User we wil run Go as
GOPATH="/root/work" # GOPATH
GOROOT="/usr/local/go" # used by revel
WORKDIR="/root/work/src/github.com/siddontang"
NAME="go-mysql-elasticsearch" # app name for gme etc ...
GO_CMD="$WORKDIR/$NAME/bin/go-mysql-elasticsearch -config=$WORKDIR/$NAME/etc/river.toml"

Start script

recursiveKill() { # Recursively kill a process and all subprocesses
CPIDS=$(pgrep -P $1);
for PID in $CPIDS
do
recursiveKill $PID
done
sleep 3 && kill -9 $1 2>/dev/null & # hard kill after 3 seconds
kill $1 2>/dev/null # try soft kill first
}

case "$1" in
start)
echo "Starting $NAME ..."
if [ -f "$WORKDIR/$NAME.pid" ]
then
echo "Already running according to $WORKDIR/$NAME.pid"
exit 1
fi
cd "$WORKDIR"
export GOROOT="$GOROOT"
export GOPATH="$GOPATH"
export PATH="${PATH}:${GOROOT}/bin:${GOPATH}/bin"
/bin/su -m -l $USER -c "$GO_CMD" > "$WORKDIR/$NAME.log" 2>&1 &
PID=$!
echo $PID > "$WORKDIR/$NAME.pid"
echo "Started with pid $PID - Logging to $WORKDIR/$NAME.log" && exit 0
;;
stop)
echo "Stopping $NAME ..."
if [ ! -f "$WORKDIR/$NAME.pid" ]
then
echo "Already stopped!"
exit 1
fi
PID=cat "$WORKDIR/$NAME.pid"
recursiveKill $PID
rm -f "$WORKDIR/$NAME.pid"
echo "stopped $NAME" && exit 0
;;
restart)
$0 stop
sleep 1
$0 start
;;
status)
if [ -f "$WORKDIR/$NAME.pid" ]
then
PID=cat "$WORKDIR/$NAME.pid"
if [ "$(/bin/ps --no-headers -p $PID)" ]
then
echo "$NAME is running (pid : $PID)" && exit 0
else
echo "Pid $PID found in $WORKDIR/$NAME.pid, but not running." && exit 1
fi
else
echo "$NAME is NOT running" && exit 1
fi
;;
*)
echo "Usage: /etc/init.d/$NAME {start|stop|restart|status}" && exit 1
;;
esac

exit 0

Date and DateTime fields Problem ?

I installed elasticsearch 2.1.1 version and i added go-mysql-elasticsearch then when i executed code everything worked fine. go-mysql-elasticsearch started to listening log file , there is no error or warning but when i update a row in mysql or insert a new row date and datetime fields are synchronizing empty to elasticsearch what can be the problem ?

bulk insert error results in critical failure with ES 2.3

With ES 2.3, an error during bulk insert returns a json object as part of the error message. The code currently treats that as a string, which causes json.Unmarshal to throw a critical failure.

support parse from binlog file

parse rows event panic runtime error: slice bounds out of range

2016/11/04 03:17:47 row_event.go:267: [fatal] parse rows event panic runtime error: slice bounds out of range, data "o\x00\x00\x00\x00\x00\x01\x00\x02\x00\f\xff\xff\xfa\xff?B\x0f\x00\b
\x00Testdata", parsed rows &replication.RowsEvent{Version:2, tableIDSize:6, tables:map[uint64]*replication.TableMapEvent{0x6f:(*replication.TableMapEvent)(0xc82014a0a0)}, needBitmap2:f
alse, Table:(*replication.TableMapEvent)(0xc82014a0a0), TableID:0x6f, Flags:0x1, ExtraData:[]uint8{}, ColumnCount:0xc, ColumnBitmap1:[]uint8{0xff, 0xff}, ColumnBitmap2:[]uint8(nil), Ro
ws:[][]interface {}(nil)}, table map &replication.TableMapEvent{tableIDSize:6, TableID:0x6f, Flags:0x1, Schema:[]uint8{0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x44, 0x61, 0x74, 0x61}, Tabl
e:[]uint8{0x70, 0x72, 0x6f, 0x64, 0x75, 0x63, 0x74, 0x73}, ColumnCount:0xc, ColumnType:[]uint8{0x3, 0xf5, 0xfc, 0xfc, 0x5, 0xfc, 0x5, 0xfc, 0xfc, 0xfc, 0xfc, 0xfc}, ColumnMeta:[]uint16
{0x0, 0x0, 0x4, 0x2, 0x2, 0x8, 0x2, 0x8, 0x2, 0x2, 0x2, 0x2}, NullBitmap:[]uint8{0xfe, 0xf}}

MySQL address, user and password

user must have replication privilege in MySQL.

my_addr = "127.0.0.1:3306"
my_user = "root"
my_pass = ""

Elasticsearch address

es_addr = ":9200"

Path to store data, like master.info, and dump MySQL data

data_dir = "./var"

Inner Http status address

stat_addr = "127.0.0.1:12800"

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = "mysql"

mysqldump execution path

if not set or empty, ignore mysqldump.

mysqldump = "mysqldump"

MySQL data source

[[source]]
schema = "ClientData"

Only below tables will be synced into Elasticsearch.

"test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don't think it is necessary to sync all tables in a database.

tables = ["product_details"]

everything is ok ,but can not sync data

I use another tool go-mysql and it can works well, the output

=== QueryEvent ===
Date: 2015-12-16 07:49:52
Log position: 1476
Event size: 109
Slave proxy ID: 38
Execution time: 0
Error code: 0
Schema: esdb
Query: insert into accesslog(name) value('antoher 3')

=== XIDEvent ===
Date: 2015-12-16 07:49:52
Log position: 1503
Event size: 27
XID: 146

but I start the go-mysql-elasticsearch,the output below:

[2015/12/16 07:30:21] status.go:52 [Info] run status http server 127.0.0.1:12800
[2015/12/16 07:30:21] dump.go:95 [Info] skip dump, use last binlog replication pos (mysql-bin.000012, 564)
[2015/12/16 07:30:21] sync.go:15 [Info] start sync binlog at (mysql-bin.000012, 564)
[2015/12/16 07:30:21] sync.go:46 [Info] rotate binlog to (mysql-bin.000012, 564)

everything looks like ok,but it can not sync data;
and if I restart mysql and go-mysql-elasticsearch, it can sync once at the begining,and then can not sync data any more;
Can you help me to figure out the reason?

Binlogging on server not active

I've got this error message

MacBook-Pro-de-Vincent% bin/go-mysql-elasticsearch
[2015/07/13 16:59:19] status.go:52 [Info] run status http server 127.0.0.1:12800
[2015/07/13 16:59:19] dump.go:107 [Info] try dump MySQL and parse
Warning: Using a password on the command line interface can be insecure.
mysqldump: Error: Binlogging on server not active
[2015/07/13 16:59:19] canal.go:138 [Error] canal dump mysql err: exit status 2

When I have a look on MySQL variables I can see sql_log_bin is ON

Any idea ?

"truncate table" seems not working

confusion on how to create config by joining table with a single document_type

How can write this in config, I want to join table and should be only one document_type named application

SELECT
    ap.associate_person_id,
    ap.person_id,
    ap.is_registrant,
    ms.ms_application_id,
    ai.application_name,
    s.service,
    s.service_id,
    aso.application_status_option,
    ai.trash,
    ai.created AS created_date
FROM associate_person ap
    LEFT JOIN ms_application ms ON ms.ms_application_id = ap.ms_application_id
    LEFT JOIN application_index ai ON ai.ms_application_id = ms.ms_application_id
    LEFT JOIN application_status aps ON aps.application_status_id = ai.application_status_id
    LEFT JOIN service s ON s.service_id = ai.service_id
    LEFT JOIN application_status_option aso ON aso.application_status_option_id = aps.application_status_option_id;

How can I achieve this?

一定要有PK吗？

我有好多表，没有PK呢，一定要有吗，可以用系统生成的id吗？

另外想问一下我想同步所有表，tables写啥，我写*好像匹配不了？

Can you use gox to manage the project

I suggest that use gox to manage the project. Godeps doesn't convenient to compile the cross binary.