GithubHelp home page GithubHelp logo

altinity / clickhouse-backup Goto Github PK

View Code? Open in Web Editor NEW
1.2K 17.0 211.0 5.1 MB

Tool for easy ClickHouse backup and restore using object storage for backup files.

Home Page: https://altinity.com

License: Other

Go 73.05% Shell 3.90% Dockerfile 0.71% Makefile 0.69% Python 21.66%
clickhouse backup s3 dump clickhouse-backup clickhousedump

clickhouse-backup's People

Contributors

alexakulov avatar anikin-aa avatar anuriq avatar besteffects avatar combin avatar dependabot[bot] avatar develar avatar elpeque avatar excitoon avatar farbodsalimi avatar felixoid avatar hodgesrm avatar lqhl avatar minguyen9988 avatar mskwon avatar nikk0 avatar nmcclain avatar o-mdr avatar rodrigargar avatar roman-vynar avatar sanadhis avatar slach avatar tadus21 avatar tvion avatar umang8223 avatar vahid-sohrabloo avatar wangzhen11aaa avatar yuzhichang avatar zekker6 avatar zvonand avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clickhouse-backup's Issues

How to restore replicated tables ?

Hello,

I need to backup all my replicated tables locally, before doing some dangerous operations.
Backup is obvious, with: clickhouse-backup create my-backup
After delete or modify some data, how to restore it ?

  • Drop database and clickhouse-backup restore my-backup
    Replicated host will also drop all data and then copy all partitions after the restore. Could be long when database is large.
  • No drop, and clickhouse-backup restore my-backup
    Previous existing data will be duplicated, and also copied on replicas.
  • Delete modified partition on database, keep only modified partition in backup folder, then clickhouse-backup restore my-backup
    Look better since only need partition will be copied on replicas, but manual step is dangerous.
    How to do that in an automatic way ?

Unable to backup table.

Hi,
I am trying to get my hands on this utility from past 3 days but not getting any success. I am able to connect to clickhouse database but create command isn't working.

$ clickhouse-backup tables -c ~/my_config.yml
Gives list of tables.
However


$ clickhouse-backup create -c ~/my_config.yml bkp2

Output-

2019/05/01 15:50:11 Create backup 'bkp2'
2019/05/01 15:50:11 can't get partitions for "default.tmp_employee" with code: 47, message: Unknown identifier: partition_id

Details-
Ubuntu 16.04 LTS
Clickhouse version- 1.1.54370
Clickhouse-backup (latest 26Apr2019)

What could be wrong, any idea ?

hard links in /var/lib/clickhouse/shadow are not treated well

During multiple freeze executions partitions that weren't changed just create an additional hard link to already existing file. These can be seen from command line:

# find shadow -xdev -samefile shadow/10/data/default/ontime/2009_1_1_0/checksums.txt -print
shadow/10/data/default/ontime/2009_1_1_0/checksums.txt
shadow/28/data/default/ontime/2009_1_1_0/checksums.txt

Those hard links pointing to same file should be processed by clickhouse-backup, but current behaviour is as follows:

  • for tree strategy those are uploaded as duplicates to S3 and during download are created as separate files but with same content
  • for archive strategy hard links are saved correctly in tar and during download are recreated as hard links, so result will be same structure as in shadow directory. N.B. After #5 merge

But both strategies suffer from restore behaviour. It doesn't distinguish those kind of links, so duplicate partitions are copied/moved to detached folder multiple times and attached to table. In the end you will end up with duplicate rows in tables.

I plan to work on this, but wanted to discuss the way to fix this:

  • Shall we fix this on upload/download part, so we'll have clean 1 copy of data in the backup folder
  • Or fix on restore part - make it understand those links and process them.

P.S. To avoid this issue right now shadow directory should be cleaned after every backup. clean command may be used.

Crash when trying to backup an empty database

Crash on this version if ClickHouse is empty

root@clickhouse00:/tmp# clickhouse-backup create
2019/04/22 03:49:40 Create backup '2019-04-22T03-49-40'
2019/04/22 03:49:40 There are no tables in Clickhouse, create something to freeze.
2019/04/22 03:49:40 Copy metadata
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x9ce303]

goroutine 1 [running]:
main.copyPath.func1(0xc000632080, 0x10, 0x0, 0x0, 0xc2a680, 0xc00009a8a0, 0x30, 0xad74c0)
	/home/travis/gopath/src/github.com/AlexAkulov/clickhouse-backup/utils.go:101 +0x413
path/filepath.Walk(0xc000632080, 0x10, 0xc00009a870, 0xc000022390, 0x2b)
	/home/travis/.gimme/versions/go1.11.4.linux.amd64/src/path/filepath/path.go:402 +0x6a
main.copyPath(0xc000632080, 0x10, 0xc000022390, 0x2b, 0x0, 0xa, 0x2328)
	/home/travis/gopath/src/github.com/AlexAkulov/clickhouse-backup/utils.go:85 +0x91
main.createBackup(0xc000024f58, 0x7, 0x0, 0x0, 0xc000024f80, 0xa, 0x2328, 0xc000024fb8, 0x7, 0xc000065fa0, ...)
	/home/travis/gopath/src/github.com/AlexAkulov/clickhouse-backup/main.go:466 +0x3db
main.main.func6(0xc0000aa9a0, 0x0, 0xc0000aa9a0)
	/home/travis/gopath/src/github.com/AlexAkulov/clickhouse-backup/main.go:107 +0x1a4
github.com/urfave/cli.HandleAction(0xa4b6c0, 0xb62768, 0xc0000aa9a0, 0xc000078a00, 0x0)
	/home/travis/gopath/pkg/mod/github.com/urfave/[email protected]/app.go:490 +0xc8
github.com/urfave/cli.Command.Run(0xb3dca0, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0xb54c60, 0x2b, 0xb5baa8, ...)
	/home/travis/gopath/pkg/mod/github.com/urfave/[email protected]/command.go:210 +0x9a2
github.com/urfave/cli.(*App).Run(0xc0001651e0, 0xc00000c060, 0x2, 0x2, 0x0, 0x0)
	/home/travis/gopath/pkg/mod/github.com/urfave/[email protected]/app.go:255 +0x687
main.main()
	/home/travis/gopath/src/github.com/AlexAkulov/clickhouse-backup/main.go:184 +0xa9c

Backup tool is not able to connect to CH server: wrong credentials issue

Hello,

This is weird issue but clickhouse-backup can't connect to clickhouse server. I tried pass credentials via env vars and via config, but no luck.

I have Clickhouse server running on 9001 port with default user and some password let's say 123. I can connect to CH via clickhouse-client w/o any problems:

$ clickhouse-client --port 9001 -u default --password 123
ClickHouse client version 19.13.1.11 (official build).
Connecting to localhost:9001 as user default.
Connected to ClickHouse server version 19.13.1 revision 54425.

clickhouse :) exit
Bye.

But connection to CH fails if I do it via clickhouse-backup:

$ export CLICKHOUSE_USERNAME=default
$ export CLICKHOUSE_PASSWORD=123
$ export CLICKHOUSE_PORT=9001
$ ./clickhouse-backup tables
2019/12/22 22:09:51 can't connect to clickouse with: code: 193, message: Wrong password for user default

Clickhouse-backup is the latest build:

./clickhouse-backup -v
Version:	 v0.5.1
Git Commit:	 5dc6234a1052c076de409666858bd4c1c6dbb48a
Build Date:	 2019-12-03

There is the same behavior if I pass creds and port via config file.
What can be wrong? How can I debug? Thanks.

Allow not to use credentials.json when running on GCE

I want to add opportunity not to use credentials.json with google cloud storage (GCS) when application running on Google compute engine (GCE) because there is a way to use default service account. What do you think about it ?

Upload to s3 error

Hello,
Sometimes I got errors, when uploading files to s3

2019/06/19 08:45:01 Upload backup 'clickhouse-2019-06-19-1560933864'\n2019/06/19 10:39:52 can't upload with shadow/***/***/***/***.bin: copying contents: Put https://***-backup-clickhouse-us.s3.amazonaws.com/clickhouse_backups_incremental/***-clickhouse-us1/clickhouse-2019-06-19-1560933864.tar.lz4?partNumber=760&uploadId=***: net/http: timeout awaiting response headers
2019/06/18 06:05:39 Upload backup '2019-06-18T06-05-19'\n2019/06/18 06:35:35 can't upload with /shadow/***/***/***/***/***/***.bin: copying contents: Put https://***-backup-clickhouse-us.s3.amazonaws.com/clickhouse_backups/***-clickhouse-us1/2019-06-18T06-05-19.tar.lz4?partNumber=209&uploadId=***: dial tcp ***:443: i/o timeout  

Upload exit with status 1 after this (
Does app try few times to upload part or stop after first bad try?

Regards,
Andrii

memory leak while upload backup to s3

rss 4.1g( 4253984), repo size ~150GB

c000000000-c0fc000000 rw-p 00000000 00:00 0
Size: 4128768 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 3994276 kB
Pss: 3994276 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 202380 kB
Private_Dirty: 3791896 kB
Referenced: 3862512 kB
Anonymous: 3994276 kB
LazyFree: 202376 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac

Version: v0.5.0
Git Commit: 0655043
Build Date: 2019-10-27

Problem with restore

I try restore:

2019/04/10 10:38:33 Attach partitions for xxxx increment 175:
2019/04/10 10:38:33 ALTER TABLE xxxx ATTACH PARTITION 201901
2019/04/10 10:38:33 can't attach partitions for table xxx.xxx with code: 228, message: /var/lib/clickhouse/data/xxx/xxxx/detached/20190101_20190118_9742_9747_1/date.bin has unexpected size: 62464 instead of 1408782

Do not delete files from s3 if they were not found locally

Now if some files are missing from the folder, they will be deleted from s3 when run clickhouse-backup upload.

I suppose the possibility, when the freeze only for the last few days partitions (partition key daily), but at the same time it is necessary to keep the files on the previous days on s3 and not delete they.

backup does not work

./clickhouse-backup -config config.yml create
2019/06/06 16:11:51 Create backup '2019-06-06T13-11-51'
2019/06/06 16:11:51 Freeze 'pf.blogger_stat'
2019/06/06 16:11:51 partition '201905'
2019/06/06 16:11:51 partition '201906'
2019/06/06 16:11:51 Freeze 'pf.migration'
2019/06/06 16:11:51 partition '197001'
2019/06/06 16:11:52 can't freeze partition '197001' on 'pf.migration' with: code: 368, message: std::bad_typeid

root@pclickhouse:/lib65/clickhouse-backup# clickhouse-server -V
ClickHouse server version 19.5.3.8 (official build).

From console - it works:

localhost :) ALTER TABLE pf.migration FREEZE PARTITION ''

ALTER TABLE pf.migration
FREEZE PARTITION ''

Ok.

0 rows in set. Elapsed: 0.091 sec.

Backup Duration

I'm curious how does AWS work for you?
It takes half a day for every TB of data.
Am I missing something?

Error backup with table '.inner.test_table'

Hello!
Clickhouse version 19.1.8

When creating a backup request failed:

can't freeze partition '0590dc88487514d92e90360839939c5f' on 'default..inner.test_table' with: code: 62, message: Syntax error: failed at position 21: .inner.test_table FREEZE PARTITION ID '0590dc88487514d92e90360839939c5f';. Expected identifier

It may be worth escaping the name of the table with ``?

can't get Clickhouse tables with: code: 47, message: Missing columns: 'data_path'

version of clickhouse 19.15.3.6.

When i'm trying to create backup, got error:

can't get Clickhouse tables with: code: 47, message: Missing columns: 'data_path' while processing query: 'SELECT database, name, is_temporary, data_path, metadata_path FROM system.tables WHERE (data_path != '') AND (is_
temporary = 0) AND (engine LIKE '%MergeTree')', required columns: 'data_path' 'is_temporary' 'engine' 'database' 'name' 'metadata_path', source columns: 'primary_key' 'storage_policy' 'sorting_key' 'data_paths' 'partition_key' 'engine_full'
'is_temporary' 'database' 'sampling_key' 'create_table_query' 'engine' 'dependencies_table' 'name' 'metadata_path' 'metadata_modification_time' 'dependencies_database'

only metadata is getting backed up, no hard links to data in the shadow folder

Tried backing up a Log table
./clickhouse-backup --config config.yml create --tables=default.test_Log

Output
2019/06/01 00:54:05 Create backup '2019-06-01T04-54-05'
2019/06/01 00:54:05 Freeze 'default.test_Log'
2019/06/01 00:54:05 Copy metadata
2019/06/01 00:54:05 Done.
2019/06/01 00:54:05 Move shadow
2019/06/01 00:54:05 Done.

However, only metadata was backup under /var/lib/clickhouse/backup. Where is the contents of shadow being moved to?

Upload to S3 from Yandex Cloud

Hi!
Tell me, should the backup upload work in S3 from Yandex Cloud? I have a error:
can't upload with 403: "There were headers present in the request which were not signed"

Error while creating backups of tables of type MergeTree

Been trying to use this tool to test backups of tables of different types
./clickhouse-backup --config config.yml create --tables=default.test_mergeTree back4

I get this error
2019/06/01 00:46:43 Create backup 'back4'
2019/06/01 00:46:43 Freeze 'default.test_mergeTree'
2019/06/01 00:46:43 partition '201906'
2019/06/01 00:46:43 can't freeze partition '201906' on 'default.test_mergeTree' with: code: 368, message: std::bad_typeid

Been following the readme closely, not sure what might be causing this!

"restore-data" seems not doing its work completely

I have a table like:

CREATE TABLE "%table_name%" (
   date date,
   datetime Datetime,
   %some_fields%
) ENGINE = MergeTree()
PARTITION BY (date)
ORDER BY (date, ...)
PRIMARY KEY (date, ...)

The table contains data for several dates, in my case from 2019-08-24 to 2019-09-18.

Then I'm trying to back it up and restore to check if it works properly.
So that:

~$ sudo -u clickhouse ./clickhouse-backup create --tables=%my_database%.* test

2019/09/19 08:16:09 Create backup 'test'
2019/09/19 08:16:09 Freeze `%my_database%`.`%table_name%`
2019/09/19 08:16:09 Copy metadata
2019/09/19 08:16:09   Done.
2019/09/19 08:16:09 Move shadow
2019/09/19 08:16:09   Done.

~$ clickhouse-client -u %user% --password=%pass% -d %my_database% -q "DROP DATABASE %my_database%"

BANG!

~$ sudo -u clickhouse ./clickhouse-backup restore-schema --tables=%my_database%.* test

2019/09/19 08:16:57 Create table `%my_database%`.`%table_name%`

~$ sudo -u clickhouse ./clickhouse-backup restore-data --tables=%my_database%.* test

2019/09/19 08:17:28 Prepare data for restoring `%my_database%`.`%table_name%`
2019/09/19 08:17:28 ALTER TABLE `%my_database%`.`%table_name%` ATTACH PARTITION 201908
2019/09/19 08:17:28 ALTER TABLE `%my_database%`.`%table_name%` ATTACH PARTITION 201909
2019/09/19 08:17:28 ALTER TABLE `%my_database%`.`%table_name%` ATTACH PARTITION ID '20190918'

Finally, after restore-data I have only data for the last date.
I'd be able to execute some additional queries to finish restoring, but I think I do something wrong:

ALTER TABLE `%my_database%`.`%table_name%` ATTACH PARTITION ID '20190917'
ALTER TABLE `%my_database%`.`%table_name%` ATTACH PARTITION ID '20190916'
-- And so on...

Can you help me to figure out what goes wrong in my example?

Support incremental backups

Hi,
Does ClickhouseBackup support incremental backups? I mean when I execute it the first I expect it to send full dump but when I call this second time a couple hours later I'd expect the tool to send only new blocks? I saw mentiones of increments in code but I wanted to double check the tool really works as I described.

Thanks,
Paweł

Expose default database name in config

My clickhouse setup do not contain database with name 'default'.
and i can not create backup with 'can't connect to clickouse with: code: 81, message: Database default doesn't exist' error.
and config file does not contain database option.
am i miss something?

Error while creating backups of tables with DISTRIBUTED engine

I have observed an issue during creating a backup for table with use DISTRIBUTED engine:

$ sudo docker run --rm -it --network host -v "/mnt/clickhouse:/var/lib/clickhouse" -e S3_BUCKET=test   alexakulov/clickhouse-backup create db_layer_0.sikandar_replicated
2019/09/09 16:35:04 Create backup 'db_layer_0.sikandar_replicated'
2019/09/09 16:35:04 Freeze 'db_layer_0.sikandar_replicated'
2019/09/09 16:35:04 Freeze 'db_layer_1.sikandar_replicated'
2019/09/09 16:35:04 Freeze 'default.sikandar'
2019/09/09 16:35:04 can't freeze 'default.sikandar' with: code: 48, message: Partition operations are not supported by storage Distributed

Looks like that issue is related to a fact that distributed table does not hold data itself, thus it doesn't have any partition.

CREATE TABLE sikandar
(
    sikandarDay Date, 
    sikandarId String, 
    sikandarAge UInt32
)
ENGINE = Distributed(level0, db_layer_0, sikandar_replicated, sipHash64(sikandarId));

The way current freeze works in the code(https://github.com/AlexAkulov/clickhouse-backup/blob/5d3a0d0196d58eb00cad915738795a120554f1ed/clickhouse.go#L178) fails

d40c477505fb :) ALTER TABLE default.sikandar FREEZE

ALTER TABLE default.sikandar
    FREEZE


Received exception from server (version 19.4.1):
Code: 48. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Partition operations are not supported by storage Distributed. 

Does distributed table backup is supported now, maybe I have missed something?

BR,
Aleksandr

Can't restore from backup

The metadata was restored. After that try restore data:

sudo ./clickhouse-backup -c ./config.yml restore-data $BACKUP_NAME

2019/05/03 15:16:55 Prepare data for restoring 'analytics.events'
2019/05/03 15:16:55 ALTER TABLE analytics.events ATTACH PARTITION ID '201905'
2019/05/03 15:16:55 can't attach partitions for table 'analytics.events' with code: 226, message: No columns in part 201905_1_1128_227

Inherit global options

The CLI interface should inherit global options. It is unintuitive to prefix each command with global options, as opposed to using them everywhere.

download failed on fresh clickhouse

Hello,

I try to restore a backup on a fresh clickhouse with v0.4.1:

clickhouse-backup download backup1-1
mkdir /var/lib/clickhouse/backup/backup1-1: no such file or directory

Look like backup folder is only created with create command

can't create backup with mkdir /mnt: read-only file system

Our clickhouse data data path is /mnt/data1/clickhouse, but run
./clickhouse_backup --config ./config.yml create -t table table_20191216.bin
got

2019/12/16 16:16:43 can't create backup with mkdir /mnt: read-only file system

How to pass the correct backup path to the config file?

Thanks

Files are reuploaded if there are more than 1000 objects

In the table for the entire period of 6500 files(objects).
Option in config s3.overwrite_strategy etag.

If objects in s3 bucket > 1000, when run clickhouse-backup upload they again upload same files except 1k.

Use ceph radosgw as s3 (maybe problem in this).
When start upload, the HTTP request log shows that two request GET:

[19/Mar/2019:07:17:38 +0000] "GET /?list-type=2&max-keys=1000&prefix=<prefix> HTTP/2.0" 200
GET /?list-type=2&max-keys=1000&prefix=<prefix> HTTP/1.1
Host: <host>
User-Agent: aws-sdk-go/1.15.58 (go1.12; linux; amd64)
Authorization: AWS4-HMAC-SHA256 Credential=.../default/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=<signature>
X-Amz-Content-Sha256: <content>
X-Amz-Date: 20190318T155745Z
Accept-Encoding: gzip

[19/Mar/2019:07:17:38 +0000] "GET /?list-type=2&max-keys=1000&prefix=d<prefix> HTTP/2.0" 200
GET /?list-type=2&max-keys=1000&prefix=<prefix> HTTP/1.1
Host: <host>
User-Agent: aws-sdk-go/1.15.58 (go1.12; linux; amd64)
Authorization: AWS4-HMAC-SHA256 Credential=.../default/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=<signature>
X-Amz-Content-Sha256: <content>
X-Amz-Date: 20190318T155745Z
Accept-Encoding: gzip

In XML response 1st request (cut out only part of the directives)

<MaxKeys>1000</MaxKeys><IsTruncated>true</IsTruncated><Marker></Marker><NextMarker>....</NextMarker>
...

2d request same objects in body, but exclude NextMarker:

<MaxKeys>1000</MaxKeys><IsTruncated>true</IsTruncated><Marker></Marker>
...

When in code i change maxkeys=7000, file not reuploaded.

Dockerfile and docker image on dockerhub

Hello,
could you add Dockerfile and docker image on dockerhub?
(something like this)

# Build container
FROM golang:1.12.1-stretch

RUN git clone https://github.com/AlexAkulov/clickhouse-backup.git /go/src/clickhouse-backup
WORKDIR /go/src/clickhouse-backup
RUN go get
RUN go build -o /clickhouse-backup .

# Run container
FROM ubuntu:18.04

RUN apt-get update
RUN apt-get install -yqq  \
    ca-certificates \
    curl \
    git \
    && rm -rf /var/lib/apt/lists/*
COPY --from=0 /clickhouse-backup /usr/local/bin/
RUN chmod +x /usr/local/bin/clickhouse-backup
ENTRYPOINT [ "clickhouse-backup" ]
CMD [ "--help" ]

Пустые данные при создание бэкапа

Starting clickhouse-backup
Attaching to clickhouse-backup
clickhouse-backup | 2019/11/14 13:43:23 Create backup '1223456789'
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.act
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.events
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.notifies
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.redirect
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.url
clickhouse-backup | 2019/11/14 13:43:23 Freeze b.actual
clickhouse-backup | 2019/11/14 13:43:23 Freeze m.events
clickhouse-backup | 2019/11/14 13:43:23 Freeze m.notifies
clickhouse-backup | 2019/11/14 13:43:23 Freeze m.op
clickhouse-backup | 2019/11/14 13:43:23 Freeze m.redirects
clickhouse-backup | 2019/11/14 13:43:23 Freeze m.url
clickhouse-backup | 2019/11/14 13:43:23 Copy metadata
clickhouse-backup | 2019/11/14 13:43:23 Done.
clickhouse-backup | 2019/11/14 13:43:23 Move shadow
clickhouse-backup | 2019/11/14 13:43:23 Done.

clickhouse-backup коннектится к базе и видит таблицы (как видно из лога)
но на выходе папки backup и shadow оказываются пустыми без каких-либо данных
хотя данные в таблицах из длга есть
в директориях /var/lib/clickhouse/shadow и /var/lib/clickhouse/metadata данные также есть

не могу понять в чем проблема

simplify cmd line

It's could be nice to simplify cmd line to allow backup or restore in a single line.
for example:

  • backup (create + upload)
  • restore (download + restore-schema + restore-data)
    or allow multiple command per line.

[QUESTION/BUG] Freezing query is failed due syntax error: CH version is 19.5.3

Hi!

I have clickhouse version: 19.5.3, and the the query is failed

ALTER TABLE db_test.test_table FREEZE PARTITION ID '201905'

Error output:

Code: 368. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: std::bad_typeid.

So basically, if remove ID from query everything works fine.

Is this issue caused by clickhouse-server version?

UPDATE:

Also, I've checked official docs and few links, and indeed ID is omitted.

Download problem

root@stats:~# clickhouse-backup list
Local backups:
2019-04-08T18-23-33
Backups on S3:
2019-04-08T18-12-05.tar
root@stats:~# clickhouse-backup download
Select backup for download:
2019-04-08T18-12-05.tar
root@stats:~# clickhouse-backup download 2019-04-08T18-12-05.tar
2019/04/08 21:36:24 404: "The specified key does not exist."

Why so? :(

Docker problem

Hello.
The tool in official docker image does not work. This happens because it was builded in another environment.

docker exec -ti clickhouse-backup sh
/ # clickhouse-backup create
2019/07/22 11:50:15 Create backup '2019-07-22T11-50-15'
2019/07/22 11:50:15 can't connect to clickouse with: could not load time location: open /home/travis/.gimme/versions/go1.11.4.linux.amd64/lib/time/zoneinfo.zip: no such file or directory

FREEZE WITH NAME

ClickHouse supports freeze a table with given name:

ALTER TABLE cdp_tags FREEZE WITH NAME 'abc';

...
2019.11.30 09:47:21.941618 [ 608 ] {a6445f03-179f-4970-bb3c-a57a65af3377} <Debug> default.cdp_tags: Freezing part f6b4ea42fb9c719593aebbee00e5526f_2_835_20_862 snapshot will be placed at /var/lib/clickhouse/shadow/abc/

pkg/chbackup/backup.go Freeze(config Config, tablePattern string) can be improved not to require shadow directory to empty.

Broken "region" variable

Hello,
i got this error in latest release
can't upload with 400:"The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'eu-west-1'
Config:
...
s3:
region: "eu-west-1"
...

Can't restore from backup

Hi!

I can't restore from backup table (ReplicatedReplacingMergeTree).

clickhouse-backup restore-data -t marketing.a_events "2019-07-31T16-27-49"

CLI:

Prepare data for restoring 'marketing.a_events'
ALTER TABLE marketing.a_events ATTACH PARTITION 200812
can't attach partitions for table 'marketing.a_events' with code: 33, message: Cannot read all data. Bytes read: 27. Bytes expected: 74.

ClickHouse logs:

  1. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x22) [0x781c272]
  2. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int)+0x22) [0x3a0a3e2]
  3. /usr/bin/clickhouse-server(DB::ReadBuffer::readStrict(char*, unsigned long)+0x181) [0x3a19f71]
  4. /usr/bin/clickhouse-server(DB::DataTypeString::deserializeBinary(DB::IColumn&, DB::ReadBuffer&) const+0x194) [0x66e9334]
  5. /usr/bin/clickhouse-server(DB::MergeTreeDataPart::loadIndex()+0x22c) [0x6a5ffec]
  6. /usr/bin/clickhouse-server(DB::MergeTreeDataPart::loadColumnsChecksumsIndexes(bool, bool)+0x58) [0x6a612d8]
  7. /usr/bin/clickhouse-server(DB::MergeTreeData::loadPartAndFixMetadata(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)+0x172) [0x6a3b432]
  8. /usr/bin/clickhouse-server(DB::StorageReplicatedMergeTree::attachPartition(std::shared_ptrDB::IAST const&, bool, DB::Context const&)+0x237) [0x69c6ac7]
  9. /usr/bin/clickhouse-server(DB::StorageReplicatedMergeTree::alterPartition(std::shared_ptrDB::IAST const&, DB::PartitionCommands const&, DB::Context const&)+0x193) [0x69c93a3]
  10. /usr/bin/clickhouse-server(DB::InterpreterAlterQuery::execute()+0x578) [0x6dc1b18]
  11. /usr/bin/clickhouse-server() [0x688ca65]
  12. /usr/bin/clickhouse-server(DB::executeQuery(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool)+0x74) [0x688e6d4]
  13. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x830) [0x3a15c20]
  14. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x3a1627b]
  15. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x725666f]
  16. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0xe9) [0x7256da9]
  17. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x81) [0x7927e41]
  18. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x7924248]
  19. /usr/bin/clickhouse-server() [0xb2ac5bf]
  20. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f2dc4f9a6db]
  21. /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f2dc472188f]

Exclude tables

Hello,
It would be nice to add --exclude option to backup all tables, except specified. Is it possible to add this?

Example
clickhouse-backup create --exclude=<db>.<table1>,<db>.<table2>

Regards,
Andrii

Ignore schema

When i ignore some tables, it looks like okay. But then while, try restore-schema it still try to create them.
ch: 19.3.8
CLICKHOUSE_SKIP_TABLES=system.,default.test1,default..inner.

"driver: bad connection " Error during table freeze on table with big amount of partitions

As I understand, after upgrading Clickhouse on version 19.11.12.69 the clickhouse-backup starts to use FreezeTable method instead of FreezeTableOldWay. Which use ALTER TABLE %v.%v FREEZE;. (instead of FREEZE PARTITION)
So then I'm trying to freeze table with big amount of partiotions it's takes more time for this operation.
But after 3 minutes clickhouse-backup fails with error :
"2020/01/24 16:39:46 can't freeze mydb.some_table_local with: driver: bad connection"
This looks like some timeout from clickhouse-go driver ?
Maybe we need an option like connection timeout or ability to choose freezing method ( oldway or new ) ?

Backup clarification

Hello,
I have few questions about backup.
Before I used v0.2.0 with archive strategy and full backup.
Today I tried v0.3.4.

  1. Is there any command to delete old backups from clickhouse_folder/backup?
  2. Is there any command to delete old backups from s3?
  3. I used command "clickhouse-backup upload backup_name" (v0.3.4). This command deleted all my old backups from s3. WTF?)
  4. Do I need to keep old backups locally to use incremental backup (--diff-from=<old_backup_name>) ?
  5. v0.3.2 DEPREСATIONS: 'dry-run' flag and 'archive' strategy were marked as deprecated
    Why 'archive' strategy were marked as deprecated?

Regards,
Andrii.

Can't backup clickhouse db which in kubernetes

I try to make backup database, which work in docker container from official yandex image. This docker image runs in kubernetes cluster. I made separate pod with docker image alexakulov/clickhouse-backup and tried to make backup. And i had this error when i started this util:

2019/09/26 11:08:49 envconfig.Process: assigning CLICKHOUSE_CLICKHOUSE_PORT to Port: converting 'tcp://192.168.186.165:9000' to type uint. details: strconv.ParseUint: parsing "tcp://192.168.186.165:9000": invalid syntax

My config:
clickhouse:
  username: default
  password: ""
  host: clickhouse #this is address of host with clickhouse server db
  port: 9000
  data_path: ""
  skip_tables:
  - system.*
s3:
  access_key: ""
  secret_key: ""
  bucket: ""
  endpoint: ""
  region: us-east-1
  acl: private
  force_path_style: false
  path: ""
  disable_ssl: false
  disable_progress_bar: false
  part_size: 104857600
  strategy: ""
  backups_to_keep_local: 0
  backups_to_keep_s3: 0
  compression_level: 1
  compression_format: lz4

run this command:
clickhouse-backup create my_backup

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.