GithubHelp home page GithubHelp logo

Comments (10)

Patryk27 avatar Patryk27 commented on May 21, 2024

Oh, that's nice - yeah, should be doable 🙂

Would you see such retry & retry-interval as a global configuration for all containers or you'd have some use-case for specifying different retry-options for different policies / containers / remotes?

from lxd-snapper.

aivanise avatar aivanise commented on May 21, 2024

Whatever is easier to implement, I don't actually have a use case to have it separate per policy, as I don't see how exec(lxc) can fail differently depending on the policy, maybe if snapshot removal on zfs level is slower depending on the snapshots around it, but it is a stretch.

although... openzfs/zfs#11933, but still a stretch ;)

from lxd-snapper.

aivanise avatar aivanise commented on May 21, 2024

one more thing here, somewhat related: lxc can also get stuck and never exit, so it would be nice to have a timeout on exec calls to lxc. Happened to me just now and since it was in a systemd unit that was missing TimeoutStartSec, it was happily hanging in there "as a service" for two weeks until I've realized there are no more snapshots ;)

from lxd-snapper.

Patryk27 avatar Patryk27 commented on May 21, 2024

it would be nice to have a timeout on exec calls to lxc.

Oh, this I can implement pretty quickly! 😄

Check out current master, I've just added there lxc-timeout (with the default of 10 minutes), which allows to specify the maximum waiting time for each invocation of lxc.

from lxd-snapper.

aivanise avatar aivanise commented on May 21, 2024

i've tried it out and it actually makes every call to lxc take lxc-timeout time instead of timing it out ;)

# stdbuf -i0 -o0 -e0 time /tmp/lxd-snapper -c /tmp/lxd-snapper.conf backup-and-prune | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
[2023-03-20 12:00:52] Backing-up
[2023-03-20 12:00:52] ----------
[2023-03-20 12:00:52]
[2023-03-20 12:00:53] AEE/aee-qc
[2023-03-20 12:00:53]   - creating snapshot: auto-20230320-110053 [ OK ]
[2023-03-20 12:00:53]
[2023-03-20 12:03:53]
[2023-03-20 12:03:53] Pruning
[2023-03-20 12:03:53] -------
[2023-03-20 12:03:53]
[2023-03-20 12:03:54] AEE/aee-qc
[2023-03-20 12:03:54]
^CCommand terminated by signal 2
0.29user 0.88system 4:26.23elapsed 0%CPU (0avgtext+0avgdata 28168maxresident)k
0inputs+0outputs (0major+20946minor)pagefaults 0swaps

# head -3 /tmp/lxd-snapper.conf
# this is yaml
lxc-timeout: 3 min
policies:

from lxd-snapper.

Patryk27 avatar Patryk27 commented on May 21, 2024

Huh, that's pretty random - I've just re-checked on my machine and everything seems to be working as intended, i.e. the commands complete without any extra delay:

pwy@ubu:~/Projects/lxd-snapper$ stdbuf -i0 -o0 -e0 time ./target/release/lxd-snapper backup-and-prune | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
[2023-03-20 14:25:38] Backing-up
[2023-03-20 14:25:38] ----------
[2023-03-20 14:25:38] 
[2023-03-20 14:25:38] test
[2023-03-20 14:25:38]   - creating snapshot: auto-20230320-132538 [ OK ]
[2023-03-20 14:25:38] 
[2023-03-20 14:25:38] Backing-up summary
[2023-03-20 14:25:38] ------------------
[2023-03-20 14:25:38]   processed instances: 1
[2023-03-20 14:25:38]   created snapshots: 1
[2023-03-20 14:25:38] 
[2023-03-20 14:25:38] Pruning
[2023-03-20 14:25:38] -------
[2023-03-20 14:25:38] 
[2023-03-20 14:25:38] test
[2023-03-20 14:25:38]   - keeping snapshot: auto-20230320-132538
[2023-03-20 14:25:38]   - deleting snapshot: auto-20230320-132510 [ OK ]
[2023-03-20 14:25:38] 
[2023-03-20 14:25:38] Pruning summary
[2023-03-20 14:25:38] ---------------
[2023-03-20 14:25:38]   processed instances: 1
[2023-03-20 14:25:38]   deleted snapshots: 1
[2023-03-20 14:25:38]   kept snapshots: 1
0.14user 0.20system 0:00.50elapsed 68%CPU (0avgtext+0avgdata 27472maxresident)k
0inputs+0outputs (0major+50077minor)pagefaults 0swaps
pwy@ubu:~/Projects/lxd-snapper$ cat config.yaml 
lxc-timeout: 3 min

policies:
  every-instance:
    keep-last: 1
pwy@ubu:~/Projects/lxd-snapper$ 

Which OS and kernel are you using? 👀

from lxd-snapper.

aivanise avatar aivanise commented on May 21, 2024

I'm on Centos 8-Streams, 4.18.0-408.el8.x86_64

maybe you should add one more machine at least to be able see it, as in my case the delays are between the machines, i.e. [OK] appears immediately, but it then waits lxc-timeout time to skip to the next one.

from lxd-snapper.

Patryk27 avatar Patryk27 commented on May 21, 2024

Yeah, I did check on multiple machines - even with a few different kernel versions (4.14, 4.9 & 5.4) 🤔

Would you mind checking this binary?

(it's lxd-snapper built via Nix, through nix build .#packages.x86_64-linux.default && cp ./result/bin/lxd-snapper . - that's to make sure the compiler or dynamic binaries aren't playing any tricks here 😄)

from lxd-snapper.

aivanise avatar aivanise commented on May 21, 2024

same result. I have noticed however that, according to 'ps -e f', it spawns lxc list and hangs in there for the duration od a timeout. Identical lxc list command issued on the command line returns within seconds. So, it might be something else, not the timeout per se. The version that works that I use is the last release (v1.3.0), so it might be something added to the master after that.

1512271 pts/0    S+     0:00  |           \_ time /tmp/lxd-snapper -c /tmp/lxd-snapper.conf backup-and-prune
1512282 pts/0    S+     0:00  |           |   \_ /tmp/lxd-snapper -c /tmp/lxd-snapper.conf backup-and-prune
1512582 pts/0    Sl+    0:00  |           |       \_ lxc list local: --project=default --format=json
1512272 pts/0    S+     0:00  |           \_ awk { print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }

from lxd-snapper.

Patryk27 avatar Patryk27 commented on May 21, 2024

Okie, I've just prepared a different implementation - feel free to checkout current master branch if you find a minute 🙂

from lxd-snapper.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.