Comments (11)
@davidvossel - I think Jon Brassow wrote that, but I don't recall the specifics of the implementation.
from resource-agents.
Does "vgreduce --removemissing" not seem to be quite a brutal way to work with a volume group? I would normally only ever expect to see that as an emergency recover from a broken disk type of "get me out of trouble somewhat". I would never expect to see that automated and part of routine cluster management! Am I missing something?
from resource-agents.
Yeah, I completely agree with you. This is not something I'd expect the agent to be doing behind the scenes on me at all.
from resource-agents.
When I first saw the Cleanup messages in my cluster logs I was "intruiged", especially since it took 7 minutes - 7 minutes during which my lvm/archives directory got multiple writes... so I found a variation (RHEL5) of the code above and my first thought was "how EVIL!". I was clinging on to the hope that it wasn't doing what I thought it does, even though I have evidence that it did what I thought it would do... and that there had to be a really good explanation and reason for why it is there.
Thanks. So what are the steps to getting this reviewed and replaced with code that would be less disruptive?
from resource-agents.
Well, I'm not sure there is code that is less disruptive that handles attempting to "clean up" that sort of failure. We need to have that discussion with the person who introduced the logic (Jon Brassow) to make sure though. Maybe you have some suggestions?
perhaps the vgchange --partial option could be of use here.
from resource-agents.
I'm not sure I understand enough of the purpose of this code to give meaningful suggestions yet...
from resource-agents.
Does anyone know if Jon Brassow is still involved and likely to reading these issue logs? If not I'll try and give him a prod.
Thanks
from resource-agents.
I'll point him to this discussion.
from resource-agents.
The following commit removes the 'vgreduce --removemissing --force' command in 'vg_start_single' and replaces it with an LV-by-LV approach to activating the logical volumes. RAID LVs are handled differently than 'mirror' LVs and non-redundant LVs that have failed devices cause the service not to start. I think this is exactly the behavior you are looking for.
518b65f
The 'CLVM' method (i.e. 'vg_start_clustered') should change to either:
- attempt repair 1 LV at a time, as above
- use 'vgreduce --removemissing --force --mirrorsonly '. This will only repair redundant LVs and will leave non-redundant LVs alone - failing to activate the service (rather than removing PVs under non-redundant LVs) if LVs cannot be activated.
Both of these options will give you the behavior you are looking for and are more correct.
from resource-agents.
Thanks Jon - I'll test that commit as soon as I get the chance. Thanks for your time on it!
from resource-agents.
@phedders all good I presume?
from resource-agents.
Related Issues (20)
- mysql: variable master_host empty on slave reboot
- Are awsvip and awseip still supported resources agents for RHEL HA? HOT 2
- awsvip versus AWS Policy HOT 3
- nothing provides /bin/ps needed by resource-agents-4.11.0 HOT 1
- WARNING: Can't get <node-name> xlog location. HOT 6
- ZFS promotion not working HOT 10
- Occasional false positive "down" reports from IPv6addr "monitor" action
- ZFS can't migrate to other node (cannot open pool: no such pool) HOT 2
- ERROR: LXC container name not set! HOT 23
- How to use the parameter of monitor_script?
- Unable to get metadata for resource agent 'stonith:fence_watchdog' (SyntaxError:JSON.parse:unexpected character at line 1) HOT 2
- master-pgsql attribute disappear HOT 1
- AWS Pacemaker awsvip failing with different errors HOT 4
- Resource agent - AWS Lambda support HOT 2
- Postfix RA continuously fails validate check HOT 1
- iSCSITarget - don't create default portal HOT 4
- resource-agents/heartbeat/ZFS - '-f' to option HOT 1
- "ocf : heartbeat : docker" does not exists in resource-agent v4.10 HOT 1
- How can I create a galera resource with two nodes?
- Filesystem in RHEL9.3 takes considerably longer to complete its stop operation compared to RHEL9.2. HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from resource-agents.