Comments (5)
On Sun, Nov 18, 2012 at 11:04:07PM -0800, Keisuke MORI wrote:
IPaddr2 may failed to assign an IPv6 address with 'dadfailed' status on LVS configuration, which has the same IPv6 address on lo too.
I consider that this is a blocker for the release.
The steps to reproduce is below. Step 4a. or 4b. is a failed scenario. Step 4c. is a succeeded scenario. (thanks to @nozawatm for testing it)
Diagnosis:
It fails when send_ua is called if the VIP is still 'tentative' status (accomplishing the assignment in the kernel).
The cause seems that send_ua is sending ping for waiting the tentative flag is disappeared (i.e. assingment has finished) but the ping packet makes IPv6 DAD (duplicate address detection) protocol fail in the case of LVS configuration.
The kernel should know where the ICMP packets are coming from.
Did you watch the network with tcpdump?
(L317-L322 in IPaddr2.c)
17d9f6a#L0R317One strange thing is that this problem was not observed on RHEL5.7 (send_ua suceeds even if it was tentative status). We are unsure that if the kernel behavior does matter or not, but anyway we have to fix it somehow.
Neither does it happen with SLE11SP2, kernel 3.0.42. I guess it
does depend on the kernel.
Solution:
I think we should take another way for waiting to finish to assign the IPv6 address. Probably we are going to remove the ping loop in send_ua and alternatively use ip command to check tentative flag in the RA.
I guess that that's also a possibility.
from resource-agents.
The cause seems that send_ua is sending ping for waiting the tentative flag is disappeared (i.e. assingment has finished) but the ping packet makes IPv6 DAD (duplicate address detection) protocol fail in the case of LVS configuration.
The kernel should know where the ICMP packets are coming from. Did you watch the network with tcpdump?
Yes (attached below). Ping packets themselves work as usual. The problem is a neighbor solicitation packet for DAD can not be seen on the wire in the case of the failed scenario. I suspect that succeeding ping to lo makes the kernel think as the address has already assigned hence DAD fails.
One strange thing is that this problem was not observed on RHEL5.7 (send_ua suceeds even if it was tentative status).
Let me correct this; RHEL5.7 did not work properly either. 'dadfailed' was not displayed on RHEL5.7 but the kernel log reports that it has detected the address duplication and the 'tentative' status remains. The connection between other nodes may work but apparently it is not right status.
I think we should take another way for waiting to finish to assign the IPv6 address. Probably we are going to remove the ping loop in send_ua and alternatively use ip command to check tentative flag in the RA.
I guess that that's also a possibility.
I and @nozawatm are now testing the patch along with this.
Please allow us for one more day.
Regards,
Failed scenario. (4b.)
# tcpdump -i lo -n icmp6
10:16:54.490512 IP6 2001:db8:100::101 > 2001:db8:100::101: ICMP6, echo request, seq 0, length 64
10:16:54.490526 IP6 2001:db8:100::101 > 2001:db8:100::101: ICMP6, echo reply, seq 0, length 64
# tcpdump -i eth0 -n icmp6
10:16:54.491331 IP6 2001:db8:100::101 > ff02::1: ICMP6, neighbor advertisement, tgt is 2001:db8:100::101, length 32
(5 times)
Succeeded scenario. (4c.)
# tcpdump -i lo -n icmp6
10:18:06.904199 IP6 2001:db8:100::101 > 2001:db8:100::101: ICMP6, echo request, seq 0, length 64
10:18:06.904208 IP6 2001:db8:100::101 > 2001:db8:100::101: ICMP6, echo reply, seq 0, length 64
# tcpdump -i eth0 -n icmp6
10:18:04.534835 IP6 :: > ff02::1:ff00:101: ICMP6, neighbor solicitation, who has 2001:db8:100::101, length 24
10:18:06.904327 IP6 2001:db8:100::101 > ff02::1: ICMP6, neighbor advertisement, tgt is 2001:db8:100::101, length 32
(5 times)
from resource-agents.
On Tue, Nov 20, 2012 at 02:47:48AM -0800, Keisuke MORI wrote:
I and @nozawatm are now testing the patch along with this.
Please allow us for one more day.
Of course.
from resource-agents.
Closed as the patch is submitted in pull request #181.
from resource-agents.
Just for an additional note:
According to our kernel experts, this issue is fixed by the commit below so the recent kernel should not be affected:
Looking at tentative status should be more reliable than pinging anyway.
from resource-agents.
Related Issues (20)
- Theoretical security problem in SAPInstance HOT 3
- mysql: variable master_host empty on slave reboot
- Are awsvip and awseip still supported resources agents for RHEL HA? HOT 2
- awsvip versus AWS Policy HOT 3
- nothing provides /bin/ps needed by resource-agents-4.11.0 HOT 1
- WARNING: Can't get <node-name> xlog location. HOT 6
- ZFS promotion not working HOT 10
- Occasional false positive "down" reports from IPv6addr "monitor" action
- ZFS can't migrate to other node (cannot open pool: no such pool) HOT 2
- ERROR: LXC container name not set! HOT 23
- How to use the parameter of monitor_script?
- Unable to get metadata for resource agent 'stonith:fence_watchdog' (SyntaxError:JSON.parse:unexpected character at line 1) HOT 2
- master-pgsql attribute disappear HOT 1
- AWS Pacemaker awsvip failing with different errors HOT 4
- Resource agent - AWS Lambda support HOT 2
- Postfix RA continuously fails validate check HOT 1
- iSCSITarget - don't create default portal HOT 4
- resource-agents/heartbeat/ZFS - '-f' to option HOT 1
- "ocf : heartbeat : docker" does not exists in resource-agent v4.10 HOT 1
- How can I create a galera resource with two nodes?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from resource-agents.