dtaht / babeld-hacking Goto Github PK
View Code? Open in Web Editor NEWThis project forked from jech/babeld
The Babel routing daemon
Home Page: http://www.pps.univ-paris-diderot.fr/~jch/software/babel/
License: MIT License
This project forked from jech/babeld
The Babel routing daemon
Home Page: http://www.pps.univ-paris-diderot.fr/~jch/software/babel/
License: MIT License
the new method (which works in olsr) is via netlink. the old method (an ioctl) stopped working years ago.
Now I remember where I went wrong on rabeld. I tried to implement sse intrinsics throughout,
doing stuff like aligning
And I got nowhere for a variety of reasons, notably I couldn't keep stuff in xmm registers without
creating types for it everywhere, and you can't do a real memcmp. So I got xor to work (I should
rename these to memeqX), found out that gcc 7 on x86_64 does inline memcmp(a,b,c) != 0 with the right thing,
but the real bottleneck is in the memcmp for less than or greater than in the core find_route_slot routine.
What this does is return the first byte that differs, but you got to load it again to find out. And even this
is wrong because it checks for a null byte, apparently.
#ifdef HAVE_SSE
#include <smmintrin.h>
/*
inline size_t xor16 (const unsigned char *p1,
const unsigned char *p2)
{
return _mm_cmpistrc((const __m128i *)p1,(const __m128i *)p2,0);
}
*/
inline bool xor16(const unsigned char a, const unsigned char b) {
__m128i xmm0, xmm1;
unsigned int eax;
xmm0 = _mm_loadu_si128((__m128i)(a));
xmm1 = _mm_loadu_si128((__m128i)(b));
xmm0 = _mm_cmpeq_epi8(xmm0, xmm1);
eax = _mm_movemask_epi8(xmm0);
return !(eax == 0xffff); //equal
}
I have a backup default route with a high metric. Keeping that route around would probably be useful.
Going through each speaker and determining the number of routes that I'm installing from it as a measure
of goodness, and keeping it's less good routes available in case the default route goes down might be a way to get there.
We have 3 different problems. odhcpd inserts a metric, it uses static routes rather than it's own table,
and it's a just mess. So I ended up always routing through another router to get here, because
that one injected static routes on itself.
root@edgerouterx:~# ip -6 route
default from 2001:558:6045:105:f039:c605:3f8b:8605 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto babel metric 1024 pref medium
default from 2601:646:8500:7100::/60 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto babel metric 1024 pref medium
default from 2603:3024:1536:8600:f6f2:6dff:feb6:a01d via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto babel metric 1024 pref medium
default from 2603:3024:1536:86f0::b25 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
default from 2603:3024:1536:86f0::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
default from 2603:3024:1536:86f4::/62 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
default from 2603:3024:1536:86f8::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
default from 2603:3024:1536:86f0::/60 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto babel metric 1024 pref medium
default from fd2d:2e5e:fe7e::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
default from fde5:dfb9:df90:fff0::/60 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto babel metric 1024 pref medium
2601:646:8500:7100::/60 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto babel metric 1024 pref medium
2603:3024:1536:86f0::/60 from 2603:3024:1536:86f0::b25 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/60 from 2603:3024:1536:86f0::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/60 from 2603:3024:1536:86f4::/62 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/60 from 2603:3024:1536:86f8::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/60 from fd2d:2e5e:fe7e::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/64 dev eth0.2 proto static metric 256 pref medium
2603:3024:1536:86f0::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f0::/64 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto babel metric 1024 pref medium
2603:3024:1536:86f4::/64 dev eth0.1 proto static metric 1024 pref medium
2603:3024:1536:86f5::/64 dev eth0.3 proto static metric 1024 pref medium
2603:3024:1536:86f7::/64 dev eth0.5 proto static metric 1024 pref medium
unreachable 2603:3024:1536:86f4::/62 dev lo proto 48 metric 1 error -148 pref medium
unreachable 2603:3024:1536:86f4::/62 dev lo proto static metric 2147483647 error -148 pref medium
2603:3024:1536:86f8::/62 from 2603:3024:1536:86f0::b25 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f8::/62 from 2603:3024:1536:86f0::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f8::/62 from 2603:3024:1536:86f4::/62 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f8::/62 from 2603:3024:1536:86f8::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f8::/62 from fd2d:2e5e:fe7e::/64 via fe80::e091:f5ff:febe:a353 dev eth0.2 proto static metric 512 pref medium
2603:3024:1536:86f8::/64 dev eth0.2 proto static metric 256 pref medium
2603:3024:1536:86f8::/62 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto babel metric 1024 pref medium
2603:3024:1536:86f0::/60 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto babel metric 1024 pref medium
fd2d:2e5e:fe7e::/48 from 2603:3024:1536:86f0::b25 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto static metric 512 pref medium
fd2d:2e5e:fe7e::/48 from 2603:3024:1536:86f0::/64 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto static metric 512 pref medium
fd2d:2e5e:fe7e::/48 from 2603:3024:1536:86f4::/62 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto static metric 512 pref medium
fd2d:2e5e:fe7e::/48 from 2603:3024:1536:86f8::/64 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto static metric 512 pref medium
fd2d:2e5e:fe7e::/48 from fd2d:2e5e:fe7e::/64 via fe80::f6f2:6dff:feb6:a01c dev eth0.2 proto static metric 512 pref medium
fd2d:2e5e:fe7e::/64 dev eth0.2 proto static metric 256 pref medium
fd89:cb20:8854::/64 dev eth0.1 proto static metric 1024 pref medium
fd89:cb20:8854:1::/64 dev eth0.3 proto static metric 1024 pref medium
fd89:cb20:8854:3::/64 dev eth0.5 proto static metric 1024 pref medium
unreachable fd89:cb20:8854::/48 dev lo proto 48 metric 1024 error -148 pref medium
unreachable fd89:cb20:8854::/48 dev lo proto static metric 2147483647 error -148 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth0.1 proto kernel metric 256 pref medium
fe80::/64 dev eth0.3 proto kernel metric 256 pref medium
fe80::/64 dev eth0.2 proto kernel metric 256 pref medium
fe80::/64 dev eth0.5 proto kernel metric 256 pref medium
fe80::/64 dev eth0.4 proto kernel metric 256 pref medium
fe80::/64 dev ifb4eth0.2 proto kernel metric 256 pref medium
let's say you have 4 known speakers. You can choose to announce and redistribute your routes at 1/4th the normal speed distributed modulo the number of speakers, and also essentially apply a rate limit to your
announcements up to (I think) about 2 minutes - although the protocol supports a little over an hour and a half. It would be good to announce default routes on a more regular basis.
There's a lot of work done "out there" to get rid of routes you don't need or can't use in BGP.
An announcement - don't send me these routes anymore because I can't use them.
or I'm out of cpu and I simply can't carry this load
or I've got nice routes here, I'm going to keep 'em a while
given fd::1/128 and fd::2/128, an ae could be created to fold these together into one announcement. if all speakers can generate that ae, announcements of the unfolded routes can cease.
When slammed with new data or retraction babel has a tendency to send late hellos, and slow boxes drop off the net. I had a logging function for this at one point. And I'd also added compute and other bounds
to the main loop to see where it was going wrong. then I thought threading the whole thing would be better, or using a better timer loop to make sure the hellos got out. Then I lost 2 years.
first up, at least log those.
This used to, prior to me putting out the latest release, just basically export my dynamic
ip's and a covering route.
local-port 33123
ipv6-subtrees true
default enable-timestamps true
interface eth0.2
interface eth0.1
interface eth0.3
interface eth0.4
interface eth0.5
redistribute local eq 62
redistribute local src-eq 62
redistribute src-eq 62
redistribute proto 48
redistribute local deny
redistribute deny
But, nope, they are escaping now.
Have to revert
default via 172.22.0.2 dev eno1
50.197.142.144/29 via 172.22.0.1 dev eno1 proto babel onlink
169.254.0.0/16 dev eno1 scope link metric 1000
172.20.0.0/14 via 172.22.0.1 dev eno1 proto babel onlink
172.22.0.0/24 dev eno1 scope link
172.22.0.2 via 172.22.0.2 dev eno1 proto babel onlink
172.22.0.91 via 172.22.0.91 dev eno1 proto babel onlink
172.22.0.172 via 172.22.0.172 dev eno1 proto babel onlink
172.22.0.193 via 172.22.0.193 dev eno1 proto babel onlink
172.22.0.215 via 172.22.0.215 dev eno1 proto babel onlink
172.22.192.0/22 via 172.22.0.193 dev eno1 proto babel onlink
172.22.193.1 via 172.22.0.193 dev eno1 proto babel onlink
172.22.220.0/22 via 172.22.0.91 dev eno1 proto babel onlink
172.22.221.1 via 172.22.0.91 dev eno1 proto babel onlink
172.22.222.1 via 172.22.0.91 dev eno1 proto babel onlink
172.22.223.1 via 172.22.0.91 dev eno1 proto babel onlink
172.23.252.2 via 172.22.0.2 dev eno1 proto babel onlink
192.168.0.0/24 dev eno1 proto kernel scope link src 192.168.0.2
192.168.122.1 via 172.22.0.215 dev eno1 proto babel onlink
This is the core of the xroute import problem. It's also similar to the resend problem. In both cases,
we could use a way better structure to deal with it, and probably the same kind. rb_tree?
I think avl would be better.
https://en.wikipedia.org/wiki/Red%E2%80%93black_tree#Set_operations_and_bulk_operations
don't put any routes into the main tables as we import from the kernel.
popcount is cheap on x86_64. If we take a popcount on entry we can avoid comparing both prefixes for equality or inequality in the resend routines.
zeroes are an xor ax, ax. Comparing against 0 or ffff is faster if done inline
v4prefix at least, might benefit.
Received truncated sub-TLV on Update.
Couldn't parse packet (8, 61) from fe80::eea8:6bff:fefe:9a2 on enp7s0.
xroutes don't get exported
nc ::1 33123 with a dump command requires 3 returns to exit
and I sometimes get also a couldn't parse packet from a bogus address entirely.
So I get this kind of stuff from ip -6 mon when I'm blowing things up with rtod.
dnsmasq and odhcpd light up also because they are getting all this stuff.
a replace would halve these. And I thought I'd managed to do that at some point.
Deleted fc12:a8a5:1b9d:1da1::/64 via fe80::46d9:e7ff:fe93:822e dev br-lan proto babel metric 1024 pref medium
unreachable fc12:a8a5:1b9d:1da1::/64 dev lo proto babel metric 4294967295 error -113 pref medium
Deleted fc12:a8a5:1b9d:1da2::/64 via fe80::46d9:e7ff:fe93:822e dev br-lan proto babel metric 1024 pref medium
unreachable fc12:a8a5:1b9d:1da2::/64 dev lo proto babel metric 4294967295 error -113 pref medium
default gw -------------------------- couch gw ------- rest of network
when the default gw goes down, the couch gw retracts all the ipv6 routes for that network,
even though that network is still alive. It would be cool to be able to configure something like
keep router_id X eq 60 on the couch gw so I don't lose internal connetivity, just the default route
for those ips goes down.
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
29.50 1.18 1.18 7135724 0.00 0.00 find_route_slot
7.00 1.46 0.28 53075108 0.00 0.00 timeval_minus_msec
7.00 1.74 0.28 40367651 0.00 0.00 find_resend
5.50 1.96 0.22 5119 0.04 0.29 flushupdates
5.25 2.17 0.21 2870371 0.00 0.00 really_buffer_update
3.75 2.32 0.15 50928 0.00 0.00 netlink_read
3.50 2.46 0.14 247 0.57 0.57 route_stream
2.25 2.55 0.09 17985 0.01 0.03 recompute_resend_time
2.25 2.64 0.09 1443 0.06 0.14 update_neighbour_metric
2.00 2.72 0.08 4630057 0.00 0.00 route_stream_next
2.00 2.80 0.08 2635143 0.00 0.00 change_route_metric
1.50 2.86 0.06 6229558 0.00 0.00 chg_route_metric
1.50 2.92 0.06 6078628 0.00 0.00 neighbour_rxcost
1.50 2.98 0.06 2008693 0.00 0.00 network_prefix
1.50 3.04 0.06 1967065 0.00 0.00 find_route
1.50 3.10 0.06 1967065 0.00 0.00 update_route
1.25 3.15 0.05 6400368 0.00 0.00 neighbour_cost
1.25 3.20 0.05 6123598 0.00 0.00 resize_route_table
Shouldn't this utterly filter out fc? I've tried all sorts of variants of this... (against my qsort thing which I'll back off, but I've also tried variants with other versions)
in ip fc::/8 le 8 deny # used by rtod for test routes
in src-ip fc::/8 le 8 deny # used by rtod for test routes
in ip fc::/8 ge 8 deny # used by rtod for test routes
in src-ip fc::/8 ge 8 deny # used by rtod for test routes
#in ip a::/8 ge 8 deny # core uses "a" as a default
redistribute ip fc::/8 ge 8 deny
redistribute src-ip fc::/8 ge 8 deny
redistribute ip fc::/8 le 8 deny
redistribute src-ip fc::/8 le 8 deny
route->interval exists in the spec but not used. It's not even preserved from the message. It's kind of a grey area.
I'd want to use it to spread out updates and keep persistent routes around long term without having to reannounce them as often.
leverage the ebpf filter
create syntax to enable it
I've been using ecn for forever. That helps on fq_codel when we hit overload on wifi - and as the
packets get marked, rather than dropped, we could also start recognizing CE and increasing the metric in the hope some better speaker isn't as overloaded. And also, doing it that way, we shouldn't ever set a route to infinity.
Perhaps we should send more hellos and ihus in this case also.
All we need is to start capturing the socket info (it's a setsockopt)....
injecting 16000 routes ate a core 3 cpu for all of the 5 minutes I had it going. We've discussed how to go about improving xroute support in the past, and I've got a better patch for filtering stuff out sooner already, but the core is a better qsort. To this day I don't get the whole "route_stream" idea....
For some reason or another 1.8.3 calculates a wrong ref_metric, at least on default routes,
but I haven't checked further. For that matter, so does 1.7.1,
but the two behaviors are different and 1.7.1 "works" where 1.8.3 gets it wrong. I kind of suspect
a parsing error or memory corruption somewhere.
For a default route...
The ref_metric I get for 1.7.1 with an advertised 0, is 256.
The ref_metric I get for 1.8.3 with one default route on the link is 0. Add another link to it though, it goes up to a much larger number.
In part I'm trying to abuse the new make-wifi-fast aqm code. But getting babel up to a city scale
has always been a goal of mine. The new unicast stuff (I hope) will do route transfers over unicast,
which means that if there are no listeners on a wifi bridge, we won't see the packets go. That should
be an enormous improvement in itself.
There are multiple other places that can be sped up. Routing updates can be made atomic (halving
their cost). We can get better about doing route retractions on a denser mesh and install a better route at the outset #5 . We can start announcing routes on a longer interval (supported by the protocol, probably,
but not the current babeld code #8 ).
As one example I can never remember the correct things to do for filtering.
There's also martians and bogons and covering routes and importing whole protos
from openwrt's other daemons and so on. And I'm pretty sure some more ebpf will help.
in ip fc::/8 ge 8 deny # used by rtod for test routes
redistribute ip fc::/8 ge 8 deny
redistribute ip fc::/8 le 8 deny
dnsrbld has been using an inline qsort for a long time. Given the large number of 128 bit values,
being able to use registers more efficiently was on my mind, which is what the xnor branch here does so far.
http://www.corpit.ru/mjt/qsort.html
It wasn't until I got rid of a lot of memcmp that I found that bug #7 existed for xroute import. As usual, I always have 2 or more bugs. Despite that, trying to improve normal route behavior with a better qsort might help. In particular, all it cares about is less than so that could simplify the comparison code somewhat. With better registerization (and sigh, that might mean sse and neon instructions a la the vector packet processing effort), and keeping the core compare inside of those registers inside the qsort, perhaps a big speedup could be obtained.
memcmp for lessor or greator than might actually be unneeded. I know neon can completely flip
endianness in a single instruction.
So I configured flent to send 100 non-ecn'd flows, while sqm was fq_codeling at 40Mbit, with 4096 routes from rtod. Basically all babel packets get marked on every burst.
qdisc fq_codel 110: parent 1:10 limit 1001p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
maxpacket 4542 drop_overlimit 0 new_flow_count 4872 ecn_mark 48 memory_used 316448
root@dancer:~/git/rtod# tc -s qdisc show dev eno1 | grep ecn
qdisc fq_codel 110: parent 1:10 limit 1001p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
maxpacket 4542 drop_overlimit 0 new_flow_count 7906 ecn_mark 98 memory_used 302976
tcp_nup-2018-10-03T221206.114668.flent.gz
Normally what would happen to babel on a test like this is it would lose connectivity at some point. Let's try that!
In part I'm trying to abuse the new make-wifi-fast aqm code. But getting babel up to a city scale
has always been a goal of mine. The new unicast stuff (I hope) will do route transfers over unicast,
which means that if there are no listeners on a wifi bridge, we won't see the packets go. That should
be an enormous improvement in itself.
There are multiple other places that can be sped up. Routing updates can be made atomic (halving
their cost). We can get better about doing route retractions on a denser mesh and install a better route at the outset #5 . We can start announcing routes on a longer interval (supported by the protocol, probably,
but not the current babeld code).
ebpf
I take my default gw down... I have another default gw... but it takes 4 route changes in rapid succession to get it back up.
ip route del 0.0.0.0/0 from 0.0.0.0/0 table 254 metric 0 dev eno1 via 172.22.0.2 proto 42
ip route add unreachable 0.0.0.0/0 from 0.0.0.0/0 table 254 metric 0 proto 42
ip route del 0.0.0.0/0 from 0.0.0.0/0 table 254 metric 0 proto 42
ip route add 0.0.0.0/0 from 0.0.0.0/0 table 254 metric 0 dev eno1 via 172.22.0.1 proto 42
Ideally this should be:
ip route replace 0.0.0.0/0 from 0.0.0.0/0 table 254 metric 0 dev eno1 via 172.22.0.1 proto 42
I have no idea why this is.
So I have observed netlink flakyness in babeld for years and years. I finally got tired of it. there's at least 3 different bugs here. Bird still does add/delete. iproute2 on the other hand, seems to always do the right thing now...
I added an attempt to do two phase commit updates to see stuff better...
So,
Deleted unreachable 2601:646:8500:7100::/60 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable 2601:646:8500:7100::/60 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable 2601:646:8500:7100::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable 2601:646:8500:7100::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable 2601:646:8500:7101::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable 2601:646:8500:7101::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable 2601:646:8500:7102::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable 2601:646:8500:7102::/64 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable fd63:6ec7:2f84::1 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable fd63:6ec7:2f84::1 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable fdb8:92ec:da7d::1 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable fdb8:92ec:da7d::1 dev lo proto babel metric 1024 error 4294967183 pref medium
Deleted unreachable fde5:dfb9:df90:fff4::1 dev lo proto babel metric 1024 error 4294967183 pref medium
unreachable fde5:dfb9:df90:fff4::1 dev lo proto babel metric 1024 error 4294967183 pref medium
2: eno1 inet6 2603:3024:1536:86f0:eea8:6bff:fefe:9a2/64 scope global dynamic mngtmpaddr
recvmsg doesn't check for this and probably should.
neither does babel_recvmsg
In most cases babeld is checking for eq or non-equal. memcmp (especially for 16 byte values) is
really innefficient for these cases.
In the babeld-xnor branch I've got rid of most of the calls to memcmp, and especially on 64 bit arches,
instead of a big call to memcmp, things get replaced with two xors and an or call, which take, oh, 3 cycles? to execute. This makes it a lot easier to profile for the calling sites that are inefficient.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.