GithubHelp home page GithubHelp logo

Comments (13)

bluejekyll avatar bluejekyll commented on May 29, 2024 1

Thanks for digging into that. I'm open to solutions here, which I suppose could be to not enforce the limit, but that feels wrong to me.

from hickory-dns.

qwerttvv avatar qwerttvv commented on May 29, 2024

I'm sorry my English isn't good. I can read but I'm not good at writing.

Here is my short addition

Please investigate the performance of all kinds of records, not only A records and AAAA records

The example here is the domain s3.amazonaws.com, which has no AAAA

Of course, there are all kinds of other records

from hickory-dns.

zonyitoo avatar zonyitoo commented on May 29, 2024

Just realized that ResolverOpts::edns0 is default to false.

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

Hi, would this be considered a breaking change between 0.22.0 and 0.23.0? We have some queries that return slightly over 512 bytes using eDNS that work on 0.22.0, but are rejected on 0.23.0. It looks to be related to the refactor in #1975.

If it helps, we're not using hickory-dns directly, but software we depend on (apollo router) is using it.

from hickory-dns.

djc avatar djc commented on May 29, 2024

I think we usually use "breaking change" to mean an intentional change -- are you asking whether we would consider this a regression?

I'm not quite familiar with the workings of eDNS, but it certainly looks like #1975 helps compliance with the relevant standards rather than reducing it.

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

Certainly unintended, so a regression then.

0.22.0 works with eDNS out of the box, but apparently had truncation issues fixed in #1975
0.23.0 fixes the truncation issues, but apparently changes the default behavior that we were relying on for DNS responses above 512 bytes.

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

Here's a quick example:

Setup:

$ docker run -it rockylinux/rockylinux:8
[root@bcbad58998ae /]# yum install -y gcc && curl https://sh.rustup.rs -sSf | sh
[root@bcbad58998ae /]# source "$HOME/.cargo/env"

0.22.0 working

[root@bcbad58998ae /]# cargo install trust-dns-util --version 0.22.0
[root@bcbad58998ae /]# resolve --warn --udp --system 512.size.dns.netmeister.org
Querying for 512.size.dns.netmeister.org A from udp:192.168.65.7:53, tcp:192.168.65.7:53
Success for query 512.size.dns.netmeister.org IN A
	512.size.dns.netmeister.org. 377 IN A 127.0.0.20
        ...

0.23.0 timing out

[root@bcbad58998ae /]# cargo install trust-dns-util --version 0.23.0
[root@bcbad58998ae /]# resolve --warn --udp --system 512.size.dns.netmeister.org
Querying for 512.size.dns.netmeister.org A from udp:192.168.65.7:53, tcp:192.168.65.7:53
2024-02-17T19:12:29.767352Z  WARN trust_dns_proto::udp::udp_client_stream: dropped malformed message waiting for id: 21579 err: unexpected end of input reached
2024-02-17T19:12:34.772239Z  WARN trust_dns_proto::udp::udp_client_stream: dropped malformed message waiting for id: 50989 err: unexpected end of input reached
2024-02-17T19:12:39.776387Z  WARN trust_dns_proto::udp::udp_client_stream: dropped malformed message waiting for id: 1358 err: unexpected end of input reached
ResolveError { kind: Timeout }

from hickory-dns.

bluejekyll avatar bluejekyll commented on May 29, 2024

Is there a reason you need to use UDP in this context? I think google and a few other DNS resolvers out there restrict packet sizes to 512 bytes over UDP to prevent DDOSes. Can you use TCP instead?

from hickory-dns.

bluejekyll avatar bluejekyll commented on May 29, 2024

On another note, it's not obvious to me why that change (which is mostly oriented around the server), would impact the client stream. Do you think it's clear what is causing the issue there?

from hickory-dns.

bluejekyll avatar bluejekyll commented on May 29, 2024

Ok, looking at this a little more, if the ends settings are 512, but the packet size returned is something larger, then I could see this as cutting off the response, and that's probably the issue. But isn't that what we want? Is there a reason we'd want to accept an upstream response that is greater than the size we've specified in EDNS?

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

Is there a reason you need to use UDP in this context? I think google and a few other DNS resolvers out there restrict packet sizes to 512 bytes over UDP to prevent DDOSes. Can you use TCP instead?

I was actually expecting it to fall back to TCP after the UDP path fails, but that does not seem to happen. I can dig in to the apollo-router settings we're using to see if it's forcing UDP somewhere.

On another note, it's not obvious to me why that change (which is mostly oriented around the server), would impact the client stream. Do you think it's clear what is causing the issue there?

I worked backwards from the warn message udp_client_stream: dropped malformed message waiting to udp_client_stream.rs, and noticed the changes around hoisting recv_buf out of the loop in #1975 looked suspicious. I'm not a rust programmer though, so my debugging is not worth much.

Ok, looking at this a little more, if the ends settings are 512, but the packet size returned is something larger, then I could see this as cutting off the response, and that's probably the issue. But isn't that what we want? Is there a reason we'd want to accept an upstream response that is greater than the size we've specified in EDNS?

One of the odd things I noticed when making the test reduction is 512.size.dns.netmeister.org actually returns a response size slightly smaller than 512, and it still triggers the issue.

$ dig 512.size.dns.netmeister.org | grep "EDNS\|MSG SIZE"
; EDNS: version: 0, flags:; udp: 1232
;; MSG SIZE  rcvd: 504

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

I had a chance to dig in a bit more, and I believe what we're seeing is this bug from coredns (we're running within k8s): coredns/coredns#5366.

The bug causes coredns to response with payloads >512 even if the client does not send an OPT RR. The client then fails to parse the response because it only has a buffer of 512 bytes. This explains why we didn't see the client fall back to tcp.

I think the docker embedded dns has a similar problem, which is why I was able to replicate the error in the rockylinux/rockylinux:8 container:

$ dig +noedns 512.size.dns.netmeister.org
...
;; MSG SIZE  rcvd: 1249

tcpdump shows no truncate flag/fallback to tcp as expected:

00:47:39.705836 IP (tos 0x0, ttl 64, id 45631, offset 0, flags [none], proto UDP (17), length 73)
    cd93a13fe924.58638 > 192.168.65.7.domain: 20601+ A? 512.size.dns.netmeister.org. (45)
00:47:39.712791 IP (tos 0x0, ttl 63, id 22724, offset 0, flags [DF], proto UDP (17), length 1277)
    192.168.65.7.domain > cd93a13fe924.58638: 20601 28/0/0 512.size.dns.netmeister.org. A ... (snip) ... (1249)

from hickory-dns.

t0m avatar t0m commented on May 29, 2024

Thanks for your help. For what it's worth, I think you're right to continue enforcing the limit and stick to the spec.

from hickory-dns.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.