GithubHelp home page GithubHelp logo

Comments (34)

kkkgo avatar kkkgo commented on June 12, 2024 1

I'm a bit confused about the cachedb-check-when-serve-expired option. From my understanding, this option is intended to check whether there are unexpired caches in the cachedb. If I set serve-expired-ttl: 0, which disables the limit, does this option become meaningless, merely adding an extra checking step that causes delays in the processing flow? Should I turn it off?

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024 1

I’m encountering the same behavior with my docker setup for unbound in combination with redis. When cachedb-check-when-serve-expired is set to yes (default), queries getting a resolve time of like 20ms after some time has passed since last query (I have serve expired enabled with 24h as max TTL). When cachedb-check-when-serve-expired is set to no the expected behavior is happening, queries are instantly served from cache. Would this indicate that searching the cachedb/redis takes up to 20ms?

from unbound.

wcawijngaards avatar wcawijngaards commented on June 12, 2024

What could be happening is that the setup includes options that now act differently for cachedb, since the release. The release fixes problems that cachedb has with them, and it works, and this may be the cause of the change. Or perhaps it is going wrong because of the change.

There was a new option introduced, cachedb-check-when-serve-expired: yes in 1.20.0, it defaults to on. If that option is disabled, then the old behaviour should appear perhaps. But I think it is likely better enabled, because then serve expired options should work properly, also with cachedb.

In particular, if cachedb is configured but also serve-expired with the serve-expired-client-timeout, then the serve-expired-client-timeout is now respected. And that means that an expired response is not sent immediately from the cachedb, but instead it attempts to fully resolve the query within the time interval. And if this works, in the case queries take 17 - 25 ms, then that answer is used. For the case the time increase with a couple msec makes it look like that could be what is going on. If it was to fail and take a longer time the cachedb expired response is then used.

So it could be that for the server the expired options have started working because of the new expired and cachedb fixes in 1.20.0, and for this reason it spends time looking up stuff, instead of immediately responding with expired answers from cachedb. If also edns client subnet is involved, there are other fixes with cachedb and serve-expired and that as well.

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

So here is my Unbound Configuration -- maybe i miss something that should be changed with the new build:
The settings "serve-expired-client-timeout" is default and not set - equal 0.

server:
verbosity: 1
statistics-interval: 0
statistics-cumulative: no
extended-statistics: yes

#Modul Configuration
module-config: "validator cachedb iterator"

# |Root|
auto-trust-anchor-file: "/var/lib/unbound/root.key"
root-hints: "/var/lib/unbound/root.hints"

# Minimize logs
# Do not print one line per query to the log
log-queries: no
# Do not print one line per reply to the log
log-replies: no
# Do not print log lines that say why queries return SERVFAIL to clients
log-servfail: no
# Do not print log lines to inform about local zone actions
log-local-actions: no
# Do not print log lines that say why queries return SERVFAIL to clients
#logfile: /dev/null    
#LogFile
logfile: "/var/log/unbound.log"

interface: 0.0.0.0
interface: ::0
do-ip4: yes
do-udp: yes
do-tcp: yes
do-ip6: yes
prefer-ip6: yes

# Only give access to recursion clients from LAN IPs
#access-control: 127.0.0.1/32 allow
#access-control: 192.168.0.0/16 allow
#access-control: fc00::/7 allow
#access-control: fd80::/7 allow
#access-control: fe80::/7 allow
#access-control: ::1/128 allow

# Ensure privacy of local IP ranges
private-address: 192.168.0.0/16
private-address: 169.254.0.0/16
private-address: 172.16.0.0/12
private-address: 10.0.0.0/8
private-address: fd00::/8
private-address: fe80::/10

# Unbound local queries needs to be off if using stubby or dnscrypt
do-not-query-localhost: no


use-caps-for-id: yes
harden-glue: yes
harden-large-queries: yes
harden-dnssec-stripped: yes
harden-below-nxdomain: yes
harden-algo-downgrade: yes
harden-short-bufsize: yes
harden-referral-path: no
aggressive-nsec: yes

target-fetch-policy: "-1 -1 -1 -1 -1"    

edns-buffer-size: 1232

rrset-roundrobin: yes
val-clean-additional: yes

cache-min-ttl: 0
cache-max-ttl: 86400

# Prefetch
prefetch: yes
prefetch-key: yes
serve-expired: yes
serve-expired-reply-ttl: 0
so-reuseport: yes

hide-identity: yes
hide-version: yes
http-user-agent: "DNS"

do-daemonize: no
qname-minimisation: yes
deny-any: no
minimal-responses: yes

# Performance
num-threads: 4

# Cache/Slabs Settings
neg-cache-size: 4m
msg-cache-size: 128m
msg-cache-slabs: 8
rrset-cache-size: 256m
rrset-cache-slabs: 8
key-cache-size: 8m
key-cache-slabs: 8
infra-cache-slabs: 8
num-queries-per-thread: 4096
outgoing-range: 8192

incoming-num-tcp: 1000
outgoing-num-tcp: 1000

so-rcvbuf: 8m
so-sndbuf: 8m

unwanted-reply-threshold: 100000

server:
forward-zone:
	name: "."

# DNScrypt proxy
forward-addr: 127.0.0.1@6053
forward-addr: ::1@6053

cachedb:
backend: redis
redis-server-path: "/var/run/redis/redis.sock"
redis-timeout: 100
redis-expire-records: no

If i understand that right, the settings "cachedb-check-when-serve-expired: yes" changed to NO could be not the solution - cause the explanaition is:
If enabled, the cachedb is checked before an expired response is returned. When [serve-expired] is enabled, without [serve-expired-client-timeout], it then does not immediately respond with an expired response from cache, but instead first checks the cachedb for valid contents, and if so returns it.

So - this is the behavior I want... or have I misunderstood something?

I think the following fixes do something that must be explained a little bit more from the Devs, i dont know why the entire cachedb function is completly different with this build... because - what is fixed if the previous build worked better ?!?

- Fix cachedb for serve-expired with serve-expired-reply-ttl.
- Fix to not reply serve expired unless enabled for cachedb.
- Fix cachedb for serve-expired with serve-expired-client-timeout.
- Fixup unit test for cachedb server expired client timeout with a check if response if from upstream or from cachedb.
- Fixup cachedb to not refetch when serve-expired-client-timeout is used.

from unbound.

wcawijngaards avatar wcawijngaards commented on June 12, 2024

Yes those are the changes that make things different, and the new config option can control them; if that option is turned to a different value it should behave in a different manner. And perhaps in the old manner, but I can also not be sure because I do not know what it did before and now. Perhaps it is possible to find out what it did before and now, for one query, with verbosity: 5 and then look at the output to see what it did before and what it does now.

The cachedb function changed to fix it to make this possible, previously it would not actually pick up expired data from cachedb correctly, if serve expired is enabled, but instead pick it up as briefly valid data. The change makes it pick up the data and set it expired.

Another change is perhaps that it can ignore the expired contents from the cache, in order to check the cachedb, perhaps redis is the slow component, and now that it incorporates the expired data as expired, it ignores it the next lookup, because it is expired and checks cachedb again for possibly fresh data.

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

Mhm.. maybe I've wait for other build and watch where it's going. Otherwise, Unbound is left on version 1.19.3. The build at least runs smoothly - at least I mean that😅

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

I don't know, but this will not make sense. The cachedb via socket is really fast and in the build 1.19.3 you can see via logs that the queries are answered from redis. And that with 3-5ms. The cache from unbound is no longer active when you work with Redis. Or did I overlook something?
Fact is - in the build 1.20 is something wrong.. and the cachedb is definitely broken. The new settings are redundant and very confusing...

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

@whyisthisbroken i always thought redis was only used as second level cache. So unbound still uses the in memory cache, with a fallback to redis if the no record is found in the in memory cache. But I indeed see lookups to redis even when the in memory cache should have the record. I’m unsure what the new option really does…

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

I hope that some developer gives some clarification and can explain in an understandable way what has now been changed and why the behavior is completely different...
Until then I'll stick with 1.19.3. It just works

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

Experimented a little today. When I set cachedb-check-when-serve-expired to no, queries are served immediately, even when expired. This is the expected behavior. However, when I restart unbound the same query will not immediately be returned from redis, from the logs it looks like a normal resolve takes place, which is strange in my opinion, from my understanding it should just serve the query from redis as stale record. This way using redis is only useful for non expired queries.

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

Yeah, the entire cachedb is with build 1.20 completely useless.

This is the result with 1.20 I've run now for 12h...
That's not what it should be - look all the cache misses

Screenshot_20240512-203012~3.png

I'll go back to 1.19.3 - I don't know what's wrong, but as long no dev reports or a new version comes up i will not test it any further.

I'll post a screenshot tomorrow with build 1.19.3 with the same settings - of course without the new ones added for 1.20.

from unbound.

wcawijngaards avatar wcawijngaards commented on June 12, 2024

So, I asked for detailed logs and you do not give them, so it is not sure what is actually happening apart from speculation? Just looking at the timing does not actually answer the questions? So, running with a verbosity level of 4 and 5, you can test what actually happens that makes queries take longer.

Since the queries are actually resolved, I do not see this as a problem. Because the resolution works.

The option can change between the old behaviour and the new behaviour.

The correct setting is the new one, because expired information needs to be categorized as expired. Despite what is commented earlier, the data shown seems to not show actual bugs apart from curiosity at the timeframe.

The setting to change serve-expired-ttl is not the same as serve-expired-client-ttl. The setting of 0 changed the answer to the clients, but did not change how unbound looks up records. Regardless of the change in the new option.

Of course, there could be bugs, and that would be nice to fix them. I actually do not see any bugs in the 1.20.0 implementation here.

Of course, if you get fresh lookups with a serve-expired-client-timeout, then the responses are classified as cache misses.

With the new option enabled, it can in fact pick up expired responses from the cachedb, and use them. Previously it would briefly mark them as present, giving trouble, and making expired messages really expired makes sure that domains do not swap between older and newer versions of themselves, so that is not really something that is wrong here.

If there is a reload, of course the cache is lost, and also no expired messages are present, in memory. The cachedb cache can then give an expired response. With the new option difference perhaps this is counted as a cache miss now, but it should have been but may have been counted as a cache hit before. Regardless the user gets the expired answer, if so, and the resolution continues. With the new option, the user may then, after the brief lookup, receive the new lookup answer if that happens to be able to be fetched within the serve-expired-client-timeout timeframe.

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

Hi, thanks for the response.

The last part of your explanation is where I'm having trouble. When you set serve-expired-client-timeout to 0 (disabling it), serve-expired to true and cachedb-check-when-serve-expired to no, I would expect the existing expired record van de cachedb to be returned immediatelly after a reload (when the memory cache is gone). It should not wait on a new resolve, because serve-expired-client-timeout is set to 0.

But instead this response takes a couple of tens of milliseconds, indicating it waiting for a resolve. I'm not sure if this is a bug or I'm misunderstanding something.

from unbound.

wcawijngaards avatar wcawijngaards commented on June 12, 2024

Well with the new option to no, it does the previous behaviour. So that would be the same as 1.19 does.

Perhaps redis is that slow, that could be visible from the logs to see what part is slow. It would be useful to have an option that logs milliseconds for that, but log-time-ascii does not do it. Or it is resolving before the expired response but I also do not know why it would, but see below how I do, if that is broken, intend to then remove the option altogether and not fix it.

Regardless of these issues, the fix with the new option yes it really a mandatory fix. Without it the serve-expired-client-timeout does not work. Also without it the information about a domain can fluctuate between different versions of the zone, once it is updated and also expired. The fix improves consistency and makes the serve expired options actually work, so the option turned on is a nicety. Realistically I think it should be removed and the current default the only choice, also the 1.19 behaviour, regardless of whether it was nice for you, has to go, because the bug has to be fixed for it. That said, it should also work, in this case. Once we figure out what that is, or what the problem, if any, is in this case.

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

I was not specifically referring to 1.19. I myself do not have the option to test that version at the moment. I was just wondering how the current setting works. I will try to look in the logging if I can find something about slowness.

The problem I had with setting cachedb-check-when-serve-expired to yes, is that now and then some expired entries did take additional time to resolve which could indeed indicate a problem with redis. I would have to investigate this further.

from unbound.

wcawijngaards avatar wcawijngaards commented on June 12, 2024

Well if the cache has an expired message, then with the new option enabled, it will actually check cachedb for a fresh new message. This is perhaps the change in behaviour that is the trouble.

from unbound.

whyisthisbroken avatar whyisthisbroken commented on June 12, 2024

So if I understood that correctly and I set "Check cachedb" to "No", even after restart Unbound and the cache is gone, does Unbound no longer look in the Redis database? So the whole Redis story would be unnecessary since it is no longer checked whether data is in the DB. This results, Redis is completely unnecessary, because the time for Redis lookups is sometimes slower than if I let it resolve again from Unbound...
Well... I think I go without redis and use unbound without persistent cache...maybe the best and fastest solution.

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

I’ve checked the logs again, from what I can see the time is coming from cached entries which needs validation before being served:

[1715628891] unbound[1:1] debug: Serve expired: Trying to reply with expired data
[1715628891] unbound[1:1] debug: Serve expired: unchecked entry needs validation

This causes a resolve to happening sometimes, which explains the time elapsed. But I wonder what the purpose is of having a cachedb if eventually a resolve is needed anyway.

from unbound.

kkkgo avatar kkkgo commented on June 12, 2024

Based on the conversation, I wonder if the cachedb-check-when-serve-expired option is unnecessary for a single-instance setup of Unbound. Here is my understanding:

  1. No Cached Result in Unbound: If there's no cached result in Unbound, it should check Redis for a cache. If Redis has a valid cache, it would provide the result unquestionably. If the Redis cache is expired, whether to use this expired result depends on the serve-expired-ttl setting. This behavior seems unrelated to the cachedb-check-when-serve-expired option.

  2. Valid Cached Result in Unbound: If Unbound has a valid cached result, it will return this result directly, which again, has no relation to the cachedb-check-when-serve-expired option.

  3. Expired Cached Result in Unbound: Here's where it gets interesting. In my understanding, the cachedb-check-when-serve-expired option would matter in this case. If set to no, it should behave like the first scenario, where the use of the expired result depends on the serve-expired-ttl setting, without checking Redis. If set to yes, it might try to check Redis for a non-expired result. This seems ideal; however, theoretically, if I'm running a single-instance of Unbound with Redis, and Redis data is always lagging behind Unbound, then checking seems unnecessary. Because if there is fresher data, it would definitely exist in Unbound’s cache rather than in Redis’s cache. Therefore, any data expired in Unbound’s cache would likely also be expired in Redis’s cache, if it exists at all.

Based on this analysis, it seems to me that this option would only add unnecessary Redis checks for a single-instance of Unbound. I'm not sure if my understanding is correct. It's just a summary based on the conversation and developers' explanations. Please feel free to correct any errors.

from unbound.

jcbvm avatar jcbvm commented on June 12, 2024

@kkkgo You are mostly right indeed. That’s why the option cachedb-check-when-serve-expired is mandatory in this case and in your situation it would probably be better to set it to no.

In step 3 it is theoretically possible to have fresh records if your restart unbound which will delete the in memory cache. But today’s TTL’s are so low that this chance is also low.

So basically the option cachedb-check-when-serve-expired is useful in the end and I don’t think it should be left out. The only thing I wonder is why there happens to be a resolve after getting the item out of the cachedb, in my opinion it should be served immediately if serve-expired is enabled and the time is within serve-expired-ttl.

from unbound.

kkkgo avatar kkkgo commented on June 12, 2024

@jcbvm When you restart Unbound, it's essentially equivalent to scenario 1, as described above. In this case, I believe it is unrelated to the cachedb-check-when-serve-expired option.

from unbound.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.