GithubHelp home page GithubHelp logo

Comments (8)

Haleygo avatar Haleygo commented on August 24, 2024

Hello!
Which way do you use to auto-discover vmstorage nodes here, using service, DNS or just enumerating nodes?
From the log of v1.97.1, the query was succeed. So vmselect is not the problem, just VMUI got stuck, do you exprience this autocomplete issue all the time? And it's most likely to be fixed in the lastest version, check our playground with latest VMUI.

About the v1.101.0, the query failed because vmselect can't connect to all the vmstorage nodes(20/20). Can you test the connection between vmselect and vmstorage pods using command like curl?

2024-06-11T08:15:15.005Z warn VictoriaMetrics/app/vmselect/netstorage/netstorage.go:1968 20 out of 20 vmstorage nodes at group "" were unavailable during the query; a sample error: cannot get label values from vmstorage storage8:8401: cannot obtain connection from a pool: cannot dial storage8:8401: dial tcp4 10.40.193.165:8401: connect: connection refused

from victoriametrics.

evkuzin avatar evkuzin commented on August 24, 2024

Which way do you use to auto-discover vmstorage nodes here, using service, DNS or just enumerating nodes?

enumerating nodes

Can you test the connection between vmselect and vmstorage pods using command like curl?

This is a production storage, all I did is I spin one test node and run these binaries one at a time

v1.101.0/vmselect-prod
v1.97.1/vmselect-prod
v1.97.1/vmselect-prod

Besides, this indeed looks like odd error, and some times I see it if I ran autodiscovery too fast after start. Anyway I tested 1.101 again for you (as you can see below it is failing after 30s of timeout. And if you check examples above for previous version - the same query would be finished in 7s) Looks like a degradation to me.

v1.101.0/vmselect-prod -dedup.minScrapeInterval 10s -httpListenAddr ':8080' -http.connTimeout 3m0s -replicationFactor 2 -search.maxConcurrentRequests 96 -search.maxQueryDuration 180s -search.maxQueueDuration 60s -search.maxSeries 125000000 -search.maxSamplesPerQuery 50000000000 -search.maxUniqueTimeseries 3000000 -search.logSlowQueryDuration 30s -clusternativeListenAddr ':8401' -clusternative.maxConcurrentRequests 64 -clusternative.maxQueueDuration 60s -storageNode=... -search.maxLabelsAPIDuration 30s
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:12	build version: vmselect-20240425-150402-tags-v1.101.0-cluster-0-g310d100ed
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:13	command-line flags
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternative.maxConcurrentRequests="64"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternative.maxQueueDuration="1m0s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternativeListenAddr=":8401"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -dedup.minScrapeInterval="10s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -http.connTimeout="3m0s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -httpListenAddr=":8080"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -replicationFactor="2"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.logSlowQueryDuration="30s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxConcurrentRequests="96"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxLabelsAPIDuration="30s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxQueryDuration="3m0s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxQueueDuration="1m0s"
2024-06-14T15:17:50.486Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxSamplesPerQuery="50000000000"
2024-06-14T15:17:50.487Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxSeries="125000000"
2024-06-14T15:17:50.487Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxUniqueTimeseries="3000000"
2024-06-14T15:17:50.487Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -storageNode="..."
2024-06-14T15:17:50.487Z	info	VictoriaMetrics/app/vmselect/main.go:96	starting netstorage at storageNodes [...]
2024-06-14T15:17:50.504Z	info	VictoriaMetrics/app/vmselect/main.go:110	started netstorage in 0.017 seconds
2024-06-14T15:17:50.508Z	info	VictoriaMetrics/lib/memory/memory.go:42	limiting caches to 59008376832 bytes, leaving 39338917888 bytes to the OS according to -memory.allowedPercent=60
2024-06-14T15:17:50.511Z	info	VictoriaMetrics/app/vmselect/main.go:125	starting vmselectapi server at ":8401"
2024-06-14T15:17:50.512Z	info	VictoriaMetrics/app/vmselect/main.go:131	started vmselectapi server at ":8401"
2024-06-14T15:17:50.512Z	info	VictoriaMetrics/lib/vmselectapi/server.go:158	accepting vmselect conns at 0.0.0.0:8401
2024-06-14T15:17:50.512Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:119	starting server at http://127.0.0.1:8080/
2024-06-14T15:17:50.512Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:120	pprof handlers are exposed at http://127.0.0.1:8080/debug/pprof/
2024-06-14T15:19:02.911Z	warn	VictoriaMetrics/app/vmselect/main.go:309	error in "/select/0/prometheus/api/v1/label/__name__/values?match%5B%5D=%7B__name__%3D%7E%22.*node.*%22%7D&limit=1000&start=1718319600&end=1718405999.999": cannot obtain values for label "__name__": cannot fetch label values from vmstorage nodes: cannot get label values from vmstorage storage:8401: cannot execute funcName="labelValues_v5" on vmstorage "10.40.193.9:8401" with timeout 30.000 seconds (elapsed 31.911 seconds); the timeout can be adjusted with `-search.maxLabelsAPIDuration` command-line flag: the number of matching timeseries exceeds 1000000; either narrow down the search or increase -search.max* command-line flag values at vmselect; see https://docs.victoriametrics.com/#resource-usage-limits
2024-06-14T15:19:02.911Z	warn	VictoriaMetrics/app/vmselect/main.go:245	slow query according to -search.logSlowQueryDuration=30s: remoteAddr="10.14.27.100:50926", duration=31.038 seconds; requestURI: "/select/0/prometheus/api/v1/label/__name__/values?match%5B%5D=%7B__name__%3D%7E%22.*node.*%22%7D&limit=1000&start=1718319600&end=1718405999.999"

And just in case you would want me to test rc1:

v1.102.0-rc1/vmselect-prod -dedup.minScrapeInterval 10s -httpListenAddr ':8080' -http.connTimeout 3m0s -replicationFactor 2 -search.maxConcurrentRequests 96 -search.maxQueryDuration 180s -search.maxQueueDuration 60s -search.maxSeries 125000000 -search.maxSamplesPerQuery 50000000000 -search.maxUniqueTimeseries 3000000 -search.logSlowQueryDuration 30s -clusternativeListenAddr ':8401' -clusternative.maxConcurrentRequests 64 -clusternative.maxQueueDuration 60s -storageNode=... -search.maxLabelsAPIDuration 30s
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:12	build version: vmselect-20240607-151249-tags-v1.102.0-rc1-cluster-0-g3f883559e2
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:13	command-line flags
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternative.maxConcurrentRequests="64"
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternative.maxQueueDuration="1m0s"
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -clusternativeListenAddr=":8401"
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -dedup.minScrapeInterval="10s"
2024-06-14T15:27:32.023Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -http.connTimeout="3m0s"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -httpListenAddr=":8080"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -replicationFactor="2"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.logSlowQueryDuration="30s"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxConcurrentRequests="96"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxLabelsAPIDuration="30s"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxQueryDuration="3m0s"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxQueueDuration="1m0s"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxSamplesPerQuery="50000000000"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxSeries="125000000"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -search.maxUniqueTimeseries="3000000"
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -storageNode="..."
2024-06-14T15:27:32.024Z	info	VictoriaMetrics/app/vmselect/main.go:96	starting netstorage at storageNodes [...]
2024-06-14T15:27:32.037Z	info	VictoriaMetrics/app/vmselect/main.go:110	started netstorage in 0.013 seconds
2024-06-14T15:27:32.040Z	info	VictoriaMetrics/lib/memory/memory.go:42	limiting caches to 59008376832 bytes, leaving 39338917888 bytes to the OS according to -memory.allowedPercent=60
2024-06-14T15:27:32.043Z	info	VictoriaMetrics/app/vmselect/main.go:125	starting vmselectapi server at ":8401"
2024-06-14T15:27:32.044Z	info	VictoriaMetrics/app/vmselect/main.go:131	started vmselectapi server at ":8401"
2024-06-14T15:27:32.044Z	info	VictoriaMetrics/lib/vmselectapi/server.go:158	accepting vmselect conns at 0.0.0.0:8401
2024-06-14T15:27:32.044Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:119	starting server at http://127.0.0.1:8080/
2024-06-14T15:27:32.044Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:120	pprof handlers are exposed at http://127.0.0.1:8080/debug/pprof/
2024-06-14T15:28:10.908Z	warn	VictoriaMetrics/app/vmselect/main.go:309	error in "/select/0/prometheus/api/v1/label/__name__/values?match%5B%5D=%7B__name__%3D%7E%22.*node.*%22%7D&limit=1000&start=1718319600&end=1718405999.999": cannot obtain values for label "__name__": cannot fetch label values from vmstorage nodes: cannot get label values from vmstorage storage8:8401: cannot execute funcName="labelValues_v5" on vmstorage "10.40.193.25:8401" with timeout 30.000 seconds (elapsed 31.908 seconds); the timeout can be adjusted with `-search.maxLabelsAPIDuration` command-line flag: the number of matching timeseries exceeds 1000000; either narrow down the search or increase -search.max* command-line flag values at vmselect; see https://docs.victoriametrics.com/#resource-usage-limits
2024-06-14T15:28:10.908Z	warn	VictoriaMetrics/app/vmselect/main.go:245	slow query according to -search.logSlowQueryDuration=30s: remoteAddr="10.14.27.100:54618", duration=31.137 seconds; requestURI: "/select/0/prometheus/api/v1/label/__name__/values?match%5B%5D=%7B__name__%3D%7E%22.*node.*%22%7D&limit=1000&start=1718319600&end=1718405999.999"

from victoriametrics.

evkuzin avatar evkuzin commented on August 24, 2024

TL/DR

@Haleygo - as you can see, old version would run query and return result in 7s, new version will fail after 30s limit on the same dataset.

from victoriametrics.

evkuzin avatar evkuzin commented on August 24, 2024

And, just to be on the same page - I was running the same storage version in all tests.

from victoriametrics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.