GithubHelp home page GithubHelp logo

helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态 about crane-scheduler HOT 13 OPEN

xucq07 avatar xucq07 commented on August 26, 2024
helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态

from crane-scheduler.

Comments (13)

qmhu avatar qmhu commented on August 26, 2024

请检查下crane-scheduler的pod的状态,是否running

from crane-scheduler.

xucq07 avatar xucq07 commented on August 26, 2024

kubectl get pods -n crane-system
NAME READY STATUS RESTARTS AGE
crane-scheduler-b84489958-6jdj6 1/1 Running 0 4d1h
crane-scheduler-controller-6987688d8d-6wr7c 1/1 Running 0 4d1h
再次确认pod已经Running

from crane-scheduler.

qmhu avatar qmhu commented on August 26, 2024

kubectl get pods -n crane-system NAME READY STATUS RESTARTS AGE crane-scheduler-b84489958-6jdj6 1/1 Running 0 4d1h crane-scheduler-controller-6987688d8d-6wr7c 1/1 Running 0 4d1h 再次确认pod已经Running

从日志没看到异常。
你可以把pod的defaultScheduler改成空,试试默认调度器是否可以工作。

from crane-scheduler.

xucq07 avatar xucq07 commented on August 26, 2024

测试过了,默认调度器没有问题可以正常调度

from crane-scheduler.

qmhu avatar qmhu commented on August 26, 2024

测试过了,默认调度器没有问题可以正常调度

能否把完整的日志发上来,包括crane-scheduler-controller-6987688d8d-6wr7c和crane-scheduler-b84489958-6jdj6

from crane-scheduler.

xucq07 avatar xucq07 commented on August 26, 2024

crane-scheduler.log
crane-scheduler-controller.log
日志信息如下

from crane-scheduler.

mobeixiaoxin avatar mobeixiaoxin commented on August 26, 2024

遇到了同样的问题,使用的k8s版本为1.27
scheduler中报错如下:
0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource

请大佬帮忙指点一下吧,感谢

from crane-scheduler.

qmhu avatar qmhu commented on August 26, 2024

遇到了同样的问题,使用的k8s版本为1.27 scheduler中报错如下: 0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource

请大佬帮忙指点一下吧,感谢

应该是高版本的兼容性问题,目前1.25以下的集群没有问题,更高的集群可能要额外支持。

from crane-scheduler.

mobeixiaoxin avatar mobeixiaoxin commented on August 26, 2024

遇到了同样的问题,使用的k8s版本为1.27 scheduler中报错如下: 0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
请大佬帮忙指点一下吧,感谢

应该是高版本的兼容性问题,目前1.25以下的集群没有问题,更高的集群可能要额外支持。

好的,感谢

from crane-scheduler.

redtee123 avatar redtee123 commented on August 26, 2024

我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态

crane-scheduler日志:
I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory
W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work.
W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled
W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled
I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259
I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController
I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...

crane-scheduler-controller日志:
I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms)
I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms)
I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms)
I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms)
I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms)
I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms)
I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms)
I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms)
I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms)
I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms)
I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms)
I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms)
I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)

from crane-scheduler.

qmhu avatar qmhu commented on August 26, 2024

我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态

crane-scheduler日志: I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work. W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251 I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259 I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...

crane-scheduler-controller日志: I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms) I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms) I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms) I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms) I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms) I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms) I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms) I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms) I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms) I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms) I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms) I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms) I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)

可能是没有关闭第二调度器的leaderelection。
helm/chart中安装的scheduler关闭了leaderelection,可以参考下:
https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23

from crane-scheduler.

redtee123 avatar redtee123 commented on August 26, 2024

我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态
crane-scheduler日志: I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work. W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251 I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259 I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...
crane-scheduler-controller日志: I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms) I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms) I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms) I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms) I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms) I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms) I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms) I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms) I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms) I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms) I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms) I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms) I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)

可能是没有关闭第二调度器的leaderelection。 helm/chart中安装的scheduler关闭了leaderelection,可以参考下: https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23

确实是第二调度器没有关闭leaderelection导致的。但不是scheduler-deployment.yaml中的leaderelection,是scheduler-configmap.yaml中的leaderelection没关闭
6861013c12857c5bd3823fe3add269ab

from crane-scheduler.

lesserror avatar lesserror commented on August 26, 2024

我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态

可能是没有关闭第二调度器的leaderelection。 helm/chart中安装的scheduler关闭了leaderelection,可以参考下: https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23

确实是第二调度器没有关闭leaderelection导致的。但不是scheduler-deployment.yaml中的leaderelection,是scheduler-configmap.yaml中的leaderelection没关闭 6861013c12857c5bd3823fe3add269ab

我的kubernetes版本为1.22.12,使用的crane-scheduler镜像版本为scheduler-0.2.2,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。也将leaderelection改为false了,但是当我创建新的pod测试调度时,pod一直处于pending状态。
pod信息:

Events:
  Type     Reason            Age   From             Message
  ----     ------            ----  ----             -------
  Warning  FailedScheduling  15s   crane-scheduler  0/1 nodes are available: 1 Insufficient cpu.

leaderelection:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  scheduler-config.yaml: |
    apiVersion: kubescheduler.config.k8s.io/v1beta2
    kind: KubeSchedulerConfiguration
    leaderElection:
      leaderElect: false
    profiles:
    - schedulerName: crane-scheduler
      plugins:
        filter:
          enabled:
          - name: Dynamic
        score:
          enabled:
          - name: Dynamic
            weight: 3

crane-scheduler日志:

I1226 09:47:56.595597       1 serving.go:348] Generated self-signed cert in-memory
W1226 09:47:57.035592       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1226 09:47:57.041561       1 server.go:139] "Starting Kubernetes Scheduler" version="v0.0.0-master+$Format:%H$"
I1226 09:47:57.044642       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1226 09:47:57.044658       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1226 09:47:57.044666       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1226 09:47:57.044679       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1226 09:47:57.044699       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1226 09:47:57.044715       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1226 09:47:57.045160       1 secure_serving.go:200] Serving securely on [::]:10259
I1226 09:47:57.045218       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I1226 09:47:57.145093       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I1226 09:47:57.145152       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I1226 09:47:57.145100       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

crane-scheduler-controller日志:

root@master:/home/ubuntu/kube-prometheus/manifests# kubectl logs -n crane-system crane-scheduler-controller-6f6b94c8f7-79vff 
I1226 17:47:56.187263       1 server.go:61] Starting Controller version v0.0.0-master+$Format:%H$
I1226 17:47:56.188316       1 leaderelection.go:248] attempting to acquire leader lease crane-system/crane-scheduler-controller...
I1226 17:48:12.646241       1 leaderelection.go:258] successfully acquired lease crane-system/crane-scheduler-controller
I1226 17:48:12.747072       1 controller.go:72] Caches are synced for controller
I1226 17:48:12.747174       1 node.go:46] Start to reconcile node events
I1226 17:48:12.747208       1 event.go:30] Start to reconcile EVENT events
I1226 17:48:12.773420       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (26.154965ms)
I1226 17:48:12.794854       1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (21.278461ms)
I1226 17:48:12.818035       1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1d" (23.146517ms)
I1226 17:48:12.837222       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (19.151134ms)
I1226 17:48:13.055018       1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (217.762678ms)
I1226 17:48:13.455442       1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1d" (400.366453ms)
I1226 17:51:12.788539       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (41.092765ms)
I1226 17:51:12.810824       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (22.248821ms)
I1226 17:54:12.771140       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (22.840662ms)
I1226 17:54:12.789918       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.740179ms)
I1226 17:57:12.773735       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (26.395777ms)
I1226 17:57:12.792897       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (19.124323ms)
I1226 18:00:12.772243       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (24.369461ms)
I1226 18:00:12.804297       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (32.008004ms)
I1226 18:03:12.774690       1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (27.291591ms)
I1226 18:03:12.795145       1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (20.350165ms)
I1226 18:03:12.813508       1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.32638ms)
I1226 18:03:12.833109       1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (19.549029ms)

from crane-scheduler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.