解决rancher-v2.6.3报helm-operator更新rancher-webhook异常问题
问题描述
namespace Cattle-system中 频繁创建podhelm-operation-xxxxx, 并且状态为Error
排查解决
查看helm-operator相关pod
kubectl get pod -n cattle-system
NAME READY STATUS RESTARTS AGE
helm-operation-2mchw 1/2 Error 0 32m
helm-operation-54bq4 1/2 Error 0 22m
helm-operation-5zv78 1/2 Error 0 32m
helm-operation-6tbkg 1/2 Error 0 32m
helm-operation-7z6jn 1/2 Error 0 12m
helm-operation-9tps9 1/2 Error 0 37m查看日志
kubectl logs helm-operation-9tps9 -n cattle-system -c helm
# 显示如下
helm upgrade --history-max=5 --install=true --namespace=cattle-system --reset-values=true --timeout=5m0s --values=/home/shell/helm/values-rancher-webhook-1.0.2-up0.2.2.yaml --version=1.0.2+up0.2.2 --wait=true rancher-webhook /home/shell/helm/rancher-webhook-1.0.2-up0.2.2.tgz
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress根据日志得知 helm 更新rancher-webhook 时, 有已经正在 安装/更新/回滚 的 进程
尝试查看helm 列表,没有看到已经部署的app
不显示pending-install 状态服务
helm 增加 -a 参数后显示
卸载部署异常的helm app rancher-webhook
等待helm-operation-xxxxx 自动再次触发,并查看
查看日志为正常
查看pod 状态为正常
参考
https://github.com/fluxcd/helm-controller/issues/149
https://forums.rancher.com/t/rancher-in-docker-helm-operation-error/37787
Last updated