如何修改 Kubernetes 节点 IP 地址?

当项目上有些场景无法验证 Kubernetes VIP 或者做了什么其他操作，需要重新调整节点 IP。

首先停止 kubelet 并备份要操作的目录：

➜ systemctl stop kubelet
➜ mv /etc/kubernetes /etc/kubernetes-bak
➜ mv /var/lib/kubelet/ /var/lib/kubelet-bak

将 pki 证书目录保留下来：

➜ mkdir -p /etc/kubernetes
➜ cp -r /etc/kubernetes-bak/pki /etc/kubernetes
➜ rm /etc/kubernetes/pki/{apiserver.*,etcd/peer.*}
rm: remove regular file ‘/etc/kubernetes/pki/apiserver.crt’? y
rm: remove regular file ‘/etc/kubernetes/pki/apiserver.key’? y
rm: remove regular file ‘/etc/kubernetes/pki/etcd/peer.crt’? y
rm: remove regular file ‘/etc/kubernetes/pki/etcd/peer.key’? y

➜ kubeadm init --config kubeadm.yaml --ignore-preflight-errors=DirAvailable--var-lib-etcd
[init] Using Kubernetes version: v1.22.8
[preflight] Running pre-flight checks
        [WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [api.k8s.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.0.106]
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master1] and IPs [192.168.0.106 127.0.0.1 ::1]
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 12.003599 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.106:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:27993cae9c76d18a1b82b800182c4c7ebc7a704ba1093400ed886f65e709ec04

上面的操作和我们平时去初始化集群的时候几乎是一样版权声明：本文遵循 CC 4.0 BY-SA 版权协议，若要转载请务必附上原文出处链接及本声明，谢谢合作！的，唯一不同的地方是加了一个 --ignore-preflight-errors=DirAvailable--var-lib-etcd 参数，意思就是使用之前 etcd 的数据。然后我们可以验证下 APIServer 的 IP 地址是否变成了新的地址：

➜ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite ‘/root/.kube/config’? y
➜ kubectl cluster-info
Kubernetes control plane is running at https://192.168.0.106:6443
CoreDNS is running at https://192.168.0.106:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

对于 node 节点我们可以 reset 后重新加入到集群即可：

# 在node节点操作
➜ kubeadm reset

重置后重新 join 集群即可：

# 在node节点操作
➜ kubeadm join 192.168.0.106:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:27993cae9c76d18a1b82b800182c4c7ebc7a704ba1093400ed886f65e709ec04

这种方式比上面的方式要简单很多。正常操作后集群也正常了。

➜ kubectl get nodes
NAME      STATUS   ROLES                  AGE     VERSION
master1   Ready    control-plane,master   48d     v1.22.8
node1     Ready    <none>                 48d     v1.22.8
node2     Ready    <none>                 4m50s   v1.22.8

总结

对于 Kubernetes 集群节点的 IP 地址最好使用静态 IP，避免 IP 变动对业务产生影响，如果不是静态 IP，也强烈建议增加一个自定义域名进行签名，这样当 IP 变化后还可以直接重新映射下这个域名即版权声明：本文遵循 CC 4.0 BY-SA 版权协议，若要转载请务必附上原文出处链接及本声明，谢谢合作！可，只需要在 kubeadm 配置文件中通过 ClusterConfiguration 配置 apiServer.certSANs 即可，如下所示：

apiVersion: kubeadm.k8s.io/v1beta3
apiServer:
  timeoutForControlPlane: 4m0s
  certSANs:
  - api.k8s.local
  - master1
  - 192.168.0.106
kind: ClusterConfiguration
......

如何修改 Kubernetes 节点 IP 地址?

总结

虚机网格(istio)管理实战篇

Kubeflow Volcano 实现典型 AI 训练任务

Cilium 网络模型之关键配置

Cilium Pod-to-Service 实地探索转发路径及 BPF 处理逻辑