centos7.9安装k8s v1.28.4

1. 安装k8s

卸载旧的k8s(一开始随便找了个教程, 跟着做,没安装成功)

sudo yum remove -y kubeadm kubectl kubelet kubernetes-cni kube*   
sudo yum autoremove -y
rm -rf /etc/systemd/system/kubelet.service
rm -rf /etc/systemd/system/kube*
sudo rm -rf ~/.kube
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kube*

1.1 环境准备

# 关闭 selinux
setenforce 0 #实时动态关闭 selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config #禁止重启后自动开启

# 关闭交换分区
swapoff -a #实时动态关闭交换分区
sed -i '/ swap / s/^/#/' /etc/fstab #禁止重启后自动开启

# 网络配置文件
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF

modprobe br_netfilter  #执行该命令 如果不执行就会在应用k8s.conf时出现加载错误
sysctl -p /etc/sysctl.d/k8s.conf #应用配置文件


1.2 配置源


# yum换国内源
cd /etc/yum.repos.d  && \
sudo mv CentOS-Base.repo CentOS-Base.repo.bak && \
sudo wget -O CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo && \
yum clean all && \
yum makecache

# 配置k8s资源的下载地址

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
 
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
        http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

1.3 安装

yum install -y docker kubelet kubeadm kubectl

1.4 docker换源

mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://registry.docker-cn.com"]
}
EOF
 
service docker restart

1.5 开机启动

systemctl disable firewalld.service  && systemctl stop firewalld.service 
systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet

1.6 拉取镜像

# 查询需要的镜像, 如下需要7个
kubeadm config images list

registry.k8s.io/kube-apiserver:v1.28.4
registry.k8s.io/kube-controller-manager:v1.28.4
registry.k8s.io/kube-scheduler:v1.28.4
registry.k8s.io/kube-proxy:v1.28.4
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1

由于国内这些镜像有问题, 所以从阿里云拉取,然后重新打tag

注意: 这个步骤可能不需要, 因为我后面就是本地有镜像了, kubeadm init的时候还是去远端拉取(通过init指定仓库解决).比较奇怪.
不过做了肯定没坏处…

docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1

重新打标签

docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.4 registry.k8s.io/kube-apiserver:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.4 registry.k8s.io/kube-controller-manager:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.4 registry.k8s.io/kube-scheduler:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.4 kube-proxy:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 registry.k8s.io/pause:3.9
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0 registry.k8s.io/etcd:3.5.9-0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1 registry.k8s.io/coredns/coredns:v1.10.1

1.7 初始化

这里的IP网段一定要设置, 不然后面安装网络组件有问题.
-v=5是会打出详细的日志, 方便排查问题

kubeadm init -v=5 --kubernetes-version=v1.28.4 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --image-repository registry.aliyuncs.com/google_containers

init的问题最大:

问题1

[root@iZbp1h9xe5dfcvrw0m6bzyZ etcd]# kubeadm init --kubernetes-version=1.28.4
[init] Using Kubernetes version: v1.28.4
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: E1207 10:44:58.216394   16932 remote_runtime.go:616] "Status from runtime service failed" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
time="2023-12-07T10:44:58+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

解决

yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install containerd.io
containerd config default > /etc/containerd/config.toml

配置 systemd cgroup 驱动
结合 runc 使用 systemd cgroup 驱动,在 /etc/containerd/config.toml 中设置:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

重启

sudo systemctl restart containerd

添加开机启动 ( 安装k8s使用一段时间后, 重启服务器,发现k8s没起来, 原因就是


问题2

我指定init指定了仓库地址, 镜像也都提前下载好了, 但是一直报拉取registry.k8s.io/pause:3.9 镜像失败

解决
这是因为/etc/containerd/config.toml配置文件里指定了sandbox_image;把这个的值改成阿里云的镜像
我设置的是sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"

1.8 查看状态

成功init后,可以看下node和pod的状态
kubectl get node
kubectl get pod -A

正常情况会有这么些…

[root@iZbp1h9xe5dfcvrw0m6bzyZ data]# kubectl get node
NAME                      STATUS   ROLES           AGE   VERSION
izbp1h9xe5dfcvrw0m6bzyz   Ready    control-plane   20h   v1.28.2
[root@iZbp1h9xe5dfcvrw0m6bzyZ data]# kubectl get pod -A
NAMESPACE      NAME                                              READY   STATUS    RESTARTS       AGE
kube-system    coredns-66f779496c-qqbfl                          1/1     Running   0              20h
kube-system    coredns-66f779496c-wt6m9                          1/1     Running   0              20h
kube-system    etcd-izbp1h9xe5dfcvrw0m6bzyz                      1/1     Running   1              20h
kube-system    kube-apiserver-izbp1h9xe5dfcvrw0m6bzyz            1/1     Running   1              20h
kube-system    kube-controller-manager-izbp1h9xe5dfcvrw0m6bzyz   1/1     Running   0              20h
kube-system    kube-proxy-5cswr                                  1/1     Running   0              20h
kube-system    kube-scheduler-izbp1h9xe5dfcvrw0m6bzyz            1/1     Running   1              20h

如果有些pod没起来, 可以看看日志

journalctl -xeu kubelet

1.9 安装网络插件

直接在服务器随便哪个目录vim kebu-flannel.yml 把内容粘贴进去

apiVersion: v1
kind: Namespace
metadata:
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
  name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
- apiGroups:
  - networking.k8s.io
  resources:
  - clustercidrs
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-ds
  namespace: kube-flannel
spec:
  selector:
    matchLabels:
      app: flannel
      k8s-app: flannel
  template:
    metadata:
      labels:
        app: flannel
        k8s-app: flannel
        tier: node
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      containers:
      - args:
        - --ip-masq
        - --kube-subnet-mgr
        command:
        - /opt/bin/flanneld
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        image: docker.io/flannel/flannel:v0.22.3
        name: kube-flannel
        resources:
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
          privileged: false
        volumeMounts:
        - mountPath: /run/flannel
          name: run
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
        - mountPath: /run/xtables.lock
          name: xtables-lock
      hostNetwork: true
      initContainers:
      - args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        command:
        - cp
        image: docker.io/flannel/flannel-cni-plugin:v1.2.0
        name: install-cni-plugin
        volumeMounts:
        - mountPath: /opt/cni/bin
          name: cni-plugin
      - args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        command:
        - cp
        image: docker.io/flannel/flannel:v0.22.3
        name: install-cni
        volumeMounts:
        - mountPath: /etc/cni/net.d
          name: cni
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
      priorityClassName: system-node-critical
      serviceAccountName: flannel
      tolerations:
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /run/flannel
        name: run
      - hostPath:
          path: /opt/cni/bin
        name: cni-plugin
      - hostPath:
          path: /etc/cni/net.d
        name: cni
      - configMap:
          name: kube-flannel-cfg
        name: flannel-cfg
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock

然后执行kubectl apply -f kebu-flannel.yml

问题1

k8s flannel dial tcp 10.96.0.1:443: i/o timeout’ - Stack Overflow

出现这个问题是因为我前面kubeadm init没有配置网段.
解决:
kubeadm reset
然后再重新init, init用我前面那个init命令就没问题

问题2

open /run/flannel/subnet.env: no such file or directory

解决:
直接建 vim /run/flannel/subnet.env 内容完全一样,IP不用改

FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

2. 安装图形界面

图形界面我习惯用kuboard
kuboard官方建议用docker安装

sudo docker run --privileged -d   --restart=unless-stopped   --name=kuboard   -p ni 你的kuboard端口:80/tcp   -p 10081:10081/tcp   -e KUBOARD_ENDPOINT="http://你的服务器IP(我配置的是公网IP):你的kuboard端口"   -e KUBOARD_AGENT_SERVER_TCP_PORT="10081"   -v /root/kuboard-data:/data   eipwork/kuboard:v3

这里 --privileged一定要加上. 官方文档里没加这个,我容器就没起来.

只要docker正常启动, 就可以访问了.
用户名: admin
密码: Kuboard123
记得改密码. 具体kuboard操作可以看官方文档.

kuboard文档地址:
https://kuboard.cn/install/v3/install-built-in.html#%E9%83%A8%E7%BD%B2%E8%AE%A1%E5%88%92

3. 重要:证书过期问题

默认情况 k8s各组件通信的ssl证书1年过期. 证书过期后,组件之间通信就有问题. 所以证书续期很重要.

# 检查证书过期
kubeadm certs check-expiration
# 更新证书
kubeadm certs renew all

要更新证书, 只要执行 kubeadm certs renew all 即可.
我是配置了个定时任务, 10个月执行一次.

4. 引用

由于这篇博文是安装完成第二天补的, 安装期间参考了大量的文章, 有部分都找不到记录了.
https://blog.csdn.net/qq_27384769/article/details/103051749
https://blog.alovn.cn/2020/11/15/k8s-network-error-flannel-subnet-env-no-such-file/
https://www.orchome.com/16614
https://blog.csdn.net/weixin_52156647/article/details/129765134