centos7.9安装k8s v1.28.4
centos7.9安装k8s
1. 安装k8s
卸载旧的k8s(一开始随便找了个教程, 跟着做,没安装成功)
sudo yum remove -y kubeadm kubectl kubelet kubernetes-cni kube*
sudo yum autoremove -y
rm -rf /etc/systemd/system/kubelet.service
rm -rf /etc/systemd/system/kube*
sudo rm -rf ~/.kube
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kube*
1.1 环境准备
# 关闭 selinux
setenforce 0 #实时动态关闭 selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config #禁止重启后自动开启
# 关闭交换分区
swapoff -a #实时动态关闭交换分区
sed -i '/ swap / s/^/#/' /etc/fstab #禁止重启后自动开启
# 网络配置文件
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
modprobe br_netfilter #执行该命令 如果不执行就会在应用k8s.conf时出现加载错误
sysctl -p /etc/sysctl.d/k8s.conf #应用配置文件
1.2 配置源
# yum换国内源
cd /etc/yum.repos.d && \
sudo mv CentOS-Base.repo CentOS-Base.repo.bak && \
sudo wget -O CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo && \
yum clean all && \
yum makecache
# 配置k8s资源的下载地址
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
1.3 安装
yum install -y docker kubelet kubeadm kubectl
1.4 docker换源
mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://registry.docker-cn.com"]
}
EOF
service docker restart
1.5 开机启动
systemctl disable firewalld.service && systemctl stop firewalld.service
systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet
1.6 拉取镜像
# 查询需要的镜像, 如下需要7个
kubeadm config images list
registry.k8s.io/kube-apiserver:v1.28.4
registry.k8s.io/kube-controller-manager:v1.28.4
registry.k8s.io/kube-scheduler:v1.28.4
registry.k8s.io/kube-proxy:v1.28.4
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1
由于国内这些镜像有问题, 所以从阿里云拉取,然后重新打tag
注意: 这个步骤可能不需要, 因为我后面就是本地有镜像了, kubeadm init的时候还是去远端拉取(通过init指定仓库解决).比较奇怪.
不过做了肯定没坏处…
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.4
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1
重新打标签
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.4 registry.k8s.io/kube-apiserver:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.4 registry.k8s.io/kube-controller-manager:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.4 registry.k8s.io/kube-scheduler:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.4 kube-proxy:v1.28.4
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 registry.k8s.io/pause:3.9
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0 registry.k8s.io/etcd:3.5.9-0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1 registry.k8s.io/coredns/coredns:v1.10.1
1.7 初始化
这里的IP网段一定要设置, 不然后面安装网络组件有问题.
-v=5是会打出详细的日志, 方便排查问题
kubeadm init -v=5 --kubernetes-version=v1.28.4 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --image-repository registry.aliyuncs.com/google_containers
init的问题最大:
问题1
[root@iZbp1h9xe5dfcvrw0m6bzyZ etcd]# kubeadm init --kubernetes-version=1.28.4
[init] Using Kubernetes version: v1.28.4
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: E1207 10:44:58.216394 16932 remote_runtime.go:616] "Status from runtime service failed" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
time="2023-12-07T10:44:58+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install containerd.io
containerd config default > /etc/containerd/config.toml
配置 systemd cgroup 驱动
结合 runc 使用 systemd cgroup 驱动,在 /etc/containerd/config.toml 中设置:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
重启
sudo systemctl restart containerd
添加开机启动 ( 安装k8s使用一段时间后, 重启服务器,发现k8s没起来, 原因就是
问题2
我指定init指定了仓库地址, 镜像也都提前下载好了, 但是一直报拉取registry.k8s.io/pause:3.9 镜像失败
解决
这是因为/etc/containerd/config.toml配置文件里指定了sandbox_image;把这个的值改成阿里云的镜像
我设置的是sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
1.8 查看状态
成功init后,可以看下node和pod的状态
kubectl get node
kubectl get pod -A
正常情况会有这么些…
[root@iZbp1h9xe5dfcvrw0m6bzyZ data]# kubectl get node
NAME STATUS ROLES AGE VERSION
izbp1h9xe5dfcvrw0m6bzyz Ready control-plane 20h v1.28.2
[root@iZbp1h9xe5dfcvrw0m6bzyZ data]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66f779496c-qqbfl 1/1 Running 0 20h
kube-system coredns-66f779496c-wt6m9 1/1 Running 0 20h
kube-system etcd-izbp1h9xe5dfcvrw0m6bzyz 1/1 Running 1 20h
kube-system kube-apiserver-izbp1h9xe5dfcvrw0m6bzyz 1/1 Running 1 20h
kube-system kube-controller-manager-izbp1h9xe5dfcvrw0m6bzyz 1/1 Running 0 20h
kube-system kube-proxy-5cswr 1/1 Running 0 20h
kube-system kube-scheduler-izbp1h9xe5dfcvrw0m6bzyz 1/1 Running 1 20h
如果有些pod没起来, 可以看看日志
journalctl -xeu kubelet
1.9 安装网络插件
直接在服务器随便哪个目录vim kebu-flannel.yml 把内容粘贴进去
apiVersion: v1
kind: Namespace
metadata:
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- networking.k8s.io
resources:
- clustercidrs
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
kind: ConfigMap
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-cfg
namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-ds
namespace: kube-flannel
spec:
selector:
matchLabels:
app: flannel
k8s-app: flannel
template:
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
containers:
- args:
- --ip-masq
- --kube-subnet-mgr
command:
- /opt/bin/flanneld
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
image: docker.io/flannel/flannel:v0.22.3
name: kube-flannel
resources:
requests:
cpu: 100m
memory: 50Mi
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
privileged: false
volumeMounts:
- mountPath: /run/flannel
name: run
- mountPath: /etc/kube-flannel/
name: flannel-cfg
- mountPath: /run/xtables.lock
name: xtables-lock
hostNetwork: true
initContainers:
- args:
- -f
- /flannel
- /opt/cni/bin/flannel
command:
- cp
image: docker.io/flannel/flannel-cni-plugin:v1.2.0
name: install-cni-plugin
volumeMounts:
- mountPath: /opt/cni/bin
name: cni-plugin
- args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
command:
- cp
image: docker.io/flannel/flannel:v0.22.3
name: install-cni
volumeMounts:
- mountPath: /etc/cni/net.d
name: cni
- mountPath: /etc/kube-flannel/
name: flannel-cfg
priorityClassName: system-node-critical
serviceAccountName: flannel
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- hostPath:
path: /run/flannel
name: run
- hostPath:
path: /opt/cni/bin
name: cni-plugin
- hostPath:
path: /etc/cni/net.d
name: cni
- configMap:
name: kube-flannel-cfg
name: flannel-cfg
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
然后执行kubectl apply -f kebu-flannel.yml
问题1
k8s flannel dial tcp 10.96.0.1:443: i/o timeout’ - Stack Overflow
出现这个问题是因为我前面kubeadm init没有配置网段.
解决:
kubeadm reset
然后再重新init, init用我前面那个init命令就没问题
问题2
open /run/flannel/subnet.env: no such file or directory
解决:
直接建 vim /run/flannel/subnet.env 内容完全一样,IP不用改
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
2. 安装图形界面
图形界面我习惯用kuboard
kuboard官方建议用docker安装
sudo docker run --privileged -d --restart=unless-stopped --name=kuboard -p ni 你的kuboard端口:80/tcp -p 10081:10081/tcp -e KUBOARD_ENDPOINT="http://你的服务器IP(我配置的是公网IP):你的kuboard端口" -e KUBOARD_AGENT_SERVER_TCP_PORT="10081" -v /root/kuboard-data:/data eipwork/kuboard:v3
这里 --privileged一定要加上. 官方文档里没加这个,我容器就没起来.
只要docker正常启动, 就可以访问了.
用户名: admin
密码: Kuboard123
记得改密码. 具体kuboard操作可以看官方文档.
kuboard文档地址:
https://kuboard.cn/install/v3/install-built-in.html#%E9%83%A8%E7%BD%B2%E8%AE%A1%E5%88%92
3. 重要:证书过期问题
默认情况 k8s各组件通信的ssl证书1年过期. 证书过期后,组件之间通信就有问题. 所以证书续期很重要.
# 检查证书过期
kubeadm certs check-expiration
# 更新证书
kubeadm certs renew all
要更新证书, 只要执行 kubeadm certs renew all 即可.
我是配置了个定时任务, 10个月执行一次.
4. 引用
由于这篇博文是安装完成第二天补的, 安装期间参考了大量的文章, 有部分都找不到记录了.
https://blog.csdn.net/qq_27384769/article/details/103051749
https://blog.alovn.cn/2020/11/15/k8s-network-error-flannel-subnet-env-no-such-file/
https://www.orchome.com/16614
https://blog.csdn.net/weixin_52156647/article/details/129765134