Skip to content

Latest commit

 

History

History
205 lines (175 loc) · 9.75 KB

03-installation.md

File metadata and controls

205 lines (175 loc) · 9.75 KB

Installation

A good practice is to start the bootstrap VM first. Then step by step all other machines. They will start and boot up. Because of the --pxe flag the VMs will send DHCP broadcasts that request PXE boots. The DHCP server then picks up this broadcast and replies with an IP address to use. In this case the returned IP address will be the services VMs IP address. Then the proper FCOS image and ignition file are selected and the installation begins.

[okd@okd ~]$ declare -A nodes \
nodes["bootstrap"]="f8:75:a4:ac:01:00" \
nodes["compute-0"]="f8:75:a4:ac:02:00" \
nodes["compute-1"]="f8:75:a4:ac:02:01" \
nodes["compute-2"]="f8:75:a4:ac:02:02" \
nodes["master-0"]="f8:75:a4:ac:03:00" \
nodes["master-1"]="f8:75:a4:ac:03:01" \
nodes["master-2"]="f8:75:a4:ac:03:02" \
nodes["infra-0"]="f8:75:a4:ac:04:00" \
nodes["infra-1"]="f8:75:a4:ac:04:01" \
nodes["infra-2"]="f8:75:a4:ac:04:02" ; \
for key in ${!nodes[@]} ; \
do \
virt-install \
    -n ${key}.$HOSTNAME \
    --description "${key}.$HOSTNAME" \
    --os-type=Linux \
    --os-variant=fedora36 \
    --ram=16384 \
    --vcpus=4 \
    --disk ~/images/${key}.$HOSTNAME.0.qcow2,bus=virtio,size=128 \
    --nographics \
    --pxe \
    --network network=okd,mac=${nodes[${key}]} \
    --boot menu=on,useserial=on --noreboot --noautoconsole ; \
done
[okd@okd ~]$ declare -A storage \
storage["storage-0"]="f8:75:a4:ac:05:00" \
storage["storage-1"]="f8:75:a4:ac:05:01" \
storage["storage-2"]="f8:75:a4:ac:05:02" ; \
for key in ${!storage[@]} ; \
do \
    virt-install \
        -n ${key}.$HOSTNAME \
        --description "${key}.$HOSTNAME" \
        --os-type=Linux \
        --os-variant=fedora36 \
        --ram=32768 \
        --vcpus=8 \
        --disk ~/images/${key}.$HOSTNAME.0.qcow2,bus=virtio,size=128 \
        --disk ~/images/${key}.$HOSTNAME.1.qcow2,bus=virtio,size=256 \
        --nographics \
        --pxe \
        --network network=okd,mac=${storage[${key}]} \
        --boot menu=on,useserial=on --noreboot --noautoconsole ; \
done

You can check the current state of the installation with:

[okd@okd ~]$ watch virsh list --all

Once the services VM is the only one running power on all virtual machines again:

[okd@okd ~]$ for node in \
    bootstrap \
    master-0 master-1 master-2 \
    compute-0 compute-1 compute-2 \
    infra-0 infra-1 infra-2 \
    storage-0 storage-1 storage-2 ; \
do \
    virsh autostart $node.$HOSTNAME ; \
    virsh start $node.$HOSTNAME ; \
done

Wait until the cluster-bootstrapping process is complete. To check if the cluster is up run the following commands:

[okd@services ~]$ \cp ~/installer/auth/kubeconfig ~/
[okd@services ~]$ echo "export KUBECONFIG=~/kubeconfig" >> ~/.bash_profile
[okd@services ~]$ source ~/.bash_profile
[okd@services ~]$ watch oc whoami

system:admin

The cluster is bootstrapped now but more steps need to be done before the installation can be considered complete.

If you experience any trouble take a look at the official OKD documentation first. If you are sure that you found a bug related to OKD, create a new issue here.

Approving the CSRs for your machines

When you add machines to a cluster, two pending certificates signing request (CSRs) are generated for each machine that you added. You must verify that these CSRs are approved or, if necessary, approve them yourself. Due to the matter of fact that we PXE booted all nodes with proper Ignition files in place, after a few minutes, some CSRs should show up.

Review the pending CSRs and ensure that the you see a client and server request with Pending or Approved status for each machine that you added to the cluster:

[okd@services ~]$ oc get csr

Because the initial CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster.

Manually approve CSRs if they are pending:

[okd@services ~]$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

This command might need to be executed multiple times as more and more CSRs are created.

After that the status of each CSR should become Approved,Issued and all nodes should be in status Ready.

[okd@services ~]$ oc get nodes

NAME                        STATUS   ROLES           AGE    VERSION
compute-0.okd.example.com   Ready    worker          159m   v1.24.6+5157800
compute-1.okd.example.com   Ready    worker          159m   v1.24.6+5157800
compute-2.okd.example.com   Ready    worker          159m   v1.24.6+5157800
infra-0.okd.example.com     Ready    worker          159m   v1.24.6+5157800
infra-1.okd.example.com     Ready    worker          159m   v1.24.6+5157800
infra-2.okd.example.com     Ready    worker          159m   v1.24.6+5157800
master-0.okd.example.com    Ready    master,worker   167m   v1.24.6+5157800
master-1.okd.example.com    Ready    master,worker   167m   v1.24.6+5157800
master-2.okd.example.com    Ready    master,worker   167m   v1.24.6+5157800
storage-0.okd.example.com   Ready    worker          159m   v1.24.6+5157800
storage-1.okd.example.com   Ready    worker          159m   v1.24.6+5157800
storage-2.okd.example.com   Ready    worker          159m   v1.24.6+5157800

Wait until all cluster operators become online

The cluster is fully up and running once all cluster operators become available.

[okd@services ~]$ oc get clusteroperator

NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.11.0-0.okd-2022-10-28-153352   True        False         False      6m26s
baremetal                                  4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
cloud-controller-manager                   4.11.0-0.okd-2022-10-28-153352   True        False         False      24m
cloud-credential                           4.11.0-0.okd-2022-10-28-153352   True        False         False      25m
cluster-autoscaler                         4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
config-operator                            4.11.0-0.okd-2022-10-28-153352   True        False         False      22m
console                                    4.11.0-0.okd-2022-10-28-153352   True        False         False      8m57s
csi-snapshot-controller                    4.11.0-0.okd-2022-10-28-153352   True        False         False      22m
dns                                        4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
etcd                                       4.11.0-0.okd-2022-10-28-153352   True        False         False      20m
image-registry                             4.11.0-0.okd-2022-10-28-153352   True        False         False      13m
ingress                                    4.11.0-0.okd-2022-10-28-153352   True        False         False      13m
insights                                   4.11.0-0.okd-2022-10-28-153352   True        False         False      4s
kube-apiserver                             4.11.0-0.okd-2022-10-28-153352   True        False         False      16m
kube-controller-manager                    4.11.0-0.okd-2022-10-28-153352   True        False         False      19m
kube-scheduler                             4.11.0-0.okd-2022-10-28-153352   True        False         False      18m
kube-storage-version-migrator              4.11.0-0.okd-2022-10-28-153352   True        False         False      22m
machine-api                                4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
machine-approver                           4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
machine-config                             4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
marketplace                                4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
monitoring                                 4.11.0-0.okd-2022-10-28-153352   True        False         False      12m
network                                    4.11.0-0.okd-2022-10-28-153352   True        False         False      22m
node-tuning                                4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
openshift-apiserver                        4.11.0-0.okd-2022-10-28-153352   True        False         False      13m
openshift-controller-manager               4.11.0-0.okd-2022-10-28-153352   True        False         False      17m
openshift-samples                          4.11.0-0.okd-2022-10-28-153352   True        False         False      8m42s
operator-lifecycle-manager                 4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
operator-lifecycle-manager-catalog         4.11.0-0.okd-2022-10-28-153352   True        False         False      21m
operator-lifecycle-manager-packageserver   4.11.0-0.okd-2022-10-28-153352   True        False         False      14m
service-ca                                 4.11.0-0.okd-2022-10-28-153352   True        False         False      22m
storage                                    4.11.0-0.okd-2022-10-28-153352   True        False         False      22m

Remove the bootstrap resources

Once the cluster is up and running it is save to remove the temporary bootstrapping node.

[okd@okd ~]$ virsh shutdown bootstrap.$HOSTNAME
[okd@okd ~]$ virsh undefine bootstrap.$HOSTNAME
[okd@okd ~]$ rm -rf ~/images/bootstrap.$HOSTNAME.0.qcow2
[okd@services ~]$ sudo sed -i '/bootstrap/d' /etc/haproxy/haproxy.cfg
[okd@services ~]$ sudo systemctl restart haproxy

Next: Authentication