new file: docs/node-restore.md
All checks were successful
/ build-docker-image (push) Successful in 15s
All checks were successful
/ build-docker-image (push) Successful in 15s
This commit is contained in:
153
docs/node-restore.md
Normal file
153
docs/node-restore.md
Normal file
@@ -0,0 +1,153 @@
|
|||||||
|
# Proxmox Node Restore
|
||||||
|
|
||||||
|
Restores a Proxmox node exactly as it previously existed, allowing it to rejoin the cluster without using `pvecm delnode`.
|
||||||
|
It works by reconstructing the node’s identity vectors so Corosync and pmxcfs accept it as the same entity.
|
||||||
|
|
||||||
|
This method is **not guaranteed** and depends on perfect identity matching.
|
||||||
|
If any identity vector differs (hostname, node ID, ring IP, certificates), the node may:
|
||||||
|
|
||||||
|
- fail to mount pmxcfs
|
||||||
|
- fail to join Corosync
|
||||||
|
- appear as a ghost node
|
||||||
|
- destabilize the cluster
|
||||||
|
|
||||||
|
Use this approach only when you fully understand the identity requirements and the cluster
|
||||||
|
is otherwise healthy.
|
||||||
|
|
||||||
|
- [1. Hostname Identity](#1-hostname-identity)
|
||||||
|
- [2. Local Name Resolution](#2-local-name-resolution)
|
||||||
|
- [3. System Users and Groups](#3-system-users-and-groups)
|
||||||
|
- [4. Network Topology](#4-network-topology)
|
||||||
|
- [5. VFIO Bindings (If Using Passthrough)](#5-vfio-bindings-if-using-passthrough)
|
||||||
|
- [6. Kernel Flags and Module Loading (If Restoring GRUB)](#6-kernel-flags-and-module-loading-if-restoring-grub)
|
||||||
|
- [7. Disk Configuration](#7-disk-configuration)
|
||||||
|
- [8. SSH Identity](#8-ssh-identity)
|
||||||
|
- [9. Corosync Identity](#9-corosync-identity)
|
||||||
|
- [single node custer](#single-node-custer)
|
||||||
|
- [Multi node cluster](#multi-node-cluster)
|
||||||
|
- [10. Node‑Specific Artifacts](#10-nodespecific-artifacts)
|
||||||
|
- [11. Finalize](#11-finalize)
|
||||||
|
|
||||||
|
## 1. Hostname Identity
|
||||||
|
|
||||||
|
Preserve the node’s cluster identity.
|
||||||
|
|
||||||
|
- Restore `/etc/hostname`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Local Name Resolution
|
||||||
|
|
||||||
|
Ensure the node can resolve itself and its peers.
|
||||||
|
|
||||||
|
- Restore `/etc/hosts`
|
||||||
|
|
||||||
|
## 3. System Users and Groups
|
||||||
|
|
||||||
|
Restore OS‑level identity and UID/GID mapping.
|
||||||
|
|
||||||
|
- Restore `/etc/passwd`
|
||||||
|
- Restore `/etc/group`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Network Topology
|
||||||
|
|
||||||
|
Restore bridges, VLANs, MTU, and IP assignments.
|
||||||
|
|
||||||
|
- Restore `/etc/network/interfaces`
|
||||||
|
- Restore `/etc/network/interfaces.d/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. VFIO Bindings (If Using Passthrough)
|
||||||
|
|
||||||
|
Restore PCI passthrough behavior.
|
||||||
|
|
||||||
|
- Restore `/etc/modprobe.d/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Kernel Flags and Module Loading (If Restoring GRUB)
|
||||||
|
|
||||||
|
Restore passthrough‑related kernel parameters.
|
||||||
|
|
||||||
|
- Restore `/etc/default/grub`
|
||||||
|
- Restore `/etc/modules`
|
||||||
|
- Restore `/etc/modules.d/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Disk Configuration
|
||||||
|
|
||||||
|
Restore storage layout and ZFS pools.
|
||||||
|
|
||||||
|
- Restore `/etc/fstab`
|
||||||
|
- Import ZFS pools (example: `zfs import fastcore`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. SSH Identity
|
||||||
|
|
||||||
|
Restore admin access and host identity.
|
||||||
|
|
||||||
|
- Restore `/root/.ssh/*`
|
||||||
|
- Restore `/etc/ssh/*`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Corosync Identity
|
||||||
|
|
||||||
|
### single node custer
|
||||||
|
|
||||||
|
In some cases, a single Proxmox node can be resurrected simply by restoring the entire
|
||||||
|
`/etc/pve` directory from backup. This works because `/etc/pve` (pmxcfs) contains:
|
||||||
|
|
||||||
|
- Corosync identity (node ID, ring IPs, cluster membership)
|
||||||
|
- Cluster certificates
|
||||||
|
- Node‑specific configuration
|
||||||
|
- Storage definitions
|
||||||
|
- VM/CT configuration metadata
|
||||||
|
|
||||||
|
If the restored node has **the same hostname, same IPs, same Corosync ring addresses, and
|
||||||
|
the cluster still contains the old node entry**, then copying the full `/etc/pve` state
|
||||||
|
can allow the node to rejoin the cluster as if nothing happened.
|
||||||
|
|
||||||
|
### Multi node cluster
|
||||||
|
|
||||||
|
Reconstruct the node’s cluster identity (Rejoin Without Delnode).
|
||||||
|
|
||||||
|
- Restore `/etc/pve/corosync.conf`
|
||||||
|
- Restore `/etc/pve/corosync.pub`
|
||||||
|
- Restore `/etc/corosync/authkey`
|
||||||
|
|
||||||
|
These define:
|
||||||
|
|
||||||
|
- cluster topology
|
||||||
|
- node ID
|
||||||
|
- ring IPs
|
||||||
|
- shared authentication
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Node‑Specific Artifacts
|
||||||
|
|
||||||
|
Restore semantic overlays and local scripts.
|
||||||
|
|
||||||
|
- Restore `/home/`
|
||||||
|
- Restore `/root/`
|
||||||
|
- Restore `/opt/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Finalize
|
||||||
|
|
||||||
|
If GRUB was restored:
|
||||||
|
|
||||||
|
- Run `update-grub`
|
||||||
|
|
||||||
|
Then:
|
||||||
|
|
||||||
|
- Reboot
|
||||||
|
|
||||||
|
---
|
||||||
Reference in New Issue
Block a user