new file: docs/node-restore.md
All checks were successful
/ build-docker-image (push) Successful in 15s
All checks were successful
/ build-docker-image (push) Successful in 15s
This commit is contained in:
153
docs/node-restore.md
Normal file
153
docs/node-restore.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# Proxmox Node Restore
|
||||
|
||||
Restores a Proxmox node exactly as it previously existed, allowing it to rejoin the cluster without using `pvecm delnode`.
|
||||
It works by reconstructing the node’s identity vectors so Corosync and pmxcfs accept it as the same entity.
|
||||
|
||||
This method is **not guaranteed** and depends on perfect identity matching.
|
||||
If any identity vector differs (hostname, node ID, ring IP, certificates), the node may:
|
||||
|
||||
- fail to mount pmxcfs
|
||||
- fail to join Corosync
|
||||
- appear as a ghost node
|
||||
- destabilize the cluster
|
||||
|
||||
Use this approach only when you fully understand the identity requirements and the cluster
|
||||
is otherwise healthy.
|
||||
|
||||
- [1. Hostname Identity](#1-hostname-identity)
|
||||
- [2. Local Name Resolution](#2-local-name-resolution)
|
||||
- [3. System Users and Groups](#3-system-users-and-groups)
|
||||
- [4. Network Topology](#4-network-topology)
|
||||
- [5. VFIO Bindings (If Using Passthrough)](#5-vfio-bindings-if-using-passthrough)
|
||||
- [6. Kernel Flags and Module Loading (If Restoring GRUB)](#6-kernel-flags-and-module-loading-if-restoring-grub)
|
||||
- [7. Disk Configuration](#7-disk-configuration)
|
||||
- [8. SSH Identity](#8-ssh-identity)
|
||||
- [9. Corosync Identity](#9-corosync-identity)
|
||||
- [single node custer](#single-node-custer)
|
||||
- [Multi node cluster](#multi-node-cluster)
|
||||
- [10. Node‑Specific Artifacts](#10-nodespecific-artifacts)
|
||||
- [11. Finalize](#11-finalize)
|
||||
|
||||
## 1. Hostname Identity
|
||||
|
||||
Preserve the node’s cluster identity.
|
||||
|
||||
- Restore `/etc/hostname`
|
||||
|
||||
---
|
||||
|
||||
## 2. Local Name Resolution
|
||||
|
||||
Ensure the node can resolve itself and its peers.
|
||||
|
||||
- Restore `/etc/hosts`
|
||||
|
||||
## 3. System Users and Groups
|
||||
|
||||
Restore OS‑level identity and UID/GID mapping.
|
||||
|
||||
- Restore `/etc/passwd`
|
||||
- Restore `/etc/group`
|
||||
|
||||
---
|
||||
|
||||
## 4. Network Topology
|
||||
|
||||
Restore bridges, VLANs, MTU, and IP assignments.
|
||||
|
||||
- Restore `/etc/network/interfaces`
|
||||
- Restore `/etc/network/interfaces.d/`
|
||||
|
||||
---
|
||||
|
||||
## 5. VFIO Bindings (If Using Passthrough)
|
||||
|
||||
Restore PCI passthrough behavior.
|
||||
|
||||
- Restore `/etc/modprobe.d/`
|
||||
|
||||
---
|
||||
|
||||
## 6. Kernel Flags and Module Loading (If Restoring GRUB)
|
||||
|
||||
Restore passthrough‑related kernel parameters.
|
||||
|
||||
- Restore `/etc/default/grub`
|
||||
- Restore `/etc/modules`
|
||||
- Restore `/etc/modules.d/`
|
||||
|
||||
---
|
||||
|
||||
## 7. Disk Configuration
|
||||
|
||||
Restore storage layout and ZFS pools.
|
||||
|
||||
- Restore `/etc/fstab`
|
||||
- Import ZFS pools (example: `zfs import fastcore`)
|
||||
|
||||
---
|
||||
|
||||
## 8. SSH Identity
|
||||
|
||||
Restore admin access and host identity.
|
||||
|
||||
- Restore `/root/.ssh/*`
|
||||
- Restore `/etc/ssh/*`
|
||||
|
||||
---
|
||||
|
||||
## 9. Corosync Identity
|
||||
|
||||
### single node custer
|
||||
|
||||
In some cases, a single Proxmox node can be resurrected simply by restoring the entire
|
||||
`/etc/pve` directory from backup. This works because `/etc/pve` (pmxcfs) contains:
|
||||
|
||||
- Corosync identity (node ID, ring IPs, cluster membership)
|
||||
- Cluster certificates
|
||||
- Node‑specific configuration
|
||||
- Storage definitions
|
||||
- VM/CT configuration metadata
|
||||
|
||||
If the restored node has **the same hostname, same IPs, same Corosync ring addresses, and
|
||||
the cluster still contains the old node entry**, then copying the full `/etc/pve` state
|
||||
can allow the node to rejoin the cluster as if nothing happened.
|
||||
|
||||
### Multi node cluster
|
||||
|
||||
Reconstruct the node’s cluster identity (Rejoin Without Delnode).
|
||||
|
||||
- Restore `/etc/pve/corosync.conf`
|
||||
- Restore `/etc/pve/corosync.pub`
|
||||
- Restore `/etc/corosync/authkey`
|
||||
|
||||
These define:
|
||||
|
||||
- cluster topology
|
||||
- node ID
|
||||
- ring IPs
|
||||
- shared authentication
|
||||
|
||||
---
|
||||
|
||||
## 10. Node‑Specific Artifacts
|
||||
|
||||
Restore semantic overlays and local scripts.
|
||||
|
||||
- Restore `/home/`
|
||||
- Restore `/root/`
|
||||
- Restore `/opt/`
|
||||
|
||||
---
|
||||
|
||||
## 11. Finalize
|
||||
|
||||
If GRUB was restored:
|
||||
|
||||
- Run `update-grub`
|
||||
|
||||
Then:
|
||||
|
||||
- Reboot
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user