Initial plan

main
John Kenyon 2025-09-21 22:08:02 -07:00
commit c0f20a18fc
1 changed files with 539 additions and 0 deletions

539
PLAN.md Normal file
View File

@ -0,0 +1,539 @@
# LibVirt + OpenTofu VM Provisioning Plan
## Overview
This plan outlines how to use **LibVirt** (virtualization management toolkit) and **OpenTofu** (Infrastructure as Code tool) to provision and manage virtual machines declaratively.
**LibVirt** provides a unified API for managing different hypervisors (KVM, QEMU, Xen, etc.), while **OpenTofu** allows you to define infrastructure using declarative configuration files.
## Prerequisites
### System Requirements
- Linux host system (Ubuntu/Debian/RHEL/CentOS)
- KVM/QEMU support (check with `lscpu | grep Virtualization`)
- Sufficient RAM and storage for VMs
- Root/sudo access
### Software Installation
#### 1. Install LibVirt and KVM
```bash
# Ubuntu/Debian
sudo apt update
sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager
# RHEL/CentOS/Fedora
sudo dnf install qemu-kvm libvirt libvirt-daemon-system libvirt-clients bridge-utils virt-manager
# Start and enable libvirt service
sudo systemctl start libvirtd
sudo systemctl enable libvirtd
# Add user to libvirt group
sudo usermod -aG libvirt $USER
```
#### 2. Install OpenTofu
```bash
# Download and install OpenTofu
curl -fsSL https://get.opentofu.org/install-opentofu.sh | sh
# Or using package manager (Ubuntu/Debian)
sudo snap install opentofu --classic
# Verify installation
tofu version
```
#### 3. Verify LibVirt Connection
```bash
# Test libvirt connection
virsh list --all
# Check available storage pools
virsh pool-list --all
# Check available networks
virsh net-list --all
```
## Project Structure
```
vm-infrastructure/
├── main.tf # Main OpenTofu configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── terraform.tfvars # Variable values
├── cloud-init/ # Cloud-init configuration files
│ ├── user-data.yaml
│ └── meta-data.yaml
├── images/ # VM base images
└── README.md # Project documentation
```
## Configuration Files
### 1. Main Configuration (`main.tf`)
```hcl
terraform {
required_providers {
libvirt = {
source = "dmacvicar/libvirt"
version = "~> 0.7.0"
}
}
}
# Configure the LibVirt Provider
provider "libvirt" {
uri = "qemu:///system"
}
# Create a storage pool
resource "libvirt_pool" "vm_pool" {
name = var.pool_name
type = "dir"
path = var.pool_path
}
# Download Ubuntu cloud image
resource "libvirt_volume" "base_image" {
name = "ubuntu-22.04-base.qcow2"
pool = libvirt_pool.vm_pool.name
source = "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
format = "qcow2"
}
# Create cloud-init disk
resource "libvirt_cloudinit_disk" "cloudinit" {
name = "cloudinit.iso"
pool = libvirt_pool.vm_pool.name
user_data = file("${path.module}/cloud-init/user-data.yaml")
meta_data = file("${path.module}/cloud-init/meta-data.yaml")
}
# Create VM volumes from base image
resource "libvirt_volume" "vm_disk" {
count = var.vm_count
name = "vm-${count.index + 1}-disk.qcow2"
pool = libvirt_pool.vm_pool.name
base_volume_id = libvirt_volume.base_image.id
size = var.disk_size
}
# Create VMs
resource "libvirt_domain" "vm" {
count = var.vm_count
name = "vm-${count.index + 1}"
memory = var.memory
vcpu = var.vcpu
cloudinit = libvirt_cloudinit_disk.cloudinit.id
network_interface {
network_name = "default"
wait_for_lease = true
}
disk {
volume_id = libvirt_volume.vm_disk[count.index].id
}
console {
type = "pty"
target_port = "0"
target_type = "serial"
}
graphics {
type = "spice"
listen_type = "address"
autoport = true
}
}
```
### 2. Variables (`variables.tf`)
```hcl
variable "pool_name" {
description = "Name of the storage pool"
type = string
default = "vm-pool"
}
variable "pool_path" {
description = "Path to the storage pool"
type = string
default = "/var/lib/libvirt/images/vm-pool"
}
variable "vm_count" {
description = "Number of VMs to create"
type = number
default = 3
}
variable "memory" {
description = "Memory allocation for each VM (MB)"
type = number
default = 2048
}
variable "vcpu" {
description = "Number of vCPUs for each VM"
type = number
default = 2
}
variable "disk_size" {
description = "Disk size for each VM (bytes)"
type = number
default = 21474836480 # 20GB
}
```
### 3. Outputs (`outputs.tf`)
```hcl
output "vm_ips" {
description = "IP addresses of the created VMs"
value = {
for i, vm in libvirt_domain.vm : vm.name => vm.network_interface[0].addresses[0]
}
}
output "vm_names" {
description = "Names of the created VMs"
value = libvirt_domain.vm[*].name
}
```
### 4. Variable Values (`terraform.tfvars`)
```hcl
pool_name = "my-vm-pool"
pool_path = "/var/lib/libvirt/images/my-vm-pool"
vm_count = 3
memory = 4096 # 4GB
vcpu = 2
disk_size = 32212254720 # 30GB
```
### 5. Cloud-Init Configuration
#### User Data (`cloud-init/user-data.yaml`)
```yaml
#cloud-config
users:
- name: ubuntu
groups: sudo
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- ssh-rsa AAAAB3NzaC1yc2EAAAA... # Your SSH public key
packages:
- curl
- wget
- git
- htop
package_update: true
package_upgrade: true
# Set timezone
timezone: UTC
# Enable SSH
ssh_pwauth: false
# Run commands on first boot
runcmd:
- systemctl enable ssh
- systemctl start ssh
- echo "VM provisioned successfully" > /tmp/provision-complete
```
#### Meta Data (`cloud-init/meta-data.yaml`)
```yaml
instance-id: vm-instance
local-hostname: vm-host
```
## Step-by-Step Workflow
### 1. Initial Setup
```bash
# Create project directory
mkdir vm-infrastructure
cd vm-infrastructure
# Create subdirectories
mkdir cloud-init images
# Generate SSH key if needed
ssh-keygen -t rsa -b 4096 -C "your-email@example.com"
```
### 2. Create Configuration Files
Create all the configuration files listed above in their respective locations.
### 3. Initialize OpenTofu
```bash
# Initialize the project
tofu init
# Validate configuration
tofu validate
# Plan the deployment
tofu plan
```
### 4. Deploy VMs
```bash
# Apply the configuration
tofu apply
# Confirm with 'yes' when prompted
```
### 5. Verify Deployment
```bash
# Check VM status
virsh list --all
# Get VM IP addresses
tofu output vm_ips
# SSH into a VM
ssh ubuntu@<vm-ip-address>
```
### 6. Management Commands
```bash
# Show current state
tofu show
# Update configuration (modify .tf files then)
tofu plan
tofu apply
# Destroy infrastructure
tofu destroy
```
## Advanced Features
### Custom Networks
```hcl
# Create custom network
resource "libvirt_network" "vm_network" {
name = "vm-network"
mode = "nat"
domain = "vm.local"
addresses = ["192.168.100.0/24"]
dhcp {
enabled = true
}
dns {
enabled = true
}
}
# Use custom network in VM
resource "libvirt_domain" "vm" {
# ... other configuration ...
network_interface {
network_id = libvirt_network.vm_network.id
wait_for_lease = true
}
}
```
### Multiple VM Types
```hcl
# Web servers
resource "libvirt_domain" "web_servers" {
count = 2
name = "web-${count.index + 1}"
memory = 2048
vcpu = 2
# ... other configuration ...
}
# Database server
resource "libvirt_domain" "db_server" {
name = "database"
memory = 4096
vcpu = 4
# ... other configuration ...
}
```
### Storage Volumes
```hcl
# Additional data volume
resource "libvirt_volume" "data_volume" {
count = var.vm_count
name = "vm-${count.index + 1}-data.qcow2"
pool = libvirt_pool.vm_pool.name
size = 53687091200 # 50GB
}
# Attach to VM
resource "libvirt_domain" "vm" {
# ... other configuration ...
disk {
volume_id = libvirt_volume.data_volume[count.index].id
}
}
```
## Best Practices
### 1. Resource Organization
- Use consistent naming conventions
- Group related resources in modules
- Use variables for configurable values
- Document your infrastructure
### 2. State Management
- Store state files securely (consider remote backends)
- Use state locking to prevent concurrent modifications
- Backup state files regularly
### 3. Security
- Use SSH keys instead of passwords
- Configure proper firewall rules
- Keep base images updated
- Use cloud-init for secure initial configuration
### 4. Performance
- Size VMs appropriately for workload
- Use thin provisioning for storage
- Monitor resource usage
- Consider CPU topology for multi-vCPU VMs
## Troubleshooting
### Common Issues
#### 1. Permission Denied
```bash
# Ensure user is in libvirt group
sudo usermod -aG libvirt $USER
# Logout and login again
# Check libvirt service status
sudo systemctl status libvirtd
```
#### 2. Network Issues
```bash
# Check default network
virsh net-list --all
virsh net-info default
# Start default network if stopped
virsh net-start default
virsh net-autostart default
```
#### 3. Storage Pool Issues
```bash
# Check storage pools
virsh pool-list --all
# Create default pool if missing
virsh pool-define-as default dir - - - - /var/lib/libvirt/images
virsh pool-build default
virsh pool-start default
virsh pool-autostart default
```
#### 4. VM Won't Start
```bash
# Check VM configuration
virsh dumpxml vm-name
# Check libvirt logs
sudo journalctl -u libvirtd -f
# Validate domain XML
virsh validate vm-name
```
### Debug Commands
```bash
# OpenTofu debugging
export TF_LOG=DEBUG
tofu plan
# LibVirt debugging
export LIBVIRT_DEBUG=1
virsh list
# Check VM console
virsh console vm-name
```
## Monitoring and Maintenance
### VM Monitoring
```bash
# Check VM stats
virsh domstats vm-name
# Monitor VM performance
virsh dominfo vm-name
virsh vcpuinfo vm-name
```
### Backup Strategy
```bash
# Backup VM configuration
virsh dumpxml vm-name > vm-name.xml
# Backup VM disk
cp /var/lib/libvirt/images/vm-disk.qcow2 /backup/location/
```
### Updates
```bash
# Update base images regularly
tofu apply -replace="libvirt_volume.base_image"
# Update OpenTofu provider
tofu init -upgrade
```
## Conclusion
This plan provides a comprehensive approach to using LibVirt and OpenTofu for VM provisioning. The Infrastructure as Code approach ensures reproducible, version-controlled virtual infrastructure that can be easily managed and scaled.
Key benefits:
- **Reproducible**: Infrastructure defined in code
- **Scalable**: Easy to add/remove VMs
- **Version Controlled**: Track infrastructure changes
- **Automated**: Minimal manual intervention required
- **Consistent**: Standardized VM configurations
Start with the basic configuration and gradually add advanced features as needed. Always test changes in a development environment before applying to production.