sophos-iaas-review/README.md
2025-12-22 14:54:16 +01:00

199 lines
No EOL
10 KiB
Markdown

# Sophos HA on STACKIT
This directory contains the **Terraform configuration** for deploying a **High Availability (HA) cluster** of two **Sophos Firewalls** on the **STACKIT** cloud platform. The configuration includes all necessary resources such as networks, virtual machines (appliances), volumes, interfaces, and security groups.
## Content Readme
* [Directory Structure](#directory-structure)
* [Prerequisites](#prerequisites)
* [Infrastructure Guide](#infrastructure-guide)
* [Application Guide](#application-guide)
---
## Directory Structure
#### [`01-provider.tf`](./01-provider.tf)
This file initializes the **STACKIT Terraform Provider**. It defines which cloud platform is being managed and specifies the version to be used. The **latest version** is always selected; in the specific deployment, **Version 0.69.0** was used.
#### [`02-config.tf`](./02-config.tf)
This is where the most important **configuration variables** are defined, determining the target project and hardware specifications. Parameters include `project_id`, the path to the `service_account_key_path`, and default settings for `default_region`, `default_az`, and the VM `flavor` (e.g., `m2i.2`).
#### [`03-sophos-image.tf`](./03-sophos-image.tf)
This file configures the creation of the necessary STACKIT images. The local **QCOW2 files** (`PRIMARY-DISK.qcow2` and `AUXILIARY-DISK.qcow2`), which contain the Sophos installation, are converted into usable cloud images.
#### [`04-sophos-network.tf`](./04-sophos-network.tf)
Here, the four **Sophos-specific networks** (Management, LAN, Synchronization, WAN) and the associated components are configured:
* `sophos_mgmt_net`
* `sophos_lan_net`
* `sophos_sync_net`
* `sophos_wan_net`
* A **VIP (Virtual IP) Interface** for HA operation. The VIP should not be attached to any VM.
* A **Public IP address** for external access (e.g., WAN).
* The necessary **Security Groups and Rules** to control traffic.
#### [`05-sophos-appliance1.tf`](./05-sophos-appliance1.tf) and [`06-sophos-appliance2.tf`](./06-sophos-appliance2.tf)
These files each define one of the two Sophos VMs that form the HA cluster. The configuration includes:
* Creation of the two **Volumes** (boot and auxiliary disk) for the VM.
* Creation of the **VM** itself.
* Creation and correct sequential attachment of the **four network interfaces**:
1. Management (`mgmt`)
2. LAN (`lan`)
3. Synchronization (`sync`)
4. WAN (`wan`)
---
## Prerequisites
Before deploying the Sophos HA construct, ensure the following tools and assets are available:
### 1. Software Tools
* **Terraform CLI:** You need the **Terraform Command Line Interface** installed and configured to execute the Infrastructure-as-Code files (`*.tf`).
* **STACKIT CLI (Optional):** The **STACKIT Command Line Interface** is not strictly required for the Terraform deployment itself but can be useful for managing your STACKIT Project and is *recommended* for mapping the MAC addresses.
### 2. Sophos Assets
* **Sophos Images (QCOW2 Format):** The Sophos appliance images must be available as **two QCOW2 files**.
* Place these files directly in the root directory and ensure they are named exactly:
* `PRIMARY-DISK.qcow2`
* `AUXILIARY-DISK.qcow2`
* **Sophos Serial Number/License:** A valid **Sophos Serial Number** or **License** is required to activate the appliances and enable HA functionality after they have been successfully deployed.
---
## Infrastructure Guide
### Step 1: Prepare the Environment
1. **Clone the Repository:**
```bash
git clone <repository-url>
cd Sophos-HA
```
2. **Provide Sophos Images:**
* Place the two Sophos QCOW2 image files directly into the root directory of the cloned repository.
* Ensure they are named correctly or adjust the path as a variable: `PRIMARY-DISK.qcow2` and `AUXILIARY-DISK.qcow2`.
### Step 2: Configure Terraform Variables
1. **Create Variables File:**
* The required variables are defined in `02-config.tf`. To provide the values, create a new file named **`00-variables.auto.tfvars`** in the root directory.
* Fill in the values based on your project and environment:
2. **Configure STACKIT Service Account:**
* **Generate** a STACKIT Service Account Key (JSON format) and place the path to this key in the `service_account_key_path` variable.
3. **Configure Terraform Backend:**
* **Recommendation:** For production environments, it is highly recommended to configure a remote backend (e.g., a **STACKIT Object Storage Bucket**) in `01-provider.tf` to store the `tfstate` securely. Alternatively, you can use the default local backend configuration for testing purposes.
### Step 3: Execute Terraform
1. **Initialize Terraform:**
* Ensure that you are in the project folder (`cd Sophos-HA`).
* Initialize the working directory and download the necessary providers (including the STACKIT provider).
```bash
terraform init
```
2. **Review the Plan:**
* Check the infrastructure changes Terraform intends to make.
```bash
terraform plan
```
3. **Apply the Configuration:**
* Execute the plan to deploy all resources (Networks, Images, Volumes, VMs/Appliances).
```bash
terraform apply
```
* Wait until the process is complete and the VMs are fully provisioned and booted.
### Step 4: Initial Sophos Appliance Configuration
After successful deployment, the two Sophos Appliances need an initial configuration via the console to enable management access.
1. **Access the Console:**
* In the STACKIT Portal, access the **Console** of **Appliance 1** (`sophos-appliance1`).
2. **Initial Login:**
* The Sophos appliance should boot into the main menu.
* The standard password for Sophos appliances is **`admin`** (case sensitive).
3. **Configure Management Interface:**
* From the main menu, select **`1. Network Configuration`**.
* Select **`1. Interface Configuration`**.
* Identify the interface corresponding to the **Management Network** (e.g., Port 1) and configure it with the pre-defined ipv4 IP and CIDR block from the Terraform configuration (specifically the **`sophos_mgmt_net`** ).
4. **Repeat for Appliance 2:**
* Perform the same initial console configuration for **Appliance 2** (`sophos-appliance2`), assigning it the pre-defined IP and CIDR within the same `sophos_mgmt_net` subnet.
5. **Access Web Interface:**
* Use a jump host VM within the same subnet `sophos_mgmt_net` or your local device with appropriate routing to access the Sophos WebAdmin console. (`https://<mgmt-ip>:4444`)
## Application Guide
This section outlines the configuration steps required within the Sophos WebAdmin interface to establish the High Availability (HA) cluster and configure public connectivity.
### Step 1: Initial Wizard and Registration
1. **Run the Setup Wizard:**
* Log in to the WebAdmin of both appliances (`https://<mgmt-ip>:4444`).
* Complete the basic setup wizard on **both** appliances.
* **Firmware Check:** Ensure both appliances are running the **exact same firmware version**.
* *Tip:* If firmware versions match immediately, you can configure the second appliance as an "HA Spare" directly during the first wizard step. Otherwise, configure both as standalone first and update firmware.
2. **Registration:**
* After the reboot, log in with the admin credentials.
* Register both firewalls. A valid license (minimum **Home License**) is required to enable HA functionality.
### Step 2: Configure Sync Interface
Perform these steps on **both** appliances:
1. **Network Configuration:**
* Navigate to **Network > Interfaces**.
* Edit the Port identified as the **Sync Interface**.
* Set the **Network Zone** to **DMZ**.
* Assign the IP address allocated by Terraform for the Sync network.
2. **Device Access:**
* Navigate to **Administration > Device Access**.
* Enable **SSH** on the **DMZ** zone (required for HA communication).
### Step 3: Configure High Availability (HA)
Configure the **Auxiliary (Sophos 2)** appliance first, then the **Primary (Sophos 1)**.
#### 1. Configure Sophos 2 (Auxiliary)
* Navigate to **System Services > High Availability**.
* **Initial Device Role:** Select **Auxiliary (Interactive Mode)**.
* **Node Name:** Assign a name (e.g., `Sophos-Node2`).
* **Passphrase:** Set a secure passphrase (note this down, it must match on the Primary).
* **Dedicated HA Link:** Select the **DMZ Port** configured in Step 3.
* Click **Save** or **Enable HA**.
#### 2. Configure Sophos 1 (Primary)
* Navigate to **System Services > High Availability**.
* **Initial Device Role:** Select **Primary (Interactive Mode)**.
* **Cluster ID:** Select an ID (0-63).
* **Node Name:** Assign a name (e.g., `Sophos-Node1`).
* **Passphrase:** Enter the same passphrase used on the Auxiliary.
* **Dedicated HA Link:** Select the **DMZ Port**.
* **Peer HA Link IPv4:** Enter the IP address of the **Sophos 2 Sync Interface**.
* **Monitored Ports:** Select the interfaces to monitor (usually LAN and WAN).
* **Peer Administration Settings:** Enter the **Management IP** of Sophos 2.
* **Hypervisor MAC:** Set **Use host or hypervisor-assigned MAC address** to **True**.
* *Important:* This setting is critical for STACKIT/OpenStack environments to ensure traffic routing works correctly during failover.
* Click **Enable HA**.
### Step 4: Post-HA Network Configuration (VIP)
Once the HA cluster is built, the primary appliance handles traffic. You must now configure the public-facing interface to use the Virtual IP (VIP).
1. **Assign VIP Address:**
* On the active (Primary) appliance, navigate to **Network > Interfaces**.
* Edit the **WAN Port**.
* Manually change the IP address to the **local VIP Address**.
2. **Enable WAN Services:**
* Navigate to **Administration > Device Access**.
* Enable necessary services on the **WAN** zone (e.g., User Portal, VPN).
* **Testing:** Temporarily enable **Ping** on WAN to verify connectivity.
### Step 5: Verification and Failover Test
1. **Ping Test:**
* Ping the **Public IP** output by the Terraform script. It should be reachable.
2. **Failover Test:**
* Shut down the **Primary VM** via the STACKIT Portal or Sophos WebUI.
* The **Auxiliary VM** should take over the role of Primary after a short interruption.
* Verify that the Public IP remains pingable.