Repository / Reference Platform /Multi-Site Hybrid Core Architecture

Multi-Site Hybrid Core Architecture

Domain:

Level:Advanced

Status:stable

Last Updated:2024-12-19

Tags:

multi-sitehybrid clouddisaster recoveryzero trustenterprise networking

Three-site hybrid core architecture with consistent security, routing, and connectivity patterns across primary, disaster recovery, and out-of-region data centers.

Multi-Site Hybrid Core Architecture

This module defines the multi-site core for the Enterprise Hybrid HCI Platform. It describes how a primary data center, a disaster recovery data center, and an out-of-region site interconnect with each other and with public cloud environments.

The goal is to provide a repeatable pattern that supports:

High availability for production workloads
Disaster recovery with defined RPO/RTO tiers
Out-of-region survivability for regional events
Consistent Zero Trust enforcement at each perimeter

📊 Enterprise Multi-Site Hybrid Architecture

💡 This diagram is optimized for readability. Scroll horizontally on mobile devices to view the full architecture.

Multi-Site Architecture Overview

The interactive diagram above shows the complete multi-site hybrid architecture including:

Three Geographic Sites: US West (DR), US East 1 (Primary), US East 2 (Secondary)
Cloud Integration: Multi-AZ deployment with transit VPC and cloud gateways
Security Perimeter: Next-generation firewalls and zero trust enforcement
HCI Infrastructure: Infrastructure, application, and database tiers at each site
Inter-Site Connectivity: Direct connections and replication between all sites

This architecture provides comprehensive geographic redundancy while maintaining consistent security policies and operational procedures across all sites.

Site Roles

Primary Data Center

The Primary DC hosts the majority of production workloads and platform services. It is the default landing zone for:

Line of business applications
Core identity and access management
Security operations and monitoring
Management and orchestration platforms

Primary DC typically has:

The largest HCI footprint
The highest connectivity density
Direct paths to cloud provider regions

Disaster Recovery Data Center

The DR DC provides synchronous or near-synchronous recovery for critical workloads and asynchronous protection for lower tiers.

Responsibilities:

Receive replicated data from the Primary DC
Host warm or hot standby instances of critical services
Provide a controlled failover target during planned or unplanned events
Support regular DR testing without impacting production

Out-of-Region DR Data Center

The Out-of-Region DR DC provides geographic redundancy for regional disasters.

Characteristics:

Located in a different geographic and risk domain
Receives asynchronous data replication
Hosts minimal always-on footprint plus scalable capacity for rapid expansion
Often integrated with public cloud for burst capacity and backup storage

Core Network Topology

Each data center follows a consistent pattern:

Dual edge routers or provider handoffs
Redundant next-generation firewalls
A spine-leaf switching fabric that connects HCI blocks and traditional server racks
Separate infrastructure, DMZ, application, and database segments

Inter-site connectivity typically includes:

High-capacity private links between Primary and DR DC
Lower-cost but diverse links to the Out-of-Region DC
Logical separation of:
- User traffic
- Replication and backup traffic
- Management and monitoring traffic

Routing is handled by a combination of:

Interior gateway protocols for intra-site routing
BGP between sites and toward cloud providers and internet edges

Security Zones and Edge Pattern

Each site implements the same security zone model:

Internet and cloud connectivity terminate on redundant edge firewalls.
Firewalls expose:
- An untrusted external zone toward the internet and provider networks
- A transit or core zone used for inter-site and cloud routing
- Internal zones mapped to:
  - DMZ
  - Application tiers
  - Database tiers
  - Infrastructure and management networks

Zero Trust is applied at the edge by:

User and device identity awareness for remote access
Application-aware policies for outbound and inbound flows
Consistent logging of all allowed and denied traffic into centralized SIEM

Cloud Integration Pattern

Public cloud environments are treated as additional sites attached to the same core pattern.

Typical components:

Site-to-site VPNs or private connectivity (for example, Direct Connect, ExpressRoute, private interconnect) from each data center to cloud transit hubs
A cloud transit VPC/VNet that:
- Aggregates connections from on-premises sites
- Provides shared services such as DNS forwarders and logging gateways
- Hosts cloud-native firewalls or security appliances when required

Routing:

BGP between on-premises edge routers and cloud gateways
Consistent prefix advertisement rules so that:
- The Primary DC is the preferred path for normal operations
- The DR DC can take over during failover
- The Out-of-Region DC can serve as a last resort or regional isolation zone

Security:

Policy parity between on-premises and cloud firewalls where possible
Centralized identity used for administrative access to both on-premises and cloud workloads
Shared logging and monitoring that correlate events across all sites

Resiliency and Failover Patterns

The three-site model supports multiple levels of resiliency.

Intra-Site Resiliency

Within each data center:

Dual firewalls operate in high-availability configuration
Spine-leaf switches provide redundant paths between all racks
HCI clusters tolerate node and component failures while maintaining service

Inter-Site Failover

Primary to DR:

Storage replication between Primary and DR HCI clusters
Application failover via:
- DNS changes
- Global load balancing
- Automation runbooks

Out-of-Region:

Asynchronous data replication from either Primary or DR DC
Minimal always-on services, such as identity replicas and management access
Defined playbooks for scaling up compute resources when invoked

Testing and Validation

The design expects:

Regular planned failover tests between Primary and DR
Periodic tabletop and technical exercises that simulate loss of a region
Verification that identity, management, and monitoring remain available from at least one surviving site

Zero Trust Foundations in the Core

While later modules go deeper into Zero Trust, the core multi-site design already assumes:

Identity-aware flows
- Remote access traffic is authenticated and authorized by identity platforms, not by IP alone.
- Administrative access to edge and core devices requires strong authentication and is tightly audited.
Microsegmentation
- Inter-site links carry traffic between well-defined zones, not broad flat networks.
- Access controls are enforced at firewalls and, where possible, at host and workload levels.
Continuous verification
- Telemetry from firewalls, routers, and identity systems feeds into SIEM and analytics platforms.
- Policies can adapt based on risk signals, such as impossible travel, anomalous access patterns, or device posture.

Implementation Options

The following products are examples that can implement this multi-site core. They are not required, but align well with the architecture.

Network Fabric and Core Routing

Cisco: Nexus or Catalyst switching, IOS XE or NX-OS routing
Arista: 7000 series data center switches
Juniper: QFX and EX series

Next-Generation Firewall and Remote Access

Palo Alto Networks: PA-series and VM-series firewalls, GlobalProtect VPN

HCI and Compute

VMware vSphere with vSAN based HCI, or VxRail appliances
Nutanix AHV with integrated storage
Dell or HPE server platforms as compute and storage foundations

Cloud Connectivity

AWS Direct Connect, Azure ExpressRoute, or equivalent cloud private connectivity
Cloud-native transit hubs and routing constructs configured to match the on-premises pattern

Related Entries

Enterprise Hybrid HCI Platform Overview - Foundational reference architecture
Enterprise Disaster Recovery Architecture - DR implementation patterns
Open Systems Reference Platform - Working implementation example

Later modules describe how identity, security operations, application tiers, and Kubernetes platforms sit on top of this core.