IBM i High Availability and Disaster Recovery in 2026: Geographic Mirroring, Switchable ASPs, and HA Tools Compared

The previous post covered Python on IBM i — running Python in PASE, accessing DB2 for i with ibm_db and pyodbc, calling RPG programs with itoolkit, and building cloud and AI integration pipelines. This post covers High Availability and Disaster Recovery on IBM i: what the platform offers natively, how third-party tools extend those capabilities, and how to design an architecture that meets your Recovery Time Objective and Recovery Point Objective in 2026.

HA vs DR — Understanding the Distinction

High Availability (HA) and Disaster Recovery (DR) are frequently used interchangeably, but they address different failure scenarios and require different architectural responses.

High Availability is concerned with eliminating or minimising planned and unplanned outages within a site or between closely coupled sites. The goal is continuous operation — measured by uptime percentage — and the primary metric is Recovery Time Objective (RTO): how quickly can the system be back in service after a failure?

Disaster Recovery is concerned with surviving a catastrophic event that takes an entire site offline — a data centre fire, flood, power grid failure, or ransomware event. The primary metric here is Recovery Point Objective (RPO): how much data are you willing to lose? An RPO of zero means synchronous replication; an RPO of 15 minutes means asynchronous replication with a tolerable lag.

  • Planned outage: OS upgrade, hardware maintenance, storage expansion — the system is taken offline intentionally. HA tools allow a role swap to a standby system so users remain productive.
  • Unplanned outage: Hardware failure, network partition, OS hang — the primary system disappears without warning. HA tools detect the failure and perform a failover, bringing the secondary online automatically or with manual intervention.
  • Disaster: An entire site becomes unavailable. DR requires a geographically separate system with replicated data and a tested recovery runbook.

On IBM i, both HA and DR rely on one or more of three underlying mechanisms: Geographic Mirroring (storage-layer), Switchable Independent ASPs (disk-pool mobility), and journal-based logical replication (object-level). Understanding which layer each mechanism operates on is essential before selecting a product or designing an architecture.

IBM i HA Options Overview

IBM i offers several native HA mechanisms, each with different trade-offs in terms of granularity, performance overhead, and licensing cost.

  • Geographic Mirroring: A Power Systems feature (requires IBM i 7.3 or later with appropriate storage) that mirrors disk writes from a production IASP to a remote copy at the storage layer. Changes are replicated synchronously or asynchronously at the hardware level, below the operating system.
  • Switchable Independent ASPs (IASPs): An Independent Auxiliary Storage Pool can be detached from one IBM i partition and attached to another — on the same system or a remote system via SAN extension. The IASP carries all objects in its library and directory structure with it. No data copying is required during the switch; the target system simply varies on the ASP.
  • Journal-based logical replication: IBM i journals capture before and after images of every change to a database file or object. Third-party HA tools — and IBM’s own PowerHA SystemMirror for i — read the journal receivers on the source system and replay those transactions on a target system, keeping a logical copy in sync.
  • Third-party HA tools: MIMIX (Precisely), iTera (Rocket Software), and IBM PowerHA SystemMirror for i each provide a management layer above journalling — automating replication monitoring, switchover orchestration, and object-level tracking beyond what native CL commands provide alone.

Independent Auxiliary Storage Pools (IASPs)

An Auxiliary Storage Pool (ASP) is a logical grouping of physical disk units. The system ASP (ASP 1) holds the operating system, licensed programs, and by default all user data. User ASPs (ASPs 2–32) are overflow pools on the same system. Independent ASPs (IASPs, ASPs 33–255) are different: they are self-contained disk pools that can be varied on and off independently of the system ASP, and they can be moved between partitions.

An IASP contains its own set of libraries and directories. When varied on, those libraries appear in the system’s library list and are fully accessible. When varied off, they disappear completely — no other system can access them unless they vary them on.

Use WRKCFGSTS to check the current state of an IASP:

WRKCFGSTS CFGTYPE(*DEV) CFGD(PRODASP)

The status will show VARIED ON, VARIED OFF, or VRYCFG IN PROGRESS. To vary an IASP off on the primary system before switching it to a secondary:

/* Vary off the IASP on the production system */
VRYCFG CFGOBJ(PRODASP) CFGTYPE(*DEV) STATUS(*OFF)

On the target (secondary) system, vary it on:

/* Vary on the IASP on the secondary system */
VRYCFG CFGOBJ(PRODASP) CFGTYPE(*DEV) STATUS(*ON)

Before varying off, ensure all jobs using objects in the IASP have ended. Use WRKOBJ or WRKOBJLCK to check for locks:

WRKOBJLCK OBJ(PRODASP/MYLIB) OBJTYPE(*LIB)

For a remote switchable IASP (the most common HA architecture), the IASP is connected to both the primary and secondary systems via a SAN fabric. Geographic Mirroring or a third-party replication tool keeps the secondary copy current. During a switchover:

  1. End all jobs on the primary that have objects open in the IASP.
  2. Vary off the IASP on the primary (VRYCFG STATUS(*OFF)).
  3. If using geographic mirroring, end mirroring and promote the secondary copy.
  4. Vary on the IASP on the secondary system.
  5. Redirect TCP/IP or change DNS to point users at the secondary.

The DSPASPSTS command provides a summary of all ASPs and their usage:

DSPASPSTS

To list all IASPs configured on the system with their current configuration, use:

WRKDEVD DEVD(*ASP)

Geographic Mirroring

Geographic Mirroring is a storage-layer replication technology available on IBM Power Systems. It mirrors writes to an IASP at the disk-controller level, producing an exact copy of the disk pool on a remote system. Because it operates below the OS, it replicates all data — database files, IFS objects, user spaces, data queues, and everything else — without any application-level awareness.

Architecture: The source system writes to its local disks and simultaneously (synchronous mode) or shortly after (asynchronous mode) sends those writes to the remote disk pool. The remote pool is an exact mirror; the target system cannot use it while mirroring is active.

Geographic Mirroring requires:

  • IBM i 7.3 or later on both systems
  • The IASP must be an Independent ASP (not the system ASP)
  • A high-bandwidth, low-latency network connection between sites (fibre or DWDM for synchronous; WAN tolerable for asynchronous)
  • The IBM i High Availability Licensed Program (5770-HAS) or equivalent entitlement

To start geographic mirroring on a configured IASP:

STRGEOMRR ASPDEV(PRODASP)

To display the current mirroring status:

DSPGEOMRR ASPDEV(PRODASP)

The output shows the mirror state (MIRRORED, SUSPENDED, DETACHED), the synchronisation percentage if catching up, and the network path being used. To end mirroring before a planned switchover:

ENDGEOMRR ASPDEV(PRODASP) OPTION(*PROMOTE)

The *PROMOTE option makes the remote copy a standalone, writable IASP. Without this option, ENDGEOMRR simply stops mirroring without promoting; the remote copy remains in a mirrored state and cannot be varied on elsewhere.

To check and manage the geographic mirroring copy description:

WRKGEOMRRCPY ASPDEV(PRODASP)

In synchronous mode, every write on the source must be acknowledged by the target before completing — this guarantees zero data loss (RPO=0) but adds latency. For sites more than roughly 50–100 km apart, the round-trip latency of synchronous mirroring becomes a performance concern; asynchronous mode is used instead, accepting a small RPO window (typically seconds to low minutes under normal load).

Journal-Based Logical Replication

Journals are IBM i’s native mechanism for capturing change data. Every database physical file that participates in replication must be journalled. When a record is inserted, updated, or deleted, the before-image and after-image of that record are written to a journal receiver. Third-party HA tools read those journal receivers and replay the changes on the target system.

To start journalling a physical file:

/* Create a journal receiver */
CRTJRNRCV JRNRCV(MYLIB/MYJRNRCV) THRESHOLD(100000)

/* Create the journal */
CRTJRN JRN(MYLIB/MYJRN) JRNRCV(MYLIB/MYJRNRCV) MNGRCV(*SYSTEM)

/* Start journalling a physical file */
STRJRNPF FILE(MYLIB/MYFILE) JRN(MYLIB/MYJRN) IMAGES(*BOTH) OMTJRNE(*OPNCLO)

The IMAGES(*BOTH) parameter instructs IBM i to capture both before-images and after-images — required for logical replication tools that need to detect and replicate updates and deletes accurately. The OMTJRNE(*OPNCLO) parameter omits open and close journal entries, which reduces journal volume without losing data change events.

To change the journalling attributes of an existing file — for example, to add before-images to a file that was previously journalled with after-images only:

CHGJRN JRN(MYLIB/MYJRN) JRNRCV(*GEN)

/* Change the file to capture both images */
CHGJRNPF FILE(MYLIB/MYFILE) IMAGES(*BOTH)

Journal receivers accumulate over time. The MNGRCV(*SYSTEM) parameter on CRTJRN tells IBM i to manage receiver chaining and detachment automatically. Detached receivers that have been delivered to the target system can be deleted:

DLTJRNRCV JRNRCV(MYLIB/MYJRNRCV0001) DLTOPT(*IGNINQMSG)

To display the current journal receiver chain and identify the active receiver:

WRKJRNA JRN(MYLIB/MYJRN)

For IFS objects and data queues, the commands differ slightly:

/* Journal an IFS stream file */
STRJRN OBJ('/PRODASP/MYDIR/MYFILE.DAT') JRN('/QSYS.LIB/MYLIB.LIB/MYJRN.JRN')

/* Journal a data queue */
STRJRNOBJ OBJ(MYLIB/MYDTAQ) OBJTYPE(*DTAQ) JRN(MYLIB/MYJRN)

Logical replication tools consume the journal stream via APIs or direct receiver access, apply transactions on the target, and maintain their own position pointer within the receiver chain. This gives them fine-grained control — they can replicate selected files, apply filters, handle conflicts, and provide detailed replication lag reporting.

Third-Party HA Tools — MIMIX, iTera, and PowerHA SystemMirror

Three products dominate the IBM i HA market in 2026. Each uses journalling as its replication engine but wraps it with management, monitoring, and automation capabilities.

MIMIX (Precisely) is the most widely deployed IBM i HA product. It provides object-level replication using journals, a graphical management console, automated switchover with pre/post-processing scripts, and detailed activity and lag reporting. MIMIX supports both database-only replication and full system replication (including IFS, data areas, user profiles, and non-journalled objects). It can replicate between IBM i versions, making it useful for upgrade projects where the standby is one release ahead.

iTera (Rocket Software) takes a similar journal-based approach but is known for its tight integration with the IBM i command line and its replication auditing features. iTera’s Replication Audit functionality performs periodic object comparisons between source and target to detect any drift caused by non-journalled object changes or replication gaps. It also supports selective replication at the library, file, or member level.

IBM PowerHA SystemMirror for i (5770-HAS) is IBM’s own offering. It provides cluster resource group (CRG) management, geographic mirroring integration, and switchable IASP orchestration. PowerHA is the choice when you want IBM-supported HA tightly integrated with the platform — particularly for environments that use geographic mirroring as the underlying replication mechanism. It does not require third-party journal readers; instead, it orchestrates the IASP switch at the CRG level.

FeatureMIMIX (Precisely)iTera (Rocket)PowerHA SystemMirror
Replication methodJournal-based logicalJournal-based logicalGeo mirror / IASP / journal
GUI managementYes (web console)Yes (5250 and web)5250 + IBM Navigator
Cross-version replicationYesYesLimited
Geographic mirroringNo (logical only)No (logical only)Yes (native integration)
Audit / drift detectionYesYes (strong feature)Limited
IBM support alignmentISVISVIBM

Designing for RTO and RPO

Choosing the right replication architecture begins with agreeing on RTO and RPO targets with the business, then working backwards to the technology.

Synchronous replication (RPO = 0): Every write on the source is confirmed on the target before the application receives an acknowledgement. No data loss is possible, but every write carries the round-trip latency to the target site. This is viable for sites within ~50 km connected by dark fibre or DWDM. Beyond that distance, the latency impact on OLTP throughput becomes unacceptable.

Asynchronous replication (RPO > 0): Writes are acknowledged immediately on the source, then transmitted to the target asynchronously. Replication lag — typically measured in seconds under normal conditions — defines the RPO. This model is appropriate for inter-city or inter-country DR where network latency rules out synchronous operation.

RTO is driven by the switchover automation. A fully automated failover triggered by a heartbeat timeout can achieve RTOs of two to five minutes. A manually executed switchover with multiple steps — ending jobs, varying off the IASP, promoting the mirror, varying it on at the DR site, redirecting DNS — typically takes 15–60 minutes depending on the runbook.

Switchover testing is non-negotiable. An untested HA architecture is not an HA architecture — it is a hope. Best practice is to test a full role swap quarterly in a non-production window:

  • Practice ending applications cleanly on the primary.
  • Vary off the IASP and confirm it reaches VARIED OFF status.
  • On the secondary, vary on the IASP and confirm all libraries are accessible.
  • Run smoke tests — verify that batch jobs start, interactive sessions connect, and key business transactions execute correctly.
  • Switch back and measure elapsed time against your RTO target.

Role swap automation can be scripted in CL. A minimal role swap CL programme skeleton:

PGM
  DCL VAR(&ASPNAME) TYPE(*CHAR) LEN(10) VALUE('PRODASP')
  DCL VAR(&MSGID)   TYPE(*CHAR) LEN(7)
  DCL VAR(&ERRMSG)  TYPE(*CHAR) LEN(100)

  /* Step 1 — end subsystems that use the IASP */
  ENDSBS SBS(QBATCH)   OPTION(*CNTRLD) DELAY(120)
  ENDSBS SBS(QINTER)   OPTION(*CNTRLD) DELAY(120)
  ENDSBS SBS(APPSBSYS) OPTION(*CNTRLD) DELAY(60)

  /* Step 2 — vary off the IASP */
  VRYCFG CFGOBJ(&ASPNAME) CFGTYPE(*DEV) STATUS(*OFF)
  MONMSG MSGID(CPF0000) EXEC(DO)
    RCVMSG MSGTYPE(*LAST) MSG(&ERRMSG)
    SNDUSRMSG MSG('VRYCFG FAILED: ' *CAT &ERRMSG) TOUSR(*SYSOPR)
    GOTO ENDPGM
  ENDDO

  /* Step 3 — signal secondary system to vary on */
  /* (implementation depends on inter-system comm mechanism) */
  SNDDTAARA DTAARA(DRLIB/SWITCHCTL (1 1)) VALUE('1')

  SNDUSRMSG MSG('Role swap initiated. IASP varied off. Secondary notified.') TOUSR(*SYSOPR)

  ENDPGM:
ENDPGM

Practical CL — Working with PowerHA SystemMirror for i

When using IBM PowerHA SystemMirror for i, HA resources are managed through Cluster Resource Groups (CRGs). A CRG defines the resources to protect (the IASP, IP addresses, application start/end scripts) and the node roles (primary, backup).

To display all cluster resource groups on the system:

WRKHAGRP

This displays a list of CRGs with their current state, primary node, and backup node. To start a cluster resource group (bring it into managed state):

STRHAGRP HAGRP(PRODCRG)

To initiate a controlled switchover (role swap) of a CRG from the primary to the backup node:

CHGHAGRP HAGRP(PRODCRG) ACTION(*SWITCH) NEWPRIMARY(BACKUPSYS)

To display the full details of a specific cluster resource group, including its IP takeover addresses and exit programme paths:

DSPHAGRP HAGRP(PRODCRG)

To check cluster node status across all nodes in the cluster:

WRKCLUNODE

PowerHA also provides DSPCLUNOD for a single-node view and WRKCLUSTS for overall cluster status. When a node shows FAILED status, investigate the cluster message log:

DSPLOG LOG(QHST) PERIOD((*AVAIL *BEGIN) (*AVAIL *END)) MSGID(CPIB)

For manual failover outside of PowerHA (for example, when using a third-party HA tool’s console), the steps are:

/* On secondary system — check IASP readiness */
DSPASPSTS

/* Confirm replication lag is within acceptable RPO */
/* (tool-specific command — e.g., MXDSP for MIMIX) */

/* Vary on the IASP */
VRYCFG CFGOBJ(PRODASP) CFGTYPE(*DEV) STATUS(*ON)

/* Start application subsystems */
STRSBS SBSD(APPLIB/APPSBSYS)

/* Redirect client IP if not using IP takeover */
ADDTCPIFC INTNETADR('10.0.1.100') LIND(ETHLINE) SUBNETMASK('255.255.255.0')

For DR scenarios where the primary site is completely unavailable, most organisations rely on a pre-written runbook that maps each step to the responsible team member, with verified contact information and tested timings from the last DR test. The technology can execute a switchover in minutes; the organisational readiness to invoke it is usually what determines the actual RTO.

Summary

IBM i provides a mature, multi-layered HA and DR capability in 2026. Geographic Mirroring delivers storage-layer replication with RPO=0 for synchronous configurations. Switchable IASPs provide the mechanism for moving a disk pool between systems with minimal RTO once replication has kept the secondary current. Journal-based logical replication, used by MIMIX, iTera, and PowerHA SystemMirror for i, provides fine-grained, application-aware replication that can cross OS versions and selectively replicate subsets of data. Designing the right architecture requires matching your RTO and RPO commitments to the appropriate replication mode — synchronous or asynchronous — and then investing in switchover automation and regular testing to ensure the architecture performs as designed when it matters most.

Next post: IBM i PTF Management and OS Upgrades — understanding PTF types, ordering and applying cumulative and group PTFs, managing SF99xxx groups, and planning an IBM i OS upgrade with IMGCLG virtual optical media.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top