Operations and admin on IBM i: job queues, output queues, and keeping the system running

Every IBM i developer eventually ends up doing some administration. A job gets stuck. The output queue fills up. A batch process that normally runs in ten minutes is still going after an hour. Someone asks why the system feels slow. These are not purely admin problems — they are problems that affect your programs, and understanding the system layer makes you far more effective at diagnosing and fixing them.

This post covers the operational concepts every IBM i developer should understand: how jobs work, how queues are organised, how to find out what is happening on a running system, and how to keep things moving when they are not.

How jobs work on IBM i

Everything that runs on IBM i runs inside a job. Your interactive session is a job. Every batch program you submit is a job. The system itself runs in jobs. Understanding jobs is the foundation of understanding IBM i operations.

Every job has three-part name: number/user/jobname. For example, 123456/MYUSER/ORDRPG. The number is assigned by the system when the job starts and is unique. You use this full name to identify a specific job when managing it.

Jobs have a lifecycle:

On a job queue — submitted but not yet running, waiting for a subsystem to pick it up
Active — running in a subsystem
On an output queue — finished, spooled output waiting to be printed or viewed
Completed — ended, job log available for a period before it expires

Subsystems

A subsystem is a controlled environment that manages how many jobs run simultaneously and what resources they get. Think of it as a job pool with rules.

The standard subsystems on most IBM i systems:

QINTER — interactive jobs (green-screen sessions, ACS sessions)
QBATCH — batch jobs submitted with SBMJOB
QSYSWRK — system work jobs (communications, spooling)
QSERVER — server jobs (file server, database server for client connections)

Your shop may have custom subsystems — many sites create separate subsystems for different applications or priority levels, so high-priority batch does not compete with low-priority reporting.

WRKSBSD (Work with Subsystem Descriptions) shows all subsystems. WRKSBS shows which ones are currently active.

Job queues

When you submit a batch job with SBMJOB, it goes to a job queue. The job queue holds it until a subsystem picks it up and starts running it. How many jobs a subsystem runs simultaneously from a given queue is controlled by the maximum active jobs setting on the job queue entry.

Key commands:

/* See all jobs on a job queue */
WRKJOBQ JOBQ(MYLIB/BATCHQ)

/* See jobs across all queues */
WRKJOB

/* Hold a job queue — stops new jobs from starting */
HLDJOBQ JOBQ(QGPL/QBATCH)

/* Release a held job queue */
RLSJOBQ JOBQ(QGPL/QBATCH)

Holding a job queue is useful during maintenance windows — jobs accumulate on the queue but do not start until you release it.

Working with active jobs

WRKACTJOB is the command you reach for when something seems wrong with the system. It shows every active job, its status, CPU usage, and what it is currently doing.

WRKACTJOB

Key columns to read:

Sts — job status. RUN is running, TIMW is waiting on a timer, LCKW is waiting on a lock (this is the one to watch), MSGW is waiting for a reply to a message
CPU % — if one job is consuming 80%+ of CPU and should not be, something is wrong
Function — what the job is currently executing

A job stuck in LCKW is waiting for a lock that another job holds. Find the job holding the lock with:

WRKOBJLCK OBJ(MYLIB/CUSTMAST) OBJTYPE(*FILE)

This shows every job that has a lock on that object and what type of lock it holds.

A job in MSGW is waiting for someone to reply to an inquiry message. Find it, check the message, and reply:

/* Work with a specific job */
WRKJOB JOB(123456/MYUSER/MYJOB)

/* Option 4 from WRKACTJOB puts you into the job's message queue */

Ending a job

Sometimes you need to end a job that is stuck, runaway, or no longer needed.

/* Controlled end — lets the job clean up */
ENDJOB JOB(123456/MYUSER/MYJOB) OPTION(*CNTRLD) DELAY(30)

/* Immediate end — use only when controlled end does not work */
ENDJOB JOB(123456/MYUSER/MYJOB) OPTION(*IMMED)

Always try *CNTRLD first. It signals the job to end and gives it time to close files, release locks, and roll back uncommitted transactions. *IMMED is a hard kill — it works, but locks may not be released cleanly and uncommitted changes will be rolled back by the system rather than the program.

Output queues and spooled files

When a program sends output to a printer file (or uses DSPLY to a spooled file), that output goes to an output queue as a spooled file. It sits there until it is printed, deleted, or moved.

/* Work with an output queue */
WRKOUTQ OUTQ(MYLIB/BATCHOUTQ)

/* Work with all spooled files for your user */
WRKSPLF

From WRKOUTQ you can view spooled files (option 5), delete them (option 4), hold them, release them, or move them to a different output queue.

Output queues filling up is a common operational issue. If a queue has thousands of spooled files that nobody is printing, they accumulate and consume storage. Regular cleanup is part of keeping a healthy system:

/* Delete all spooled files on an output queue */
CLROUTQ OUTQ(MYLIB/OLDREPORTS)

Use with care — CLROUTQ deletes everything on the queue with no confirmation.

The job log

Every job produces a job log — a record of messages generated during the job’s execution. The job log is your primary diagnostic tool when something goes wrong in a batch job.

/* View the job log for a specific job */
DSPJOBLOG JOB(123456/MYUSER/MYJOB)

After a job ends, its job log is written as a spooled file on the output queue associated with that job. If a batch job fails and you want to know why, find its spooled job log and read it.

From SQL, you can query the job log directly — as shown in the DB2 post:

SELECT MESSAGE_TIMESTAMP, MESSAGE_ID, MESSAGE_TYPE, MESSAGE_TEXT
  FROM TABLE(QSYS2.JOBLOG_INFO('123456/MYUSER/MYJOB')) AS X
  WHERE MESSAGE_TYPE = '*ESCAPE'
  ORDER BY MESSAGE_TIMESTAMP DESC

Filtering for *ESCAPE messages shows only the errors, cutting through the noise.

Scheduled jobs

IBM i has a built-in job scheduler for running jobs automatically at set times. The command interface is WRKJOBSCDE (Work with Job Schedule Entries).

/* Add a scheduled job — runs every day at 11pm */
ADDJOBSCDE JOB(NIGHTLY) +
            CMD(CALL PGM(MYLIB/NIGHTLY)) +
            FRQ(*WEEKLY) +
            SCDDAY(*ALL) +
            SCDTIME(230000) +
            JOBD(MYLIB/BATCHJD)

WRKJOBSCDE gives you a list of all scheduled entries where you can add, change, hold, release, and delete them.

For more complex scheduling — dependencies between jobs, calendars, holiday skipping — most shops use a third-party scheduler like Robot/SCHEDULE or Advanced Job Scheduler (IBM’s product). The built-in scheduler works for simple daily/weekly jobs but has limited dependency management.

System performance: WRKDSKSTS and WRKSYSSTS

Two commands for a quick read on system health:

WRKSYSSTS — system status. Shows CPU utilisation, main storage (memory) usage, database faults (pages being read from disk because they are not in memory), and active jobs. If database faults are consistently high, the system needs more memory or the workload needs tuning.

WRKDSKSTS — disk status. Shows utilisation percentage for each disk unit and ASP (Auxiliary Storage Pool). IBM recommends keeping disk utilisation below 80%. Above that, performance degrades and you risk running out of space for journals and temp files.

WRKSYSSTS
WRKDSKSTS

Both refresh automatically. Press F5 to force a refresh. These are the first two commands to run when someone says the system is slow.

ASPs — managing storage

IBM i organises disk storage into ASPs (Auxiliary Storage Pools). The system ASP (ASP 1) contains the operating system and, by default, everything else. Additional ASPs (user ASPs, iASPs) allow separation of workloads and independent disk management.

WRKDSKSTS shows utilisation per ASP. If the system ASP is running high:

Clean up old spooled files and job logs
Clear work files and temporary objects in QTEMP and work libraries
Check for runaway journals consuming space with WRKJRN
Move user data to a user ASP if one is available

Useful SQL views for operational monitoring

The QSYS2 schema includes views that make operational monitoring scriptable and automatable:

-- Jobs currently waiting on locks
SELECT JOB_NAME, JOB_STATUS, SUBSYSTEM, FUNCTION
  FROM QSYS2.JOB_INFO
  WHERE JOB_STATUS = 'LCKW'
  ORDER BY JOB_NAME

-- Jobs in message wait
SELECT JOB_NAME, JOB_STATUS, SUBSYSTEM
  FROM QSYS2.JOB_INFO
  WHERE JOB_STATUS = 'MSGW'

-- Disk utilisation across all ASPs
SELECT ASP_NUMBER, TOTAL_CAPACITY, TOTAL_CAPACITY_AVAILABLE,
       DEC(100 - (TOTAL_CAPACITY_AVAILABLE / TOTAL_CAPACITY * 100), 5, 1)
         AS PCT_USED
  FROM QSYS2.ASP_INFO
  ORDER BY ASP_NUMBER

-- Spooled files older than 30 days
SELECT JOB_NAME, SPOOLED_FILE_NAME, CREATE_TIMESTAMP, TOTAL_PAGES
  FROM QSYS2.OUTPUT_QUEUE_ENTRIES
  WHERE CREATE_TIMESTAMP < CURRENT_TIMESTAMP - 30 DAYS
  ORDER BY CREATE_TIMESTAMP

You can run these from ACS on a schedule, build them into monitoring programs, or use them as the basis for automated cleanup routines.

The commands worth bookmarking

If you are new to IBM i operations, these are the commands to have ready:

WRKACTJOB — what is running right now
WRKSYSSTS — CPU and memory at a glance
WRKDSKSTS — disk space
WRKJOBQ — what is waiting to run
WRKOUTQ — what output is waiting
WRKOBJLCK — who holds a lock on an object
DSPJOBLOG — why did that job fail
WRKJOBSCDE — what is scheduled to run

None of these change anything. They are all read-only views of system state. Get comfortable running them freely — the system will not break from looking at it.

Next post: Performance tuning on IBM i — finding bottlenecks, reading query plans, and making slow programs fast.