Back to Blog

Codex Record & Replay: Automate GUI Workflows by Demo

Industry Insights2627
Codex Record & Replay: Automate GUI Workflows by Demo

Abstract

On June 18, 2026, OpenAI introduced Record & Replay in Codex app version 26.616. The feature allows users to demonstrate a workflow on macOS and convert it into a reusable Codex skill. It is designed for repetitive tasks, preference-dependent processes, and workflows that are easier to demonstrate than explain through a long prompt.

During recording, Codex observes the user’s actions and the relevant window content. After recording stops, it drafts a skill that describes when the workflow should be used, what inputs it requires, which steps it should follow, and how the result should be verified. The skill can later be reused through Computer Use, browser actions, installed plugins, or a combination of these tools.

Record & Replay does not represent model training in the traditional machine-learning sense. It is better understood as demonstration-driven workflow authoring. The user performs a task once, Codex extracts a reusable procedure, and the resulting skill provides structured context for future executions.

This article explains the feature’s operating process, execution surfaces, security boundaries, practical use cases, deployment restrictions, and broader implications for human-computer interaction.

1. What Record & Replay Actually Does

Traditional desktop automation usually starts with one of two approaches. The first relies on scripts or APIs. The second uses macros that replay fixed mouse positions and keyboard inputs.

Both approaches have limitations.

API-based automation is reliable, but it requires development work and access to supported interfaces. Macro tools are easier to configure, but they are often fragile. A small layout change, an unexpected pop-up, or a longer loading time may break the sequence.

Record & Replay introduces a third approach. Instead of describing every step or programming each action, the user demonstrates the workflow directly. Codex then converts that demonstration into a reusable skill.

OpenAI lists several representative use cases:

  1. Filing an expense report;
  2. Booking a parking space;
  3. Creating an issue with the correct configuration;
  4. Publishing a video;
  5. Downloading a recurring report.

These examples share several characteristics. They are repetitive, have clear completion criteria, and often contain preferences that are difficult to express in one prompt.

The generated skill is more than a raw recording of cursor coordinates. It contains reusable instructions, expected inputs, operational steps, and validation criteria. During replay, Codex uses the tools available in the current environment to complete the task.

This design provides more flexibility than a basic macro. However, it should not be interpreted as a guarantee that every recording will survive major interface changes. If a website or desktop application substantially redesigns its workflow, the skill may still require revision.

2. How Record & Replay Works

The complete workflow can be organized into seven practical stages.

2.1 Confirm the Prerequisites

Record & Replay is currently a Codex app feature for macOS. Computer Use must also be installed, available, and enabled.

Before starting, select a workflow that you already understand. The process should have stable steps and a clear definition of success. Workflows with frequent exceptions or unclear decision points are less suitable.

2.2 Start a New Skill Recording

Open the Plugins section in the Codex app. Select the + menu and choose Record a skill.

Codex will prepare an initial prompt for the recording session.

2.3 Explain the Goal and Variable Inputs

Review the suggested prompt before recording. Add context that may help Codex interpret the workflow.

For example, explain:

This context helps Codex distinguish fixed procedures from variable inputs.

2.4 Approve Recording Permissions

When Codex requests permission to observe the workflow, approve it only when you are ready to begin.

On macOS, Computer Use may require Screen Recording and Accessibility permissions. Screen Recording allows Codex to see the target application. Accessibility permission allows it to click, type, and navigate through the interface.

2.5 Demonstrate the Complete Workflow

Perform the task from beginning to end.

Keep the demonstration focused. Avoid switching to unrelated applications, checking messages, or carrying out cleanup work after the task has already finished.

The demonstration should be short enough to interpret clearly, but complete enough to include the full success path.

2.6 Stop Recording and Review the Draft Skill

Stop the recording through the menu bar, the recording overlay, or a direct instruction to Codex.

Codex then reviews the recorded actions and drafts a skill. The draft describes:

Users can ask Codex to refine the skill. This is the right stage to remove unnecessary actions, document hidden preferences, and clarify decision points.

2.7 Replay the Skill with New Inputs

Start a new Codex thread and ask it to use the generated skill.

Provide the values that have changed. These may include:

Codex uses the skill as reusable task context. It then selects from Computer Use, browser actions, and installed plugins according to the current environment.

3. Recording Practices That Improve Reliability

The quality of the replay depends heavily on the quality of the demonstration. A noisy or incomplete recording often produces an ambiguous skill.

OpenAI recommends keeping demonstrations short and complete. Users should explain the intended goal and identify variable inputs before recording. They should also stop the recording as soon as the workflow is finished.

Several additional practices improve results.

3.1 Use a Clean Starting State

Open the required application before recording. Close unrelated windows and remove unnecessary pop-ups.

A stable starting state makes the workflow easier to interpret.

3.2 Demonstrate the Preferred Path

Do not intentionally make mistakes during the first recording unless error recovery is part of the workflow.

The initial demonstration should show the normal success path. Exception handling can be added later through skill refinement or separate recordings.

3.3 State Hidden Rules Explicitly

Some preferences are visible in the user’s actions but difficult to infer accurately.

Examples include:

These rules should be written into the skill after recording.

3.4 Avoid Sensitive Information

Do not enter passwords, identity numbers, access tokens, confidential customer information, or other secrets during the demonstration.

OpenAI specifically recommends using realistic inputs without including secrets or sensitive data.

3.5 Add Verification Steps

A reliable automation should confirm that the task succeeded.

For example, the skill may verify that:

Without verification, a workflow may appear complete even when the final action failed.

4. Three Execution Surfaces in the Codex Ecosystem

Record & Replay does not depend on a single control method. A generated skill can use different execution surfaces according to the task.

The three main options are Computer Use, the Codex Chrome extension, and the in-app browser.

4.1 Computer Use for Desktop and Cross-Application Workflows

Computer Use allows Codex to see and operate graphical interfaces. It can click, type, navigate between windows, and interact with applications when command-line tools or structured integrations are insufficient.

Typical use cases include:

Computer Use is available on macOS and Windows in supported regions. Record & Replay itself, however, launched as a macOS-only feature.

Computer Use has a broader operating scope than the browser-specific tools, but that scope also creates additional risk. It can affect application and system state outside a project directory. Tasks should therefore remain narrow, and users should review approval requests before allowing actions to continue.

4.2 Codex Chrome Extension for Signed-In Websites

The Chrome extension is designed for tasks that depend on an existing browser session.

It is suitable for websites such as:

The extension can use the active Chrome profile and work with authenticated pages. Codex asks for approval before interacting with a new website unless that domain has already been allowed. Users can manage website allowlists and blocklists in the relevant settings.

This execution surface is useful for content publishing, cloud administration, customer relationship management, and web-based office workflows.

Browser content should still be treated as untrusted input. A malicious or misleading page could influence an agent’s actions. Users should review the target site and the requested task before approving access.

4.3 In-App Browser for Web Development and Debugging

The Codex in-app browser is intended mainly for development work.

It supports:

It does not use the user’s regular browser profile. It also does not support existing cookies, browser extensions, signed-in pages, or normal Chrome tabs. Authenticated workflows should use the Chrome extension instead.

For deeper debugging, Developer mode provides controlled access to the Chrome DevTools Protocol. Codex can inspect console output, network traffic, DOM state, applied styles, and performance traces. Full CDP access requires explicit approval because it can expose sensitive browser internals.

The in-app browser is therefore best suited to development verification, not general office automation.

5. The Role of Appshots

Appshots are related to visual context, but they should not be confused with the Record & Replay recording engine.

An Appshot captures the frontmost application window and sends it to a Codex thread. It can include:

On macOS, users can create an Appshot by pressing both Command keys or by using a custom shortcut. The Appshot then behaves like an attachment in the Codex thread.

Appshots are useful when the user needs to show Codex a current screen state without writing a long explanation. For example, a developer may share an error dialog, settings panel, design preview, or API documentation page.

However, Appshots are discrete context snapshots. They should not be described as a continuous frame-by-frame recording stream for Record & Replay. The two features solve different problems:

This distinction makes the technical description more accurate.

6. Availability, Permissions, and Regional Restrictions

Record & Replay launched on June 18, 2026, as part of Codex app version 26.616.

Its initial availability has three main constraints:

  1. It is available on macOS;
  2. Computer Use must be enabled;
  3. It is initially unavailable in the European Economic Area, the United Kingdom, and Switzerland.

There is an important distinction between Record & Replay and Computer Use. On June 16, 2026, OpenAI announced broader Computer Use availability on macOS and Windows in the EEA, UK, and Switzerland. Record & Replay nevertheless launched two days later with its own regional exclusion.

OpenAI’s documentation does not provide a detailed public explanation for the difference. It is therefore better to state the restriction directly rather than attribute it to a specific regulatory or technical cause.

Organizations can also control availability through managed Codex configuration. If an administrator disables Computer Use through requirements.toml, Record & Replay becomes unavailable as well.

7. Security and Governance Considerations

GUI automation can affect real applications, accounts, and business data. This makes security review essential.

7.1 Apply Least Privilege

Codex should receive only the permissions required for the current workflow.

Avoid broad access to unrelated applications, folders, websites, or browser history.

7.2 Keep High-Risk Actions Human-Controlled

Human approval should remain mandatory for actions such as:

Record & Replay is most appropriate for low-risk and reversible operations.

7.3 Review Generated Skills

A recorded skill should be treated as executable operational documentation.

Before reuse, review:

A workflow that was safe during recording may become unsafe when different inputs are supplied.

7.4 Separate Credentials from the Workflow

Passwords, tokens, and other secrets should not be embedded in the recorded procedure.

Use existing authenticated sessions, approved integrations, managed secret stores, or human approval prompts where appropriate.

7.5 Treat External Content as Untrusted

Browser pages, emails, documents, and web forms may contain instructions designed to manipulate an AI agent.

The workflow should clearly distinguish trusted user instructions from untrusted page content. Sensitive browser permissions should be approved only for the specific task.

8. Skills, Plugins, and Team Deployment

Record & Replay is primarily a fast way to create a skill from a personal demonstration.

This is useful for an individual workflow. It may also help a developer prototype a repeatable process before formalizing it for wider use.

For team distribution, OpenAI recommends packaging the workflow as a separate plugin when the organization needs to:

A plugin provides a more structured distribution unit than a single locally generated skill.

This distinction creates two deployment levels.

Personal Workflow Level

Record & Replay is appropriate when:

Team or Enterprise Level

A plugin is more suitable when:

GUI automation and model API management should also be treated as separate infrastructure layers. Record & Replay controls user-facing workflows. A multi-model backend may still require a unified API access layer to avoid maintaining separate provider integrations. Services such as 4sapi can be considered for that supporting layer, rather than being presented as part of the Record & Replay feature itself.

9. Practical Business Scenarios

9.1 Administrative Operations

Record & Replay can automate stable internal processes such as:

The strongest candidates are tasks with predictable steps and clear completion messages.

9.2 Content Publishing

A skill may capture a publishing workflow that includes:

Variable values can be supplied during each replay. The overall procedure remains consistent.

Publishing workflows still require caution. A final review step is advisable before content becomes publicly visible.

9.3 Software Engineering

Development teams may use recorded skills for:

For local front-end development, the in-app browser is usually preferable. For native applications or workflows that span several programs, Computer Use is the better fit.

9.4 Internal Business Platforms

Many enterprises still depend on internal systems with limited API coverage.

Record & Replay may provide a practical bridge for these systems. It can automate the interface without requiring an immediate backend integration project.

However, GUI automation should not automatically replace an available API. For high-volume, deterministic workloads, an API or dedicated integration is usually easier to monitor, test, and maintain.

10. Where Record & Replay Works Best

Record & Replay is a strong fit when the workflow is:

It is a weaker fit when the workflow involves:

The feature should therefore be viewed as a new automation option, not a universal replacement for RPA, APIs, scripts, or human review.

11. Broader Impact on Human-Computer Interaction

The most important change introduced by Record & Replay is not the recording interface itself. It is the shift from instruction-based automation to demonstration-based automation.

Under the traditional model, users must translate their knowledge into one of three formats:

Record & Replay allows operational knowledge to be communicated through action.

The user demonstrates the task. Codex then converts the demonstration into a reusable skill. This lowers the initial barrier for people who understand a workflow but cannot easily formalize it as code.

The user’s role also changes. Instead of manually repeating every operation, the user becomes a workflow designer and reviewer. Their main responsibilities are to:

Codex becomes the executor, but the user remains responsible for defining boundaries and evaluating outcomes.

This model may become increasingly important as AI agents move beyond text and code into general desktop work. It offers a practical path between manual operation and fully engineered automation.

12. Conclusion

Codex Record & Replay introduces a practical method for turning demonstrated macOS workflows into reusable skills. It is designed for tasks that are repetitive, preference-dependent, or easier to show than describe.

The feature follows a straightforward process. The user starts a recording, explains the goal, demonstrates the workflow, reviews the generated skill, and replays it with new inputs. Codex can then execute the procedure through Computer Use, browser actions, installed plugins, or a combination of these tools.

Its value lies in reducing the effort required to formalize GUI workflows. Users do not need to begin with a script or a highly detailed prompt. They can begin with the process they already know.

At the same time, the feature has clear boundaries. Record & Replay launched only on macOS and initially excludes the EEA, UK, and Switzerland. It also requires Computer Use. Sensitive data, high-risk actions, unstable interfaces, and irreversible operations still require careful governance.

Record & Replay is therefore best understood as a bridge between manual work and engineered automation. It does not eliminate the need for APIs, plugins, security controls, or human review. Instead, it provides a faster way to capture operational knowledge and turn it into a repeatable AI-assisted workflow.

Tags:CodexGUI AutomationComputer UseAI AgentsDesktop Automation

Recommended reading

Explore more frontier insights and industry know-how.