---
name: llm-spatial-ipd-runner
description: Run and inspect LLM-vs-LLM spatial iterated prisoner's dilemma simulations in web/scratch using the Bun runner and historical replay UI.
compatibility: Created for Zo Computer
metadata:
  author: rob.zo.computer
  version: "0.1"
---

# llm-spatial-ipd-runner

Run LLM-vs-LLM spatial iterated prisoner's dilemma simulations from soft natural-language parameters, persist them into SQLite, and inspect the completed run in the scratch site viewer.

## When to use

Use this skill when Rob asks for things like:
- "run ipd grid 20x20 100 steps all registered models"
- "run a small multi-model IPD test"
- "simulate GPT-5.4 vs Sonnet on a 10x10 grid"
- "run another LLM spatial prisoner's dilemma and give me the run id"

## What this is

This skill operates the LLM Spatial Iterated Prisoner's Dilemma setup in `file 'web/scratch'`.

Core pieces:
- Runner: `file 'web/scratch/scripts/llm_spatial_ipd.ts'`
- Default config: `file 'web/scratch/data/llm-spatial-ipd/default-config.json'`
- DB: `file 'web/scratch/data/llm-spatial-ipd/runs.sqlite'`
- Schema: `file 'web/scratch/data/llm-spatial-ipd/schema.sql'`
- Historical viewer page: `file 'web/scratch/src/pages/LlmSpatialIpd.tsx'`
- Historical viewer route: `/llm-spatial-ipd?runId=<RUN_ID>`

The runner uses the AI SDK with Vercel AI Gateway string-model routing. Each cell on a toroidal grid is assigned a model. Models repeatedly play pairwise local iterated prisoner's dilemma matches against neighbors, then cells update by imitation/replicator dynamics.

## Registered models

Current registered model keys:
- `GPT5_3_CODEX_MC`
- `GPT5_4_MC`
- `OPUS_4_6_MC`
- `KIMI_K2_5_MC`
- `GLM_5_MC`
- `MINIMAX_2_5_MC`
- `GEMINI_3_MC`
- `SONNET_4_5_MC`

"all registered models" means all of the above with equal weight unless specified otherwise.

## Default interpretation of soft params

Map common language to config fields like this:
- "grid 20x20" → `gridSize: 20`
- "100 steps" / "100 generations" → `generations: 100`
- "8 rounds" / "8 rounds per step" → `roundsPerGeneration: 8`
- "all registered models" → all model weights set to `1`
- "just GPT-5.4 and Sonnet" → those model weights set to `1`, all others `0`
- "moore" / "8-neighbor" → `neighborhood: "moore"`
- "von neumann" / "4-neighbor" → `neighborhood: "vonneumann"`
- "imitate best" → `updateRule: "imitate_best"`
- "replicator" → `updateRule: "replicator"`
- "noise 1%" → `noiseRate: 0.01`
- "mutation 0.2%" → `mutationRate: 0.002`
- "seed 42" → `seed: 42`
- "temptation 5 reward 3 punishment 1 sucker 0" → payoff matrix fields

Reasonable defaults if not specified:
- `gridSize: 12`
- `generations: 24`
- `roundsPerGeneration: 8`
- `neighborhood: "moore"`
- `updateRule: "imitate_best"`
- `temptation: 5`
- `reward: 3`
- `punishment: 1`
- `sucker: 0`
- `mutationRate: 0.002`
- `noiseRate: 0.01`
- all model weights = 1

## How to run it

### 1. Translate the request into a config

In a new chat, the agent should read this skill and interpret Rob's request directly. A separate parser is not required.

Write a temporary JSON config in the conversation workspace and keep the canonical DB at:
- `file 'web/scratch/data/llm-spatial-ipd/runs.sqlite'`

Use the current default config at `file 'web/scratch/data/llm-spatial-ipd/default-config.json'` as a baseline, then override only the requested parameters.

Recommended pattern:
```bash
bun scripts/llm_spatial_ipd.ts run --config /absolute/path/to/temp-config.json
```

You may also initialize the canonical files if needed:
```bash
bun scripts/llm_spatial_ipd.ts init
```

### 2. Execute from the scratch site root

Always use cwd:
```bash
/home/workspace/web/scratch
```

Run command:
```bash
bun scripts/llm_spatial_ipd.ts run --config /absolute/path/to/temp-config.json
```

The runner prints JSON including the `runId` and DB path.

### 3. Verify persistence

Optionally confirm the run landed in SQLite:
```bash
sqlite3 /home/workspace/web/scratch/data/llm-spatial-ipd/runs.sqlite \
  "select id,status,name,grid_size,generations from runs order by created_at desc limit 5;"
```

## How to view results

Primary viewer:
- `file 'web/scratch'`
- route: `/llm-spatial-ipd?runId=<RUN_ID>`

The page is designed to show:
- run metadata
- model roster
- population evolution chart
- final population mix
- replayable grid with play/pause/scrubber
- sample interactions for a selected generation

API endpoints behind the viewer:
- `GET /api/llm-spatial-ipd/runs`
- `GET /api/llm-spatial-ipd/runs/:runId`

## Output schema summary

The run is persisted in SQLite tables:
- `runs` — one row per simulation run
- `run_models` — participating models and gateway IDs
- `run_population` — per-generation counts/shares/scores by model
- `run_cells` — full grid state by generation for replay
- `run_matches` — match-by-match actions and payoffs

## Execution checklist

When running this skill:
1. Read the user's request.
2. Read `file 'web/scratch/data/llm-spatial-ipd/default-config.json'` if useful for inheritance.
3. Translate the request into an explicit JSON config yourself.
4. Save the config into the conversation workspace.
5. Run the Bun script from `file 'web/scratch'`.
6. Capture the `runId`.
7. Verify the run status in SQLite or via `/api/llm-spatial-ipd/runs/<runId>`.
8. Return the run id and the viewer route.

## Example translations

### Example A
User: "run ipd grid 20x20 100 steps all registered models"

Interpret as roughly:
```json
{
  "gridSize": 20,
  "generations": 100,
  "modelWeights": {
    "GPT5_3_CODEX_MC": 1,
    "GPT5_4_MC": 1,
    "OPUS_4_6_MC": 1,
    "KIMI_K2_5_MC": 1,
    "GLM_5_MC": 1,
    "MINIMAX_2_5_MC": 1,
    "GEMINI_3_MC": 1,
    "SONNET_4_5_MC": 1
  }
}
```

### Example B
User: "run a small test with gpt5.4 and sonnet on a 6x6 grid for 10 generations"

Interpret as roughly:
```json
{
  "gridSize": 6,
  "generations": 10,
  "modelWeights": {
    "GPT5_3_CODEX_MC": 0,
    "GPT5_4_MC": 1,
    "OPUS_4_6_MC": 0,
    "KIMI_K2_5_MC": 0,
    "GLM_5_MC": 0,
    "MINIMAX_2_5_MC": 0,
    "GEMINI_3_MC": 0,
    "SONNET_4_5_MC": 1
  }
}
```

## Constraints and cautions

- Do not restart the scratch dev server manually.
- Use the existing site at `file 'web/scratch'`.
- Large runs can be expensive and slow because every edge interaction can trigger model calls.
- For very large requested runs, warn Rob about likely cost/runtime and, if needed, suggest a smaller smoke test first.
- Prefer temporary config files in the conversation workspace rather than editing the canonical default config unless Rob explicitly asks for a default change.
- A helper script may exist for testing, but the skill should not rely on it; the chat agent should be able to execute this workflow from the skill instructions alone.

## Suggested response format after a run

Keep it short:
- run status
- `runId`
- key config summary
- where to view it: `/llm-spatial-ipd?runId=<RUN_ID>`
