Runway Zero

RL-trained LLMs recover airport crises that rule-based systems cannot solve end to end.

Space Redeployed to preserve GPU; switched to FREE TIER

Link to Vercel Site - https://project-2pdc2.vercel.app/

Airlines still depend on highly trained operations-control teams when disruption cascades across aircraft, crews, passengers, runways, airline economics, and fairness. Runway Zero turns that unsolved end-to-end recovery problem into an OpenEnv environment where LLM agents learn from verifiable rewards.

Open crisis replay Training evidence HF Space: Monopoly Game played by RL Agents in YouTube Demo background

393 RL / 1,827 basesimulated cancellations across all models/levels

90%delay reduction after RL

4 × 4models by crisis levels

7B RL > 120B w/o RLtrained Qwen2.5 beats every base model

Hugging Face Space

The Space runs the OpenEnv environment; the large-model training evidence is linked here.

The public Space keeps the reset, step, state, and close API live for evaluation. The 7B, 14B, 31B, and 120B LLM training runs were executed as GPU GRPO jobs, then exported as adapters, trainer states, replay traces, and plots. Keeping those models loaded inside a public demo Space would make the judge experience slow and unreliable, so this page opens the exact replay UI and links the notebooks and artifacts used to verify training.

OpenEnv API metadata Live environment state Training evidence GUI GRPO notebook Hosted training artifacts

Hackathon Story

Most benchmarks reward planning. Runway Zero rewards recovery.

Static schedules are easy. The real skill is what happens when fog hits Delhi, Mumbai loses a runway, Bengaluru slots become political, crew legality collapses, passengers are stranded, and every airline asks Tower Central to favor them. Existing tools support pieces of this work; the complete end-to-end recovery problem remains human-led.

Level 1

Operations Recovery

Fog, runway debris, and aircraft faults test whether the model can make safe dispatch decisions.

84 RL / 38 baseRecovery score comparison

Open replay Level 2

Passenger-Aware Recovery

Connections, stranded passengers, emergency arrivals, and gate failures turn delay into human cost.

88 RL / 29 baseRecovery score comparison

Open replay Level 3

Economic Multi-Agent Control

IndiGo, Air India, Akasa Air, and SpiceJet negotiate slots while Tower Central preserves fairness.

90 RL / 21 baseRecovery score comparison

Open replay Level 4

IndiGo Crisis Replay

A December 2025-style crew availability crisis shows how RL recovery could reduce mass cancellations.

82 RL / 12 baseRecovery score comparison

Open replay