Quix · Progressive results
Progressive view

From V0 to three levels deep.

Watch the cohort fall toward the origin as each level adds context. V0 first — then Level 1, then Level 2, then Level 3. Level 4 is the polish; the shape was already set by then.

V0 yaw ≡ 0.0163 V0 cte ≡ 254 m x = yaw RMSE y = CTE RMSE
Stage 0

The "non-existing agent."

V0 is the do-nothing baseline. It marks where every agent starts — top-right, far from where we want to be.

STAGE 0 · V0 ONLY0.0000.0040.0090.0130.018069137206275yaw RMSE (rad/s)CTE RMSE (m)V0 baseline: yaw=0.0163, cte=254.3
V0 baseline
V0 yaw RMSE
0.0163
rad/s
V0 CTE RMSE
254.3
metres
Stage 1

Level 1 agents enter the field.

Ten first-pass models. The pack falls a long way in one step — most of the headroom is in just having any physically-grounded prediction.

STAGE 1 · V0 + LEVEL 10.0000.0040.0090.0130.018069137206275yaw RMSE (rad/s)CTE RMSE (m)V0 baseline: yaw=0.0163, cte=254.3Level 1 A01: yaw=0.0080, cte=110.2Level 1 A02: yaw=0.0084, cte=108.8Level 1 A03: yaw=0.0084, cte=110.2Level 1 A04: yaw=0.0085, cte=111.7Level 1 A05: yaw=0.0084, cte=113.8Level 1 A06: yaw=0.0086, cte=121.3Level 1 A07: yaw=0.0082, cte=118.4Level 1 A08: yaw=0.0097, cte=110.3Level 1 A09: yaw=0.0081, cte=113.3Level 1 A10: yaw=0.0085, cte=128.5
V0 baseline
Level 1 agents
Cohort size so far
10 agents
Best yaw RMSE
0.0080
↓ 51.0% vs V0
Best CTE RMSE
108.8 m
↓ 57.2% vs V0
Stage 2

Level 2 — same brief, sharper skill.

Refit the same shape with better fitting choices. The pack tightens; the floor doesn’t move much.

STAGE 2 · V0 + LEVEL 1 + LEVEL 20.0000.0040.0090.0130.018069137206275yaw RMSE (rad/s)CTE RMSE (m)V0 baseline: yaw=0.0163, cte=254.3Level 1 A01: yaw=0.0080, cte=110.2Level 1 A02: yaw=0.0084, cte=108.8Level 1 A03: yaw=0.0084, cte=110.2Level 1 A04: yaw=0.0085, cte=111.7Level 1 A05: yaw=0.0084, cte=113.8Level 1 A06: yaw=0.0086, cte=121.3Level 1 A07: yaw=0.0082, cte=118.4Level 1 A08: yaw=0.0097, cte=110.3Level 1 A09: yaw=0.0081, cte=113.3Level 1 A10: yaw=0.0085, cte=128.5Level 2 A01: yaw=0.0084, cte=109.6Level 2 A02: yaw=0.0082, cte=103.3Level 2 A03: yaw=0.0083, cte=108.9Level 2 A04: yaw=0.0082, cte=104.0Level 2 A05: yaw=0.0083, cte=104.0Level 2 A06: yaw=0.0080, cte=111.0Level 2 A07: yaw=0.0080, cte=109.6Level 2 A08: yaw=0.0088, cte=109.4Level 2 A09: yaw=0.0080, cte=111.0Level 2 A10: yaw=0.0080, cte=115.2
V0 baseline
Level 1 agents
Level 2 agents
Cohort size so far
20 agents
Best yaw RMSE
0.0080
↓ 51.0% vs V0
Best CTE RMSE
103.3 m
↓ 59.4% vs V0
Stage 3

Level 3 — domain knowledge in the prompt.

Vehicle dynamics handed to the agent. The whole cohort shifts down on CTE — the kind of move you don’t get from compute.

STAGE 3 · V0 + LEVEL 1 + LEVEL 2 + LEVEL 30.0000.0040.0090.0130.018069137206275yaw RMSE (rad/s)CTE RMSE (m)V0 baseline: yaw=0.0163, cte=254.3Level 1 A01: yaw=0.0080, cte=110.2Level 1 A02: yaw=0.0084, cte=108.8Level 1 A03: yaw=0.0084, cte=110.2Level 1 A04: yaw=0.0085, cte=111.7Level 1 A05: yaw=0.0084, cte=113.8Level 1 A06: yaw=0.0086, cte=121.3Level 1 A07: yaw=0.0082, cte=118.4Level 1 A08: yaw=0.0097, cte=110.3Level 1 A09: yaw=0.0081, cte=113.3Level 1 A10: yaw=0.0085, cte=128.5Level 2 A01: yaw=0.0084, cte=109.6Level 2 A02: yaw=0.0082, cte=103.3Level 2 A03: yaw=0.0083, cte=108.9Level 2 A04: yaw=0.0082, cte=104.0Level 2 A05: yaw=0.0083, cte=104.0Level 2 A06: yaw=0.0080, cte=111.0Level 2 A07: yaw=0.0080, cte=109.6Level 2 A08: yaw=0.0088, cte=109.4Level 2 A09: yaw=0.0080, cte=111.0Level 2 A10: yaw=0.0080, cte=115.2Level 3 A01: yaw=0.0074, cte=72.8Level 3 A02: yaw=0.0070, cte=70.6Level 3 A03: yaw=0.0070, cte=70.6Level 3 A04: yaw=0.0070, cte=70.4Level 3 A05: yaw=0.0071, cte=70.6Level 3 A06: yaw=0.0071, cte=70.6Level 3 A07: yaw=0.0071, cte=70.6Level 3 A08: yaw=0.0071, cte=70.6Level 3 A09: yaw=0.0070, cte=70.6Level 3 A10: yaw=0.0071, cte=70.6
V0 baseline
Level 1 agents
Level 2 agents
Level 3 agents
Cohort size so far
30 agents
Best yaw RMSE
0.0070
↓ 56.8% vs V0
Best CTE RMSE
70.4 m
↓ 72.3% vs V0
Trajectory

What each level moved.

Improvement vs V0, level by level. Median is the cohort centre; best is the strongest single agent. Level 4 not shown — the same numbers, with finishing.

LevelYaw ↑ medianYaw ↑ best CTE ↑ medianCTE ↑ best
Level 148.6%51.0%56.1%57.2%
Level 249.8%51.0%57.0%59.4%
Level 356.6%56.8%72.2%72.3%
V0 yaw ≡ 0.0163 V0 cte ≡ 254 m 3 levels, 30 agents
← → navigate