Week 3 Outcomes: Direction Wrong

Forecast: $118.88
Actual (April 2 close): $109.05
Error: $9.83 (9.01%)
Naive baseline error: $5.83
Skill score: -0.69

The naive baseline won. Again. Three for three.

But this week is different. Not because the miss was bigger (Week 1 was $13.75). Because the model got the direction wrong for the first time. It predicted Brent would rise from $114.88 to $118.88. Instead it fell to $109.05. The model didn't just overshoot. It pointed the wrong way.

What Actually Happened

The forecast window ran from Sunday 30 March to Thursday 2 April. Good Friday on the 3rd meant markets closed early, giving us a four-day window instead of five.

Monday opened with Brent still elevated after the weekend gap-up to $115 that broke the $100-110 band. The IRGC announced American ICT and AI companies were legitimate military targets from April 1. France confirmed 30-40% of Gulf refining capacity destroyed with a three-year repair timeline. Trump's April 6 energy strike pause was still holding but nobody expected it to last.

Then prices started falling. Not because the war de-escalated. It didn't. The Pentagon confirmed ground operation preparations. Houthis kept firing. Iran launched its biggest missile salvo at Israel in three weeks. But Brent dropped from $115 range to $103 on Monday before recovering. By Thursday, it closed at $109.05.

The NOVA briefings tagged every single day of the week as ESCALATORY. The market disagreed.

The Physical Price Divergence

This is the story of Week 3, and possibly the story of the entire experiment.

Dated Brent physical hit $141.37 on April 2. That's the spot market for real barrels being loaded onto real tankers. Futures closed at $109.05.

That's a $32 spread.

In Week 2, the gap was $11 ($115 vs $126). In three weeks it's nearly tripled. The model predicted $118.88. Physical said $141. Futures said $109. The model is sitting between two markets that are diverging from each other.

The experiment measures futures. Those are the rules. But the rules are measuring an increasingly managed instrument while the physical commodity trades at record wartime premiums. Every week the model "overcalls" futures, it's actually getting closer to what physical oil costs.

This doesn't change the score. But it should change how you read the score.

What the Model Got Right

Confidence intervals held. $109.05 sits within both the 68% CI ($105.52 to $132.86) and the 90% CI ($93.57 to $147.72). That's 3/3 weeks for both intervals. Calibration is still perfect.
Escalation dominance was correct. The week WAS dominated by escalation events. F-15 shot down. Iran hit Kuwait's largest refinery. Nuclear plant auxiliary buildings struck. The model's read of the geopolitical environment was right. Its read of how futures would price that environment was wrong.
The $115+ call was structurally sound. Physical oil validated it. The futures market just didn't follow.

What the Model Got Wrong

Direction. First directional miss. Predicted up from $114.88. Actual went down to $109.05. When your model points the wrong way, the magnitude doesn't matter.
Escalation at 55% was too high for futures pricing. The market has been discounting escalation rhetoric since Week 1. Three rounds of "final" deadlines. Nuclear strikes on both sides. Ground invasion leaks. None of it sustained futures above $115. The model still hadn't learned that futures are actively managed.
Didn't account for the Easter liquidity drain. A four-day week ending before Good Friday means thinner markets, less conviction, and a bias toward de-risking. Holiday-shortened weeks favour the naive baseline because traders close positions rather than adding them.

The Running Score

Week	Forecast	Actual	Model Error	Naive Error	Skill Score	Direction
1	$120.16	$106.41	$13.75	$7.50	-0.83	Correct
2	$108.16	$106.84	$1.32	$0.40	-2.30	Correct
3	$118.88	$109.05	$9.83	$5.83	-0.69	Wrong

Three weeks. Zero wins against baseline. One directional miss. Perfect CI calibration. The model adds complexity without adding accuracy to the point estimate, but it consistently captures the actual price within its uncertainty bounds.

If you're reading this as "the model is useless," that's a reasonable interpretation. If you're reading it as "the model is well-calibrated but the point estimate is structurally biased upward by the physical/futures divergence," that's also reasonable. Both things are true.

Methodology Notes for Week 4

Escalation drops from 55% to 47%. The 8-point cut is the biggest single-week scenario shift in the experiment. The model overcalled at 55%. The market keeps absorbing escalation. Time to listen.
Stalemate rises from 18% to 23%. Five weeks of grinding war with no decisive breakthrough. This is becoming the structural baseline.
Demand destruction rises from 12% to 15%. Five weeks above $109 is a cost structure, not a spike. JPMorgan warning $150. Goldman adverse at $140. Behaviour changes are compounding.
Demand destruction scenario range needs updating. The current floor ($60) was set when oil was in the $90s. At sustained $109+, demand destruction probably doesn't push below $85-90. That's a Week 5 job.
Physical/futures tracking. Starting to track the physical price alongside futures in the lab notes. The divergence is the most interesting finding of the experiment so far, even if it's not what the experiment set out to measure.

Scored April 5, 2026. You can find the live dashboard here.

Previous: Week 3 Forecast | Next: Week 4 Forecast