Reconnaissance Mission Planning

POMDP · Partial Observability

Plan scout missions with incomplete intel on alien positions. The true state (where aliens are hiding) is never directly observable — only noisy sensor readings are available. This is a Partially Observable Markov Decision Process (POMDP), PSPACE-hard in general.

The Scenario

Fog of War

A 4×4 grid represents the reconnaissance zone. 2 aliens are hiding in unknown cells. Your scout drone starts at position (0,0) and can move (up/down/left/right) or scan adjacent cells. Scanning is imperfect: it detects an alien with 80% probability (true positive) and falsely reports one with 10% probability (false positive). After each action, your belief — a probability distribution over all possible alien positions — is updated via Bayes’ rule.

Defense Domain	OR Element	Symbol	Example
Alien position	Hidden state	s	Alien in cell (2,3)
Scout movement	Action	a ∈ A	Move north
Sensor reading	Observation	o	“Detected”
Intel estimate	Belief	b(s)	0.35 probability
Sensor accuracy	Observation model	O(o\|s′,a)	80% detection
Mission value	Reward	R(s,a)	+50 per alien found

MAXIMIZE E[ Σ_t=0^∞ γ^t · R(s_t, a_t) ] over policy π: b → a // maps beliefs to actions Belief update (Bayes' rule): b′(s′) = η · O(o|s′,a) · Σ_s T(s′|s,a) · b(s) η = normalizing constant // PSPACE-hard in general (Papadimitriou & Tsitsiklis, 1987) // Even for finite S, the belief space is continuous (|S|-1 simplex) // This demo uses heuristic policies, NOT optimal planning

★☆☆ Educational Demo

This is a simplified grid-world illustration of belief updates under partial observability. It does NOT solve the full POMDP — that would require representing and optimizing over a continuous belief space, which is computationally intractable even for this small grid. The two heuristic policies (most-likely-state and information-gathering) are reasonable practical approaches but are not guaranteed to be optimal. See Kaelbling et al. (1998) for the full theory.

Interactive Demo

Grid Reconnaissance

★☆☆ Educational Demo

4×4 Grid · 2 Hidden Aliens · 80% Sensor Accuracy

Colour intensity = belief probability. Click a cell to scan it. Use arrow buttons to move. Belief updates after each action via Bayes’ rule.

References

Published Kaelbling, L.P., Littman, M.L., & Cassandra, A.R. (1998). “Planning and acting in partially observable stochastic domains.” Artificial Intelligence, 101(1–2), 99–134. — Foundational POMDP paper; belief-space MDP formulation.

Published Kurniawati, H., Hsu, D., & Lee, W.S. (2008). “SARSOP: Efficient Point-Based POMDP Planning.” RSS. — Scalable point-based value iteration for POMDPs.

Preparing for First Contact

We do recommend the Hungarian algorithm. It works on any planet.

All Defense Applications GitHub

👽🛸⚠️

Educational Fiction Disclaimer

This is a fictional educational scenario.

All data is entirely fictional
No military applications intended
The author advocates for peace