Lecture 07 Adversarial Search

#notes #cs471

Recap

Recap: Local Search

Useful when path to goal does not matter / solving pure optimization problem
**Basic Idea: **
- Only keep current state
- Improve iteratively
- Don’t keep paths followed

Recap: Hill-Climbing Search

Repeat visiting neighbors, finding a local maximum

Challenges for Hill-Climbing

Local Maxima
- Once local maxima reached, no way to backtrack

Recap: Simulated Annealing

Key Idea: Usually goes upwards, sometimes goes downward.

Recap: Gradient Descent

Move towards gradient of function

Games

Axes:
- Deterministic vs. stochastic
- One, two, or more players
- Zero sum vs. general sum
- Perfect information vs. partial information
Algorithms need to calculate a ”strategy” (policy) which recommends a move (action) from each position (state)

Deterministic Games

Problem Formulation
- States: S (start at $S_{0}$ )
- Players: $P = {1... N}$ (take turns)
- $T o M o v e (s)$ : The player whose turn it is to move in state $s$
- $A c t i o n s (s)$ : Set of legal moves in state $s$
- $R es u lt (s, a)$ : Transition function, state resulting from taking action $a$ in state $s$
- $I s T er mina l (s)$ : A terminal test, true when game is over
- $U t i l i t y (s, p) : S \times P \to R$ , Final numerical value to player $p$ when the game ends in state $s$
Solution for a player is a policy

Zero-Sum vs. General Games

Zero-Sum Games
- Agents have opposite utilities
- Can think of outcome as a single value that maximizes, and the other minimizes
- Adversarial, pure competition
General Games
- Agents have independent utilitiees
- Cooperation, indifference, competition, and more are all possible

Adversarial Search

Single-Agent Trees

No adversaries
Value of State
- The best achievable outcome from that state

Adversarial Game Trees

Minimax Values

States Under Opponent’s Control
- $V (s^{'}) = mi n_{s \in s u ccessors (s^{'})} V (s)$
States Under Agent’s Control
- $V (s) = ma x_{s^{'} \in s u ccessors (s)} V (s^{'})$

Adversarial Search (Minimax)

Deterministic, zero-sum games:
- Tic-tac-toe, chess, checkers
- One player maximizes result
- Other minimizes
Minimax Search:
- State-space search tree
- Players alternate turns
- Compute each nodes minimax value )
  - Best utility against a rational (optimal) adversary

Minimax Implementation

Minimax Properties

Optimal against a rational player. Otherwise, minimax definition of optimality may not be true

Minimax Efficiency

Like Exhaustive DFS
Time: $O (b^{m})$
Space: $O (bm)$
$b$ = legal moves, $m$ = maximum tree depth

Generative Adversarial Network

An adversarial game of image generation

Generator vs. Discriminator

Meet's Notes

Table of Contents

Lecture 07 Adversarial Search

Recap

Recap: Local Search

Recap: Hill-Climbing Search

Challenges for Hill-Climbing

Recap: Simulated Annealing

Recap: Gradient Descent

Games

Deterministic Games

Zero-Sum vs. General Games

Adversarial Search

Single-Agent Trees

Adversarial Game Trees

Minimax Values

Adversarial Search (Minimax)

Minimax Implementation

Minimax Properties

Minimax Efficiency

Generative Adversarial Network

Graph View

Backlinks

Meet's Notes

Table of Contents

Lecture 07 Adversarial Search

Recap §

Recap: Local Search §

Recap: Hill-Climbing Search §

Challenges for Hill-Climbing §

Recap: Simulated Annealing §

Recap: Gradient Descent §

Games §

Deterministic Games §

Zero-Sum vs. General Games §

Adversarial Search §

Single-Agent Trees §

Adversarial Game Trees §

Minimax Values §

Adversarial Search (Minimax) §

Minimax Implementation §

Minimax Properties §

Minimax Efficiency §

Generative Adversarial Network §

Graph View

Backlinks

Recap

Recap: Local Search

Recap: Hill-Climbing Search

Challenges for Hill-Climbing

Recap: Simulated Annealing

Recap: Gradient Descent

Games

Deterministic Games

Zero-Sum vs. General Games

Adversarial Search

Single-Agent Trees

Adversarial Game Trees

Minimax Values

Adversarial Search (Minimax)

Minimax Implementation

Minimax Properties

Minimax Efficiency

Generative Adversarial Network