#notes#cs471

Recap

  • Useful when path to goal does not matter / solving pure optimization problem
  • **Basic Idea: **
    • Only keep current state
    • Improve iteratively
    • Don’t keep paths followed
  • Repeat visiting neighbors, finding a local maximum
Challenges for Hill-Climbing
  • Local Maxima
    • Once local maxima reached, no way to backtrack
Recap: Simulated Annealing
  • Key Idea: Usually goes upwards, sometimes goes downward.

Recap: Gradient Descent

  • Move towards gradient of function

Games

  • Axes:
    • Deterministic vs. stochastic
    • One, two, or more players
    • Zero sum vs. general sum
    • Perfect information vs. partial information
  • Algorithms need to calculate a ”strategy” (policy) which recommends a move (action) from each position (state)

Deterministic Games

  • Problem Formulation
    • States: S (start at )
    • Players: (take turns)
    • : The player whose turn it is to move in state
    • : Set of legal moves in state
    • : Transition function, state resulting from taking action in state
    • : A terminal test, true when game is over
    • , Final numerical value to player when the game ends in state
  • Solution for a player is a policy

Zero-Sum vs. General Games

  • Zero-Sum Games
    • Agents have opposite utilities
    • Can think of outcome as a single value that maximizes, and the other minimizes
    • Adversarial, pure competition
  • General Games
    • Agents have independent utilitiees
    • Cooperation, indifference, competition, and more are all possible

Single-Agent Trees

  • No adversaries
  • Value of State
    • The best achievable outcome from that state

Adversarial Game Trees

Minimax Values
  • States Under Opponent’s Control
  • States Under Agent’s Control

Adversarial Search (Minimax)

  • Deterministic, zero-sum games:
    • Tic-tac-toe, chess, checkers
    • One player maximizes result
    • Other minimizes
  • Minimax Search:
    • State-space search tree
    • Players alternate turns
    • Compute each nodes minimax value )
      • Best utility against a rational (optimal) adversary
Minimax Implementation

Minimax Properties
  • Optimal against a rational player. Otherwise, minimax definition of optimality may not be true

Minimax Efficiency

  • Like Exhaustive DFS
  • Time:
  • Space:
  • = legal moves, = maximum tree depth

Generative Adversarial Network

An adversarial game of image generation

  • Generator vs. Discriminator