DIT410/TIN174, Artificial Intelligence
Peter Ljunglöf
28 April, 2017
A graph consists of a set \(N\) of nodes and a set \(A\) of ordered pairs of nodes,
called arcs.
Node \(n_2\) is a neighbor of \(n_1\)
if there is an arc from \(n_1\) to \(n_2\).
That is, if \( (n_1, n_2) \in A \).
A path is a sequence of nodes \( (n_0, n_1, \ldots, n_k) \) such that \( (n_{i-1}, n_i) \in A \).
The length of path \( (n_0, n_1, \ldots, n_k) \) is \(k\).
A solution is a path from a start node to a goal node,
given a set of start nodes and goal nodes.
(Russel & Norvig sometimes call the graph nodes states).
A generic search algorithm:
Given a graph, start nodes, and a goal description, incrementally
explore paths from the start nodes.
Maintain a frontier of nodes that are to be explored.
As search proceeds, the frontier expands into the unexplored nodes
until a goal node is encountered.
The way in which the frontier is expanded defines the search strategy.
BFS is guaranteed to halt but uses exponential space.
DFS uses linear space, but is not guaranteed to halt.
Idea: take the best from BFS and DFS — recompute elements of the frontier rather than saving them.
Iterative deepening search calls depth-bounded DFS with increasing bounds:
Complexity with solution at depth \(k\) and branching factor \(b\):
level | breadth-first | iterative deepening | # nodes |
---|---|---|---|
\(1\) \(2\) \(\vdots\) \(k-1\) \(k\) |
\(1\) \(1\) \(\vdots\) \(1\) \(1\) |
\(k\) \(k-1\) \(\vdots\) \(2\) \(1\) |
\(b\) \(b^{2}\) \(\vdots\) \(b^{k-1}\) \(b^{k}\) |
total | \({}\geq b^{k}\) | \({}\leq b^{k}\left(\frac{b}{b-1}\right)^{2}\) |
Numerical comparison for \(k=5\) and \(b=10\):
Note: IDS recalculates shallow nodes several times,
but this doesn’t have a big effect compared to BFS!
Idea: search backward from the goal and forward from the start simultaneously.
This can result in an exponential saving, because \(2b^{k/2}\ll b^{k}\).
The main problem is making sure the frontiers meet.
One possible implementation:
Use BFS to gradually search backwards from the goal,
building a set of locations that will lead to the goal.
Interleave this with forward heuristic search (e.g., A*)
that tries to find a path to these interesting locations.
A* always finds an optimal solution first, provided that:
the branching factor is finite,
arc costs are bounded above zero
(i.e., there is some \(\epsilon>0\)
such that all
of the arc costs are greater than \(\epsilon\)), and
\(h(n)\) is admissible
Graph search keeps track of visited nodes, so we don’t visit the same node twice.
Suppose that the first time we visit a node is not via the most optimal path
\(\Rightarrow\) then graph search will return a suboptimal path
Under which circumstances can we guarantee that A* graph search is optimal?
If \(h\) is consistent, then A* graph search is optimal:
Consistency is defined as: \(h(n’) \leq cost(n’, n) + h(n)\) for all arcs \((n’, n)\)
The \(f\) values in A* are nondecreasing, therefore:
first | A* expands all nodes with \( f(n) < C \) |
then | A* expands all nodes with \( f(n) = C \) |
finally | A* expands all nodes with \( f(n) > C \) |
A* will not expand any nodes with \( f(n) > C* \),
where \(C*\) is the cost of an optimal solution.
A* tree search is optimal if:
A* graph search is optimal if:
Search strategy |
Frontier selection |
Halts if solution? | Halts if no solution? | Space usage |
---|---|---|---|---|
Depth first | Last node added | No | No | Linear |
Breadth first | First node added | Yes | No | Exp |
Greedy best first | Minimal \(h(n)\) | No | No | Exp |
Uniform cost | Minimal \(g(n)\) | Optimal | No | Exp |
A* | \(f(n)=g(n)+h(n)\) | Optimal* | No | Exp |
*Provided that \(h(n)\) is admissible.
If (admissible) \(h_{2}(n)\geq h_{1}(n)\) for all \(n\),
then \(h_{2}\) dominates \(h_{1}\) and is better for search.
Typical search costs (for 8-puzzle):
depth = 14 | DFS ≈ 3,000,000 nodes A*(\(h_1\)) = 539 nodes A*(\(h_2\)) = 113 nodes |
depth = 24 | DFS ≈ 54,000,000,000 nodes A*(\(h_1\)) = 39,135 nodes A*(\(h_2\)) = 1,641 nodes |
Given any admissible heuristics \(h_{a}\), \(h_{b}\),
the maximum heuristics \(h(n)\)
is also admissible and dominates both:
\[ h(n) = \max(h_{a}(n),h_{b}(n)) \]
Admissible heuristics can be derived from the exact solution cost of
a relaxed problem:
If the rules of the 8-puzzle are relaxed so that a tile can move anywhere,
then \(h_{1}(n)\) gives the shortest solution
If the rules are relaxed so that a tile can move to any adjacent square,
then \(h_{2}(n)\) gives the shortest solution
Key point: the optimal solution cost of a relaxed problem is
never greater than
the optimal solution cost of the real problem
A* search with admissible (consistent) heuristics is optimal
But what happens if the heuristics is non-admissible?
Why would we want to use a non-admissible heuristics?
if
State=5 then
[Right, Suck] else
[]]The solution subtree is shown in bold, and corresponds to the plan:
[Suck, if
State=5 then
[Right, Suck] else
[]]
(a) Predicting the next belief state for the sensorless vacuum world
with a deterministic action, Right.
(b) Prediction for the same belief state and action in the nondeterministic
slippery version of the sensorless vacuum world.
The main difference to chapters 3–4:
now we have more than one agent that have different goals.
All possible game sequences are represented in a game tree.
The nodes are states of the game, e.g. board positions in chess.
Initial state (root) and terminal nodes (leaves).
States are connected if there is a legal move/ply.
(a ply is a move by one player, i.e., one layer in the game tree)
Utility function (payoff function). Terminal nodes have utility values
\({+}x\) (player 1 wins), \({-}x\) (player 2 wins) and \(0\) (draw).
Perfect information games are solvable in a manner similar to
fully observable single-agent systems, e.g., using forward search.
If two agents are competing so that a positive reward for one is a negative reward
for the other agent, we have a two-agent zero-sum game.
The value of a game zero-sum game can be characterized by a single number that one agent is trying to maximize and the other agent is trying to minimize.
This leads to a minimax strategy:
The Minimax algorithm gives perfect play for deterministic, perfect-information games.
Minimax(root) | = | \( \max(\min(3,12,8), \min(2,x,y), \min(14,5,2)) \) |
= | \( \max(3, \min(2,x,y), 2) \) | |
= | \( \max(3, z, 2) \) where \(z\leq 2\) | |
= | \( 3 \) |
The amount of pruning provided by the α-β algorithm depends on the ordering of the children of each node.
It works best if a highest-valued child of a MAX node is selected first and
if a lowest-valued child of a MIN node is returned first.
In real games, much of the effort is made to optimise the search order.
With a “perfect ordering”, the time complexity becomes \(O(b^{m/2})\)
Most real games are too big to carry out minimax search, even with α-β pruning.
For these games, instead of stopping at leaf nodes,
we have to use a cutoff test to decide when to stop.
The value returned at the node where the algorithm stops
is an estimate of the value for this node.
The function used to estimate the value is an evaluation function.
Much work goes into finding good evaluation functions.
There is a trade-off between the amount of computation required
to compute the evaluation function and the size of the search space
that can be explored in any given time.
A naive evaluation function will not see the difference between these two states.
where \(P(a)\) is the probability that action a occurs.
Variables: | WA, NT, Q, NSW, V, SA, T |
Domains: | \(D_i\) = {red, green, blue} |
Constraints: | SA≠WA, SA≠NT, SA≠Q, SA≠NSW, SA≠V, WA≠NT, NT≠Q, Q≠NSW, NSW≠V |
Constraint graph: | Every variable is a node, every binary constraint is an arc. |
Variables: | F, T, U, W, R, O, \(X_1, X_2, X_3\) |
Domains: | \(D_i\) = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} |
Constraints: | Alldiff(F,T,U,W,R,O), O+O=R+10·\(X_1\), etc. |
Constraint graph: | This is not a binary CSP! The graph is a constraint hypergraph. |
The general-purpose algorithm gives rise to several questions:
Heuristics for selecting the next unassigned variable:
Minimum remaining values (MRV):
\(\Longrightarrow\) choose the variable with the fewest legal values
Degree heuristic (if there are several MRV variables):
\(\Longrightarrow\) choose the variable with most constraints on remaining variables
Heuristics for ordering the values of a selected variable:
What if some domains have more than one element after AC?
We can resort to backtracking search:
Do we need to restart AC from scratch?
There are several kinds of consistency properties and algorithms:
Node consistency: single variable, unary constraints (straightforward)
Arc consistency: pairs of variables, binary constraints (AC-3 algorithm)
Path consistency: triples of variables, binary constraints (PC-2 algorithm)
\(k\)-consistency: \(k\) variables, \(k\)-ary constraints (algorithms exponential in \(k\))
Consistency for global constraints:
Suppose that each subproblem has \(c\) variables out of \(n\) total.
The cost of the worst-case solution
is \(n/c\cdot d^{c}\), which is linear in \(n\).
Start with any complete tour, and perform pairwise exchanges
Variants of this approach get within 1% of optimal
very quickly with thousands of cities
Hill climbing search is also called gradient/steepest ascent/descent,
or greedy local search.
Local maxima — Ridges — Plateaux
As well as upward steps we can allow for:
Random steps: (sometimes) move to a random neighbor.
Random restart: (sometimes) reassign random values to all variables.
Both variants can be combined!
Two 1-dimensional search spaces; you can step right or left:
Idea: maintain a population of \(k\) states in parallel, instead of one.
The value of \(k\) lets us limit space and parallelism.
Note: this is not the same as \(k\) searches run in parallel!