Here's my thumb-typed answers to the Exercise questions at the end of Chapter 2 of Artificial Intelligence -- A Modern Approach. ai-class.com hasn't officially kicked off the Stanford AI class yet, but I'm enjoying getting a had start... there's a 1st for everything.
Q1: Show that rational agent action should depend on the time step index if it's performance metric is limited to a number of initial time steps.
A1: As the agent approaches the end of its performance window it may be rational to attempt more risky actions if the performance measure would increase within the remaining window but the negative consequences of failure might be spread into the future beyond the performance window. Likewise, a rational agent should attempt all learning actions early in the time window if those actions and learned behaviors have reward profiles that would take a while to affect the performance metric, as long as the cost doesn't outweigh the benefit over the course of the agent "lifetime."
Q2.2a: show that the agent in fig 2.3 is rational under the assumptions on p. 37
A2.2a: There are 4 posible values for the binary 2D percepts in the agent function table. There are 3 possible actions, 2 alternative actions besides the 1 implemented by the agent for each percept vector. The alternatice always reduce performance or keep it the same. Percept -> Alternative Actions: A-C -> S or L delays cleaning of B (1 point worse performance), and if B is already clean performance is the same. A-D -> R or L delays cleaning of A and always reduces performance. B-C->R or S is mirror of A-C->S/L. Likewise for the last "alternative" action.
Q2.2b: If cost of motion is 1, describe a rational agent. Does it need internal state?
A2.2b: same, see above cost of motion is outweighed by potential lost reward (oportunity cost) of 999 points for the unvisited square if move cost is saved, unless the likelihood of dirty is less than .1%, but this task environment factor is unspecified
Q2.2c: if clean squares can get dirty and geography unknown, does it need state then, what should it learn? If not why?
A2.2d: yes. Must learn statistics of dirtiness and any dirtiness statistics dynamics and the geography, even if dirtiness statistics are 0% and constant. The geography uncertainty penalizes poor moves due to a random L/R movement too heavily. Only if dirtiness is 100% would the always sucker be optimal and rational without state or learning and only if he knew it to be 100%.
Q2.3a: partial state sensing agent is always irrational
A2.3a: false. rationality depends on what the agent can observe not what the task environment designer could have made observable. The vacuum suffer that doesn't know its geography or position can still act rationally with what it does sense.
Q2.3b: in some task environments its impossible for a pure reflex agent to behave rationally
A2.3b: true. Vacuum cleaner with unknown geography. Though this sounds identical to question 2.3a which I answered oppositely, it isn't. Sensors are defined as part of the task environment, but state memory and reflex vs non-reflex are part of agent design and fair game for rationality examination.
Q2.3c: for some task environments all agents are rational
A2.3c: true. E.g. the always clean and no movement penalty vacuuming problem. (Presuming the environment is fully known apriori).
Q2.3d: inputs to agent program and agent function are the same
A2.3d: false. An agent program doesn't get percept history, it must record it internally as a state if it needs to use/act on it.
Q2.3e: every agent function is implementable in a machine+program, only purely in theory not in practice--you can imagine an infinite table of percepts and actions, and a hyperdimensional quantum computer and program that implements that infinity, but an implementation might not fit within the known universe.
Q2.3f: in a deterministic task environment there exists a purely random agent that is rational
A2.3f: true. E.g. agent that must buy and sell a finite dollar value of stock shares once a day with no trading cost perfmance penalty, but must maximize returns and minimize returns volatility.
Q2.3g: the same agent can be ration in 2 different task environments
A2.3g: true. E.g. the suck-or-move agent is the best for most variations of the task environment discussed, and an optimal general agent would be rational in a nearly infinite variety of task environments.
Q2.3h: every agent is rational in an unobservable environment.
A2.3h: false. In the random redirtying but unobservable vacuuming environment, agents that don't suck or move would be irrational.
Q2.3i: a perfectly rational poker agent never loses.
A2.3i: false, but if losing is considered a full sequence game is the determiner of win/lose and not an individual hand, and if the game is infinitely long and the bankroll is infinitely deep, then true. But then even a nonrational agent never loses. Only if "never" becomes finite does a finite agent in a finite environment have a chance of "never" losing. Never is an infinity, so infinite environments and resources can also be assumed unless the question is qualified/clarified.
Q2.4 characterize in PEAS and chqaracteristics of sect 2.3.2:
Q2.4a playing soccer
P--score greater than oponent
E--players, ball, goals, field, crowd (noise interference
Random--holes/rocks in field
MostlyKnwn--crowd psychology unknown, some spin aerodynamics not fully known
A--7 DOF legs, 7DOF arms for fending/balance, voice for com, interferece, distraction, and crowd inciting, 6DOF head
S--ears, eyes, smell (detect players behind you), propriceptive sense, accelerometer/gyro (inner ear)
Q2.4b Titan moon sub
P--new knowledge/data returned to earth
E--emersed in liquid, life forms?, weather, radiation, geothermal temp swings, currents, ice, great distance
SA PO R D Seq C PartiallyKnwn
A--propeller, rf emitter, claws?, tracks?, Arms?, tazer?, nuclear radiation?, pumps with filters/samplers
S -- cameras, thermom, spectrometer, microscope, microphone,
Q2.4c shopping for used ai books online
Q2.4d playing tennis
Q2.4e practicing tennis against wall
Q2.4f high jump
Q2.4e bidding at live auction
agent -- ai element that inerracts with world using sensors + actuators
agent function -- ideal infinite table of percepts paired with actions
agent program -- algorithm implmented in a machine to accomplish ai
rationality -- degree to which agent does the best it could know to do
autonomy -- degree of independence from human input
reflex agent -- agent that makes one decision for each isolated percept
model-based agent -- agent designed using ba model of the env
goal-b a -- a that targets a single percept state or a threshold percept
utility b a -- a that uses a nonbinary measure of performance
learning a -- a that adjusts parameters within its program over time
-- AI–aMA p. 61 [81/1152]