![Hands-On Intelligent Agents with OpenAI Gym](https://wfqqreader-1252317822.image.myqcloud.com/cover/567/36699567/b_36699567.jpg)
上QQ阅读APP看书,第一时间看更新
State-value function
A state-value function is a function that represents the agent's estimate of how good it is to be in a state at time step t. It is denoted by
and is usually just called the value function. It represents the agent's prediction of the future reward it would get if it were to end up in state
at time step t. Mathematically, it can be represented as follows:
![](https://epubservercos.yuewen.com/F87B1A/19470390008867106/epubprivate/OEBPS/Images/5b77e308-33ba-4973-920f-675d5ec5a7ec.png?sign=1738888137-KLtulimZxo1utZmNNMiGpWEGQHnLkb10-0-9e01cf07feb33af25f785873f4016a13)
What this expression means is that the value of state under policy
is the expected sum of the discounted future rewards, where
is the discount factor and is a real number in the range [0,1]. Practically, the discount factor is typically set to be in the range of [0.95,0.99]. The other new term is
, which is the policy of the agent.