본문 바로가기

전자

Quick Derivation of Hamilton-Jacobi-Bellman Equation

1. Settings

Objective:

J(T)=t0TL(x(t),u(t))dt+ϕ(xT)

Dynamic programming:

V(x,t)=minu(t){tt+hL(x(τ),u(τ))dτ+V(x(t+h),t+h)}

System:

x˙=f(x,u)

2. Approximation

Zero-order approximation:

tt+hL(x(τ),u(τ))dτhL(x(t),u(t))

First-order approximation (Taylor expansion):

V(x(t+h),t+h)V(x(t),t)+hdVdt=V(x(t),t)+hVxxt+hVt=V(x(t),t)+hVxf(x,u)+hVt

3. HJB Equation

Substitue the approximations for the exact representations in the dynamic programming.

V(x,t)=minu(t){hL(x(t),u(t))+V(x,t)+hVxf(x,u)+hVt}=V(x,t)+minu(t){hL(x(t),u(t))+hVxf(x,u)+hVt}

0=minu(t){hL(x(t),u(t))+hVxf(x,u)+hVt}

0=Vt+minu{L(x,u)+Vxf(x,u)}