In this paper, the finite-horizon optimal control design for nonlinear discrete-time systems in affine form is presented. In contrast with the traditional approximate dynamic programming methodology, which requires at least partial knowledge of the system dynamics, in this paper, the complete system dynamics are relaxed utilizing a neural network (NN)-based identifier to learn the control coefficient matrix. The identifier is then used together with the actor-critic-based scheme to learn the time-varying solution, referred to as the value function, of the Hamilton-Jacobi-Bellman (HJB) equation in an online and forward-in-time manner. Since the solution of HJB is time-varying, NNs with constant weights and time-varying activation functions are considered. To properly satisfy the terminal constraint, an additional error term is incorporated in the novel update law such that the terminal constraint error is also minimized over time. Policy and/or value iterations are not needed and the NN weights are updated once a sampling instant. The uniform ultimate boundedness of the closed-loop system is verified by standard Lyapunov stability theory under nonautonomous analysis. Numerical examples are provided to illustrate the effectiveness of the proposed method.
- Closed loop systems,
- Digital control systems,
- Dynamic programming,
- Numerical methods,
- System theory,
- Approximate dynamic programming,
- Finite horizon optimal control,
- Finite horizons,
- Hamilton-Jacobi-Bellman (HJB) equations,
- Neural network (NN),
- Nonlinear discrete-time systems,
- Optimal controls,
- Uniform ultimate Boundedness,
- Discrete time control systems,
- Artificial neural network,
- Nonlinear system,
- Time,
- Uncertainty,
- Neural Networks (Computer),
- Nonlinear Dynamics,
- Time Factors,
- Uncertainty,
- Finite-horizon
Available at: http://works.bepress.com/jagannathan-sarangapani/222/