This study provides a lifelong integral reinforcement learning (LIRL)-based optimal tracking scheme for uncertain nonlinear continuous-time (CT) systems using multilayer neural network (MNN). In this LIRL framework, the optimal control policies are generated by using both the critic neural network (NN) weights and single-layer NN identifier. The critic MNN weight tuning is accomplished using an improved singular value decomposition (SVD) of its activation function gradient. The NN identifier, on the other hand, provides the control coefficient matrix for computing the control policies. An online weight velocity attenuation (WVA)-based consolidation scheme is proposed wherein the significance of weights is derived by using Hamilton-Jacobi-Bellman (HJB) error. This WVA term is incorporated in the critic MNN update law to overcome catastrophic forgetting. Lyapunov stability is employed to demonstrate the uniform ultimate boundedness of the overall closed-loop system. Finally, a numerical example of a two-link robotic manipulator supports the theoretical claims.
- Catastrophic forgetting,
- Continual learning,
- Lifelong learning,
- Multilayer neural networks,
- Reinforcement learning
Available at: http://works.bepress.com/jagannathan-sarangapani/278/
Office of Naval Research, Grant N00014-21-1-2232