In this paper, we study the optimal control of a discrete-time stochastic differential equation (SDE) of mean-field type, where the coefficients can depend on both a function of the law and the state of the process. We establish a new version of the maximum principle for discrete-time mean-field type stochastic optimal control problems. Moreover, the cost functional is also of the mean-field type. This maximum principle differs from the classical principle one since we introduce new discrete-time mean-field backward (matrix) stochastic equations. Based on the discrete-time mean-field backward stochastic equations where the adjoint equations turn out to be discrete backward SDEs with mean field, we obtain necessary first-order and sufficient optimality conditions for the stochastic discrete mean-field optimal control problem. To verify, we apply the result to production and consumption choice optimization problem.
It is well known that conservative mechanical systems exhibit local oscillatory behaviours due to their elastic and gravitational potentials, which completely characterise these periodic motions together with the inertial properties of the system. The classification of these periodic behaviours and their geometric characterisation are in an on-going secular debate, which recently led to the so-called eigenmanifold theory. The eigenmanifold characterises nonlinear oscillations as a generalisation of linear eigenspaces. With the motivation of performing periodic tasks efficiently, we use tools coming from this theory to construct an optimization problem aimed at inducing desired closed-loop oscillations through a state feedback law. We solve the constructed optimization problem via gradient-descent methods involving neural networks. Extensive simulations show the validity of the approach.
The purpose of this paper is to utilize adaptive dynamic programming to solve an optimal consensus problem for double-integrator multi-agent systems with completely unknown dynamics. In double-integrator multi-agent systems, flocking algorithms that neglect agents' inertial effect can cause unstable group behavior. Despite the fact that an inertias-independent protocol exists, the design of its control law is decided by dynamics and inertia. However, inertia in reality is difficult to measure accurately, therefore, the control gain in the consensus protocol was solved by developing adaptive dynamic programming to enable the double-integrator systems to ensure the consensus of the agents in the presence of entirely unknown dynamics. Firstly, we demonstrate in a typical example how flocking algorithms that ignore the inertial effect of agents can lead to unstable group behavior. And even though the protocol is independent of inertia, the control gain depends quite strongly on the inertia and dynamic of the agent. Then, to address these shortcomings, an online policy iteration-based adaptive dynamic programming is designed to tackle the challenge of double-integrator multi-agent systems without dynamics. Finally, simulation results are shown to prove how effective the proposed approach is.
In this paper, an uncertain disturbance rejection control problem for the affine system in the presence of asymmetric input constraints is addressed using an event-triggered control method. The disturbance rejection control is converted to an H ∞ optimal control problem, and a Zero-sum game-based method is proposed to solve this H ∞ optimal control problem. To deal with the input constraints, a new cost function is proposed. The event-triggered controller is updated only when the triggering condition is satisfied, which can reduce the computational complexity.In order to obtain a controller that minimizes the performance index function in the worst-case disturbance, we use a critic-only network to solve the Hamilton-Jacobi-Isaacs(HJI) equation, and the critic network weight is tuned through a gradient descent method with the historical state data. The stability of the closed-loop system and the uniform ultimate boundedness of the critic network parameters are proved by the Lyapunov method. Two numerical examples are provided to verify the effectiveness of the proposed methods.