OPTIMAL CONTROL OF PROBABILITY DENSITY FUNCTIONS OF STOCHASTIC PROCESSES 1

. A Fokker-Planck framework for the formulation of an optimal control strategy of 7 stochastic processes is presented. Within this strategy, the control objectives are deﬁned based 8 on the probability density functions of the stochastic processes. The optimal control is obtained as 9 the minimizer of the objective under the constraint given by the Fokker-Planck model. Representa- 10 tive stochastic processes are considered with diﬀerent control laws and with the purpose of attaining 11 a ﬁnal target conﬁguration or tracking a desired trajectory. In this latter case, a receding-horizon 12 algorithm over a sequence of time windows is implemented. 13

initial time. We consider that the state variable X t evolves according to the SDE (1.1) 85 under a controller u which is explicitly modeled in the equation. Then, we consider the the 86 Fokker-Planck (FP) equation associated to the process as the governing model. In this way, 87 we have the advantage to be able to characterize the whole shape of the randomness present 88 in the system and avoid the technical difficulties of the stochastic Itô's calculus, since the 89 problem is now formulated in a weak deterministic sense. 90 We remark that the use of the Fokker-Planck equation for control purposes is a less 91 explored topic. In [14,15], the possibility to control a dynamical system is discussed based 92 on the assumption that the system can be described by a FP equation. In [19], a method that 93 uses a discretized FP equation to solve the HJB equation for the optimal control problem 94 of nonlinear stochastic dynamical systems is presented. 95 In our framework, we consider a stochastic process in a time interval, with given initial 96 PDF and the objective of approximating a desired final PDF target with the actual PDF of  Further, we use this control strategy applied to a sequence of small time sub-intervals to 106 construct a fast closed-loop control scheme of the stochastic process based on the receding 107 horizon (RH) model predictive control (MPC) approach [22,23]. Notice that in contrast to 108 LQR-type feedback schemes, employed in, e.g., [5,32], the closed-loop schemes based on the 109 RH-MPC approach do not optimize a true performance index; see [8] for a method to quan-110 tify the performance degradation. Nevertheless, these schemes provide robust controllers 111 that apply equally well to linear and nonlinear models and allow to accommodate different 112 control-and state constraints [8,10]. For this reason, RH-MPC schemes are among the 113 most widely used control techniques in process control.

114
Based on the RH-MPC formulation, we are able to implement a control strategy that 115 allows to determine a piecewise control function that drives the process to follow a desired 116 PDF trajectory, which may include the case of a desired non-equilibrium configuration. In 117 this strategy, measures of the state PDF are required at the end of each time sub-interval.

118
In the next section, we define a class of stochastic processes and introduce our framework 119 with optimal control problems based on the FP equation. In Section 3, we discuss the 120 discretization of the FP equation that is required to extend our framework to the case 121 where the FP equation must be solved numerically. Section 4 is devoted to illustrate the 122 receding horizon model predictive control scheme. In Sections 5 and 6, we discuss two  2 Fokker-Planck optimal control formulation of stochastic processes 131 We consider a class of one-dimensional stochastic processes X t ∈ Ω ⊂ R, governed by the 132 following stochastic Itô differential equation where W t is a Wiener process with zero mean and unit variance, σ(X t , t) > 0 is a function of 135 variance of the stochastic process, b(X t , t; u) is a drift term that includes a control function.
the stochastic model evolving in a given time interval to attain a desired final configuration. 138 We assume that the initial probability density distribution is given and the control is a 139 constant function in this time interval. This is a terminal observation control problem.

140
The second step of our strategy is to consider a sequence of such time intervals (t k , t k+1 ),

153
If ρ(y, s) is the given initial density probability of the process at time s, then we have 154 that the probability density of the process at time t > s is given by the following Notice that ρ should be nonnegative and normalized Ω ρ(y, s)dy = 1.

157
Now, we assume to know the initial value of the process at time t k , in the sense that we 158 give the probability density ρ(x, s) at time s = t k . Our problem is to determine a control u 159 such that starting with initial distribution ρ the process evolves such that a desired target 160 probability density f d (x, t) at time t = t k+1 is matched as close as possible. With this 161 setting, we consider the following control problem in Q = Ω × (t k , t k+1 ). We have where |u| is the absolute value of the (scalar) control and (2.5) is the Fokker-Planck equa- is as difficult to compute as the Green function of a partial differential equation. However, 171 in our case we assume that the initial distribution ρ is given and hence we can reformulate

172
(2.4)-(2.6) as follows where we dropped s in the initial distribution. We consider solutions of the FP equation that 177 are sufficiently regular [6], i.e. f (x, t) has continuous first derivative in time, and continuous 178 second derivative in space, jointly to the conservation condition. However, weaker conditions are possible; see [26].

180
In the following, we use the optimal control formulation (2.7)-(2.9) to define our control 181 strategy. To illustrate our framework, we consider different stochastic processes and solve
For completeness, we remark that the optimal control problem (2.7)-(2.9) is an infinite-184 dimensional constrained minimization problem whose solution is characterized as the solu-185 tion of the following first-order optimality system. We have Notice that the state variable evolves forward in time and the adjoint variable evolves back-188 wards in time. In the optimality equation, we have used the following inner product Therefore, we can introduce the so-called reduced cost functionalĴ given by The gradient ofĴ with respect to u is given by where p(u) is the solution of the adjoint equation for the given f (u).

198
Summarizing, the optimal control problem (2.7)-(2.9) can be reformulated as solve  Define a sequence of grids {Ω h } h>0 given by We assume that Ω is a segment which is a multiple of the spatial mesh size h, i.e. h is 215 chosen such that the boundaries of Ω coincide with grid points. We call Ω h the mesh, Ω h 216 is the set of interior mesh-points, and Γ h is the set of the two boundary end-points.

217
With ∂ − x (resp. ∂ + x ) denotes the backward (resp. forward) difference quotient in the x three-point stencil and is given by for all details.
Similarly, for the negative reconstructed fluxφ − j+1/2 we use the stencil {j, j + 1, j + 2} with 249 coefficients c − = [1/3, 5/6, −1/6]. Therefore the total flux for the right boundary of the cell 250 j is given by The same reconstruction is performed for the cell j − 1 and ∂ C x is calculated.

252
In order to ensure convergence of the numerical scheme, the following Courant-Friedrichs-

253
Lewy conditions must be satisfied, δt = c min(h 2 /(2 σ 2 h ∞ ), h/B). We find that c = 0.8 254 guarantees stable and accurate numerical results. closed-loop algorithms. One important aspect of this approach is that it can be applied to 268 infinite dimensional evolution systems [10], that is the case of the FP model.

269
The RH-MPC procedure is summarized in the following algorithm.   3. Assign this PDF as the initial condition for the FP problem in the next time window. 275 4. If t k+1 < T , set k := k + 1, go to 1. and repeat.

277
In the following, we report results of numerical experiments where we compute the con- The classical problem of a massive particle immersed in a viscous fluid and subject to random 291 Brownian fluctuations due to interaction with other particles, is modeled by the Ornstein-292 Uhlenbeck (OU) process. We focus on the control of the PDF of the OU process. For this 293 purpose, we set b(X t , t; u) = −γX t + u, σ(X t , t) = σ, where X t represents the velocity of 294 the particle and u is the momentum induced by an external force field defining the control the action of the control u becomes In this case, the solution to the Fokker-Planck equation is well known to be a Gaussian 300 distribution with mean 301 µ(t; y, t k ; u) = u/γ + (y − u/γ)e −γ(t−t k ) and variance Therefore the solution is as follows This solution defines a mappingf =f (u). Now, assuming an initial distribution ρ at t = t k , 305 and havingf (x, t k+1 ; y, t k ; u), the final distribution f (x, t k+1 ; u) is given by integration as 306 defined in (2.3). This procedure provides a mapping f = f (u) and thus the following reduced 307 cost functional is obtained As a result, we obtain the one-parameter optimization problem min u J(f (u), u). This prob-309 lem is solved efficiently by a bisection minimization procedure [24].

310
In Figure 1, we show the optimal solution for the problem (5.1) corresponding to the target is also Gaussian as follows

330
In the GB control problem, we have b(X t , t; u) = (µ + u)X t , σ(X t , t) = σX t . As an 331 example of application, we assume that X t is the wealth, µ is the average market price of where x, y ≥ 0.

339
In this case, the analytical solution for the transition probability function is known as 340 the following log-normal distribution 341f (x, t; y, t 0 ; u) = 1 Proceeding as in the previous OU case, we obtain a one-parameter optimization problem 343 that can be solved efficiently by bisection.

344
In Figure 2, we show the results obtained with the Geometric-Brownian process. We   An interesting setting in application is b(X t , t; u) = u + µX t , σ(X t , t) = σX t . In this case, 358 the control is not multiplying X t in the drift term. This model is known in the literature 359 as the Shiryaev process [25] and it can be found in applications in sequential analysis and 360 financial mathematics [25,31,28,29]. This setting can be considered as the sum of absolute 361 fixed income from risk-free market and outcome of consumption.

362
The Fokker-Planck equation corresponding to the Shiryaev process is as follows 363 ∂ tf (x, t; y, t k ) − 1 2 σ 2 ∂ 2 x (x 2f (x, t; y, t k )) + ∂ x ((u + µx)f (x, t; y, t k )) = 0 364f (x, t k ; y, t k ) = δ(x − y). Although close to the GB case (6.1), the analytical solution to the FP equation is not known 366 in closed form; however some series expansion can be obtained [25]. Therefore, we solve the 367 following FP equation numerically. We have 368 ∂ t f (x, t) − 1 2 σ 2 ∂ 2 x (x 2 f (x, t)) + ∂ x ((u + µx)f (x, t)) = 0. In this paper, a Fokker-Planck framework for determining controls of stochastic processes 375 was presented. The control objective was defined based on the probability density function.

376
The control strategy is based on a receding-horizon model predictive control framework 377 where optimal controls were obtained minimizing the objective under the constraint given 378 by the Fokker-Planck equation that models the evolution of the probability density function.