On the Action of a Semi-Markov Process on a System of Differential Equations

Abstract We deal with a model equation for stochastic processes that results from the action of a semi-Markov process on a system of ordinary differential equations. The resulting stochastic process is deterministic in pieces, with random changes of the motion at random time epochs. By using classical methods of probability calculus, we first build and discuss the fundamental equation for the statistical analysis, i.e. a Liouville Master Equation for the distribution functions, that is a system of hyperbolic PDE with non-local boundary conditions. Then, as the main contribute to this paper, by using the characteristics’ method we recast it to a system of Volterra integral equations with space fluxes, and prove existence and uniqueness of the solution. A numerical experiment for a case of practical application is performed.


Introduction
Non-deterministic or stochastic modelling is nowadays one of the most active research field of investigation for applied science, engineering and finance. Randomness, or noise, can be included into modelling equations, roughly speaking, in two different ways. First, for continuous time equations, a very popular way is to sum a Wiener or Gaussian noise to deterministic differential equations, so that the randomness affects the state variables by overlapping the motion at every instant time of its evolution. The number of examples is enormous, a classical one from physics is the Langevin equation. The main analysis tools for the treatment of this class of stochastic equation are provided by the well established Itô calculus, as well as the Fokker-Planck equation.
The second category of noise modelling is by "point processes", where the randomness affects the deterministic motion only at some random "epochs".
Historically, they find application to queues and renewal processes. For these cases there are specific techniques of calculus and analysis. Not so long time ago, the modelling field of "point processes" was extended to the motion allowed to assume a certain number of deterministic states that can switch randomly at random times [14]. Nowadays this model framework has potentially a huge number of practical applications. This very wide class of stochastic process is named Piecewise Deterministic Processes (PDP) [14,15]. Within this category, here we focus on processes that switch randomly between deterministic states driven by a semi-Markov process S(t), that is a discrete stochastic jump process where the influence of the past is erased at the epochs of jumps. Early work was made in [23] for the action of a Markov process.
The equation model is a first order system of differential equation, where the known driving vector function is affected by a semi-Markov process. The state function X(t), X : [t 0 , ∞[ → Ω, Ω ⊂ R d , is described by: a) It satisfies the equation: where S(t) : [t 0 , ∞[ → S is a semi-Markov process (defined by c) and d)) with set of discrete states S = {1, . . . , S} and, correspondingly, given s ∈ S we say that the dynamics is in the (deterministic) state s, driven by the functionĀ s : Ω → R d , that is one of a set of {Ā 1 , . . . ,Ā S } known functions (or controllers). We require that allĀ s (·), s ∈ S, be Lipschitz continuous, so that, for fixed s, X(t) exists, is unique and non-explosive solution.
b) It satisfies the initial condition settled by the Cauchy problem to Eq. (1.1), i.e. X(t 0 ) = X 0 ∈ Ω, and by the initial state s 0 = S(t 0 ) of the same equation.
c) S(t) is characterized by a probability density function (p.d.f.) ψ s : R + → R + , of transition events: for each state s ∈ S. In other words it is the p.d.f. for the time holds in the state s. d) S(t) is also characterized by the stochastic transition probability matrix (or transition measure)q := {q ij }: When a transition event occurs, the dynamics switches instantaneously from a state s ∈ S, i.e.Ā s , randomly to a new state s ∈ S, i.e.Ā s . We note that the stochastic matrixq has spectral radius ρ(q) = q 1 = 1.
Virtual transitions from the state s to itself, i.e. no true transition, are allowed for this model, this means that can be q ss > 0. The position X(t) of the process is not affected when the state switches, because it happens instantaneously, so that the process is continuous.
vativity was proved under a Courant-Friederichs-Lèvy-like conditions. Finite volume schemes for the approximation of probability measures for a system of hyperbolic PDEs arising in dynamics reliability of a system have been studied in [11,16]. In this paper we provide an analytical and modelling discussion about the LME. As above mentioned the model equation studied here does not cover all the cases of the general theory of PDPs [14], although it is enough to model many cases of practical application. As we will see also for this reduced formulation, when all the model parameters and functions are given explicitly, the equation for the statistical description, i.e. the LME, can result into a little tricky model equation. Hence, the choice of the discrete semi-Markov process as source of randomness for deterministic motion seems to be a fair compromise between generality of the model and practicality of the explicit model equation.
In the introductory Section 2 we show first the derivation of the LME for the one-dimensional case, with simple classical arguments, then extend it to processes in multi-dimensional space states. We show various cases that are distinguished by the presence of memory or not, and for distributions or densities. Section 3 is devoted to recast the LME to a Volterra Integral Equation (VIE) with space maps, and to prove both an existence and uniqueness theorem. Finally, in Section 4 we will show a numerical result from a model of practical application by using the numerical scheme proposed in [6]. A section of conclusion completes the paper.

Building the Basic Equation
In this section we show how to build the LME, starting from the Chapman-Kolmogorov-Smoluchowski equation for a process in one dimension, d = 1; details of the basic theory can be found in any textbooks of stochastic processes (e.g. [13]).
In terms of probability theory the configurational parameters can be described according to the marginal density distribution functions: f. that at time t, the configurational variables are close to the value x, and that they are in the state s and having spent there the time y = t − t k from the last event t k generated by the semi-Markov process S(t). Y (t) is a random variable representing the "memory" of the time spent from the last switching event t k . The normalizing condition is: We are assuming that the probability measure is absolutely continuous. Indeed, for some stochastic processes also a discrete measure should be included, so that for the completeness of the description, the Lebesgue-Stieltjes measure should be used. The typical problem is to calculate ρ s (x, y, t) at every time, when ρ (0) s (x, y) is known at the time t 0 .
The evolution is described in terms of the transport equation: where p sj (x, y, t; x 0 , y 0 , t 0 ) (y 0 > 0) is the transition probability density distribution from the state ( to (x, y, t, s) defined as: The r.h.s. is the conditional probability to find the process X(t) close to x, in the state s having spent the time y since the last switching event, given the process X(t 0 ) at x 0 , in the state j, having spent the time y 0 in the state j. The transition p.d.f. owns the property: where δ(x) is the δ-Dirac and δ ij the Kronecker symbol, and it gives us the complete information on the stochastic process. The basic method to obtain it is by solving a Chapman-Kolmogorov-Smoluchowski equation that connects the transition p.d.f'.s at different times of the process. However, solving such an equation for the probability transition function is not useful in practice because its general solution is difficult to find as the Green function of a partial differential equation. Therefore usually one searches for solution to Eq. (2.1). The main goal is to find the evolution equation to this p.d.f. according to the dynamical law (1.1). Now we build an equation for the marginal distributions for y ≥ 0.
Here and in what follows we will set sometime the lower limit of integrals to −∞ rather than the lower value of Ω, because we suppose that the density of probability is vanishing outside Ω. Note that F s (x, y, t) is a distribution for x, but a density with respect to y. The initial state (x 0 , y 0 , t 0 ) is omitted. An important function of the model is the hazard function [13] λ s (y) := ψ s (y)/ ∞ y ψ s (y ) dy , y ≥ 0 (2.3) related to ψ s (t) of Eq. (1.2). It represents the density probability per unit of time that an event of S(t) appears for S(t), having spent the time (memory) y in the state s. It easy to see from its definition that the hazard function λ s (y) not necessarily is L 1 or even continuous. In the most general case it could be L 1 loc , however we will not investigate its properties here.
To build the LME let consider the motion of the process in small time steps of size ∆t. There are two cases. In absence of events, the motion is deterministic, so that for a small interval of time ∆t and displacement ∆x, the change of distribution is The right hand side is the probability that the time event does not appear and the equation follows a deterministic law. By approximating with differentials, considering the first order terms and using the right continuity of F s , we obtain: for t > t 0 , y > 0. The r.h.s. is the infinitesimal generator of the process. This equation is valid as longer as there is no a switching event.
In the other case, when an event occurs within the time interval ∆t, we have: that is the distribution function of the state when the process is just entered in, i.e. y = 0. It results as the sum over all the probability from the previous states. The single contribution takes in account of all the memory states y spent in the state j: λ j (y)∆t is the probability that an event occurs after the sojourn time y in that state, and F j (x, y, t) is that of the system, hence the product is the joint probability, that have to be integrated over all the time spent in j. In order to Eq. (2.5) be meaningful, we assume that integrals on the r.h.s. be finite. Eq. (2.5) acts as a non-local boundary condition for the Eq. (2.4). The equation model is completed by the Cauchy initial conditions: The distribution function of x at time t regardless the memory state is Finally, there are boundary conditions related to the conservation of the probability: lim We note that the PDE (2.4) has boundary conditions for (y ≥ 0, t = t 0 ) of Eq. (2.6), and (y = 0, t > t 0 ) of Eq. (2.5). According to the "nature" of the hyperbolic equation, the values at the boundaries are "transported" along the characteristics lines y = t − t * , for (t > t * , y > 0). This means that the initial condition (2.6) influences the region of the solution domain y ≥ t − t 0 , while the nonlocal condition of Eq. (2.5) the region (0 ≤ y < t − t 0 , t > t 0 ). Further, we note that due to (2.5), in general the lim t→t + 0 F s (x, 0, t) = F 0,s (x, 0) because of F 0,s is an arbitrarily assigned function, and F (x, 0, t) depends on the setting of the model. Therefore, this discontinuity propagates along y = t − t 0 , hence lim y→(t−t0) + F s (x, y, t) = F s (x, t − t 0 , t). As a result, with smooth enough conditions for Eq. (2.4), the solution F s (x, y, t) can be continuous in two separate regions and, as we will see in Section 3, it is defined in two pieces: where H(y) is the step function H(y) = 1 for y ≥ 0, H(y) = 0 for y < 0. If we consider an absolute continuous measure of F s (x, y, t) with respect to x, the equation for the density probability p s (x, y, t) = ∂ x F s (x, y, t) reads as: for t > t 0 , y > 0 and the boundary conditions: We will assume that Ω it is known a priori and is a simply connected set, indeed it results from a non-trivial combination of the state spaces of each individual dynamical state s ∈ S of the system.
In what follows we show the formulation of the LME for some different cases in many-dimension, and with the presence or not of memory.
Memoryless multi-dimensional. This is the case of switching events distributed according to the Poisson statistics ψ s (t) = λ s e −λst , so that the driving stochastic process S(t) is Markov.
it is subject to evolution according to Eq. (1.1), then D t+∆t is the set D t "evolved" after the time ∆t.
In absence of transition events, from the conservation probability By using the continuity equation (see textbooks of fluid mechanics or statistical physics), at the first approximation order, from the r.h.s., we get: where I is the identity matrix and |I + ∆t∇Ā| is the approximation of the Jacobian's matrix for the change of variable. It follows If we consider the contribute from the other states and introduce the probability that in case of an event the transition occurs according to the stochastic matrix {q ls }, we write: where we used the vanishing conditions lim xj →−∞ A (j) l p l = 0 (j = 1, . . . , d). The symbolx =j ∈ R d−1 means "exclude the component x j ",x =j := (x 1 , . . . , x j−1 , x j+1 , . . . , x d ), and the multiple integral is calculated to a (d − 1)dimensional space.
If we use the Lebesgue-Stieltjes integral, for the definition of the marginal distribution function: Eq. (2.11) turns into: As an example when d = 2: Multi-dimensional case with memory. In this case the statistics of transition events described by In absence of transitions, from the probability conservation: At the first order approximation, from the r.h.s. we get: the term before the differential is the approximation of the Jacobian's matrix for the change of variable. By considering the first order terms only, we get: As from Eq. (2.9), complete the equation the non-local boundary condition: 13) and the Cauchy initial condition: l (x, y). (2.14) Formulation with distribution functions. The formulation with distribution functions is similar to the case without memory: (2.15) The non-local boundary condition is: and the Cauchy condition is: l (x, y).
Hamiltonian Systems. In this paragraph we write the LME for an Hamiltonian system subject to a Markov process: for l = 1 . . . S Hamiltonian states H l (q, p, t) for a vector of position q and momentum p variables. Let ρ l (q, p, t) the space phase density distribution, from Eq. (2.10) we get: that is: where we used ∇ p ∂ q H l = ∇ q ∂ p H l just as in the Liouville's theorem. By defining the Poissoin's brackets: it reads as: that represents the LME for the stochastic Hamiltonian system of Eq. (2.16). The initial distribution is ρ l (q, p, t 0 ) = ρ 0,l (q, p). By summing on the Markov states: S l=1 ∂ t ρ l + {H l , ρ l } = 0 we found just the Liouville's equation.
As an example, let consider the motion of a massive point particle into a potential U S(t) (q) that is subject to a 2 states Markov process S(t) with stochastic matrix q 11 = q 22 = 0, q 12 = q 21 = 1 and Poissoin's transition rates λ 1,2 . The Hamiltonian is H S(t) (q, p) = p 2 2m + U S(t) (q), and the LME reads as: Hamiltonian case with memory. When the statistics of the switching events is not Poissonian, then S(t) is a semi-Markov process and the phase space density, associated to (2.16), depends also on the memory y: ρ l (q, p, y, t), p, q ∈ R d . According to Eqs. (2.12) and (2.13) the Liouville Master Equation with memory reads as: The initial condition ρ 0,l (q, p, y) has the special meaning that for the initial state of the system, the initial state of memory should be assigned or prepared. An initial state prepared without past memory is proportional to δ(y). The former example, with hazard functions λ 1,2 (y), becomes: ρ 1 (q, p, y, t 0 ) = ρ 0,1 (q, p, y), ρ 2 (q, p, y, t 0 ) = ρ 0,2 (q, p, y). ]. An advantage of this formulation is that the proof of the existence and uniqueness theorem for the solution, can be obtained by using the standard technique of the contraction theorem. Moreover, it would be possible use numerical schemes for the solution of integral equation rather than for PDE. The recasting procedure is based on the characteristics' method for hyperbolic PDE. It is shown in the following paragraphs, step by step.  that can be directly verified from the definition of hazard function, we obtain:

Change of function. The change of function
The Cauchy condition (2.6) becomes:

Characteristics' method analysis
In order to simplify Eq. (3.2), we use the characteristics' method. The boundary and Cauchy conditions are given on the following curves: where φ s (x, t) is a new unknown, that has the properties: φ s (x, t) = φ s (b, t) if x ≥ b, and φ s (x, t) = 0 if x ≤ a. As we discussed in Section 2 there is no continuous limit of the solution to y = 0 memory value of the initial condition, i.e. lim t→t + 0 φ s (x, t) = G 0,s (x, 0). From now we drop the indexes of the discrete states.
According to the characteristic's method, we search for a change of coordinates: established by the following equations: with the Cauchy initial conditions β s (0, u 1 , u 2 ) = β 0,s (u 1 , u 2 ) given for v = 0. With this choice Eq. (3.2) becomes: g v (v, u 1 , u 2 ) = 0, g(0, u 1 , u 2 ) = G| γ1,γ2 whose solution is trivial: and the solution in terms of coordinates (x, y, t) is given by inversion of the change of coordinates (3.9): G(x, y, t) = g 0, u 1 (x, y, t), u 2 (x, y, t) , (3.11) calculated both on γ 1 and γ 2 . We have to solve Eq. (3.10). LetΦ A (v, u 1 , u 2 ) be the solution to the first equation of (3.10) when the initial condition γ 1 and γ 2 are given. Now we have: where α i have to be determined. We have two cases according to the curves γ 1 , γ 2 .
Boundary condition 0 < y < t − t 0 , t > t 0 . For the γ 2 condition (3.6) and (3.12) we have: The first equation is treated as in the previous case. For the second it is α 2 (u 1 , u 2 ) = 0 due to the second equation of Cauchy condition in (3.6) and for the third is α 3 (u 1 , u 2 ) = u 2 . Therefore from Eq. (3.7) and (3.11): The dynamical mapping Now we have to find the mapping Φ A . It is related to the dynamical equation (1.1) whose solution can be written as a map x = Φ * A (t; x 0 , t 0 ) that transforms the initial point (x 0 , t 0 ) to the point x at time t, with the initial condition . In this case the initial condition is β 0,1 (u 1 , u 2 ) = u 1 , given for v = 0 for both γ 1 and γ 2 (see (3.13) and (3.15)).
This means that, if y ≥ t − t 0 we see from Eq. (3.13) that the map connects the initial (u 1 , t 0 ) to the final point (x, t) (with v = t − t 0 ) as x = Φ * A (t; u 1 , t 0 ) and its inverse is: If y ≤ t − t 0 , from Eq. (3.15) the map connects the initial (u 1 , u 2 ) = (u 1 , t − y) to the final point (x, t) (with v = y), so that: Since the codomain must be equal to the domain, the maps have to be subject to restriction: The integral equation Now we insert the information obtained by the characteristics' method into Eq. (3.4). The integral over y is splitted in two parts, according to the values of t − t 0 : y; x, t), t − y ψ j (y) dy x, t), y − t + t 0 ψ j (y) dy and by using Eq. (3.5) and (3.3): that are both valid for t > t 0 . We note that Eq. (3.19) is a system of Volterra Integral Equations of the II kind, with space fluxes (or maps) inside the unknowns φ j , and that is not a weak formulation of the PDE's (2.4). When φ s (x, t) is found, the unknown F s (x, y, t) can be determined. From Eq. (3.16) and Eq. (3.1) we get: The solution for memory y ≥ t − t 0 can be obtained from Eq. (3.14), Eq. (3.5) and Eq. (3.18): so that by using Eq. (3.1), we get:

Summary
Here we summarize the procedure to get the solution to the LME from the Volterra integral equation: 1. Determine the set of the space states Ω. It results from a non trivial combination of the space states of each deterministic dynamics. Ω can be found heuristically by evaluating the dynamics of the resulting stochastic process X(t).
5. From the p.d.f. of the switching events ψ s (t) defined in Eq. (1.2) and by using the Eq. (2.3), write the hazard functions λ j (y).
6. By using the initial condition of Eq (2.6) and Eq. (3.20), write the system of integral equation (3.19) and solve it for φ s (x, t), s ∈ S.
7. By using Eq. (3.21) write the marginal distribution functions with memory as F + s (x, y, t), for y ∈ ]0, t − t 0 [. 8. By using Eq. (3.22) write the marginal distribution functions with memory as F * s (x, y, t), for y ≥ t − t 0 . 9. At this point the solution is written as the sum of two pieces of functions is the unit step function.
10. If we are not interested in the memory states, we can integrate over it with Eq. (2.7) and find F s (x, t).
11. The total memoryless distribution function that includes all states of the semi-Markov process is the sum of the marginal distribution func-tionF T (x, t) = S s=1 F s (x, t), and the density is its derivativeP(x, t) = ∂ xFT (x, t).

Existence and uniqueness theorem
Now we give a proof for the existence and uniqueness for the solution of the system of integral equation (3.19). It will be proved by using the standard technique of the contraction theorem. We note that the presence of the continuous maps Φ −1 s does not substantially affect the standard proof. We define the norm φ 1,∞ := S j=1 max x,t |φ j (x, t)| and the distance d 1,∞ (φ,φ) := φ −φ 1,∞ for the space of continuous functions C 0 (Ω × [t 0 , t 1 ]). In that norm we prove that the integral operator defined by the right hand side of Eq. (3.19) is a contraction, then there exists a unique continuous vector solution φ j (x, t).
Proof. Set y = t − η in Eqs. (3.19) and (3.20), and define the integral operators T j : is a known set of functions. We stress that the maps Φ −1 Aj are supposed known and to be part of the definition of the integral operators, that transform the functions φ j (x, t). By defining also the matrix operator Q that symbolically includes the action of the stochastic matrix φ s = S j=1 q sj (T j φ j ), then the Eq. (3.23) reads as: φ = Qφ. If Q acts as a contraction in the norm · 1,∞ , then the above equation has a unique solution.
We begin by evaluating the norm of the difference of two transformed function samplesφ,φ: where we have used q sj ≥ 0. We see that the continuous map Φ −1 Aj (x, t) does not affect the maximum of the difference fromφ andφ, since ξ ∈ Ω, so that: When the operator Q is applied twice, we get: When the operator Q is applied n-times, the following estimation for the norm is obtained: and in matrix notation: where M = diag(M 1 , . . . , M S ). This shows that when n is large enough, the operator Q n is a contraction, and this prove the thesis. Moreover, we note that M n Q n 1 ≤ 1, so that the semi-Markov process, does not affect the contraction at all.

Normalization verify
In this paragraph we verify, not in the general case, that the marginal distributions obtained by solving the integral equation (3.19), with the transformations (3.21) and (3.22), satisfy the total conservation of Eq. (2.8), between the initial time t 0 and t 0 + . As an example, we set the initial condition that the particle is known to stay at the position x 0 with probability 1, with a memory state concentrated to y = 0, that is: Let t − t 0 = ≈ 0, from Eqs. (3.19) and (3.20), we get: by substituting in the previous By summing these last two equations and over all discrete states, we get: from what we see that if the solutions φ j are regular enough for x → ∞ and → 0, the conservation of the total probability is verified, because of S j=1 f j = 1.

A Practical Application Example
As an example of application, we set the deterministic initial condition (3.24), with f 1 = f 2 , hence the integral equation (3.19) becomes: that is of the following form: Numerical scheme. This equation can be discretized as in [6,4], and then solved numerically. Briefly, let Π M := {x 1 , . . . , x M } an uniform grid of size h on the domain Ω, and P j (ξ, η) = l2 l=l1 L l (ξ)φ j (x l , η) the Lagrange based interpolating polynomial approximation of the solutions φ j . Therefore, the semidiscrete equation of the previous equation, reads as:  w n,k ψ j (t n − t k ) l L l g j (t k ; x m , t n ) φ j,l,k , for n ≥ r. By ordering all indexes of φ s,m,n , that equation can be written in the system of equation form: for the unknown ϕ n that includes also the states of the process. The normalized kernel scheme [4] guarantees the asymptotic convergence of the numerical method. After which the discrete distribution functions can be evaluated by the discrete version of the points 7-11 of the Summary.
In the following example we use trapezoidal method for quadrature and linear interpolation.
Filtered telegraph signal. We study the relaxation procesṡ From the forward map we see that X(t) ∈ Ω if the initial data is into the same domain X 0 ∈ Ω. Since the codomain must be equal to the domain, from the backward maps according Eq. (3.18), we define: g 1,2 (η; x, t) = min W/γ, max −W/γ, Φ * −1 1,2 (η; x, t) .

Conclusions
In this paper we studied the modelling of a stochastic process that results from the action of a semi-Markov process on a system of ordinary differential equation. ODE's are the basic mathematical tools for the description of deterministic physical systems based on the material point approximation. The insertion of semi-Markov process is a modern and not yet fully explored way to model randomness, also as alternative to Gaussian noise based modelling. Potentially this model has huge practical applications in many fields of applied and theoretical research. We stress that the concept of memory is used in advanced modelling, that finds the natural description with this kind of stochastic processes and related equations. We provided a discussion about this modelling and write the basic equation for the statistical analysis, the LME, in the form of Volterra renewal integral equation. We shown that this form is a valid alternative to the formulation in PDE differential form, and an existence and uniqueness theorem for the solution was proved. A numerical test to a practical application problem confirmed the validity of the new formulation.