A practical approach for optimal control problems under partial differential equations

Abstract

This paper introduces a novel pseudo-spectral (PS) method tailored for the numerical solution of optimal control problems (OCPs) governed by partial differential equations (PDEs). The proposed technique leverages two-dimensional interpolating polynomials constructed on shifted Legendre-Gauss-Lobatto nodes to approximate the state and control variables. This spectral discretization transforms the original infinite-dimensional control problem into a finite-dimensional nonlinear programming (NLP) formulation, enabling efficient numerical treatment. To solve the resulting NLP, we derive the Karush-Kuhn-Tucker (KKT) optimality conditions in full detail, leading to a system of algebraic equations that encapsulate the necessary conditions for optimality. The Levenberg-Marquardt algorithm, known for its robustness in solving nonlinear algebraic systems, is then employed to solve this system iteratively, yielding an accurate approximation of the optimal solution. The effectiveness and reliability of the proposed method are demonstrated through a series of benchmark numerical examples involving PDE-constrained OCPs. Comparative analyses reveal that our approach not only achieves superior accuracy and computational efficiency compared to existing methods but also offers notable advantages in terms of implementation simplicity and scalability. These features make it a compelling alternative for tackling complex OCPs in various scientific and engineering domains.

Keywords

optimal control problem Legendre-Gauss-Lobatto points Karush-Kuhn-Tucker optimality conditions interpolating polynomials Levenberg-Marqurdt algorithm 49K20 35G31 90C30

Introduction

Optimal control problems (OCPs) governed by partial differential equations (PDEs) arise in diverse applications such as mechanics, economics, robotics, and aeronautics. Broadly, there are two main classes of methods for solving OCPs: direct methods, which follow a discretize-then-optimize approach, and indirect methods, which follow an optimize-then-discretize philosophy. Direct methods rely on discretization and parameterization, ultimately transforming the problem into a nonlinear programming (NLP) formulation. This class encompasses a wide range of techniques, including quasi-linearization, steepest descent, quasi-Newton approximations, spectral and pseudo-spectral (PS) methods, algorithmic differentiation, finite difference methods (FDMs), finite element methods (FEMs), measure-theoretic approaches, linearization techniques, control parameterization, time-scaling transformations (Huang et al., 2025; Noori Skandari and Tohidi, 2011), and reproducing kernel algorithms (Abu, 2018; Abu and Shawagfeh, 2021). Among these, direct collocation methods are arguably the most powerful for solving general OCPs. In direct collocation, both state and control variables are approximated using specified functional forms (Noori Skandari et al., 2016). Some direct numerical methods, such as FDMs and FEMs, require the construction of computational meshes and typically operate locally. In contrast, spectral methods, being globally defined and continuous, do not require mesh construction (Samadi et al., 2024). Indirect methods, on the other hand, are grounded in the Pontryagin Minimum (or Maximum) Principle (PMP) and the Hamilton–Jacobi–Bellman (HJB) equations. These approaches yield necessary optimality conditions that often lead to boundary or initial value problems. Such problems can be effectively addressed using collocation techniques, as well as spectral and PS methods (Kang et al., 2008; Noori Skandari et al., 2016). The divergence in philosophy between direct and indirect methods has led to a methodological dichotomy within the optimal control community. Researchers favoring indirect methods tend to focus on differential equation theory, while those employing direct methods are more concerned with optimization algorithms (Biegler et al., 2003). Direct methods are particularly valued for their high convergence rates and the availability of efficient NLP solvers. Interestingly, the covector mapping principle (CMP) has demonstrated that when PS collocation is employed, the distinction between direct and indirect methods essentially disappears (Bertolazzi and Biral, 2023).

A comprehensive summary of gradient-based methods for solving OCPs was provided in Polak (1973, 553–584). The historical development of OC from 1950 to 1985, including an elegant overview of the calculus of variations dating back to the 1600s, is documented in Bryson (1996, 26–33). In Von Stryk and Bulirsch (1992, 357–373), a concise list of commonly used methods developed prior to the 1990s is presented, emphasizing the effectiveness of combining indirect and direct approaches, referred to as hybrid methods. A brief description of techniques for converting continuous-time OCPs into parameter optimization problems is given in Hull (1997, 57–60). Numerical solutions of OCPs governed by PDEs have been explored using various advanced techniques, including: Legendre polynomials and the Ritz method (Mamehrashi and Yousefi, 2017), Generalized Lagrangian Jacobi-Gauss-Radau (GLJGR) collocation method (Latifi et al., 2020), Shifted Gegenbauer polynomial (ShGP) method (Soufivand et al., 2023), Fractional-order Bernstein polynomials (BPs) method (Ketabdari et al., 2021), and Discrete Krawtchouk polynomials (DKPs) method (Dehestani et al., 2025). Further, Nemati and Yousefi (2017, 1079–1097) studied OCPs using a hybrid Ritz method combined with a fractional operational matrix based on Legendre polynomials. Their approach transforms the OCP into a system of algebraic equations. Additionally, Nemati (2018, 2632–2645) proposed a spectral method using Bernstein polynomials and operational matrices to solve OCPs numerically. Finite element methods (FEMs) were employed in Fuica and Jork (2025), while finite difference methods (FDMs) were used in Zoccolan et al. (2025, 237–260) to solve OCPs under PDE constraints. In Mohammadizadeh et al. (2019, 77–102), an indirect method based on the Chebyshev PS (CPS) technique was proposed for solving OCPs governed by Burgers’ equation. The method first reduces the original problem to a system of PDEs with boundary conditions using optimality principles, then approximates control and state functions via interpolating polynomials. Akkouche et al. (2014, 622–631) applied the variational iteration (VI) method, an indirect approach, to solve a quadratic OCP governed by linear PDEs. The method derives necessary conditions using PMP, leading to the Hamilton–Pontryagin equations, which form a multi-point boundary value problem. In Sabeh et al. (2016, 3350–3360) a computational technique based on the Legendre PS (LPS) method was used to solve a distributed OCP for Burgers’ equation. The LPS method transforms the PDE-constrained OCP into a classical OCP governed by ordinary differential equations, which can be solved using either direct or indirect methods. The resulting OCP is then tackled via an indirect method by deriving and numerically solving the first-order optimality conditions.

An alternative classification of numerical methods for solving OCPs governed by PDEs distinguishes between continuous methods, such as collocation techniques (Huang and Peng, 2024; Jiang and Gao, 2024), and discrete methods, including finite difference, Runge-Kutta, and multi-step schemes (Doehring et al., 2024; Erfanifar and Hajarian, 2024). Although continuous methods can produce discrete approximations at selected points, many discrete methods lack the ability to generate globally continuous approximations. This limitation is particularly evident in extrapolation and Runge-Kutta methods, which are less effective for problems requiring smooth, globally differentiable solutions. For such problems, spectral methods, a specialized subclass of collocation techniques, offer a compelling solution. These methods are designed to produce globally smooth approximations using algebraic polynomials. In particular, PS methods (Pirouzeh et al., 2024; Samadi et al., 2024), a refined form of spectral collocation, have proven highly effective for PDE-constrained OCPs. PS methods enforce the differential equations at strategically chosen collocation points within a finite domain, while ensuring that the polynomial approximations match the exact solution at boundary and initial nodes. This approach guarantees that the resulting approximations are not only accurate but also globally continuously differentiable, a critical property for precise control of dynamic systems and physical phenomena. It should be noted that the significance of spectral and PS methods has recently been acknowledged by some researchers in solving other optimal control problems, including delay and fractional problems (Marzban, 2021; Marzban and Hoseini, 2016; Marzban and Manochehri Naeini, 2026; Marzban and Nezami, 2022; Tabrizidooz et al., 2017).

OCPs governed by PDEs are becoming increasingly prevalent in applied sciences, introducing substantial analytical and computational challenges. Addressing these challenges requires the development of novel mathematical models and advanced numerical techniques, particularly PS methods, which have shown great promise in handling complex, high-dimensional systems. This need is especially pressing for problems involving nonlinear PDE systems, which are directly relevant to practical applications in engineering, physics, and biology. A considerable body of research has focused on OCPs involving semi-linear parabolic equations, a fundamental class in PDE theory. These problems often rely on first-order necessary optimality conditions, typically derived using the PMP (Raymond and Zidani, 1998). In addition, several studies have investigated Karush–Kuhn–Tucker (KKT) conditions and even second-order sufficient conditions to enhance solution robustness and theoretical guarantees (Casas et al., 2008). Despite these advances, existing numerical methods for solving such problems frequently encounter limitations in terms of computational cost and convergence reliability, particularly when applied to highly nonlinear PDEs. The complexity of these systems also complicates error analysis, as thoroughly reviewed in Leugering et al. (2012), Tröltzsch (2010), further emphasizing the need for accurate and efficient algorithms. Moreover, OCPs involving wave-type PDE solutions present additional challenges. Examples include the control of spiral waves (Borzi and Griesse, 2006), and wave propagation in cardiac models (investigated by Kunisch and Wagner, 2013). These problems are computationally demanding and often suffer from convergence issues when tackled with iterative optimization techniques. Such limitations underscore the urgency of developing robust, scalable, and high-fidelity numerical methods tailored to these complex control scenarios (Casas and Yong, 2023; Mowlavi and Nabi, 2023).

To solve OCPs governed by PDEs using indirect methods, one must first derive the necessary optimality conditions. However, these conditions are typically available only for problems involving specific PDEs, such as the Burgers’ equation addressed Mohammadizadeh et al. (2019, 77–102). In general, deriving such conditions is analytically intractable and impractical for broader classes of nonlinear PDEs. Consequently, many researchers have turned to direct methods for solving OCPs involving nonlinear PDEs. These approaches begin by discretizing the original problem into a nonlinear programming (NLP) formulation using techniques such as Runge-Kutta discretization. While effective in principle, these methods often require a large number of discretization points, leading to increased computational complexity. Moreover, the resulting NLP problems typically lack closed-form solutions. These challenges stem from two primary sources: (I) Inefficient discretization techniques, which fail to accurately capture the dynamics of the original OCP. (II) Weak optimization algorithms, which struggle to solve the resulting NLP problems effectively. In contrast, direct methods that utilize the KKT conditions offer a more tractable framework for solving NLP problems. To overcome the aforementioned difficulties, we propose a three-step approach for solving OCPs governed by nonlinear PDEs: Step 1: Discretize the problem using a high-accuracy method such as the Legendre–Gauss–Lobatto (LGL)-PS method. Step 2: Formulate the KKT necessary optimality conditions, including detailed gradient and derivative matrices. Step 3: Solve the resulting algebraic system using a robust numerical solver, such as the Levenberg-Marquardt algorithm, to approximate the optimal solution. This process yields highly accurate approximations and can lead to near-explicit solutions. Among various PS techniques, the LGL-PS method stands out for its superior accuracy in solving continuous-time problems involving both ODEs and PDEs (Hairer and Wanner, 1996; Ogundare, 2009; Wright, 1964). Numerical experiments demonstrate that the LGL-PS method outperforms other approaches, including Legendre PS (LPS), Chebyshev PS (CPS), and variational iteration (VI) methods, in terms of both computational efficiency and solution accuracy (Akkouche et al., 2014; Mohammadizadeh et al., 2019; Sabeh et al., 2016). In the present work, we apply the LGL-PS method to an OCP governed by a general class of second-order nonlinear smooth PDEs. The full formulation of the KKT conditions, including gradient and Jacobian matrices, is provided to support reproducibility and further analysis.

We centralize on the following form of OCPs under PDEs

\begin{align} M i n i m i z e & J (y, u) = \int_{0}^{L} \int_{0}^{T} ({(y (t, x) - Φ_{1} (t, x))}^{2} + {(u (t, x) - Φ_{2} (t, x))}^{2}) d t d x \\ s u b j e c t t o \end{align}

(1)

Here, $y : [0, T] \times [0, L] \to R$ and $u : [0, T] \times [0, L] \to R$ represent the state and control variables, respectively. Functions $Φ_{1}, Φ_{2} : [0, T] \times [0, L] \to R$ are two arbitrarily given continuously differentiable functions, and $Ψ_{1}, Ψ_{2} : [0, T] \to R$ and $g : [0, L] \to R$ are three arbitrary continuous functions. Any piecewise continuous function u(⋅) defined over the interval $[0, T] \times [0, L]$ is deemed an admissible control for this problem. The aim is to find the state y(t, x) and admissible control u(t, x) over the interval $[0, T] \times [0, L]$ so that the objective functional (1) is minimized. In problem (1)-(5), $J (y, u)$ shows the performance index, the PDE is described by equation (2). Also, equations (3)–(5) show the initial and boundary conditions.

This article is structured as follows: In LGL-PS method to approximate the solution section, we implement the LGL-PS method for problem (1)-(5) and gain an NLP problem. In optimality conditions for gained NLP problem section, by writing the KKT optimality conditions for the NLP problem, we arrive at a system of algebraic equations and solve this algebraic system using the Levenberg–Marquardt method. In Numerical examples section, we present some test problems to show the superiority and effectiveness of method to solve OCPs under PDEs.

LGL-PS method to approximate the solutions

In this paper, we present an approach for numerical solving problem (1)-(5). Suppose $z_{0} < z_{1} < \dots < z_{N}$ are the LGL points on [−1, 1], which are the roots of $(z^{2} - 1) {\dot{P}}_{N} (z)$ , where $P_{N} (\cdot)$ denotes the Legendre polynomial of degree $N$ and it can be defined also by the following recurrence relation

P_{N + 1} (z) = \frac{2 N + 1}{N + 1} z P_{N} (z) - \frac{N}{N + 1} P_{N - 1} (z), N \geq 1,

where

P_{0} (z) = 1

and

P_{1} (z) = z

for z ∈ [−1, 1]. The shifted LGL points can also be defined on [0, T] and [0, L] by the following transformation

t_{j} = \frac{T}{2} (z_{j} + 1), x_{j} = \frac{L}{2} (z_{j} + 1), j = 0,1, \dots, N,

(6)

where

{z_{j}}_{j = 0}^{N}

are the LGL points on [−1, 1]. In the suggested approach, we first approximate the state and control variables on the intervals [0, T] and [0, L], respectively, along with their derivatives, as follows

where $L_{i} (\cdot)$ and ${\bar{L}}_{j} (\cdot)$ represent the Lagrange polynomials of degree $N$ , defined as

L_{i} (t) = \prod_{\begin{matrix} l = 0 \\ l \neq i \end{matrix}}^{N} \frac{t - t_{l}}{t_{i} - t_{l}}, {\bar{L}}_{j} (x) = \prod_{\begin{matrix} l = 0 \\ l \neq j \end{matrix}}^{N} \frac{x - x_{l}}{x_{j} - x_{l}}, i, j = 1,2, \dots, N, (t, x) \in [0, T] \times [0, L] .

Now, we can discretize the approximations (7)–(9) at the collocation points ${(t_{k}, x_{p})}_{k, p = 0}^{N}$ and use the delta Kronecker property of Lagrange polynomials. We get the following relations for $k, p = 0,1, \dots, N$

where D = (D_ij), $\bar{D} = ({\bar{D}}_{i j})$ , $D^{(2)} = (D_{i j}^{(2)}) = D \times D$ and ${\bar{D}}^{(2)} = ({\bar{D}}_{i j}^{(2)}) = \bar{D} \times \bar{D}$ represent the derivative matrices and

D_{i j} = \{\begin{cases} \frac{2}{T} \cdot \frac{- N (N + 1)}{4}, i = j = 0, \\ \frac{2}{T} \cdot \frac{N (N + 1)}{4}, i = j = N, \\ \frac{{\hat{P}}_{N} (t_{i})}{{\hat{P}}_{N} (t_{j})} \cdot \frac{1}{t_{i} - t_{j}}, i \neq j, \\ 0, otherwise, \end{cases}

{\bar{D}}_{i j} = \{\begin{cases} \frac{2}{L} \cdot \frac{- N (N + 1)}{4}, & i = j = 0, \\ \frac{2}{L} \cdot \frac{N (N + 1)}{4}, i = j = N, \\ \frac{{\hat{P}}_{N} (x_{i})}{{\hat{P}}_{N} (x_{j})} \cdot \frac{1}{x_{i} - x_{j}}, i \neq j, \\ 0, otherwise, \end{cases}

where

{\hat{P}}_{N} (t_{k}) = P_{N} (\frac{T}{2} z_{k} + \frac{T}{2})

for

k = 0,1, \dots, N

, and

{\hat{P}}_{N} (x_{p}) = P_{N} (\frac{L}{2} z_{p} + \frac{L}{2})

for

p = 0,1, \dots, N

The OCP (1)-(5) can be expressed as follows through discretization according to the aforementioned procedure as mentioned above which is an NLP problem

\begin{align} M i n i m i z e & J^{N} (\bar{y}, \bar{u}) = \sum_{k = 0}^{N} \sum_{p = 0}^{N} w_{k} \cdot w_{p} [{({\bar{y}}_{k p} - Φ_{1} (t_{k}, x_{p}))}^{2} + {({\bar{u}}_{k p} - Φ_{2} (t_{k}, x_{p}))}^{2}] \\ s u b j e c t t o \end{align}

(15)

Optimality conditions for gained NLP problem

We point out that most of direct methods for solving NLP problems cannot be led to an explicit and accurate solution. So, it is essential that we apply the optimality conditions, that is, KKT conditions. To achieve the KKT conditions, we first denote the constraints (16)-(19) by $G_{1}^{k p}, G_{2}^{k}, G_{3}^{k}$ , and $G_{4}^{p}$ for $k, p = 0,1,2, \dots, N$ , respectively, and define the Lagrange function as

L = J^{N} + λ_{1} G_{1} + λ_{2} G_{2} + λ_{3} G_{3} + λ_{4} G_{4}

(20)

The necessary condition for optimality is

\nabla L = \nabla J^{N} + λ_{1} \nabla G_{1} + λ_{2} \nabla G_{2} + λ_{3} \nabla G_{3} + λ_{4} \nabla G_{4} = 0,

(21)

which can be written as

Considering that constraints G₁, G₂, and G₃ do not involve variable u_kp, the above system of equations becomes:

We denote the constraints at the collocation points (t_i, x_j) as $G_{1}^{i j}, G_{2}^{i}, G_{3}^{i}$ , and $G_{4}^{j}$ . Thus, for constraint G₁, the number of obtained equations equals ${(N + 1)}^{2}$ , and each equation is represented as $G_{1}^{i j}$ . As a result, we have:

\nabla_{y} G_{1}^{i j} = {(\frac{\partial G_{1}^{i j}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}}, \nabla_{u} G_{1}^{i j} = {(\frac{\partial G_{1}^{i j}}{\partial u_{k p}})}_{1 \times {(N + 1)}^{2}},

\nabla_{y} G_{1} = {[\frac{\partial G_{1}^{i j}}{\partial y_{k p}}]}_{{(N + 1)}^{2} \times {(N + 1)}^{2}}, \nabla_{u} G_{1} = {[\frac{\partial G_{1}^{i j}}{\partial u_{k p}}]}_{{(N + 1)}^{2} \times {(N + 1)}^{2}},

We can write for more explanation

\nabla_{y} G_{1}^{i j} = {(\frac{\partial G_{1}^{i j}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}} = (\frac{\partial G_{1}^{i j}}{\partial y_{00}}, \frac{\partial G_{1}^{i j}}{\partial y_{01}}, \dots, \frac{\partial G_{1}^{i j}}{\partial y_{0 N}}, \frac{\partial G_{1}^{i j}}{\partial y_{10}}, \frac{\partial G_{1}^{i j}}{\partial y_{11}}, \dots, \frac{\partial G_{1}^{i j}}{\partial y_{1 N}}, \dots, \frac{\partial G_{1}^{i j}}{\partial y_{N 0}}, \dots, \frac{\partial G_{1}^{i j}}{\partial y_{N N}}),

\nabla_{y} G_{1} = {[\frac{\partial G_{1}^{i j}}{\partial y_{k p}}]}_{{(N + 1)}^{2} \times {(N + 1)}^{2}} = [\begin{matrix} \frac{\partial G_{1}^{00}}{\partial y_{00}} & \frac{\partial G_{1}^{00}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{00}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{00}}{\partial y_{N 0}} & \frac{\partial G_{1}^{00}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{00}}{\partial y_{N N}} \\ \frac{\partial G_{1}^{01}}{\partial y_{00}} & \frac{\partial G_{1}^{01}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{01}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{01}}{\partial y_{N 0}} & \frac{\partial G_{1}^{01}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{01}}{\partial y_{N N}} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial G_{1}^{0 N}}{\partial y_{00}} & \frac{\partial G_{1}^{0 N}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{0 N}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{0 N}}{\partial y_{N 0}} & \frac{\partial G_{1}^{0 N}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{0 N}}{\partial y_{N N}} \\ \frac{\partial G_{1}^{10}}{\partial y_{00}} & \frac{\partial G_{1}^{10}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{10}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{10}}{\partial y_{N 0}} & \frac{\partial G_{1}^{10}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{10}}{\partial y_{N N}} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial G_{1}^{1 N}}{\partial y_{00}} & \frac{\partial G_{1}^{1 N}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{1 N}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{1 N}}{\partial y_{N 0}} & \frac{\partial G_{1}^{1 N}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{1 N}}{\partial y_{N N}} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial G_{1}^{N 0}}{\partial y_{00}} & \frac{\partial G_{1}^{N 0}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{N 0}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{N 0}}{\partial y_{N 0}} & \frac{\partial G_{1}^{N 0}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{N 0}}{\partial y_{N N}} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial G_{1}^{N N}}{\partial y_{00}} & \frac{\partial G_{1}^{N N}}{\partial y_{01}} & \dots & \frac{\partial G_{1}^{N N}}{\partial y_{0 N}} & \dots & \frac{\partial G_{1}^{N N}}{\partial y_{N 0}} & \frac{\partial G_{1}^{N N}}{\partial y_{N 1}} & \dots & \frac{\partial G_{1}^{N N}}{\partial y_{N N}} \end{matrix}] .

Other cases can be raised in the same way. Moreover, for constraints G₂, G₃, and G₄, the number of each of them has equal to $(N + 1)$ , we have:

\begin{array}{l} \nabla_{y} G_{2}^{i} = {(\frac{\partial G_{2}^{i}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}}, \nabla_{y} G_{3}^{i} = {(\frac{\partial G_{3}^{i}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}}, \nabla_{y} G_{4}^{i} = {(\frac{\partial G_{4}^{i}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}}, i = 0,1, \dots, N, \\ \nabla_{y} G_{2} = {[\frac{\partial G_{2}^{i}}{\partial y_{k p}}]}_{(N + 1) \times {(N + 1)}^{2}}, \nabla_{y} G_{3} = {[\frac{\partial G_{3}^{i}}{\partial y_{k p}}]}_{(N + 1) \times {(N + 1)}^{2}}, \nabla_{y} G_{4} = {[\frac{\partial G_{4}^{i}}{\partial y_{k p}}]}_{(N + 1) \times {(N + 1)}^{2}}, \end{array}

We can write for more explanation

\nabla_{y} G_{2}^{i} = {(\frac{\partial G_{2}^{i}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}} = (\frac{\partial G_{2}^{i}}{\partial y_{00}}, \frac{\partial G_{2}^{i}}{\partial y_{01}}, \dots, \frac{\partial G_{2}^{i}}{\partial y_{0 N}}, \frac{\partial G_{2}^{i}}{\partial y_{10}}, \frac{\partial G_{2}^{i}}{\partial y_{11}}, \dots, \frac{\partial G_{2}^{i}}{\partial y_{1 N}}, \dots, \frac{\partial G_{2}^{i}}{\partial y_{N 0}}, \dots, \frac{\partial G_{2}^{i}}{\partial y_{N N}})

\nabla_{y} G_{2} = {[\frac{\partial G_{2}^{i}}{\partial y_{k p}}]}_{(N + 1) \times {(N + 1)}^{2}} = [\begin{matrix} \frac{\partial G_{2}^{0}}{\partial y_{00}} & \frac{\partial G_{2}^{0}}{\partial y_{01}} & \dots & \frac{\partial G_{2}^{0}}{\partial y_{0 N}} & \dots & \frac{\partial G_{2}^{0}}{\partial y_{N 0}} & \frac{\partial G_{2}^{0}}{\partial y_{N 1}} & \dots & \frac{\partial G_{2}^{0}}{\partial y_{N N}} \\ \frac{\partial G_{2}^{1}}{\partial y_{00}} & \frac{\partial G_{2}^{1}}{\partial y_{01}} & \dots & \frac{\partial G_{2}^{1}}{\partial y_{0 N}} & \dots & \frac{\partial G_{2}^{1}}{\partial y_{N 0}} & \frac{\partial G_{2}^{1}}{\partial y_{N 1}} & \dots & \frac{\partial G_{2}^{1}}{\partial y_{N N}} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial G_{2}^{N - 1}}{\partial y_{00}} & \frac{\partial G_{2}^{N - 1}}{\partial y_{01}} & \dots & \frac{\partial G_{2}^{N - 1}}{\partial y_{0 N}} & \dots & \frac{\partial G_{2}^{N - 1}}{\partial y_{N 0}} & \frac{\partial G_{2}^{N - 1}}{\partial y_{N 1}} & \dots & \frac{\partial G_{2}^{N - 1}}{\partial y_{N N}} \\ \frac{\partial G_{2}^{N}}{\partial y_{00}} & \frac{\partial G_{2}^{N}}{\partial y_{01}} & \dots & \frac{\partial G_{2}^{N}}{\partial y_{0 N}} & \dots & \frac{\partial G_{2}^{N}}{\partial y_{N 0}} & \frac{\partial G_{2}^{N}}{\partial y_{N 1}} & \dots & \frac{\partial G_{2}^{N}}{\partial y_{N N}} \end{matrix}]

Other cases can be raised in the same way. Now, regarding the objective functional, we have $J^{N} = J^{N} (\bar{y}, \bar{u})$ where $J^{N} : R^{N + 1} \times R^{N + 1} \to R,$ and

\frac{\partial J^{N}}{\partial y_{k p}} = \frac{\partial J^{N} (\bar{y}, \bar{u})}{\partial y} ∣_{(t_{k}, x_{p})} = 2 w_{k} \cdot w_{p} ({\bar{y}}_{k p} - Φ_{1} (t_{k}, x_{p})), \frac{\partial J^{N}}{\partial u_{k p}} = \frac{\partial J^{N} (\bar{y}, \bar{u})}{\partial u} ∣_{(t_{k}, x_{p})} = 2 w_{k} \cdot w_{p} ({\bar{u}}_{k p} - Φ_{2} (t_{k}, x_{p}))

(26)

\nabla_{\bar{y}} J^{N} = \frac{\partial J^{N}}{\partial \bar{y}} = {(\frac{\partial J^{N}}{\partial y_{k p}})}_{1 \times {(N + 1)}^{2}}, \nabla_{\bar{u}} J^{N} = \frac{\partial J^{N}}{\partial \bar{u}} = {(\frac{\partial J^{N}}{\partial u_{k p}})}_{1 \times {(N + 1)}^{2}}

(27)

Consequently, the system of equations (15)–(19) is rewritten as follows

To this system of equations, the following results can be obtained:

(A) According to the relationship of (16), we have

\begin{align} \frac{\partial G_{2}^{i}}{\partial y_{k p}} = \{\begin{cases} 1 k = i, p = 0, \\ 0 otherwise . \end{cases} \end{align}

(30)

That is, by changing (k, p), the result becomes a vector, where all of its elements except for element (k = i, p = 0), it means the $(i \times (N + 1))$ -th component of this vector, are equal to zero.

(B) According to (17), we have

\frac{\partial G_{3}^{i}}{\partial y_{k p}} = \{\begin{cases} 1 k = i, p = N, \\ 0 otherwise . \end{cases}

(31)

By changing (k, p), the result becomes a vector, where all of its elements except for element (k = i, p = N), it means the $(i \times (N + 1) + (N + 1)) - t h$ component of this vector, are equal to zero.

\frac{\partial G_{4}^{i}}{\partial y_{k p}} = \{\begin{cases} 1 k = 0, p = i, \\ 0 otherwise . \end{cases}

(32)

Varying (k, p) results in a vector, where only the element (k = 0, p = i), it means the i − th component of this vector, is non-zero.

(D) To solve the system of gradient equations (28) and (29), we need $\frac{\partial G_{1}^{i j}}{\partial u_{k p}}, \frac{\partial G_{1}^{i j}}{\partial y_{k p}}$ all other components are explicitly defined. Assuming that equation (15) takes the form:

F (t, x, y (t, x), u (t, x), \frac{\partial y}{\partial x} (t, x), \frac{\partial^{2} y}{\partial x^{2}} (t, x), \frac{\partial^{2} y}{\partial t^{2}} (t, x)) = F^{*} (t, x, y (t, x), \frac{\partial y}{\partial x} (t, x), \frac{\partial^{2} y}{\partial x^{2}} (t, x), \frac{\partial^{2} y}{\partial t^{2}} (t, x)) + u (t, x) .

(33)

In (15), we have

\frac{\partial G_{1}^{i j}}{\partial u_{k p}} = \{\begin{cases} - 1 k = i, p = j, \\ 0 otherwise, \end{cases}

(34)

Varying (k, p) results in a vector where only the element (k = i, p = j), it means the $(i \times (N + 1) + j) - t h$ component of this vector, is non-zero.

Below, we present equation (15) for the following special OCPs:

(I) OCPs under linear equation: we consider

F (t, x, y (t, x), u (t, x), \frac{\partial y}{\partial x} (t, x), \frac{\partial^{2} y}{\partial x^{2}} (t, x), \frac{\partial^{2} y}{\partial t^{2}} (t, x)) = \frac{\partial^{2} y}{\partial x^{2}} (t, x) + \frac{\partial y}{\partial x} (t, x) + u (t, x) + f (t, x) .

(35)

Now, for convenience, we rewrite vector ${(\partial G_{1}^{i j} / \partial y_{k p})}_{1 \times {(N + 1)}^{2}}$ as matrix ${[\partial G_{1}^{i j} / \partial y_{k p}]}_{(N + 1) \times (N + 1)}$ . We will examine the matrix ${[\partial G_{1}^{i j} / \partial y_{k p}]}_{(N + 1) \times (N + 1)}$ in which i and j are constant and k and p change. All the components of this matrix are zero, except for the components of the i-th row and the j-th column. This matrix has one non-zero row and one non-zero column and can be calculated as

[\begin{matrix} 0 & 0 & \dots & D_{i 0} & \dots & 0 \\ 0 & 0 & \dots & D_{i 1} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ - {\bar{D}}_{j 0}^{(2)} - {\bar{D}}_{j 0} & - {\bar{D}}_{j 1}^{(2)} - {\bar{D}}_{j 1} & \dots & D_{i i} - {\bar{D}}_{j j}^{(2)} - {\bar{D}}_{j j} & \dots & - {\bar{D}}_{j N}^{(2)} - {\bar{D}}_{j N} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & D_{i N} & \dots & 0 \end{matrix}]

(36)

(II) OCPs under diffusion equation: We consider

F (t, x, y (t, x), u (t, x), \frac{\partial y}{\partial x} (t, x), \frac{\partial^{2} y}{\partial x^{2}} (t, x), \frac{\partial^{2} y}{\partial t^{2}} (t, x)) = \frac{\partial^{2} y}{\partial x^{2}} (t, x) + u (t, x) + f (t, x) .

(37)

Thus, we have formulated an OCP involving the diffusion equation. We will examine the matrix ${[\partial G_{1}^{i j} / \partial y_{k p}]}_{(N + 1) \times (N + 1)}$ in which i and j are constant and k and p change. All the components of this matrix are zero, except for the components of the i-th row and the j-th column. This matrix is as follows

[\begin{matrix} 0 & 0 & \dots & D_{i 0} & \dots & 0 \\ 0 & 0 & \dots & D_{i 1} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ - {\bar{D}}_{j 0}^{(2)} & - {\bar{D}}_{j 1}^{(2)} & \dots & D_{i i} - {\bar{D}}_{j j}^{(2)} & \dots & - {\bar{D}}_{j N}^{(2)} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & D_{i N} & \dots & 0 \end{matrix}]

(38)

(III) OCPs under Burgers’ equation: we consider

F (t, x, y (t, x), u (t, x), \frac{\partial y}{\partial x} (t, x), \frac{\partial^{2} y}{\partial x^{2}} (t, x), \frac{\partial^{2} y}{\partial t^{2}} (t, x)) = - \frac{\partial y}{\partial x} (t, x) \cdot y (t, x) + ν \frac{\partial^{2} y}{\partial x^{2}} (t, x) + u (t, x),

(39)

where ν is a given constant. With this, we establish an OCP within the framework of Burgers’ equation. We will examine the matrix

{[\partial G_{1}^{i j} / \partial y_{k p}]}_{(N + 1) \times (N + 1)}

, where i and j are constant while k and p change. All components of this matrix are zero, except for those in the i-th row and the j-th column. Essentially, this matrix is as

[\begin{matrix} 0 & 0 & \dots & D_{i 0} & \dots & 0 \\ 0 & 0 & \dots & D_{i 1} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ - ν {\bar{D}}_{j 0}^{(2)} + {\bar{y}}_{i j} {\bar{D}}_{j 0} & - ν {\bar{D}}_{j 1}^{(2)} + {\bar{y}}_{i j} {\bar{D}}_{j 1} & \dots & D_{i i} - ν {\bar{D}}_{j j}^{(2)} + {\bar{y}}_{i j} {\bar{D}}_{j j} + \sum_{q = 0}^{N} {\bar{y}}_{k q} {\bar{D}}_{p q} & \dots & - ν {\bar{D}}_{j N}^{(2)} + {\bar{y}}_{i j} {\bar{D}}_{j N} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & D_{i N} & \dots & 0 \end{matrix}]

(40)

By attention to above calculate matrices and vectors, we achieve the following system of algebraic equations

\{\begin{matrix} \frac{\partial J^{N}}{\partial y_{k p}} + \sum_{i = 0}^{N} \sum_{j = 0}^{N} λ_{1}^{i j} \frac{\partial G_{1}^{i j}}{\partial y_{k p}} + \sum_{i = 0}^{N} λ_{2}^{i} \frac{\partial G_{2}^{i}}{\partial y_{k p}} + \sum_{i = 0}^{N} λ_{3}^{i} \frac{\partial G_{3}^{i}}{\partial y_{k p}} + \sum_{i = 0}^{N} λ_{4}^{i} \frac{\partial G_{4}^{i}}{\partial y_{k p}} = 0; k, p = 0,1,2, \dots, N, \\ \frac{\partial J^{N}}{\partial u_{k p}} + \sum_{i = 0}^{N} \sum_{j = 0}^{N} λ_{1}^{i j} \frac{\partial G_{1}^{i j}}{\partial u_{k p}} = 0, k, p = 0,1,2, \dots, N, \\ \sum_{i = 0}^{N} {\bar{y}}_{i p} D_{k i} - F (t_{k}, x_{p}, {\bar{y}}_{k p}, {\bar{u}}_{k p}, \sum_{j = 0}^{N} {\bar{y}}_{k j} {\bar{D}}_{p j}, \sum_{j = 0}^{N} {\bar{y}}_{k j} {\bar{D}}_{p j}^{(2)}, \sum_{i = 0}^{N} {\bar{y}}_{i p} D_{k i}^{(2)}) = 0, k, p = 0,1,2, \dots, N . \\ {\bar{y}}_{k 0} - Ψ_{1} (t_{k}) = 0, k = 0,1,2, \dots, N, \\ {\bar{y}}_{k N} - Ψ_{2} (t_{k}) = 0, k = 0,1,2, \dots, N, \\ {\bar{y}}_{0 p} - g (x_{p}) = 0, p = 0,1,2, \dots, N . & (41) \end{matrix}

To determine the number of equations, we first analyze the gradient equations (28) and (29), considering that there are ${(N + 1)}^{2}$ collocation points (t_k, x_p), resulting in $2 {(N + 1)}^{2}$ equations. Next, we consider the feasible conditions: constraint (15) comprises ${(N + 1)}^{2}$ equations, while constraints (16)–(18) each contain $(N + 1)$ equations. Therefore, a total of $N = 3 {(N + 1)}^{2} + 3 (N + 1)$ equations are available. To determine the number of variables, we consider each of y_kp, u_kp, and $λ_{1}^{i j}$ , with the number of variables for each being ${(N + 1)}^{2}$ . Similarly, for each of $λ_{2}^{i}$ , $λ_{3}^{i}$ , and $λ_{4}^{i}$ , the number of variables is $(N + 1)$ . Thus, there are $3 {(N + 1)}^{2} + 3 (N + 1)$ variables in total. Consequently, the number of equations equals with the unknowns.

The Levenberg–Marquardt algorithm effectively tackles systems of nonlinear algebraic equations. While often associated with curve fitting, it extends well to solving such nonlinear systems. By modifying the Gauss–Newton method, it addresses challenges posed by ill-conditioned problems or poor initial guesses, using a damping parameter to balance between the Gauss–Newton method (for rapid convergence near the solution) and gradient descent (for stability far from the solution). For clarification, consider the system H(X) = 0, defined as:

H (X) = (H_{1} (X), H_{2} (X), \dots, H_{N} (X)), X = (x_{1}, x_{2}, \dots, x_{N}) = (({\bar{y}}_{i j}, {\bar{u}}_{i j}, λ_{1}^{i j}, λ_{2}^{i}, λ_{3}^{i}, λ_{4}^{i}) : i, j = 0,1,2, \dots, N)

The algorithm begins with an initial guess X⁽⁰⁾ for the vector X. The Jacobian matrix J(X), containing partial derivatives $J_{i j} = \frac{\partial H_{i}}{\partial x_{j}}$ , is then computed. The solution vector X is updated iteratively using:

X^{(r + 1)} = X^{(r)} - {[(J {(X^{(r)})}^{T} J (X^{(r)}) + μ I]}^{- 1} J {(X^{(r)})}^{T} H (X^{(r)}), r = 1,2,3, \dots

Here, μ is the damping parameter, dynamically adjusted at each iteration and I is the identity matrix. This approach combines the fast convergence of the Gauss–Newton method with the stability of gradient descent. Convergence is assessed by the norm of the residual vector H(X^(r+1)), and the algorithm stops when this norm falls below a predefined threshold, indicating a solution.

Levenberg–Marquardt is a popular alternative to the Gauss–Newton method of finding the minimum of the function that is a sum of squares of nonlinear functions. This method is a fundamental regularization technique for the Newton method applied to nonlinear equations, possibly constrained, and possibly with singular or even nonisolated solutions. Levenberg–Marquardt method is the most typical algorithm for solving nonlinear algebraic equations. They sequentially solve subproblems represented as squared residual of the Newton equations with the L₂ regularization to determine the search direction. Although this method is generally used to solve non-square devices, extensions have recently been added to it to solve square devices (Fischer et al., 2024; Han and Rui, 2024).

In next section, we solve three test problems to show the efficiency of suggested approach compared with others.

Numerical examples

In this section, we solve three OCPs involving PDEs using the proposed method and demonstrate its effectiveness. Referring to the definitions of the 2-norm ‖ ⋅‖₂, we will utilize the following relationships to check the convergence of approximate solutions

E_{2}^{N} (y^{N}) = \sqrt{\sum_{k = 0}^{N} \sum_{p = 0}^{N} {(y^{*} (t_{k}, x_{p}) - y^{N} (t_{k}, x_{p}))}^{2}}, E_{2}^{N} (u^{N}) = \sqrt{\sum_{k = 0}^{N} \sum_{p = 0}^{N} {(u^{*} (t_{k}, x_{p}) - u^{N} (t_{k}, x_{p}))}^{2}},

where, (y*, u*) and

(y^{N}, u^{N})

are the exact and approximate optimal solutions, respectively. Moreover the absolute error of obtained solutions, residual error of the PDEs and the residual error of the gradient equations, can be displayed from the following relations

E_{y}^{N} (t, x) = ∣ y^{*} (t, x) - y^{N} (t, x) ∣, E_{u}^{N} (t, x) = ∣ u^{*} (t, x) - u^{N} (t, x) ∣, (t, x) \in [0, T] \times [0, L],

(42)

E_{\nabla, R}^{N} (t_{k}, x_{p}) = ∣ \nabla J^{N} (t_{k}, x_{p}) + λ \nabla G^{N} (t_{k}, x_{p}) ∣, k, p = 0,1,2, \dots, N,

(43)

E_{R}^{N} (t_{k}, x_{p}) = | (\frac{\partial y^{N}}{\partial t} (t_{k}, x_{p}) - F (t_{k}, x_{p}, y^{N} (t_{k}, x_{p}), u^{N} (t_{k}, x_{p}), \frac{\partial y^{N}}{\partial x} (t_{k}, x_{p}), \frac{\partial^{2} y^{N}}{\partial x^{2}} (t_{k}, x_{p}), \frac{\partial^{2} y^{N}}{\partial t^{2}} (t_{k}, x_{p})) |

(44)

+ ∣ y^{N} (t_{k}, 0) - Ψ_{1} (t_{k}) ∣ + ∣ y^{N} (t_{k}, L) - Ψ_{2} (t_{k}) ∣ + ∣ y^{N} (0, x_{p}) - g (x_{p}) ∣, k, p = 0,1,2, \dots, N .

Finally, we summarize the solution process in in the following algorithm:

(1) Select $N$ and determine the collocation points,

(2) Discretize and obtain the corresponding NLP system according to (15)–(19),

(3) Calculate the gradients of the functions in question according to (28) and (29),

(4) Solve the algebraic system obtained from the previous steps, which is according to (41), using the Levenberg–Marquardt method,

(5) Obtain approximate solutions $\bar{y}, \bar{u}, \bar{λ}$ ,

(6) Calculate the errors according to (42)–(44).

Furthermore, it is worth mentioning that the method proposed in this article was implemented and executed using MATLAB software (version 2013) on a personal computer equipped with 6 GB of RAM and an Intel Core i3 processor. The approximate solutions were obtained in less than 10 minutes, which is considered reasonable given the complexity and computational demands of the problems addressed.

Example 1

Consider the following OCP under PDEs

\begin{align} M i n i m i z e & J (y, u) = \int_{0}^{1} \int_{0}^{1} {(y (t, x) - e^{t} \sin x)}^{2} d t d x + \int_{0}^{1} \int_{0}^{1} {(u (t, x) - t^{2} x^{2})}^{2} d t d x \\ s u b j e c t t o \end{align}

(45)

The exact optimal solutions of this problem are $J (y^{*}, u^{*}) = 0$ and

u^{*} (t, x) = e^{t} \sin x a n d y^{*} (t, x) = t^{2} x^{2} .

(50)

The analytical solution applies to the problem, enabling us to measure and validate the effectiveness of the proposed method and assess its error. The graphs depicting the results obtained from this method are as follows. Figures 1 and 2 display the approximate solutions the absolute errors for $N = 10$ . Figures 3(a) and 3(b) depict the logarithm of the 2-norm of the difference between exact and approximate solutions for all of $N$ . Figure 4 illustrates the logarithm of $E_{R}^{N} (t_{k}, x_{p})$ and $E_{\nabla, R}^{N} (t_{k}, x_{p})$ for $N = 10$ . In Table 1, we can see that the gained approximate values for J tend to optimal value J* = 0 when $N$ increases by a good convergence rate. The results show that the presented method has low errors and by increasing $N$ , they go to zero.

Figure 1.

The approximate solution and the logarithm of absolute error between the approximate and exact state variables for $N = 10$ , in Example 1.

Figure 2.

The approximate solution and the logarithm of absolute error between the approximate and exact control variables for $N = 10$ , in Example 1.

Figure 3.

The logarithm of difference between exact and approximate solutions, in Example 1.

Figure 4.

The logarithm of $E_{R}^{N} (t_{k}, x_{p})$ and $E_{\nabla, R}^{N} (t_{k}, x_{p})$ with $N = 10$ , for Example 1.

Table 1.

The approximate value of $J^{N} (\bar{y}, \bar{u})$ for Example 1.

$N$	2	4	6	8	10
$J^{N} (\cdot)$	0.184196381404508	0.000024178091020	0.000000043329805	0.000000000000210	0.000000000000004

We note that $J^{N} (\bar{y}, \bar{u}) - J (y^{*}, u^{*}) = J^{N} (\bar{y}, \bar{u})$ , which is shown in Table 1. Given that the residual tends towards zero with an increase of $N$ , the numerical convergence of the method can be observed.

Example 2

We consider the following OCP

\begin{align} M i n i m i z e & J (y, u) = \frac{1}{2} \int_{0}^{1} \int_{0}^{1} ({(y (t, x) - t^{2} x (1 - x))}^{2} + {(u (t, x))}^{2}) d t d x \\ s u b j e c t t o \end{align}

(51)

This problem has an optimal solution, but an analytical and precise relationship for it is not available. We solve this problem by presented approach. The approximate solutions for N = 20 are depicted in Figure 5. Moreover, Figure 6 illustrates the logarithm of the PDE residual error and the residual error of gradient equations. Upon examining the results presented in Tables 2 and 3, it becomes apparent that by increasing N, the absolute error reduces. The results obtained indicate the superiority and efficiency of LGL-PS method over VI method (Akkouche et al., 2014).

Figure 5.

The approximate solutions $y^{N} (t, x)$ and $u^{N} (t, x)$ with $N = 20$ , for Example 2.

Figure 6.

The logarithm of the $E_{R}^{N} (t_{k}, x_{p})$ and $E_{\nabla, R}^{N} (t_{k}, x_{p})$ with $N = 20$ , for Example 2.

Table 2.

The values of $∣ J^{N} - J^{N - 1} ∣$ for Example 2.

$N$	LGL-PS method	VI method
2	0.181161045e − 5	0.183715461e − 5
3	0.181105706e − 6	0.524223719e − 4
4	0.266800030e − 8	0.984740568e − 6
5	0.993678081e − 9	0.933596337e − 6
6	0.198635971e − 9	0.130975053e − 7
7	0.153371286e − 10	0.105275454e − 7
8	0.353473001e − 11	0.110904750e − 9

Table 3.

The approximate value of $J^{N} (\bar{y}, \bar{u})$ for Example 2.

$N$	2	4	6	8	10
$J_{N} (\cdot)$	0.181128993589688	0.181155678028960	0.181155706154274	0.181155706423584	0.181155706457046

We note that Table 2, shows the residual values of the obtained numerical sequence. Given that the residual tends towards zero with increasing $N$ , the numerical convergence of the method can be observed.

Note that Table 3, represents a fixed number, and with each iteration, more digits of that number become constant with an increase of $N$ , the numerical convergence of the method can be observed.

Example 3

We consider the following OCP

\begin{align} M i n i m i z e & J (y, u) = \frac{1}{2} \int_{0}^{1} \int_{0}^{1} (y^{2} (t, x) + u^{2} (t, x)) d t d x \\ s u b j e c t t o \end{align}

(56)

This problem has an optimal solution, but an analytical form for it is not available. Hence, we solve this problem by our approach, numerically. For $N = 10$ and 20, the approximate solutions are shown in Figures 7 and 8. Additionally, Figures 9 and 10 display the logarithm of the PDE residual error and the residual error of gradient equations, respectively. The results of the objective functional J obtained using our method are compared with those from the CPS method (Mohammadizadeh et al., 2019) and the LPS method (Sabeh et al., 2016), as shown in Table 4. The obtained results indicate the superiority and efficiency of the LGL-PS method over both CPS and LPS methods.

Figure 7.

The approximate solutions $y^{N} (t, x)$ and $u^{N} (t, x)$ with $N = 10$ , for Example 3.

Figure 8.

The approximate solutions y^N(t, x) and u^N(t, x) with $N = 20$ , for Example 3.

Figure 9.

The logarithm of the $E_{R}^{N} (t_{k}, x_{p})$ and $E_{\nabla, R}^{N} (t_{k}, x_{p})$ with $N = 10$ for Example 3.

Figure 10.

The logarithm of the $E_{R}^{N} (t_{k}, x_{p})$ and $E_{\nabla, R}^{N} (t_{k}, x_{p})$ with $N = 20$ for Example 3.

Table 4.

Comparison of objective functional values for Example 3.

$N$	LGL-PS method	CPS method	LPS method
10	0.0308082421125	0.033014881174862	0.0828638100277
20	0.0290844555011	0.040440356851832	0.0620867108909

Conclusions and suggestions

In this paper, a novel and practical scheme was proposed for solving optimal control problems (OCPs) governed by partial differential equations (PDEs) using the Legendre–Gauss–Lobatto (LGL) pseudo-spectral (PS) method. The LGL-PS method has demonstrated significant advancements in the numerical solution of PDE-constrained OCPs, offering a robust combination of theoretical soundness, high-order accuracy, and computational efficiency. These features make it a valuable and versatile tool for addressing complex control problems in various scientific and engineering domains. Compared to existing numerical approaches, such as Legendre PS (LPS), Chebyshev PS (CPS), and variational iteration (VI) methods, the LGL-PS method yielded superior performance in terms of solution accuracy and error minimization, particularly for benchmark problems like the Burgers’ equation and the diffusion equation. The method’s ability to produce globally smooth and continuously differentiable approximations is crucial for ensuring reliable control strategies in dynamic systems. Furthermore, the proposed framework incorporates the Karush–Kuhn–Tucker (KKT) optimality conditions and employs efficient solvers such as the Levenberg–Marquardt algorithm, enabling the accurate resolution of the resulting nonlinear programming (NLP) problems. This integration enhances both the stability and convergence of the numerical scheme. The flexibility of the LGL-PS method suggests promising avenues for future research. In particular, extending this approach to OCPs governed by fractional-order PDEs, time-delay systems, and multi-dimensional nonlinear PDEs could significantly broaden its applicability. These problem classes are increasingly relevant in modeling complex phenomena such as anomalous diffusion, memory-dependent processes, and spatiotemporal dynamics in biological and physical systems. In summary, the LGL-PS method not only provides a powerful tool for solving classical PDE-constrained OCPs but also lays the groundwork for tackling emerging challenges in optimal control theory. Its continued development and adaptation to more generalized systems will be instrumental in advancing both theoretical insights and practical solutions in the field.

Footnotes

ORCID iD

Mohammad Hadi Noori Skandari

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

There is no data out of the article to present.

References

Abu

(2018) Numerical solutions for the Robin time-fractional partial differential equations of heat and fluid flows based on the reproducing Kernel algorithm. International Journal of Numerical Methods for Heat and Fluid Flow 28(4): 828–856.

Abu

Shawagfeh

(2021) Solving optimal control problems of Fredholm constraint optimality via the reproducing Kernel Hilbert space method with error estimates and convergence analysis. Mathematical Methods in the Applied Sciences 44(10): 7915–7932.

Akkouche

Maidi

Aidene

(2014) Optimal control of partial differential equations based on the variational iteration method. Computers & Mathematics with Applications 68(5): 622–631.

Bertolazzi

Biral

(2023) A direct/indirect approach to optimal control problems. In: International conference on numerical computations: theory and algorithms, Calabro, Italy, 14–20 June 2023. Springer Nature Switzerland, 47–62.

Biegler

Ghattas

Heinkenschloss

, et al. (2003) Large-scale PDE-constrained optimization: an introduction. In: Large-Scale PDE-Constrained Optimization. Springer, 3–13.

Borzì

Griesse

(2006) Distributed optimal control of lambda–omega systems. Journal of Numerical Mathematics 14(1): 17–40.

Bryson

(1996) Optimal control–1950 to 1985. IEEE Control Systems 16(3): 26–33.

Casas

Yong

(2023) Optimal control of a parabolic equation with memory. ESAIM: Control, Optimisation and Calculus of Variations 29: 23.

Casas

De Los Reyes

Tröltzsch

(2008) Sufficient second-order optimality conditions for semilinear control problems with pointwise state constraints. SIAM Journal on Optimization 19(2): 616–643.

10.

Dehestani

Ordokhani

Razzaghi

(2025) A robust optimisation approach for the 2D-VO fractional optimal control problems. International Journal of Systems Science 56(2): 347–362.

11.

Doehring

Gassner

Torrilhon

(2024) Many-stage optimal stabilized Runge–Kutta methods for hyperbolic partial differential equations. Journal of Scientific Computing 99(1): 28.

12.

Erfanifar

Hajarian

(2024) A new multi-step method for solving nonlinear systems with high efficiency indices. Numerical Algorithms 97(2): 959–984.

13.

Fischer

Izmailov

Solodov

(2024) The Levenberg–Marquardt method: an overview of modern convergence theories and more. Computational Optimization and Applications 89(1): 33–67.

14.

Fuica

Jork

(2025) Adaptive finite element method for an unregularized semilinear optimal control problem. arXiv preprint arXiv:2505.04439.

15.

Hairer

Wanner

(1996) Solving Ordinary Differential Equations. II: Stiff and Differential-Algebraic Problems. Springer.

16.

Han

Rui

(2024) A new adaptive Levenberg–Marquardt method for nonlinear equations and its convergence rate under the Hölderian local error bound condition. Symmetry 16(6): 674.

17.

Huang

Peng

(2024) A deep difference collocation method and its application in elasticity problems. International Journal of Solids and Structures 291: 112692.

18.

Huang

Rad

Noori Skandari

, et al. (2025) A spectral collocation scheme for solving nonlinear delay distributed-order fractional equations. Journal of Computational and Applied Mathematics 456: 116227.

19.

Hull

(1997) Conversion of optimal control problems into parameter optimization problems. Journal of Guidance, Control, and Dynamics 20(1): 57–60.

20.

Jiang

Gao

(2024) Review of collocation methods and applications in solving science and engineering problems. Computer Modeling in Engineering and Sciences 140(1): 41–76.

21.

Kang

Ross

Gong

(2008) Pseudospectral optimal control and its convergence theorems. In: Astolfi

Marconi

(eds) In Analysis and Design of Nonlinear Control Systems. Springer, 109–124.

22.

Ketabdari

Farahi

Effati

(2021) An efficient approximate method for solving two-dimensional fractional optimal control problems using generalized fractional order of Bernstein functions. IMA Journal of Mathematical Control and Information 38(1): 378–395.

23.

Kunisch

Wagner

(2013) Optimal control of the bidomain system (II): uniqueness and regularity theorems for weak solutions. Annali di Matematica Pura ed Applicata 192(6): 951–986.

24.

Latifi

Parand

Delkhosh

(2020) Generalized Lagrange–Jacobi–Gauss–Radau collocation method for solving a nonlinear optimal control problem with the classical diffusion equation. The European Physical Journal Plus 135(10): 834.

25.

Leugering

Engell

Griewank

, et al. (2012) Constrained Optimization and Optimal Control for Partial Differential Equations. Springer.

26.

Mamehrashi

Yousefi

(2017) A numerical method for solving a nonlinear 2-D optimal control problem with the classical diffusion equation. International Journal of Control 90(2): 298–306.

27.

Marzban

(2021) A new fractional orthogonal basis and its application in nonlinear delay fractional optimal control problems. ISA Transactions 114: 106–119.

28.

Marzban

Hoseini

(2016) An efficient discretization scheme for solving nonlinear optimal control problems with multiple time delays. Optimal Control Applications and Methods 37(4): 682–707.

29.

Marzban

Manochehri Naeini

(2026) An innovative class of orthogonal functions based on the fractional Vieta-Fibonacci functions and its utilization in optimal control of piecewise constant order systems consisting of delay. Mathematics and Computers in Simulation 239: 223–244.

30.

Marzban

Nezami

(2022) Analysis of nonlinear fractional optimal control systems described by delay Volterra-Fredholm integral equations via a new spectral collocation method. Chaos, Solitons & Fractals 162: 112499.

31.

Mohammadizadeh

Tehrani

Noori Skandari

(2019) Chebyshev pseudo-spectral method for optimal control problem of Burgers’ equation. Iranian Journal of Numerical Analysis and Optimization 9(2): 77–102.

32.

Mowlavi

Nabi

(2023) Optimal control of PDEs using physics-informed neural networks. Journal of Computational Physics 473: 111731.

33.

Nemati

(2018) Numerical solution of 2D fractional optimal control problems by the spectral method along with Bernstein operational matrix. International Journal of Control 91(12): 2632–2645.

34.

Nemati

Yousefi

(2017) A numerical scheme for solving two-dimensional fractional optimal control problems by the Ritz method combined with fractional operational matrix. IMA Journal of Mathematical Control and Information 34(4): 1079–1097.

35.

Noori Skandari

Tohidi

(2011) Numerical solution of a class of nonlinear optimal control problems using linearization and discretization. Applied Mathematics 2(5): 646–652.

36.

Noori Skandari

Kamyad

Effati

(2016) Smoothing approach for a class of nonsmooth optimal control problems. Applied Mathematical Modelling 40(2): 886–903.

37.

Ogundare

(2009) On the pseudo-spectral method of solving linear ordinary differential equations. Journal of Mathematics and Statistics 5(2): 136–140.

38.

Pirouzeh

Noori Skandari

Pirbazari

, et al. (2024) A pseudo-spectral approach for optimal control problems of variable-order fractional integro-differential equations. AIMS Mathematics 9(9): 23692–23710.

39.

Polak

(1973) An historical survey of computational methods in optimal control. SIAM Review 15(2): 553–584.

40.

Raymond

Zidani

(1998) Pontryagin’s principle for state-constrained control problems governed by parabolic equations with unbounded controls. SIAM Journal on Control and Optimization 36(6): 1853–1879.

41.

Sabeh

Shamsi

Dehghan

(2016) Distributed optimal control of the viscous Burgers equation via a Legendre pseudo-spectral approach. Mathematical Methods in the Applied Sciences 39(12): 3350–3360.

42.

Samadi

Heydari

Effati

(2024) Numerical solutions of two-dimensional PDE-Constrained optimal control problems via bilinear pseudo-spectral method. Mathematical Sciences 18(1): 107–123.

43.

Soufivand

Soltanian

Mamehrashi

(2023) A numerical approach for solving a class of two-dimensional variable-order fractional optimal control problems using Gegenbauer operational matrix. IMA Journal of Mathematical Control and Information 40(1): 1–19.

44.

Tabrizidooz

Marzban

Pourbabaee

, et al. (2017) A composite pseudospectral method for optimal control problems with piecewise smooth solutions. Journal of the Franklin Institute 354(5): 2393–2414.

45.

Tröltzsch

(2010) Optimal Control of Partial Differential Equations: Theory, Methods, and Applications. American Mathematical Society.

46.

Von Stryk

Bulirsch

(1992) Direct and indirect methods for trajectory optimization. Annals of Operations Research 37(1): 357–373.

47.

Wright

(1964) Chebyshev collocation methods for ordinary differential equations. The Computer Journal 6(4): 358–365.

48.

Zoccolan

Strazzullo

Rozza

(2025) A streamline upwind Petrov-Galerkin reduced order method for advection-dominated partial differential equations under optimal control. Computational Methods in Applied Mathematics 25(1): 237–260.