A TOPSIS based Self-Organizing Double Loop Recurrent Broad Learning System for Uncertain Nonlinear Systems*

This study proposes an efficient intelligent control structure for uncertain nonlinear systems. The controller is implemented by a sliding mode control framework including a modified broad leaning network (BLS) with a double-loop recurrent structure. In addition, the proposed BLS involves a self-organizing mechanism to increase or decrease the size of the BLS. The technique for order of preference by similarity to ideal solution (TOPSIS) method is used to build the self-organizing mechanism. Moreover, two dynamic thresholds of TOPSIS are automatically determined according to the stability of the controller. One dynamic threshold is used to consider whether to retain or remove existing network neurons in the BLS; and the other is used to generate new neurons, so as to meet the requirements of different control states and save computing resources. To improve the network's dynamic characteristics, a double-loop recurrent structure is further introduced into the self-organizing BLS. The Lyapunov stability function is used to ensure the stability of the control system. The proposed controller is applied to the simulation control of a nonlinear chaotic system and a three-link robot manipulator. The experimental results show that the proposed controller can achieve better control performance against other network-based controllers. The source code of this work is placed at https://github.com/wzhuang-xmu/SODLRBLS


I. INTRODUCTION
Nowadays, many studies have used learning methods to control uncertain nonlinear systems [1]- [3]; however, designing an efficient intelligent controller with excellent control performance is still a challenge.In particular, how counteracting non-linearity and uncertainty is the key to solving the challenge.Many studies suggest using the robust control algorithm as the main approach to controlling uncertain systems [4]- [7].However, the sliding mode control (SMC) method can have This work was supported by the Natural Science Foundation of Fujian Province of China (No. 2021J01002).
better performance for nonlinear systems [8]- [10].SMC can convert high-order systems into low-order systems and SMC is insensitive to parameter changes, capable of fast dynamic response, and able to suppress external disturbances [11].
Therefore, the SMC-based intelligent control systems are widely used in the control of nonlinear systems to obtain better learning ability and faster convergence speed.In addition, many SMC controllers involve various artificial neural networks to further improve their non-linear characteristics [12], [13].For example, a fuzzy brain emotional learning network was embedded into an SMC control system for controlling different types of non-linear systems including robots [14]- [16].In particular, the cerebellar model articulation controller network (CMAC) with a recurrent feedback loop has attracted good attention and led to promising results due to its structure's greater freedoms-of-design [17], [18].
However, current SMC controllers often suffer two challenges: (1) controllers consume too many computational resources, and (2) the accuracy of controllers is not high enough and their dynamic response characteristics are insufficient.For the first challenge, a self-organizing structure must be involved to efficiently and flexibly invoke computational resources; we thus find that the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) firstly proposed in [19] has been applied in many dynamic networks [20]- [22], whose results revealed that TOPSIS can use information contained in input data to rank and select candidate neurons to build selforganizing structures [23].
For the second challenge, the Broad Learning System network (BLS) [24], [25] can expand in width, so as to provide a faster response scheme with higher accuracy compared with other popular neural networks.BLS is designed based on the random vector functional-link neural network (RVFLNN) [26], which further solves shortcomings of RVLFNN in processing large-scale and high-dimensional data.Also, the computational speed of a BLS network is faster than an extremely learning machine (ELM) [27].In addition, Fei et al. proposed a double loop recurrent neural network (DLRNN) [28], which combines the advantages of internal and external feedback.A DLRNN captures both output state information and internal state information simultaneously to own better approximation performance than other regular recurrent neural networks.Therefore, combining BLS and DLRNN is a potential solution to improve a BLS network's dynamic response-ability.
Based on the above considerations, we propose a selforganizing BLS network-based SMC controller, in which the TOPSIS method is used to build a self-organizing structure to efficiently take advantage of computational resources, and a double loop mechanism inspired by DLRNN is created to improve the BLS network's dynamic characteristics.Two dynamic thresholds: one threshold determines whether to retain or remove the existing feature neurons and the other generates new feature neurons, which are automatically determined to construct the self-organizing structure.The double-loop structure is introduced into the self-organizing BLS, so as to process both internal and output state information simultaneously.The Lyapunov stability theorem is used to guarantee the stability of the proposed controller and derive the updated rules of the parameters in the proposed BLS network.The control performance of the control system is demonstrated through a control simulation of a nonlinear chaotic system and a threelink robot manipulator.
The main contributions of the proposed controller include: 1) The TOPSIS method is used to evaluate outputs of feature neurons and two dynamic thresholds are automatically determined to construct a self-organizing BLS, which meets the requirements of different control states and saves computational resources.2) A double-loop structure is introduced into the selforganizing BLS: the internal feedback neural network is added to the enhancement neurons to capture the internal state information and the external feedback neural network is added to the output neurons to capture the output state information, so as to improve the dynamic characteristics of self-organizing BLS.

II. PROBLEM FORMULATION
A class of nth-order multi-input multi-output (MIMO) uncertain nonlinear system is described as: where T ∈ R m is the control input vector.Thus, Eq. ( 1) is rewritten as: where ∆f (x (t)) and ∆g (x (t)) are the respective uncertain terms in f (x (t)) and g (x (t)), and ε (x (t) , t) = ∆f (x (t))+ ∆g (x (t)) u (t)+d (t) is the lumped uncertainties and external disturbances.If the lumped uncertainties and external disturbances are ignored, Eq. ( 2) can be rewritten as: where f 0 (x (t)) ∈ R m and g 0 = diag(g 01 , g 02 , . . ., g 0m ) ∈ R m×m are the respective nominal portions of f (x (t)) and R m×n is the desired signal, then the tracking error vector is defined as: e (t) = e (n−1) (t) , e (n−2) (t) , . . ., ė (t) , e (t) Thus, an ideal sliding surface is defined as: T 0 e (t) dt T 0 e (t) dt where Take the derivative of Eq. ( 5), we have: If the following inequality (7) can be satisfied, the control system will be stable; meanwhile, an ideal control output u IDEAL can be obtained.
where σ i > 0, i = 1, 2, . . ., m. Applying Eq. ( 6) into Eq.( 7): where sgn (•) is a sign function.However, the ideal control output u IDEAL cannot be obtained directly because ε (x (t) , t) is unknown, and in general we must know the exact system dynamics parameters to obtain the functional form of f 0 and g 0 .
In this paper, TOPSIS-based self-organizing double loop recurrent broad learning system (SODLRBLS) is proposed to fit this process.The input of SODLRBLS is the combination error e (t), and its output is u SODLRBLS .

III. SELF-ORGANIZING DOUBLE LOOP RECURRENT BROAD LEARNING SYSTEM
The TOPSIS method and DLRNN structure are introduced into BLS to improve the performance of the control system in this paper.TOPSIS method is applied to evaluate the output of the feature node layer of BLS to automatically determine two dynamic thresholds for constructing a self-organizing BLS. the internal feedback neural network of DLRNN is added to the enhancement node layer and the external feedback neural network of DLRNN is added to the output layer.The configuration of the proposed SODLRBLS network is illustrated in Fig. 1.These layers of SODLRBLS are specified as below.
1. Input layer: both input signals and external feedback signals from output layer are input to every node in this layer.The output of ith node in this layer can be denoted as: where x i is the input signal to ith node of this layer; exu i is the ith output signal of SODLRBLS calculated in the last step and eexu i is the output signal one step before exu i ; exu i and eexu i work as the feedback signals in the external recurrent loop; w roi are the neural weights connecting the ith node of the output layer and the ith node of the input layer.2. Feature node layer: there are m feature node groups, each of which has n fj nodes.w f ijk is the weight connecting the ith input node in input layer and the kth feature node in jth feature node group, b f jk is the bias term of the kth feature node in jth feature node group.The output of the kth feature node in jth feature node group is defined by: where ∅ (•) is the activation function, ∅ (x) = tanh (x) = (e x − e −x )/ (e x + e −x ).
We denote , which indicates n f equals the maximum number of nodes in all feature node groups.For the convenience of calculation, if the feature node group whose number of feature nodes is less than n f , the output value f is regarded as 0.
The TOPSIS method is used to determine the dynamic deleting threshold value and dynamic generating threshold value, which are used for automatically retaining, deleting the feature node, or generating a new feature node.The Shannon entropy method is used to determine the weights for the evaluation criteria in the TOPSIS.The specific process based on [29], [30] is as follows: Let feature nodes be the alternatives and let the outputs f j1 , f j2 , . . ., f jn f j be judgment conditions, we have: 1) Generate an evaluation vector: which means each feature node group is an evaluation vector and the output of the nodes in the group constitutes the dimensions of the evaluation vector.There are a total of n evaluation vectors, each evaluation vector has n f dimensions.Then we have: 2) Normalize evaluation vector to limit the value of each element to [0, 1] by: 3) Use the entropy weight method to obtain the weights.The entropy method is given by [31], [32].First, the entropy value is calculated by: Then, determine the diversification degree of the measurement quality by: Determine the weight for each evaluation criteria by: 4) Determine the weighted normalized decision matrix by: 5) Determine the best weighted vector v * and worst vector v ′ by: 6) Calculate the separation distance between the best weighted vector v * and the worst weighted vector v ′ by: 7) Calculate the similarity to the worst condition and the best condition by: Vector wc and bc can be used to determine the dynamic thresholds.
ρ wc = P wc min (W C) = P wc min(max (wc)) (24) ρ bc = P bc max (BC) = P bc max(min (bc)) (25) where ρ wc is the dynamic deleting threshold and ρ bc is the dynamic generating threshold.W C is a storage vector that contains the maximum values of wc in each train round and BC is a storage vector that contains the minimum values of bc in each train round.0 ≤ P wc ≤ 1 is deleting threshold coefficient and 0 ≤ P bc ≤ 1 is the generating threshold coefficient.With dynamic deleting and dynamic generating thresholds, we can achieve the self-organization in the feature node layer by : If the output of the feature node is greater than or equal to ρ wc , this node remains, else this node is deleted.Also, If max(F ) < ρ bc , a new feature node is generated; otherwise, no node is added.
3. Enhancement node layer: Assume that there is one node group with n h nodes, thus both signals from the feature node layer and internal feedback signals are input to each node in this layer.The output of qth node in this layer is defined as: where w hjkq denotes the weights connecting the k-th feature node in j-th feature node group and the q-th enhancement node in this layer; the Gaussian function is the activation function; w r is the weights of internal feedback; exh is the output of enhancement layer calculated in the last epoch, which works as the feedback signal in the internal recurrent loop.c jkq and v jkq are the center and width of the Gaussian function, which calculated in the q-th enhancement node for the k-th feature node in j-th feature node group.Thus, we have: 4. Output layer: The outputs of both the feature node layer and enhancement node layer are passed to this layer.k ijk is the weight connecting kth feature node in jth feature node group and the ith output node, w iq is the weight connecting qth enhancement node and the ith output node.The output of ith output node can be calculated by: We define K and W as: Thus, we have: The overall computational process of the proposed SODL-RBLS network is summarized in pseudocode in Algorithm 1.

IV. ADAPTIVE LEARNING ALGORITHM AND CONVERGENCE ANALYSIS
The framework of the proposed SODLRBLS-based control system is illustrated in Fig. 2. The SODLRBLS is used as the main controller in the control system, which works with a robust controller.
Algorithm 1 The pseudocode of SODLRBLS network.
1: calculate θ i using Eq. ( 10); 2: calculate f jk using Eq. ( 11); 3: calculate dynamic deleting ρ wc and generating threshold ρ bc using Eqs.( 24) and ( 25); 4: Self-organizing of enhancement node layer according to self-organization rule (26); 5: calculate h q using Eq. ( 27); 6: calculate the output u of network using Eq. ( 29  A set of update laws for SODLRBLS are derived to support the proposed control system and the rules can be proven that global control system can achieve an H ∞ tracking control effect using the Lyapunov stability theory.The detailed proof process is as follows: Subtracting Eq. ( 9) from Eq. ( 6) Assume that there exists an optimal SODLRBLS, u * SODLRBLS , to approach an ideal controller u IDEAL .K * and W * are optimal weight matrices, and F * and H * are optimal output matrices of the feature node layer and the enhancement node layer in the optimal SODLRBLS.Then the optimal control output is: where ε is a minimum approximation error vector; K and Ŵ are estimated weight matrices; and F and Ĥ are estimated output matrices of the feature node layer and enhancement node layer in the actual SODLRBLS.Then the actual control output is: where u RC is the output of the robust controller.Taking Eqs.(34) and (35) into Eq.( 33): where A partially linear form of the receptive-field basis function vector F in Taylor series can be described as: where F w f , F b f and F wro are defined by: w ro ; and β 1 is a higher-order vector.
F * can also be rewritten as: Also, a partially linear form of the receptive-field basis func-tion vector H in Taylor series can be described as: where H w h , H c , H v and H wr are defined by: where , and β 2 is a higher-order vector.
H * can also be rewritten as: Then, substituting Eqs. ( 39) and (42) to Eq. (36), we have: where Because of the existence of τ , we can use an attenuation constant, λ, to guarantee an H ∞ tracking performance [33]: where η K , η W , η w f , η b f , η wro , η w h , η c , η v and η wr are diagonal positive constant learning-rate matrices.The initial conditions of system are set as e (0) = 0, K (0) = 0, 0) = 0, v (0) = 0, w r (0) = 0, so Eq. ( 44) can be rewritten as: The update laws of all parameters in SODLRBLS-based controller proposed in this paper is described as: where Eq. ( 55) is the adaptive laws of robust controller, R = diag[λ 1 λ 2 , . . ., λ m ] ∈ R m×m is a diagonal matrix of robust controller, whose elements are used as attenuation coefficients.
The Lyapunov function of the control system is empirically designed as: Taking the derivative of the Lyapunov function and using Eq.(36), we have: Substituting from Eqs. ( 46) to (55) into (57), we have: Integrating Eq. ( 58) from t = 0 to t = T , we have: Since L (T ) > 0, we have: where Eq. ( 60) is exactly Eq. (44).The above proves that the tracking control effect of H ∞ can be achieved.

V. SIMULATION RESULTS
In this section, we first conducted an ablation experiment of SODLRBLS to show each component's importance in the Duffing-Holmes chaotic system.Then, a comparison with other neural network-based controllers was conducted in a three-link robotic manipulator to exhibit the advantages of our proposed work.

A. Ablation Experiment: Duffing-Holmes Chaotic System
In the ablation experiment, the backbone of SODLRBLS was denoted as BLS; the self-organizing behavior of SODL-RBLS was deleted, denoted as DLRBLS; and the DLRNN structure of SODLRBLS was deleted, denoted as SOBLS.The performances of BLS, DLRBLS, SOBLS, and SODLR-BLS were compared to study the influence of self-organizing behavior and DLRNN structure on the performance of the SODLRBLS network.The control object is a Duffing-Holmes chaotic system, whose dynamic equation is: where f d (x) = 0.1x is the uncertainty, d (t) = sin (2t) + cos (2t) is the disturbance.x (0) = ẋ (0) = 0.2 are the initial system states.x d (0) = sin (1.1t) is the reference trajectory.
The sliding hyperplane was designed as s (t) = 0.5 ė (t) + 5e(t); and the diagonal matrix R of the robust controller was set to 0.07I, where I ∈ R 2 is a unit matrix.The weights, bias, and parameters of the Gaussian function of those four networks were initialized by random numbers from [−1, 1].The learning rate, η, is set to 0.01.The number of feature nodes in each group is set to 3 for BLS and DLRBLS, and the number of feature nodes in each group of SOBLS and SODLRBLS is initialized by random numbers from [1,10].In the process of self-organization, no more than 10 feature nodes can exist in each group.The number of feature node groups was set to 2, and the number of enhancement nodes is set to 1 for these four networks.The deleting threshold coefficient, P wc , was set to 0.1; and the generating threshold coefficient, P bc , was set to 1.We conducted ten simulation experiments for each of the four networks and averaged all results.
The simulation results are shown in Fig. 3. Fig. 3-(a) shows all of the trajectory phase portraits; Fig. 3-(b) shows all of the tracking responses of x and tracking responses of ẋ; Fig. 3-(c) shows all of tracking errors and derivative of errors.From Fig. 3(a) to (c), all the four controllers (BLS, DLRBLS, SOBLS and SODLRBLS) can well follow the reference trajectories.Moreover, the number of remaining feature nodes of SOBLS and SODLRBLS are shown in Fig. 3(d).In the figure, both SOBLS-based controller and SODLRBLS-based controller can be suitably self-organized.However, the feature nodes of SOBLS changed more drastically, and the number of feature nodes within the converged term is larger that that of SODLRBLS.
In addition, Table I shows the average RMSE values of ten simulations.The average RMSE of the DLRBLS-based controller is significantly smaller than those of other BLS-based controllers, showing that the double-loop recurrent structure can improve the accuracy of the controller well.Also, the performance of DLRBLS and SODLRBLS was very close, but the result of SOBLS was not as good as SODLRBLS; this result proved with the assistance of the TOPSIS structure,

B. Comparison Experiment: Three-Link Robot Manipulator
The model of a three-link robot manipulator is illustrated in Fig. 4. The dynamic equation of the three-link robot manipulator is expressed as: where M (q) is the inertia matrix, C (q, q) is the Coriolis/Centripetal matrix, g (q) is the gravity vector, u is the output torque, τ d = 2 × [0.2 sin (2t), 0.1 cos (2t), 0.1 sin (t)] T is the external disturbance, q is the the joint angle state vector, q is the velocity vector, q is the acceleration vector.T , and the original state is q = 0 and q = 0.The reference trajectories were set as: q d1 = 0.5 × [0.5 sin (2t + 2.5) + 0.75 cos (2t + 1.5), sin (2t) + sin (t), 0.2 cos (2t) − 0.2 sin (t)] T , qd1 = 0 and qd1 = 0.
The reference trajectories were changed as q d2 = T , qd2 = 0 and qd2 = 0 at 15s to evaluate the robustness of the proposed control system.The sliding hyperplane was designed as s (t) = 0.55 ė (t) + 10e(t) and the diagonal matrix R of the robust controller was set to 0.066I, where I ∈ R 2 is a unit matrix.
The weights, bias, and parameters of the Gaussian function of SODLRBLS are initialized by random numbers from [−0.01, 0.01].The learning rate η is 0.0001.The number of feature nodes in each group of SODLRBLS is initialized by random numbers from [1,10].The number of feature node groups is set to 3, and the number of enhancement nodes is set to 3. The deleting threshold coefficient P wc was set as 0.1 and The generating threshold coefficient P bc was set as 0.1.We conducted ten simulation experiments for each of the four networks and averaged all results as well.
The simulation results are shown in Fig. 5. Figs.5(a)-(c) show trajectory responses and tracking errors of Joints 1, 2, and 3. We compared the tracking effect of the SODLRBLSbased controller with the DLRNN-based and FBEL-based controllers.All the controllers can well handle the robotic system; however, for the average RMSE values (given in Table II) of ten simulations of the three-link robot manipulator, the tracking effect of the SODLRBLS-based controller was significantly better than those of the DLRNN-based controller and FBEL-based controllers.In Fig. 5(d), the feature nodes of SODLRBLS can be well self-organized in the initial stage and when the trajectory was abruptly changed, so as to meet the requirements of different control states and save computational resources.Note that the number of the feature nodes contained a chattering phenomenon after 15 seconds, more efforts should be required to eliminate the chattering.Note that we calculated the average computational time of the ten simulations in both Tables I and II.In these two tables, our computational time is larger than those of the compared methods.The reason is the self-organization mechanism increased the amount of computational time.

VI. CONCLUSION
In this work, we proposed a type of self-organizing neural network that is used to build a controller for uncertain nonlinear systems.The network contained the key structure of the TOPSIS method and Broad Learning System.In addition, a double-loop recurrent structure was introduced to improve the network's dynamic characteristics.The Lyapunov stability function was used to prove the stability of the control system and derive the updated rules of the parameters in the proposed network.The network-based control system was used to simulate the control of a Duffing-Holmes chaotic system and a three-link robot manipulator.The simulation results showed that the proposed control system can achieve a better tracking performance against other network-based controllers.The future study will focus on building a more stable organizing method for TOPSIS to reduce the node number's instability.In addition, it is also crucial to apply our system to control real-time models and to deal with the chattering problems in real dynamic systems. .

Fig. 3 .
Fig. 3.The simulation results for a nonlinear chaotic system.(a) The trajectory phase portrait, (b) The tracking responses of x and tracking responses of ẋ, (c) Tracking errors and derivative of errors, (d) The number of remaining feature nodes of SOBLS and SODLRBLS.

Fig. 5 .
Fig. 5.The simulation results for a three-link robot manipulator.(a) Trajectory response and tracking error of joint 1, (b) Trajectory response and tracking error of joint 2, (c) Trajectory response and tracking error of joint 3, (d) The number of remaining feature nodes of SODLRBLS.

TABLE I THE
AVERAGE RMSE VALUES AND COMPUTATIONAL TIME OF 10 SIMULATIONS OF DUFFING-HOLMES CHAOTIC SYSTEM.