Valley, a minimum path between mountains. Photo by http://dreamicus.com

Using an MPC to control a system: The error function

7 min readNov 14, 2020

In this series, we have been learning how to create an MPC. This is a control technology that is more powerfull than the well-known PID but is also more complex and difficult to implement.

To do it we have seen that we need to have a system that we have to be able to model accurately enough. For this project it was chosen a system that heats up a solution using steam that we proceeded to deduce two equations that model the system that you can read about here:

Using an MPC to control a system: defining the system

Have you ever wondered how the cruise control in a car keeps its speed constant? How the pressure of a certain gas in…

tiagomiguelrs.medium.com

Later we deduced the full State-Space Equation System (SSES) in which the first equation tells how the system behaves and the second isolated the variable we want to control. We discretized the system and then deduced the calculations to obtain predictions for an horizon of time samples but not without going into detail on how the control inputs are used to predict but also to decrease the error in the next iteration. All this is written here:

Using an MPC to control a system: discretizing the state-space equation system

Last time we discussed what controllers are and some examples where they are used. We also defined a system to control…

tiagomiguelrs.medium.com

After talking about all this there are still some steps missing. We still need to know how we calculate the vector of the control inputs, from which we are going to use the first value to control our system for the next iteration. Let’s recall the system of equations:

This system is calculates the value of the states for the horizon of predictions and it depends on the states now and the control inputs a time sample before the predictes states. To approximate it to reality, we are going to slightly change how we calculate the predictions.

Changing the control input

Currently we are using values of absolute pressure, but instead, we should use its variation. This makes sense if you think that we can’t just jump from 0.8 to 1.2 atm from one time sample to the other if, in reality, this might not be possible.

The pressure in the next time sample is bound by the pressure now and there must be a variation associated to it that is going to depend on how the pressure changing mechanism works. Is it a fast or a slow mechanism? Could it go from the minimum to the maximum pressure in 0.05 seconds for example? This needs to be accounted for because whatever information the controller sends, needs to be able to be executed by the real system or else it may not be controllable.

Even if it is possible for the mechanism to change so quickly, for the sake of having a more stable pressure variation and reducing wear and tear of a mechanism, we are going to control for pressure variation instead of absolute pressure.

A new pressure change (Δθ) vector is calculated at every iteration and added to the absolute pressure of the previous iteration. Note that these are not to calculate the predictions. The vectors refer to all control input variables if there were more in our system. Let’s include this modification in the SSES.

The matrix of the parameters and states has changed. We can consider that our A_d and B_d matrices and X vectors have expanded, represented by the tilde. Will this substantially change what we have done so far? No, not at all! Predictions can still be calculated substituting the variables from the Kth iteration to find those in (K+1)th iteration and substitute the variables from (K+1)th iteration to predict those in (K+2)th iteration and so on.

We’ll change the secong equation accordingly.

One question still persists that was mentioned in the previous publication. How do we calculate the control inputs?

The cost function

The cost function is a function that either minimises or maximises a cost by changing its independent variables. In this situation, the independent variables are the temperature and the control input. Instead of temperature we could say that it is the error between the temperature and its reference value since the error dependes on the temperature.

The problem at hand requires that we minimise the costs. For that we need a function that has at least a minimum that can be found through differentiation. Take a look at the following examples:

Error dependent cost of a linear (J1), quadratic (J2) and cubic (J3) functions. Figure by author.

Imagine all three functions calculate costs J1, J2 and J3 that depend on a single variable of error e. We can see that with a quadratic function we can find a minimum value and with both linear and cubic there is a continuous decrease of the cost as the error decreases. Let’s define our quadratic cost function that depends on the temperature error and control input.

To make things more general, we are going to use a reference vector r. This vector is a column vector containing the reference for all controlled variables. In our case it is just the reference value for temperature T_RK in which the subscript R stands for reference. The error vector is the difference between the reference and the controlled variable. The calculation of the controlled variables is simplified since the matrix of tilde D_d is zero. Following is the error function in an algebraic shape:

The superscripted T means transposition. Can you see the quadratic shape? I’ll explain for a system where we have two target variables (1 and 2) instead of only one as it is easier to understand.

You can see that the way the matrices and vectors unfold reproduce a shape we are more use to see as quadratic with the variables raised to the power of two. The diagonal matrices S, Q and R are weight matrices. They module how easily the errors and control inputs should change from one prediction to the next. But how really?

Stiffness of the cost

Take a look at the general equation of a quadratic function for one independent variable:

To create a paralel to our system, the y can represent the cost and x the error. We’ll see what b does later, but now let’s see how does a affect the shape of our curve.

Effect of a in the shape of a quadratic function. Figure by author.

We can see that as a increases, the inclination in both sides of the minimum becomes greatly steeper. This creates some stiffness in what regards the ability of x move around. We can understand that choosing a value of x that is not x1 will translate into greatly higher costs with bigger values of a.

The a parameter in our actual cost function are the diagonal S, Q and R matrices. The bigger their values, the stiffer will be the changes in the variables temperature and control input. This is important because it affects the dynamics of our MPC. For instance, if R is small, bigger values of pressure change are more acceptable than if R is big since it doesn’t impact the cost as much.

This may help bring the system to the desired values of temperature, but at a possible cost of greater pressure actuator wear and tear or even overshoot of the control. So, a good compromise is usually the best policy.

Adding and subtracting a constant

Let’s backtrack a little. I said we were going to talk about b. This parameter is just a constant that adds or subtracts to the cost. However, it doesn’t change the position of the minimum. So, getting rid of it will just reduce the calculations our machine needs to perform. For big systems that demand a lot of control, it can be interesting to do.

Effect of b in the shape of a quadratic function. Figure by author.

As you can see, x1 is the independent variable value for which y is minimum in both situations, independent of b.

Closing thoughts

In this publication, we continued building our MPC. We saw how to make some changes to the control input based on some real world considerations, namely changing the control of absolute pressure to pressure variation.

We also set a cost function, talked about whether it should be linear, quadratic or cubic and decided based on the ability to find a minimum.

Finally we discussed the role of two of the parameters in a quadratic function and how they can influence the results of our cost function.

Next time we are going to simplify the cost function and deduce its derivative so that we can calculate the minimum and hopefully I’ll share some animations and close this chapter.

Again, thanks for reading!