Application of Pareto Optimization in an Economic Model Predictive Controlled Microgrid

This paper presents an economic model predictive control approach for a linear microgrid model. The microgrid in grid-connected mode represents a medium-sized company building including storage systems, renewable energies and couplings between the electrical and heat energy system. Economic model predictive control together with Pareto optimization is applied to find suitable compromises between two competing objectives, i.e. monetary costs and thermal comfort. Using realworld data from 2018 and 2019, the model is simulated with auto-detection of the Pareto solution which is closest to the Utopia point. The results show that the Pareto optimization can either be used in real-time control of the microgrid, or to obtain suitable weights from long term simulations. Both approaches result in significant cost reductions.


I. INTRODUCTION
The decentralization of the electrical grid is a promising approach to meet national goals on CO 2 emissions reduction. Single communities or companies, so-called microgrids, may use on-site energy resources to reduce their dependency on the grid's stability and/or support it at the same time. Furthermore, with energy prices increasing, they may reduce their monetary costs. For example, electricity peak costs in German industry pricing for the medium-sized company building considered in this study raised from 76.00 e/kW peak in 2017 to 100.01 e/kW in 2019. Thus, reducing peak loads by an intelligent control of on-site energy resources and storages can reduce costs significantly.
However, to reduce energy costs in a microgrid, e. g. by peak shaving, usually compromises have to be made, e. g. limiting the heating, ventilation and air conditioning (HVAC) system. This decision would be taken by a human decision maker (HDM) responsible for the microgrid's energy management. The HDM has to choose compromises between multiple contradicting objectives in general -depending on the current situation and possibly including (uncertain) forecasts for renewable energy resources (RER), weather conditions and energy costs (in case of participation in the energy intraday market). Thus, two questions arise. First, how can an HDM find an 'optimal' control for the energy resources of a building in general? Second, how is the compromise between the contradicting objectives to be chosen?
To address the first question, solutions vary from simplified modeling of a microgrid by linear ordinary differential/difference equations (ODEs), to complex simulation tools resulting in a 'black-box' control approach. To address the second question, the concept of multi-objective optimization (MOO) or Pareto optimization has to be introduced. With at least two objectives, there usually is no single best solution. Instead, there is an infinite set of solutions which are not dominated by any other solution. Namely, considering two objectives, there is no other solution which is better in objective 1 without being worse in objective 2. The complete set of these non-dominated solutions is called Pareto front. However, so far literature of applications covering both problems is sparse.
In [1], a microgrid is modeled with energy storage systems, a controllable diesel generator and RER by linear ODEs. Flexibility in the control decisions is imposed by distinguishing critical and flexible loads while an optimization problem in model predictive control (MPC) fashion is formulated. However, a curtailed fulfillment of the flexible loads is only penalized proportionally. Namely, in the cost function, the unfulfilled power demand multiplied with a constant weight factor is added. Consequences of the (un-)fulfillment of flexible loads are not respected, neither on any physical quantity nor on future (flexible) loads. Furthermore, no possibly better trade-off by varying the weight factor is considered. In [2], a hierarchical structure is used to model multiple subsystems in more detail on the low-level. On the high-level, economic model predictive control (EMPC) is applied to optimize the total cooling and heating loads for every subsystem. They consider a microgrid with multiple buildings, where each building's zone temperatures and thermal storages are used as states and described by ODEs. While they present a feasible approach by applying complex control algorithms to decomposed, simplified models of the microgrid, they do not consider any trade-offs in finding the best compromise between monetary costs and user comfort. One of the studies which respect the optimization of multiple objectives is [3]. A building performance simulation tool is used to model the thermal characteristics of a three-room building. Then, they use genetic algorithms to optimize their decision variables, i. e. hourly temperature setpoints, for two objectives, i. e. monetary costs and thermal comfort. However, since the genetic algorithms and model are used in a black-box fashion, only sub-optimal solutions are obtained. Furthermore, the optimization is run once a day -with a planning horizon of one day. Thus, the main idea of MPC, optimizing the decision variables for N p steps, applying only the first step and then repeating the optimization for N p steps, is violated. Since the optimization time required is in the order of an hour, the approach is not applicable in real-time control, especially with a more complex microgrid. The authors of [4] consider a typical microgrid consisting of a storage system, RERs, a diesel generator and critical and controllable loads. Five (partly) contradicting objectives are formulated, e. g. energy costs and reliability of service. Then, these five objective functions are summed with (equal) constant weights w i and the resulting cost function is optimized over a time horizon of one day. This procedure is described as MOO and multiple Pareto fronts for two objectives each are shown, e. g. for J j and J k . No weighted sum method is used to obtain the Pareto fronts. Instead, for one point in time, all feasible solutions to the optimization problem found are presented. Then, all solutions for which J j and J k do not dominate each other are considered to be part of the Pareto front. Furthermore, no approach on how to use and derive a benefit from the obtained Pareto fronts is presented.
For this study, we examine a medium-sized company building in Offenbach, Germany, as our microgrid. Furthermore, we used expert knowledge for this specific building to determine its thermal capacity and heat transfer coefficient. Measurement data for energy demand, photovoltaic (PV) energy production and outside air temperature is available for 2018 and the first half of 2019. We model the building's energy management as a linear time-invariant system subject to disturbances. Then, we formulate two contradicting objectives, i. e. monetary costs J mon and thermal comfort J comf , and determine control sequences for the HVAC system and controllable energy resoruces in a MPC fashion. Furthermore, in every time step, we calculate all possible compromises between J mon and J comf , i. e. the Pareto front. Then, the most likely desired compromise is selected by detecting the frontier's knee point and either chosen automatically or presented to the HDM as a suggestion.
The assessment of the simulation results is, however, not trivial. Since this work's focus is on the application of the Pareto optimization, we need to compare 'optimal' results with other 'optimal' results. Thus, a combination of a P controller and if-then-else rules is used as a realistic baseline solution to which the different applications of EMPC are compared to.
The remainder of this paper is structured as follows. In Section II, the problem formulation is given, i. e. the microgrid model and the EMPC scheme. In Section III, the concept of Pareto optimization is explained together with our selection of a solution. In Section IV, the simulation results are presented and assessed. Finally, a conclusion and an outlook are given in Section V.

A. Microgrid Modeling
We model the energy flow and temperature of a mediumsized company building as a linear time-discrete model with two states and disturbances, i. e.
x(k + 1) = Ax(k) + Bu(k) + Sd(k). (1) Thereby, x(k) ∈ X ∈ R n is the state space vector, u ∈ U ⊆ R m the input vector, d ∈ D ⊆ R q the disturbance vector and A, B and S the system, input and disturbance matrices with appropriate sizes each. Furthermore, constraints on the change of x apply, i. e. (x(k + 1) − x(k)) ∈ ∆X ⊆ R n . Particularities are a combined heat and power plant (CHP), a stationary battery, and a PV plant as RER. The microgrid operates in grid-connected mode.
For the electrical system, we consider a stationary battery and its stored energy E as the first state. For the thermal system, we consider the thermal capacity of the building as a storage and use the building temperature ϑ b as its state. Then, the building's energy system can be described by with the abbreviation µ = 1−e − H air C th Ts

Hair
. The parameters used in A, B and S are explained in Table I. The inputs are the electrical power P grid bought from or sold to the grid, the electrical power P chp from the CHP, the thermal heating powerQ rad from the gas heating and the thermal cooling powerQ cool from an air conditioning system. Note that the CHP can only produce electrical power P chp and thermal powerQ chp at the same time (see Table I). As disturbances, we consider all impacts on our system which cannot be controlled. Thus, the renewable energy P ren from the PV system is treated as a disturbance, too. Other disturbances are the power demand P dem which must be met (e. g. consumption from offices), the outside air temperature ϑ air , and any other thermal powersQ other acting on the building (e. g. heat losses to the ground).

B. Economic Model Predictive Control
Due to its lower restrictions on the cost function, EMPC turned out to be a suitable extension of regular MPC for its application on microgrid control. Namely, for a model as in (1), in every time step the optimization problem is solved. Thereby, N p is the length of the prediction horizon, is the final cost term and ℓ(x, u) is the stage cost.
In contrast to regular MPC, ℓ(x, u) does not need to be in quadratic form. This complicates conclusions on stability and requires (strict) dissipativity for most approaches. However, proving stability for systems with disturbances such as (1) using EMPC is still an open problem [5]. Thus, we limit stability considerations here to the remark that the autonomous system (2) is Lyapunov stable. Thus, together with the assumption that all disturbances are known, the EMPC controller is expected not to destabilize (2), which is encouraged by simulation results.
In general, MPC is suited as a control method for microgrids for two reasons: 1) it respects future predictions of the model, inputs and especially disturbances, which is important considering the energy management system's dependence on exogenous influences such as power demand, air temperature and renewable energy productions; 2) various constraints on both control and state variables can be handled [1].

C. Cost Function Formulation
To use EMPC to control the microgrid model of the building, the formulation of the optimization problem in a MPC manner is necessary. For this, the discrete linear model from (2) is used with a sampling time T s = 0.5 h and a prediction horizon of 24 h, i. e. N p = 48 steps. Furthermore, the stage costs consist of two competing objectives, i. e. monetary and comfort costs. The monetary costs consist of four parts, ℓ mon = ℓ mon,grid + ℓ mon,peak + ℓ mon,chp + ℓ mon,heat , which describe the actually arising monetary costs. The first part are the costs (or profits) for electricity bought from (or sold to) the grid, ℓ mon,grid (k) = c grid,buy (k) · P + grid (k) . . .
+ c grid,sell (k) · P − grid (k) · T s . Remark that in this study, German industry pricing is applied, i. e. c grid,buy = 0.13 e kWh and c grid,sell = 0.07 e kWh are constant. Since buying and selling has to be distinguished in the costs, P grid is split into P + grid and P − grid , depending on whether it is positive or negative, respectively, i. e. P grid (k) = P + grid (k) + P − grid (k) for any k.
The second part in (5) describes the peak costs. Namely, at the end of the year, the maximum power peak drawn from the grid is punished with a high peak cost factor c grid,peak (87.38 e kW for 2018 and 100.01 e kW for 2019). These peak costs are given by where P grid,peak (k) is the maximum peak until time step k.
The third part in (5) describes the costs from using the CHP, The last part in (5) describes the costs from the gas heating, which are expensive in comparison to the CHP, i. e.
Note that both Eqs. (6) and (7) are discontinuous in P grid , which can lead to problems in solving the optimization problem. However, it is possible to reformulate the cost function into a continuous linear programming problem by an epigraph reformulation [6], which requires additional decision variables and constraints.
Our second objective in (4) comprises the comfort costs, which describe the quadratic deviation of the building temperature from a desired setpoint of ϑ set = 21°C, Since the monetary costs (5) can have a significantly higher order of magnitude than the comfort costs (10), they need to be weighted in (4). Thus, the actual stage costs used are Omitting the final cost term in (3) and using the stage costs from (11), for t = 0 the optimization problem can then be formulated as However, it is not clear how w mon and w comf should be chosen. Thus, Pareto optimization is used.

III. PARETO OPTIMIZATION
As the problem formulation (12) shows, (E)MPC is a singleobjective optimization technique. To consider multiple objectives, the sum of all objectives with different weights can be used as a single cost function, as is done in (11). However, two problems arise. First, the weights w i need to be chosen, thereby respecting both possibly different importance and orders of magnitude of the competing objectives. Second, fixed weights may not be optimal in terms of adaptability to different circumstances of the underlying problem, e. g. due to time-varying energy costs. Thus, we consider the microgrid control as a MOO problem; both monetary costs and temperature deviation should be minimized. Unfortunately, solving MOO problems is complex, since not a single optimal solution exists. Instead, for n o objectives, a hyperplane in R no describes the infinite 'optimal' solutions.

A. Pareto Front Construction
This hyperplane is called the Pareto front. All points on the Pareto front are Pareto optimal; i. e. every point refers to a solution which is not dominated by any other solution. Formally, a solution y dominates another solution z, i. e. y ≻ z, if y i ≤ z i ∀ components i and y j < z j for at least one component j. In other words, for a solution y on the Pareto front, there cannot be any other solution z which would be better in one objective without being worse in at least one other objective.
Knowing the Pareto front for an MOO problem is valuable. In our use case, an HDM gets the overview of all possible solutions and can choose from them. However, calculating the Pareto front is in general a hard problem. While different approaches exist, two of them are the most popular. First, evolutionary algorithms such as NSGA-II are used frequently [7]. However, the representation of MPC in form of evolutionary algorithms is difficult. More important, they are generally slow. To apply (E)MPC in real time, solving the underlying optimization problem in every time step must be fast enough. Thus, we use the second popular approach instead; the weighted-sum (WS) method. WS is directly transferable to the formulation of EMPC. Namely, we vary the weights in (11) s. t.
w mon , w comf ∈ [0, 1] and w mon + w comf = 1. (13) Then, every combination of w mon and w comf leads to a different point on the Pareto front. Since n o = 2, our Pareto front is a curve in 2D space as shown in Figure 1.
Concluding, the biggest advantages of WS are its simpler implementation (there is no representation of the problem in a genetic form needed), its good convergence and thus its lower computational costs. In contrast, there are two major obstacles.
First, capturing non-convex parts of the Pareto front is not possible with WS. However, since (12) is a convex problem, this does not affect us. Second, finding evenly distributed solutions on the front using WS is difficult. Namely, using equidistant steps respecting (13) in general does not lead to equidistant points on the Pareto front, but can lead to substantial parts being skipped. However, to choose a solution from the Pareto front, i. e. a compromise, an HDM would want to consider its complete course. Since the Pareto front cannot be obtained in an algebraic form, it shall be at least sampled in a desired resolution. Thereby, distances between solutions on the Pareto front are calculated in a normalized space, i. e. the distance from the Utopia point to the extreme points of every dimension is considered to be 1. The Utopia point is defined by (l mon,min , l comf,min ) and thus no possible solution, but lies below the Pareto front (see Figure 1).
To calculate samples on the Pareto front efficiently, we adapt a recently published approach, the adaptive weight determination scheme (AWDS) [8]. AWDS works by geometric interpretation of the weighted-sum method. Having found n o solutions on the Pareto front, they can be used to determine new weights w i for which the optimization problem should be solved again. The solutions of the optimization problem with w i leads to a point of the Pareto front which is approximately in the middle between the first two points. Then, for every new combination between the solutions new weights can be determined which lead again to a solution approximately in the middle between the parent solutions. For details on the AWDS, the reader is referred to [8].

B. Choosing Solutions from the Pareto Front
Once the Pareto front is constructed, a solution has to be chosen. Since every point of it is an 'optimal' solution, there is no clear way to determine a best one. Thus, the aim is usually to choose a point from which a small improvement in one objection would lead to a bigger deterioration in the other objective, i. e. a so-called knee point. However, there are various definitions of knee points [9]. Most common, for n o = 2, it is considered to be the point with the highest curvature. However, for simplicity and lower computational costs, we choose a different approach. We define the closest to Utopia point (CUP) to be our knee point. Thereby, we calculate the Euclidean distance in the normalized space as explained before.
In a real-world application, the Pareto front and the autodetected knee point would be presented to an HDM. Then, he could decide whether to choose the CUP or a different solution. However, to assess the application of Pareto optimization in the long term, we used the auto-detection of the CUP as described above for the simulations in Section IV.

IV. CASE STUDIES
The derived EMPC approach is applied with and without Pareto front calculations to the medium-sized company building's model from Section II-A. However, since in both cases the results are 'optimal' regarding their problem formulation, we developed a deterministic controller combining if-elsethen rules and a discrete P controller as a baseline solution to assess whether the application of EMPC is beneficial or not.
First, the baseline solution's controller is explained. Second, simulations for 2018 are carried out. Thereby, we compare the baseline solution to the EMPC algorithm with both auto-detection of CUP and fixed weights. The weights are determined a posteriori from the CUPs. Third, the first half of 2019 is simulated. Again, we compare the baseline solution to the EMPC algorithm with and without fixed weights. However, this time the mean weights obtained from the 2018 simulation are used to assess how well the determined weights perform in a new environment setting.
For all simulations, real world measurement data from the company building is used for the disturbances. That is, the electricity consumption P dem has been measured directly, the power generation P ren from PV has been proportionally scaled from solar radiation measurements and the outside air temperature ϑ air is used from a weather station. As heat disturbancesQ other , only heat losses to the ground are considered, which can be assumed to be approximately constant,Q other ≈ −12.9 kW. In any case, we assume to have no prediction errors, i. e. the real disturbances are used as predictions within the prediction horizon.

A. Baseline Solution
The baseline solution is supposed to be a realistic approach which might be used in a company building nowadays. However, we note that the actual control of a building is most likely even less sophisticated. We use (2) to derive the ODE for the building's temperature in dependence of the overall heating powerQ tot =Q chp +Q rad +Q cool , We choose a simple proportional controller to determine the inputQ tot byQ where k P is chosen such that the only eigenvalue of (14) is at the origin, i. e. we use dead-beat control.
Then,Q tot is split intoQ chp ,Q rad andQ cool as follows. IfQ tot >= 0, then the CHP is used as much as possible, since it cheaply produces both heating and electrical power. Only ifQ chp is running at its maximum, the gas radiatoṙ Q rad is used. IfQ tot < 0, the building must be cooled, thuṡ Q cool =Q tot .
Furthermore, it must be determined whether the stationary battery should be charged, discharged or neither. Since it shall be used for peak-shaving, the following strategy is applied.
The battery is only discharged if a new peak is arising. If so, it is discharged only as much as necessary to avoid a new peak. Otherwise, it is charged as much as possible without culminating in a new peak. Of course, both charging and discharging constraints are respected, too.
WithQ chp ,Q rad ,Q cool and x(k + 1) determined, the corresponding P grid is given by

B. 2018 Results
As the first case study, we simulate the building's model with data from 2018. With a time step of 0.5 h, this results in 17,520 steps. For all simulations, we used the MATLAB toolbox YALMIP [10] together with the CPLEX demo version, which limits the optimization problem to a maximum of each 1000 decision variables and constraints. With the reformulations mentioned earlier, a prediction horizon of 24 h ∧ = 48 steps results in 291 decision variables and 915 constraints, i. e. the demo version is sufficient. On a regular desktop PC with an Intel i5, the simulation for one year with calculation of the Pareto fronts takes about 15 h.
The knee point detection for 2018 is presented in Figure 2. It shows the chosen (normalized) weight w mon . The means of the chosen weights for the entire year are w mon = 26.47 %, w comf = 73.53 %. The very low weights with w mon < 1 % appear in situations where new peak costs would be possible. Then, ℓ mon has a much higher order of magnitude and thus the CUP in the normalized space belongs to a very small w mon . Possible new peak costs arise in scenarios with high demand P dem and a high air temperature ϑ air , where the electricity consumption of the air conditioningQ cool drives P grid to the peak limit. This setting is also shown in the exemplary Pareto front in Figure 1.
The high weights with w mon > 45 % at k ≈ 2700 appear in the rare situation where the very low ϑ air drives the gas heatingQ rad into its constraint. Then, a higher temperature deviation and higher comfort costs ℓ comf are unavoidable. Thus, the CUP, which is more or less in the 'middle' of the Pareto front, belongs to a higher w mon .
Next, the EMPC has been simulated with fixed weights, i. e. without determining the Pareto fronts. We chose three settings, • equal weights w mon = w comf = 0.5, • the average weights for the entire year, i. e. w mon = 0.2647, w comf = 0.7353, and • the average weights for every month, see Table II.
The results are shown in Figure 3. Thereby, an initial peak of 250 kW has been used. However, the value for the initial peak turned out to be not significant. Note that the costs for the initial peak (21,830 e) are neglected in the following. The baseline solution performs the best in terms of average temperature deviation with 0.11°C, since dead-beat-control is applied as long as the constraints are not active. Thus, the temperature mainly differs from the given setpoint of 21°C due to the uncompensated disturbances. However, it performs worst in terms of monetary costs. Since the air conditioning is used regardless of possible new peak costs, the maximum peak rises to 650.25 kW, resulting in unnecessary high peak costs with monetary costs of ≈ 362,000 e. In comparison, the EMPC with equal weights leads to a significant cost reduction of 36,764 e. As for all EMPC settings, the maximum peak is reduced to 384.46 kW, which turned out to be unavoidable due to the maximum demand P dem = −630.55 kW in January. However, since no weighting is applied, the average temperature deviation is uncomfortable high with 0.51°C.
The auto CUP selection seems to find a better trade-off. The monetary savings compared to the baseline solution are a bit lower with 29,136 e. However, the average temperature deviation is reduced to acceptable 0.2296°C. Using the mean weights from the entire year instead leads to similar results with a slightly higher focus on cost reduction, i. e. saving 32,806 e with an average temperature deviation of 0.2430°C. However, using different weights for every month leads to a higher focus on reducing the temperature deviation in comparison to the CUP solutions, i. e. saving only 28,139 e with an average temperature deviation of 0.2052°C.
Concluding, taking mean values for the weights obtained from the Pareto solutions performed similar well as the use of CUPs in every time step, while reducing the time windows for average means did not yield significant benefits. However, using average weights could be done only a posteriori. Thus, we need to validate whether means from former simulations still work well for new and unseen disturbances.

C. 2019 Results
The same settings as before have been simulated with disturbance data from the first half of 2019. Namely, the fixed weights are the ones obtained from the 2018 Pareto simulation. Figure 4 shows the results. Again, the baseline solution has the lowest average temperature deviation but the highest peak costs. The EMPC with equal weights behaves the same, too. It has the lowest monetary costs with 167,420 e, but a very high temperature deviation of 0.54°C.

V. CONCLUSION
In this paper, we presented an approach to use EMPC together with a linear second order model to control the energy management system of a medium-sized company building. The Pareto front for two competing objectives, monetary costs and thermal comfort, is efficiently determined using the AWDS scheme and the CUP is chosen as a reasonable tradeoff. This way, the building's thermal capacity can be used as an additional storage to avoid high peak costs. Simulations with real world data showed that the Pareto solutions can either be used for live control without the need of a priori weights, or to determine sophisticated weights a posteriori. of the AWDS to determine the Pareto fronts, our approach is scalable to more objectives and computationally not too demanding. Using CPLEX, the calculation of a single Pareto front takes only 3 to 4 seconds on a regular desktop PC and is thus appicable in real-time. Therefore, future work will involve both more objectives such as CO 2 emissions as well as more complex systems, e. g. including electrical vehicle charging stations and multiple temperature zones.