Creating a predictive model for electricity transmission losses

Karas Peter, Vrablecová Petra, Lucká Mária, Lóderer Marek, Grmanová Gabriela, Rozinajová Viera

Abstract

Modern power systems are experiencing an increase in power demand and changes to complex interconnected power networks comprising conventional and renewable energy sources. Maintaining a balance between generation and demand is important for the reliable operation of power networks. Besides forecasts of generation and demand, forecasts of transmission losses play an important role in the decision-making of system operators. 

In this paper we present several machine learning based models for the line loss prediction one day ahead. In addition, we investigate the impact of derived features that capture the evolution and trend of selected features and line loss itself in the recent time period. We evaluate selected methods on data enriched with different derived features and perform an exhaustive amount of experiments. The best performing model is the model based on Gradient Boosting on Regression Trees (Catboost) that in addition to original data attributes used specific curve-based features and a feature expressing the trend of line loss computed by Exponentially Weighted Moving Average.

Introduction

Modern power systems are experiencing an increase in power demand and changes to complex interconnected power networks comprising conventional and renewable energy sources. Maintaining a balance between generation and demand is important for the reliable operation of power networks. Besides forecasts of generation and demand, accurate forecasts of transmission losses play an important role in the decision-making of system operators. Incorrectly estimated losses cause increased network fees in the electricity trading process. Since the network fees can make up to 40% of the final electricity price for households, precise forecasts affect all electricity market participants. 

Predicting accurate line losses is a difficult task. It is influenced by many factors and it is subject to uncertainty since there are many unknown parameters. Line losses can be derived from complex physical relationships based on current, voltage, impedance and use complex numerical models. Another approach is to predict line loss by means of amount of electric supply, electric exchange flows, demand and line losses rate. This approach is based on linear regression and the least square estimates calculated from statistical data representing the line losses [Sahlin, 2016]. Despite this, both mentioned methods require forecast data on some special parameters that are difficult to predict. At present, many studies use succesfully

machine learning methods for solving the problem of the line loss prediction. The LSTM based method was successfully applied to the line loss prediction in Finland [Tulensalo, 2020]. They integrated the weather forecast into the model and showed that increasing the size of training data had a positive effect on the overall model performance. Their model outperformed the models based on the linear regression and at that time state-of-the-art method for calculating grid losses predictions in Finland. Recently, several other works focused on using neural networks for line loss forecasting.

Random Bits Forest (RBF) algorithm and BP (Back-Propagation) neural network and Deep LSTM were compared in the work [Liu, 2021]. The authors calculated losses for the distribution network in a city and compared several LSTM configurations. They showed that a deep LSTM network can achieve a good prediction effect on the line loss of the distribution network. Line losses and its dependence on active power supply, reactive power supply, capacity of distribution lines and the distribution line length was modeled by BP neural network in [Huang, 2022] using simulated data.

Besides neural networks, researchers experimented also with other machine learning methods. The Catboost method [Prokhorenkova, 2019] (Gradient boosting on decision trees) for predicting losses on lines in a one day-ahead forecast, was succesfully applied in [Dalal, 2020]. The proposed machine learning system comprises 24 different models and performs forecasts for three smaller network components forming the entire distribution network in Norway. The accuracy and quality of proposed models is compared with baseline predictions. Proposed system lowered the MAE error by 41% and, as a consequence, the financial risk was also reduced. Line loss prediction of high voltage transmission lines based on an EEMD-LSTM-SVR algorithm is designed and evaluated in the paper [Ding, 2022]. It uses Ensemble Empirical Mode Decomposition (EEMD) to decompose the time series containing the line losses. Empirical Mode Decomposition decomposes line loss into high/low frequency and random Intrinsic Mode Functions (IMFs). In the EEMD method, IMF components are forecasted separately based on multidimensional data and finally summed up to determine the predicted value of the line loss. Some components are predicted by LSTM while for others the SVR method is applied.

Besides designing the proper forecasting method, data preprocessing and feature engineering significantly influence the performance of the overall solution. Among others the Exponentially Weighted Moving Average (EWMA) feature helps in finding the trend of line loss in time and eliminates the noise [Ding, 2022]. The curve-based features such as average value, extremum, and the average difference of a particular feature can express how the feature evolved recently. The research of [Ding, 2022] showed that enrichment of data by curve-based features can improve the results. 

Our solution is based on published works [Ding, 2022] and [Dalal, 2020]. It combines techniques of feature extraction, forecasting methods and applies them in a methodologically correct way. The result is a proper combination of feature extraction steps and application of machine learning method that outperforms the published results.

Methods

As mentioned before, our task is to forecast line loss based on data about electric supply and demand and meteorological forecasts. The first step of the prediction process represents selection of features, their preprocessing and transformation. Based on related works we proposed to enrich the original set of features with derived features to extract as much information as possible from the data. The second step is exploitation of machine learning methods for line loss prediction. To choose the best solution, different combinations of features were used to predict line losses by the several selected prediction methods and the results were evaluated.

Feature engineering

Proper feature selection is crucial for performance of machine learning methods. Firstly, calendar features such as hour, weekday, in the cyclic form and whether it is a holiday or not are assumed. Then available information about grid loads and power demand were also considered. Moreover, we proposed to select weather forecast features that highly correlated with line losses.

Curve-based feature extraction of Time series data

Since there is not available information about power grid characteristics, such as demand and loads, at the time for which line loss predictions are being made, we calculated local curve characteristics of those features to characterize their evolution within the recent data. We assumed average, minimum, maximum and average difference values used to describe the average trend and extreme values. The characteristics were calculated within the last known period, e.i. a sliding window of fixed length was used.

EWMA

Similarly to characterizing demand and grid loads by curve-based features, the trend of line loss within the last known period is valuable information for forecasting of line loss. Trend was represented by an Exponentially Weighted Moving Average (EWMA) [Ding, 2022, Agami, 2011]. This statistical method computes a moving average by assigning exponentially decreasing weights to historical data. While the most recent data receive the highest weights, they can mostly influence the near future. This allows the EWMA to quickly respond to recent changes in data while incorporating the most important historical information.

Prediction methods 

Once, the set of features specified, the proper prediction method needs to be chosen. We assumed Catboost, SVR and Ensemble Empirical Mode Decomposition (EEMD).

Catboost

CatBoost is a machine learning algorithm based on gradient-boosting and decision trees designed for solving classification and regression problems [Hancock, 2020]. It uses a combination of feature engineering, decision tree optimization, random permutations as well as gradient and ordered boosting to obtain accurate results and high performance on large and complex data sets.

The algorithm iteratively calculates the negative gradient of the loss function with respect to the current predictions. After each iteration, the gradient is used to update the predictions by adding a scaled version of the gradient to the current predictions.

SVR

Support Vector Regression (SVR) is a machine learning algorithm that tries to map the input data into a higher dimensional feature space, in which the training data may exhibit linearity, and then linear regression can be used in this new feature space [Ding, 2022]. The parameters of the SVR models can be optimized using Particle Swarm Optimization (PSO) to minimize the error [Kennedy, 1995].

EEMD

In this combined approach, training data were decomposed using the Empirical Mode Decomposition (EMD) method. In the original paper [Ding, 2022], the high frequency IMF components were predicted by the LSTM method, but eventually we switched to Catboost, which was able to predict given IMF components more accurately. The low frequency IMF components are predicted by SVR. Based on [Dalal, 2020], the learning is based on a sliding window of a fixed number of days.

Post-processing

For methods where separate models are created for each hour of a day (SVR), we post-processed the results by smoothing the prediction using the Savitzky-Golay filter by using the 5th order polynomial.

Evaluation metrics

RMSE stands for Root Mean Squared Error and is a metric used to quantify the overall accuracy of a prediction model by measuring the square root of the average of the squared differences between predicted values and actual values in a dataset.

MAPE stands for Mean Absolute Percentage Error which measures the average magnitude of error produced by a prediction model. The advantage of the metric is its ability to compare prediction errors across time series with different absolute values. Moreover, the percentage format is intuitive and can be easily understood.

Experiments

The experiments were performed for different prediction methods and different feature sets (original or enriched data). In addition, smoothing of results was used in some cases.

Data

We evaluated the proposed methods on the Norwegian data measured from December 2017 to May 2020 (30 months) with one hour resolution. We split the whole set into a training and a test set. The training set consisted of the first 24 months from December 2017 to November 2019 and the test set was composed of the last 6 months from December 2019 to May 2020.

Data features used for learning and forecasting are, for example, network losses, network load, temperature forecast, calendar features such as year, season, month, day of the week, day of the week, hour, in cyclic form and whether it is a holiday or not, and estimated demand in Trondheim. In addition, we also obtained historical weather forecasts from OpenWeather and then selected relevant data achieved by exploratory analysis and their relationships to the basin dataset. 

Based on [Dalal, 2020], we used data from the dataset on grid 1 in our experiments.

Feature extraction and preprocessing

Original set of features was selected based on correlation analysis. Categorical data were removed and missing data were imputed by backward filling. Finally, selected features were scaled. Similarly to the original paper [Dalal, 2020], to simulate the real prediction of line loss for a particular day, we were obliged to use 6 days old data, since newer data were not available at the time of making the prediction. Exhaustive list of original features used for particular experiments is listed in Table 1. The original set was enriched either with curve-based features and/or a EWMA feature. Curve-based features (namely average, minimum, maximum and average difference values) were computed for selected weather attributes by using a 7 days sliding window.

Features for individual methods were selected experimentally. The final selection is presented in Table 1.

MethodList of selected features (original set of features)Curve-based features
CatboostGrid1-load, grid1-loss, grid1-temp, season_x, season_y, month_x, month_y, week_x, week_y, weekday_x, weekday_y, holiday, hour_x, hour_ygrid1-temp_mean, grid1-temp_min, grid1-temp_max, grid1-temp_mean_diff, grid1-load_mean, grid1-load_min, grid1-load_max, grid1-load_mean_diff,
SVRgrid1-load, grid1-loss, grid1-temp, season_x, season_y, month_x, week_x, snow_depth, temperature, dew_point, demandtemperature_mean, temperature_min, temperature_max, temperature_mean_diff, dew_point_mean, dew_point_min, dew_point_max, dew_point_mean_diff, grid1-temp_mean, grid1-temp_min, grid1-temp_max, grid1-temp_mean_diff
EEMDgrid1-load, grid1-loss, grid1-temp, season_x, season_y, month_x, month_y, week_x, week_y, weekday_x, weekday_y, holiday, hour_x, hour_y, temperature, dew_point, demand, pressure, ground_pressure, humidity, clouds, wind_speed, wind_deg, rain, snow, ice, fr_rain, convective, snow_depth, accumulated, hours, rate, probabilitytemperature_mean, temperature_min, temperature_max, temperature_mean_diff, dew_point_mean, dew_point_min, dew_point_max, dew_point_mean_diff, grid1-temp_mean, grid1-temp_min, grid1-temp_max, grid1-temp_mean_diff
Table 1: List of features used for models training

Training prediction methods 

To better understand the performance of proposed methods, naive and persistent models were considered. Naive model predicted the value measured the previous day. Persistent model predicted the value that was measured a week ago.

Catboost and SVR methods were trained on the expanding window starting from the beginning of measurements and ending on the day before the day being predicted. Predictions were computed for the test set starting from December 1st, 2019 to May 31st, 2020 (6 months in total).

Specifically for SVR, the line loss was predicted by 24 separate hourly SVR models – each model predicted one hour of the day and was trained only on the values from the hour it predicted. 

Several versions of used feature sets were used in training. We also experimented with a post-processing step, where the curve composed of results of hourly SVR models was smoothed. RBF kernel was used and the parameters of the SVR models (ε, C, γ) were optimized using PSO in order to minimize the RMSE error (root mean square error) on the last 10% of the training set. 

In the EEMD-based approach, the first 24 months of the training data were decomposed using the EMD method. In our case, the method produced 12 IMF components. In the early stage of our evaluation, we predicted the first 6 (high frequency) IMF components using LSTM, but eventually we switched to Catboost, which was able to predict given IMF components more accurately. The other 6 (low frequency) IMF components were predicted by SVR. In this combined approach, the learning is based on a sliding window of 180 days where we predicted hourly line loss for the next day. The subsequent model training was based on a 180-day sliding window. Line loss data from the sliding window was decomposed by the EMD method. Each individual IMF of the decomposition was used to re-train the individual SVR or CatBoost model. Each IMF was then predicted and subsequently summed the individual predictions together and got the overall line loss prediction.

Results

Table 2 presents the results of selected combinations of feature sets and prediction models for hourly line loss predictions for the next day.

Based on published work [Ding, 2022], the EEMD method achieved one of the best results that was not confirmed by our experiments. We hypothesize that the difference between the original and our implementation of the EEMD method was in the IMF components computation. We suspect that in the solution proposed by [Ding, 2022], the line loss decomposition was calculated from a complete set of measured data (as can be seen from Figure 7 in [Ding, 2022]), even from those days they were going to predict. In our solution, we used only a training set for the decomposition and did not receive results that would outperform simpler methods. The EEMD is the most complex method out of all the methods we evaluated in our experiments, but it did not achieve expected precision.

SVR achieved slightly worse results in comparison to Catboost. After applying the data smoothing method, we observed a subtle improvement in the results, which allowed us to obtain more accurate and consistent outcomes. The best performed Catboost method applied to a feature set enriched by curve-based features and EWMA. This approach was proposed as a combination of model selection based on [Dalal, 2020] and feature extraction based on [Ding, 2022]. This method improved the prediction accuracy of the Catboost method trained on the original feature set by 1 percentage point, as measured by MAPE.

MethodCurve-based featuresEWMA featureSmoothingRMSEMAPE
1CatBoost++na2.3649.498
2CatBoost+na2.4239.861
3CatBoostna2.63310.407
4SVR++2.78910.91
5SVR++2.68511.02
6SVR+2.82711.498
7CatBoost+na3.10411.816
8SVR+2.86211.816
9Naivena2.86211.959
10Persistencena3.74712.364
11SVR+3.11712.571
12SVR3.08512.633
13EEMD++na3.65713.7
Table 2: Results of evaluated methods including description of used training features sorted according to MAPE. The features are represented by two columns: Curve-based features: calculated from selected weather features and EWMA feature: capturing line loss trend. Application of smoothing is denoted where applicable.

Conclusion

Extensive series of experiments confirm the claim that the incorporation of curve-based features, which capture the non-linear aspects of the data, or integrating EWMA to capture time-dependent patterns, enhances the predictive performance of models that become more adapted to capturing the underlying complexities of the distribution network’s loss. 

However, the model complexity should not be indiscriminately escalated. The excessively complex EEMD model resulted in inferior predictive performance compared to simpler alternatives, even when compared to the persistent or the naive model. 

Out of all tested models, the best performing model was CatBoost. In combination with the curve-based features and EWMA it outperformed the published solution of [Dalal, 2020].

Used data set: Norwegian data available at Grid Loss Prediction Dataset | Kaggle

Cite: Karas, P., Vrablecova, P., Lucka, M., Loderer, M., Grmanova, G., Rozinajova, V. Creating a predictive model for electricity transmission losses. Technical report (2023). Link: https://kinit.sk/creating-a-predictive-model-for-electricity-transmission-losses/

Literature

Agami, R. T. (2011). “Analysis of Time Series Data,” in Applied Data Analysis and Modeling for Energy Engineers and Scientists (Boston, MA: Springer). https://doi.org/10.1007/978-1-4419-9613-8_9

Dalal, N., Mølnå, M., Herrem, M., Røen, M., & Gundersen, O. E. (2020). Day-Ahead Forecasting of Losses in the Distribution Network. Proceedings of the AAAI Conference on Artificial Intelligence, 34(08), 13148–13155. https://doi.org/10.1609/aaai.v34i08.7018

Ding, C., Zhou, Y., Ding, Q., & Wang, Z. (2022). Loss Prediction of Ultrahigh Voltage Transmission Lines Based on EEMD–LSTM–SVR Algorithm. Frontiers in Energy Research, 10, 811745. https://doi.org/10.3389/fenrg.2022.811745

Hancock, J. T., & Khoshgoftaar, T. M. (2020). CatBoost for big data: an interdisciplinary review. In Journal of Big Data (Vol. 7, Issue 1). https://doi.org/10.1186/s40537-020-00369-8

Huang, H. (2022). Line Loss Prediction of Distribution Network Based on BP Neural Network. In T. Ni (Ed.), Scientific Programming (Vol. 2022, pp. 1–7). Hindawi Limited. https://doi.org/10.1155/2022/6105316

Kennedy, J., Eberhart, R. (1995) Particle swarm optimization, in Neural Networks, 1995. Proceedings., IEEE International Conference on, Vol. 4., pp. 1942–1948 vol.4.

Liu, K., Jia, D., Luo, L., Kang, Z., & Li, W. (2021). Line loss prediction method of distribution network based on long short-term memory. In The 16th IET International Conference on AC and DC Power Transmission (ACDC 2020). The 16th IET International Conference on AC and DC Power Transmission (ACDC 2020). Institution of Engineering and Technology. 1798–1803. https://doi.org/10.1049/icp.2020.0207

Prokhorenkova, Liudmila; Gusev, Gleb; Vorobev, Aleksandr; Dorogush, Anna Veronika; Gulin, Andrey (2019). CatBoost: unbiased boosting with categorical features. https://arxiv.org/abs/1706.09516

Sahlin, J. (2016). Line Loss Prediction Model Design at Svenska kraftnät. School of Electrical Engineering and Computer Science, Stockholm, TRITA-EE 2016:109, 1-76

Tulensalo, J., Seppänen, J., & Ilin, A. (2020). An LSTM model for power grid loss prediction. In Electric Power Systems Research (Vol. 189, p. 106823). Elsevier BV. https://doi.org/10.1016/j.epsr.2020.106823

This project was supported by the Slovenská elektrizačná prenosová sústava Fund at the Pontis Foundation.