In this study, we evaluated multiple machine learning models on an energy consumption dataset. Our results show that XGBoost achieved the best predictive performance, as indicated by the lowest Mean Absolute Error (MAE) and Mean Squared Error (MSE), while Support Vector Machine (SVM) performed the worst. To further understand the differences in these models’ predictions, we applied SHAP (Shapley Additive exPlanations). SHAP analysis revealed that the same features influenced XGBoost and SVM differently, contributing to the varying prediction outcomes.
For paper and coding: https://github.com/belsguidetotech/Explainable_AI_for_Reliable_Energy_Consumption_Forecast
To predict power consumption and understand how it works, we start from data preparation. And then we use five machine learning methods to predict power consumption to get the best prediction result. Finally, to uncover why algorithm works, we use SHAPLEY to explore the underlying principles of each modelling method.
XGBoost has the best accuracy, while SVM is the poorest.
To find out why XG Boost and SVM have different results predicting the same data set with the same features, we use SHAPLEY values and visualizations as below to understand the contribution of each feature to the model’s prediction. Bar Plot shows the importance of features by the mean absolute SHAP value for each feature. The influence of the feature is increased as the bar becomes higher. The most important features of XG Boost and SVM are different. For XG Boost, rolling_mean_t168, rolling_mean_t48, and dayofyear are the top3 features that contribute the most. While for SVM, the most important 3 features are rolling_mean_t168, dayofyear and rolling_mean_t62. Waterfall Plot provides a clear explanation of which features are the most important for a given prediction. The features of pushing the prediction higher or lower than the baseline are different. For XG Boost, dayofyear pushes the prediction lower. While for SVM, dayofyear pushes the prediction higher.
In this study, we compared the predictive performance of several machine learning algorithms for power consumption forecasting. As indicated by the Mean Absolute Error (MAE) and Mean Squared Error (MSE), XGBoost demonstrated the highest accuracy, while Support Vector Machine (SVM) had the lowest predictive performance. This discrepancy can be further explained through SHAP (SHapley Additive exPlanations), which reveals that the same feature contributes differently to each model’s predictions. The variance in feature importance across models highlights that each algorithm processes the data in unique ways, leading to variations in prediction outcomes.