Hybrid Machine Learning Approaches for Accurate Solar Energy Forecasting from Real-World Weather Data
Keywords:
solar forecasting, renewable integration, hybrid machine learning, ensemble methods, solar irradiance, weather data, time-series predictionAbstract
Accurate solar power forecasting is crucial for integrating large solar installations into modern power grids and meeting global climate targets. Solar output is highly variable due to weather factors (clouds, temperature, humidity), which makes predictions challenging. Traditional single-model forecasting methods (physical NWP models, statistical ARIMA, standalone ML) have limitations in capturing all weather-driven uncertainties. We explore hybrid machine learning approaches that combine multiple models to improve accuracy across different timescales (intra-hour to day-ahead). We use real-world datasets (NREL NSRDB for irradiance and weather data, and the INESC-TEC Portugal PV dataset with WRF forecasts) to train models including ANN, LSTM, SVR, Random Forest, and XGBoost. Our hybrid framework uses a stacking ensemble with a meta-learner to aggregate base models’ predictions. Experiments compare hybrid ensembles to single models and simple voting ensembles. We evaluate short-term (30-min to 6h ahead) and day-ahead (24h ahead) forecasts using MAE, RMSE, MAPE, and R². We find that hybrid models consistently reduce errors (up to ~10%) over single methods, especially for longer horizons. Sensitivity tests show multi-variable inputs (irradiance, temperature, cloud cover, humidity, wind) improve forecasts. Results illustrate that forecasting error grows with horizon (consistent with other grid studies), but hybrid approaches mitigate this. This work demonstrates that hybrid ML can significantly improve solar forecast accuracy under real-world weather conditions. Our findings have implications for grid stability and energy market operations.
