Retail Products Price Forecasting with Empirical Mode Decomposition and Auto Regressive Integrated Moving Average Model Using Web-Scraped Price Microdata
DOI:
https://doi.org/10.31181/sdmap21202519Keywords:
Retailing, Price forecasting, Web-scraping, ARIMA, EMDAbstract
This study presents a cutting-edge approach to price forecasting for an online retail business in Turkey, utilizing a hybrid model that combines Empirical Mode Decomposition (EMD) with Auto Regressive Integrated Moving Average (ARIMA) models. A 900-day dataset, scraped from the website, underpins this analysis. A battery of fourteen metrics is employed to evaluate the forecasting performance, culminating in a statistically significant confirmation of the hybrid model's superiority over the standalone ARIMA model, as established by the Wilcoxon signed-rank test. In addition to this performance validation, our investigation unveils an intriguing association between category standard deviations and forecasting accuracy, with lower standard deviations correlating with higher forecasting performance. While acknowledging the study's limitations related to data collection constraints, this research bears wider significance for the entire supply chain, offering strategic insights for retailers and the potential for more detailed analysis with larger datasets. Moreover, it lays the groundwork for future studies involving dynamic ARIMA parameter determination, advanced EMD variants, and machine learning integration, enhancing its applicability to various time series contexts. The results are compared with machine learning algorithms namely Neural Networks, Support Vector Regression, Regression Tree, Gaussian Process Regression, Generalized Additive Model.
Downloads
References
Hoeltgebaum, H., Borenstein, D., Fernandes, C., & Veiga, Á. (2021). A score-driven model of short-term demand forecasting for retail distribution centers. Journal of Retailing, 97(4), 715-725. https://doi.org/10.1016/j.jretai.2021.05.003
Wang, J., Chong, W. K., Lin, J., & Hedenstierna, C. P. T. (2024). Retail Demand Forecasting Using Spatial-Temporal Gradient Boosting Methods. Journal of Computer Information Systems, 64(5), 652-664. https://doi.org/10.1080/08874417.2023.2240753
Fedoseeva, S., & Herrmann, R. (2023). Assortments and prices in online grocery retailing. Digital Business, 3(1), 100054. https://doi.org/10.1016/j.digbus.2023.100054
Sharma, R. R., Kumar, M., Maheshwari, S., & Ray, K. P. (2020). EVDHM-ARIMA-based time series forecasting model and its application for COVID-19 cases. IEEE Transactions on Instrumentation and Measurement, 70, 1-10. https://doi.org/10.1109/tim.2020.3041833
Lasheras, F. S., de Cos Juez, F. J., Sánchez, A. S., Krzemień, A., & Fernández, P. R. (2015). Forecasting the COMEX copper spot price by means of neural networks and ARIMA models. Resources Policy, 45, 37-43. https://doi.org/10.1016/j.resourpol.2015.03.004
Qin, Q., Huang, Z., Zhou, Z., Chen, Y., & Zhao, W. (2022). Hodrick–Prescott filter-based hybrid ARIMA–SLFNs model with residual decomposition scheme for carbon price forecasting. Applied Soft Computing, 119, 108560. https://doi.org/10.1016/j.asoc.2022.108560
Zhu, B., & Wei, Y. (2013). Carbon price forecasting with a novel hybrid ARIMA and least squares support vector machines methodology. Omega, 41(3), 517–524. https://doi.org/10.1016/j.omega.2012.06.005
Liu, J., Wang, P., Chen, H., & Zhu, J. (2022). A combination forecasting model based on hybrid interval multi-scale decomposition: Application to interval-valued carbon price forecasting. Expert Systems with Applications, 191, 116267. https://doi.org/10.1016/j.eswa.2021.116267
Matyjaszek, M., Fernández, P. R., Krzemień, A., Wodarski, K., & Valverde, G. F. (2019). Forecasting coking coal prices by means of ARIMA models and neural networks, considering the transgenic time series theory. Resources Policy, 61, 283-292. https://doi.org/10.1016/j.resourpol.2019.02.017
Alam, M. S., Murshed, M., Manigandan, P., Pachiyappan, D., & Abduvaxitovna, S. Z. (2023). Forecasting oil, coal, and natural gas prices in the pre-and post-COVID scenarios: contextual evidence from India using time series forecasting tools. Resources Policy, 81, 103342. https://doi.org/10.1016/j.resourpol.2023.103342
Kriechbaumer, T., Angus, A., Parsons, D., & Casado, M. R. (2014). An improved wavelet–ARIMA approach for forecasting metal prices. Resources Policy, 39, 32-41. https://doi.org/10.1016/j.resourpol.2013.10.005
Lehna, M., Scheller, F., & Herwartz, H. (2022). Forecasting day-ahead electricity prices: A comparison of time series and neural network models taking external regressors into account. Energy Economics, 106, 105742. https://doi.org/10.1016/j.eneco.2021.105742
Zhang, B., Song, C., Jiang, X., & Li, Y. (2023). Electricity price forecast based on the STL-TCN-NBEATS model. Heliyon, 9(1), e13029. https://doi.org/10.1016/j.heliyon.2023.e13029
Krishna Prakash, N., & Singh, J. G. (2023). Electricity price forecasting using hybrid deep learned networks. Journal of Forecasting, 42(7), 1750-1771. https://doi.org/10.1002/for.2981
Mohammadi, H., & Su, L. (2010). International evidence on crude oil price dynamics: Applications of ARIMA-GARCH models. Energy Economics, 32(5), 1001-1008. https://doi.org/10.1016/j.eneco.2010.04.009
Jianwei, E., Bao, Y., & Ye, J. (2017). Crude oil price analysis and forecasting based on variational mode decomposition and independent component analysis. Physica A: Statistical Mechanics and its Applications, 484, 412-427. https://doi.org/10.1016/j.physa.2017.04.160
Koutroumanidis, T., Ioannou, K., & Arabatzis, G. (2009). Predicting fuelwood prices in Greece with the use of ARIMA models, artificial neural networks and a hybrid ARIMA–ANN model. Energy Policy, 37(9), 3627-3634. https://doi.org/10.1016/j.enpol.2009.04.024
Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33(6), 497-505. https://doi.org/10.1016/j.omega.2004.07.024
Zolfaghari, M., & Gholami, S. (2021). A hybrid approach of adaptive wavelet transform, long short-term memory and ARIMA-GARCH family models for the stock index prediction. Expert Systems with Applications, 182, 115149. https://doi.org/10.1016/j.eswa.2021.115149
Li, C., & Hu, J. W. (2012). A new ARIMA-based neuro-fuzzy approach and swarm intelligence for time series forecasting. Engineering Applications of Artificial Intelligence, 25(2), 295-308. https://doi.org/10.1016/j.engappai.2011.10.005
Babu, C. N., & Reddy, B. E. (2015). Prediction of selected Indian stock using a partitioning–interpolation based ARIMA–GARCH model. Applied Computing and Informatics, 11(2), 130-143. https://doi.org/10.1016/j.aci.2014.09.002
Yu, Z., Qin, L., Chen, Y., & Parmar, M. D. (2020). Stock price forecasting based on LLE-BP neural network model. Physica A: Statistical Mechanics and Its Applications, 553, 124197. https://doi.org/10.1016/j.physa.2020.124197
David, S. A., Inacio Jr, C. M. C., Nunes, R., & Machado, J. T. (2021). Fractional and fractal processes applied to cryptocurrencies price series. Journal of Advanced Research, 32, 85-98. https://doi.org/10.1016/j.jare.2020.12.012
Büyükşahin, Ü. Ç., & Ertekin, Ş. (2019). Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition. Neurocomputing, 361, 151-163. https://doi.org/10.1016/j.neucom.2019.05.099
Bjørnland, H. C., Gerdrup, K., Jore, A. S., Smith, C., & Thorsrud, L. A. (2012). Does forecast combination improve Norges Bank inflation forecasts?. Oxford Bulletin of Economics and Statistics, 74(2), 163-179. https://doi.org/10.1111/j.1468-0084.2011.00639.x
Bataa, E., Osborn, D. R., Sensier, M., & Dijk, D. V. (2014). Identifying changes in mean, seasonality, persistence and volatility for G7 and Euro Area inflation. Oxford Bulletin of Economics and Statistics, 76(3), 360-388. https://doi.org/10.1111/obes.12021
Banaś, J., & Utnik-Banaś, K. (2021). Evaluating a seasonal autoregressive moving average model with an exogenous variable for short-term timber price forecasting. Forest Policy and Economics, 131, 102564. https://doi.org/10.1016/j.forpol.2021.102564
An, W., Wang, L., & Zhang, D. (2023). Comprehensive commodity price forecasting framework using text mining methods. Journal of Forecasting, 42(7), 1865-1888. https://doi.org/10.1002/for.2985
An, W., Wang, L., & Zeng, Y. R. (2023). Text‐based soybean futures price forecasting: A two‐stage deep learning approach. Journal of Forecasting, 42(2), 312-330. https://doi.org/10.1002/for.2909
Ong, C. S., Huang, J. J., & Tzeng, G. H. (2005). Model identification of ARIMA family using genetic algorithms. Applied Mathematics and Computation, 164(3), 885-912. https://doi.org/10.1016/j.amc.2004.06.044
Madziwa, L., Pillalamarry, M., & Chatterjee, S. (2022). Gold price forecasting using multivariate stochastic model. Resources Policy, 76, 102544. https://doi.org/10.1016/j.resourpol.2021.102544
Deina, C., do Amaral Prates, M. H., Alves, C. H. R., Martins, M. S. R., Trojan, F., Stevan Jr, S. L., & Siqueira, H. V. (2022). A methodology for coffee price forecasting based on extreme learning machines. Information Processing in Agriculture, 9(4), 556-565. https://doi.org/10.1016/j.inpa.2021.07.003
Pedregal, D. J. (2020). Forecasting uranium prices: Some empirical results. Nuclear Engineering and Technology, 52(6), 1334-1339. https://doi.org/10.1016/j.net.2019.11.028
Bloznelis, D. (2018). Short‐term salmon price forecasting. Journal of Forecasting, 37(2), 151-169. https://doi.org/10.1002/for.2482
Imbat, M. B. S., Gordovez, F. S. S., Solomon, R. M., Malicdem, G. E. Q., Andrada, J. D. P., & Belandres, E. B. (2024). Rice Price Forecasting Using the Arima Model, Iconic Research and Engineering Journals, 7(7), 333-338.
Vo, N., & Ślepaczuk, R. (2022). Applying hybrid ARIMA-SGARCH in algorithmic investment strategies on S&P500 index. Entropy, 24(2), 158. https://doi.org/10.3390/e24020158
Wolters, J., & Huchzermeier, A. (2021). Joint in-season and out-of-season promotion demand forecasting in a retail environment. Journal of Retailing, 97(4), 726-745. https://doi.org/10.1016/j.jretai.2021.01.003
Sarlo, R., Fernandes, C., & Borenstein, D. (2023). Lumpy and intermittent retail demand forecasts with score-driven models. European Journal of Operational Research, 307(3), 1146-1160. https://doi.org/10.1016/j.ejor.2022.10.006
Li, C., & Lim, A. (2018). A greedy aggregation–decomposition method for intermittent demand forecasting in fashion retailing. European Journal of Operational Research, 269(3), 860-869. https://doi.org/10.1016/j.ejor.2018.02.029
Ma, S., Fildes, R., & Huang, T. (2016). Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information. European Journal of Operational Research, 249(1), 245-257. https://doi.org/10.1016/j.ejor.2015.08.029
McIntyre, S. H., Achabal, D. D., & Miller, C. M. (1993). Applying case-based reasoning to forecasting retail sales. Journal of Retailing, 69(4), 372-398. https://doi.org/10.1016/0022-4359(93)90014-a
Wellens, A. P., Boute, R. N., & Udenio, M. (2024). Simplifying tree-based methods for retail sales forecasting with explanatory variables. European Journal of Operational Research, 314(2), 523-539. https://doi.org/10.1016/j.ejor.2023.10.039
Ma, S., & Fildes, R. (2021). Retail sales forecasting with meta-learning. European Journal of Operational Research, 288(1), 111-128. https://doi.org/10.1016/j.ejor.2020.05.038
Huang, T., Fildes, R., & Soopramanien, D. (2014). The value of competitive information in forecasting FMCG retail product sales and the variable selection problem. European Journal of Operational Research, 237(2), 738-748. https://doi.org/10.1016/j.ejor.2014.02.022
Schlaich, T., & Hoberg, K. (2024). When is the next order? Nowcasting channel inventories with Point-of-Sales data to predict the timing of retail orders. European Journal of Operational Research, 315(1), 35-49. https://doi.org/10.1016/j.ejor.2023.10.038
Agrawal, D., & Schorling, C. (1996). Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model. Journal of Retailing, 72(4), 383-407. https://doi.org/10.1016/s0022-4359(96)90020-2
Rahman, M. A., Modak, C., Mozumder, M. A. S., Miah, M. N. I., Hasan, M., Sweet, M. M. R., Hossan, M.Z., & Alam, M. (2024). Advancements in Retail Price Optimization: Leveraging Machine Learning Models for Profitability and Competitiveness. Journal of Business and Management Studies, 6(3), 103-110. https://doi.org/10.32996/jbms.2024.6.3.11
Box, G.E.P., Jenkins, G.M., Reinsel, G.C. & Ljung, G.M. (2016) Time Series Analysis: Forecasting and Control. Fifth Edition, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken.
Liu, H., Tian, H. Q., & Li, Y. F. (2015). An EMD-recursive ARIMA method to predict wind speed for railway strong wind warning system. Journal of Wind Engineering and Industrial Aerodynamics, 141, 27-38. https://doi.org/10.1016/j.jweia.2015.02.004
Wang, Z. Y., Qiu, J., & Li, F. F. (2018). Hybrid models combining EMD/EEMD and ARIMA for long-term streamflow forecasting. Water, 10(7), 853. https://doi.org/10.3390/w10070853
Li, T., Qu, S., & Huang, G. (2021). Research on the prediction of Shenzhen growth enterprise market price index based on EMD-ARIMA model. In LISS 2020: Proceedings of the 10th International Conference on Logistics, Informatics and Service Sciences (pp. 783-795). Springer Singapore. https://doi.org/10.1007/978-981-33-4359-7_54
Fatema, N., Malik, H., & Abd Halim, M. S. (2022). Hybrid approach combining EMD, ARIMA and monte carlo for multi-step ahead medical tourism forecasting. Journal of Intelligent & Fuzzy Systems, 42(2), 1235–1251. https://doi.org/10.3233/jifs-189785
Powell, B., Nason, G., Elliott, D., Mayhew, M., Davies, J., & Winton, J. (2018). Tracking and Modelling Prices Using Web-Scraped Price Microdata: Towards Automated Daily Consumer Price Index Forecasting. Journal of the Royal Statistical Society Series A: Statistics in Society, 181(3), 737–756. https://doi.org/10.1111/rssa.12314
Claveria, O., Monte, E., & Torra, S. (2017). Using survey data to forecast real activity with evolutionary algorithms. A cross-country analysis. Journal of Applied Economics, 20(2), 329–349. https://doi.org/10.1016/s1514-0326(17)30015-6
Liu, M. D., Ding, L., & Bai, Y. L. (2021). Application of hybrid model based on empirical mode decomposition, novel recurrent neural networks and the ARIMA to wind speed prediction. Energy Conversion and Management, 233, 113917. https://doi.org/10.1016/j.enconman.2021.113917
Cao, Y., Zhang, D., Ding, S., Zhong, W., & Yan, C. (2024). A Hybrid Air Quality Prediction Model Based on Empirical Mode Decomposition. Tsinghua Science and Technology, 29(1), 99–111. https://doi.org/10.26599/tst.2022.9010060
Wang, H., Liu, L., Dong, S., Qian, Z., & Wei, H. (2015). A novel work zone short-term vehicle-type specific traffic speed prediction model through the hybrid EMD–ARIMA framework. Transportmetrica B: Transport Dynamics, 4(3), 159–186. https://doi.org/10.1080/21680566.2015.1060582
Abadan, S., & Shabri, A. (2014). Hybrid Empirical Mode Decomposition-ARIMA for Forecasting Price of Rice. Applied Mathematical Sciences, 8(63), 3133–3143. https://doi.org/10.12988/ams.2014.43189
Nasir, J., Aamir, M., Haq, Z. U., Khan, S., Amin, M. Y., & Naeem, M. (2023). A New Approach for Forecasting Crude Oil Prices Based on Stochastic and Deterministic Influences of LMD Using ARIMA and LSTM Models. IEEE Access, 11, 14322–14339. https://doi.org/10.1109/access.2023.3243232
Yang, H. L., & Lin, H. C. (2016). An integrated model combined ARIMA, EMD with SVR for stock indices forecasting. International Journal on Artificial Intelligence Tools, 25(02), 1650005. https://doi.org/10.1142/s0218213016500056
Zhou, Y., & Huang, M. (2016). Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model. Microelectronics Reliability, 65, 265–273. https://doi.org/10.1016/j.microrel.2016.07.151
Awajan, A. M., Al-Hasanat, B., Elkaroui, E., AL e’damat, A., Al-Gounmeein, R. S., Al-Jawarneh, A. S., Ayyoub H. N., & Alfarajat, E. (2024). Time Series Forecasting of New Cases for COVID-19 Pandemic in Jordan Using Enhanced Hybrid EMD-ARIMA. Journal of Statistics Applications & Probability An International Journal, 13(1), 261. https://doi.org/10.18576/jsap/130118
Chyon, F. A., Suman, M. N. H., Fahim, M. R. I., & Ahmmed, M. S. (2022). Time series analysis and predicting COVID-19 affected patients by ARIMA model using machine learning. Journal of Virological Methods, 301, 114433. https://doi.org/10.1016/j.jviromet.2021.114433
Faruk, D. Ö. (2010). A hybrid neural network and ARIMA model for water quality time series prediction. Engineering Applications of Artificial Intelligence, 23(4), 586–594. https://doi.org/10.1016/j.engappai.2009.09.015
PennState. (2023). https://online.stat.psu.edu/stat510/lesson/2/2.2
Duke. (2023). https://people.duke.edu/~rnau/arimrule.htm
Mohammed, E. A., Naugler, C., & Behrouz, H. (2015). Emerging Business Intelligence Framework for a Clinical Laboratory Through Big Data Analytics. Emerging Trends in Computational Biology, Bioinformatics, and Systems Biology: Algorithms and Software Tools, 577–602. https://doi.org/10.1016/b978-0-12-802508-6.00032-6
Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N. C., Tung, C. C., & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971), 903–995. https://doi.org/10.1098/rspa.1998.0193
Lin, Y., Yan, Y., Xu, J., Liao, Y., & Ma, F. (2021). Forecasting stock index price using the CEEMDAN-LSTM model. The North American Journal of Economics and Finance, 57, 101421. https://doi.org/10.1016/j.najef.2021.101421
Zhang, Y., Yan, B., & Aasma, M. (2020). A novel deep learning framework: Prediction and analysis of financial time series using CEEMD and LSTM. Expert Systems with Applications, 159, 113609. https://doi.org/10.1016/j.eswa.2020.113609
Ning, Y., Kazemi, H., & Tahmasebi, P. (2022). A comparative machine learning study for time series oil production forecasting: ARIMA, LSTM, and Prophet. Computers & Geosciences, 164, 105126. https://doi.org/10.1016/j.cageo.2022.105126
Matthews, K. (1985). Forecasting with a Rational Expectations Model of The UK. Oxford Bulletin of Economics and Statistics, 47(4), 311–336. https://doi.org/10.1111/j.1468-0084.1985.mp47004001.x
Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295–316. https://doi.org/10.1016/j.neucom.2020.07.061
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Mehmet Ozcalci, Elif Kaya (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.