Retail Products Price Forecasting with Empirical Mode Decomposition and Auto Regressive Integrated Moving Average Model Using Web-Scraped Price Microdata

Authors

DOI:

https://doi.org/10.31181/sdmap21202519

Keywords:

Retailing, Price forecasting, Web-scraping, ARIMA, EMD

Abstract

This study presents a cutting-edge approach to price forecasting for an online retail business in Turkey, utilizing a hybrid model that combines Empirical Mode Decomposition (EMD) with Auto Regressive Integrated Moving Average (ARIMA) models. A 900-day dataset, scraped from the website, underpins this analysis. A battery of fourteen metrics is employed to evaluate the forecasting performance, culminating in a statistically significant confirmation of the hybrid model's superiority over the standalone ARIMA model, as established by the Wilcoxon signed-rank test. In addition to this performance validation, our investigation unveils an intriguing association between category standard deviations and forecasting accuracy, with lower standard deviations correlating with higher forecasting performance. While acknowledging the study's limitations related to data collection constraints, this research bears wider significance for the entire supply chain, offering strategic insights for retailers and the potential for more detailed analysis with larger datasets. Moreover, it lays the groundwork for future studies involving dynamic ARIMA parameter determination, advanced EMD variants, and machine learning integration, enhancing its applicability to various time series contexts. The results are compared with machine learning algorithms namely Neural Networks, Support Vector Regression, Regression Tree, Gaussian Process Regression, Generalized Additive Model.

Downloads

Download data is not yet available.

References

Hoeltgebaum, H., Borenstein, D., Fernandes, C., & Veiga, Á. (2021). A score-driven model of short-term demand forecasting for retail distribution centers. Journal of Retailing, 97(4), 715-725. https://doi.org/10.1016/j.jretai.2021.05.003

Wang, J., Chong, W. K., Lin, J., & Hedenstierna, C. P. T. (2024). Retail Demand Forecasting Using Spatial-Temporal Gradient Boosting Methods. Journal of Computer Information Systems, 64(5), 652-664. https://doi.org/10.1080/08874417.2023.2240753

Fedoseeva, S., & Herrmann, R. (2023). Assortments and prices in online grocery retailing. Digital Business, 3(1), 100054. https://doi.org/10.1016/j.digbus.2023.100054

Sharma, R. R., Kumar, M., Maheshwari, S., & Ray, K. P. (2020). EVDHM-ARIMA-based time series forecasting model and its application for COVID-19 cases. IEEE Transactions on Instrumentation and Measurement, 70, 1-10. https://doi.org/10.1109/tim.2020.3041833

Lasheras, F. S., de Cos Juez, F. J., Sánchez, A. S., Krzemień, A., & Fernández, P. R. (2015). Forecasting the COMEX copper spot price by means of neural networks and ARIMA models. Resources Policy, 45, 37-43. https://doi.org/10.1016/j.resourpol.2015.03.004

Qin, Q., Huang, Z., Zhou, Z., Chen, Y., & Zhao, W. (2022). Hodrick–Prescott filter-based hybrid ARIMA–SLFNs model with residual decomposition scheme for carbon price forecasting. Applied Soft Computing, 119, 108560. https://doi.org/10.1016/j.asoc.2022.108560

Zhu, B., & Wei, Y. (2013). Carbon price forecasting with a novel hybrid ARIMA and least squares support vector machines methodology. Omega, 41(3), 517–524. https://doi.org/10.1016/j.omega.2012.06.005

Liu, J., Wang, P., Chen, H., & Zhu, J. (2022). A combination forecasting model based on hybrid interval multi-scale decomposition: Application to interval-valued carbon price forecasting. Expert Systems with Applications, 191, 116267. https://doi.org/10.1016/j.eswa.2021.116267

Matyjaszek, M., Fernández, P. R., Krzemień, A., Wodarski, K., & Valverde, G. F. (2019). Forecasting coking coal prices by means of ARIMA models and neural networks, considering the transgenic time series theory. Resources Policy, 61, 283-292. https://doi.org/10.1016/j.resourpol.2019.02.017

Alam, M. S., Murshed, M., Manigandan, P., Pachiyappan, D., & Abduvaxitovna, S. Z. (2023). Forecasting oil, coal, and natural gas prices in the pre-and post-COVID scenarios: contextual evidence from India using time series forecasting tools. Resources Policy, 81, 103342. https://doi.org/10.1016/j.resourpol.2023.103342

Kriechbaumer, T., Angus, A., Parsons, D., & Casado, M. R. (2014). An improved wavelet–ARIMA approach for forecasting metal prices. Resources Policy, 39, 32-41. https://doi.org/10.1016/j.resourpol.2013.10.005

Lehna, M., Scheller, F., & Herwartz, H. (2022). Forecasting day-ahead electricity prices: A comparison of time series and neural network models taking external regressors into account. Energy Economics, 106, 105742. https://doi.org/10.1016/j.eneco.2021.105742

Zhang, B., Song, C., Jiang, X., & Li, Y. (2023). Electricity price forecast based on the STL-TCN-NBEATS model. Heliyon, 9(1), e13029. https://doi.org/10.1016/j.heliyon.2023.e13029

Krishna Prakash, N., & Singh, J. G. (2023). Electricity price forecasting using hybrid deep learned networks. Journal of Forecasting, 42(7), 1750-1771. https://doi.org/10.1002/for.2981

Mohammadi, H., & Su, L. (2010). International evidence on crude oil price dynamics: Applications of ARIMA-GARCH models. Energy Economics, 32(5), 1001-1008. https://doi.org/10.1016/j.eneco.2010.04.009

Jianwei, E., Bao, Y., & Ye, J. (2017). Crude oil price analysis and forecasting based on variational mode decomposition and independent component analysis. Physica A: Statistical Mechanics and its Applications, 484, 412-427. https://doi.org/10.1016/j.physa.2017.04.160

Koutroumanidis, T., Ioannou, K., & Arabatzis, G. (2009). Predicting fuelwood prices in Greece with the use of ARIMA models, artificial neural networks and a hybrid ARIMA–ANN model. Energy Policy, 37(9), 3627-3634. https://doi.org/10.1016/j.enpol.2009.04.024

Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33(6), 497-505. https://doi.org/10.1016/j.omega.2004.07.024

Zolfaghari, M., & Gholami, S. (2021). A hybrid approach of adaptive wavelet transform, long short-term memory and ARIMA-GARCH family models for the stock index prediction. Expert Systems with Applications, 182, 115149. https://doi.org/10.1016/j.eswa.2021.115149

Li, C., & Hu, J. W. (2012). A new ARIMA-based neuro-fuzzy approach and swarm intelligence for time series forecasting. Engineering Applications of Artificial Intelligence, 25(2), 295-308. https://doi.org/10.1016/j.engappai.2011.10.005

Babu, C. N., & Reddy, B. E. (2015). Prediction of selected Indian stock using a partitioning–interpolation based ARIMA–GARCH model. Applied Computing and Informatics, 11(2), 130-143. https://doi.org/10.1016/j.aci.2014.09.002

Yu, Z., Qin, L., Chen, Y., & Parmar, M. D. (2020). Stock price forecasting based on LLE-BP neural network model. Physica A: Statistical Mechanics and Its Applications, 553, 124197. https://doi.org/10.1016/j.physa.2020.124197

David, S. A., Inacio Jr, C. M. C., Nunes, R., & Machado, J. T. (2021). Fractional and fractal processes applied to cryptocurrencies price series. Journal of Advanced Research, 32, 85-98. https://doi.org/10.1016/j.jare.2020.12.012

Büyükşahin, Ü. Ç., & Ertekin, Ş. (2019). Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition. Neurocomputing, 361, 151-163. https://doi.org/10.1016/j.neucom.2019.05.099

Bjørnland, H. C., Gerdrup, K., Jore, A. S., Smith, C., & Thorsrud, L. A. (2012). Does forecast combination improve Norges Bank inflation forecasts?. Oxford Bulletin of Economics and Statistics, 74(2), 163-179. https://doi.org/10.1111/j.1468-0084.2011.00639.x

Bataa, E., Osborn, D. R., Sensier, M., & Dijk, D. V. (2014). Identifying changes in mean, seasonality, persistence and volatility for G7 and Euro Area inflation. Oxford Bulletin of Economics and Statistics, 76(3), 360-388. https://doi.org/10.1111/obes.12021

Banaś, J., & Utnik-Banaś, K. (2021). Evaluating a seasonal autoregressive moving average model with an exogenous variable for short-term timber price forecasting. Forest Policy and Economics, 131, 102564. https://doi.org/10.1016/j.forpol.2021.102564

An, W., Wang, L., & Zhang, D. (2023). Comprehensive commodity price forecasting framework using text mining methods. Journal of Forecasting, 42(7), 1865-1888. https://doi.org/10.1002/for.2985

An, W., Wang, L., & Zeng, Y. R. (2023). Text‐based soybean futures price forecasting: A two‐stage deep learning approach. Journal of Forecasting, 42(2), 312-330. https://doi.org/10.1002/for.2909

Ong, C. S., Huang, J. J., & Tzeng, G. H. (2005). Model identification of ARIMA family using genetic algorithms. Applied Mathematics and Computation, 164(3), 885-912. https://doi.org/10.1016/j.amc.2004.06.044

Madziwa, L., Pillalamarry, M., & Chatterjee, S. (2022). Gold price forecasting using multivariate stochastic model. Resources Policy, 76, 102544. https://doi.org/10.1016/j.resourpol.2021.102544

Deina, C., do Amaral Prates, M. H., Alves, C. H. R., Martins, M. S. R., Trojan, F., Stevan Jr, S. L., & Siqueira, H. V. (2022). A methodology for coffee price forecasting based on extreme learning machines. Information Processing in Agriculture, 9(4), 556-565. https://doi.org/10.1016/j.inpa.2021.07.003

Pedregal, D. J. (2020). Forecasting uranium prices: Some empirical results. Nuclear Engineering and Technology, 52(6), 1334-1339. https://doi.org/10.1016/j.net.2019.11.028

Bloznelis, D. (2018). Short‐term salmon price forecasting. Journal of Forecasting, 37(2), 151-169. https://doi.org/10.1002/for.2482

Imbat, M. B. S., Gordovez, F. S. S., Solomon, R. M., Malicdem, G. E. Q., Andrada, J. D. P., & Belandres, E. B. (2024). Rice Price Forecasting Using the Arima Model, Iconic Research and Engineering Journals, 7(7), 333-338.

Vo, N., & Ślepaczuk, R. (2022). Applying hybrid ARIMA-SGARCH in algorithmic investment strategies on S&P500 index. Entropy, 24(2), 158. https://doi.org/10.3390/e24020158

Wolters, J., & Huchzermeier, A. (2021). Joint in-season and out-of-season promotion demand forecasting in a retail environment. Journal of Retailing, 97(4), 726-745. https://doi.org/10.1016/j.jretai.2021.01.003

Sarlo, R., Fernandes, C., & Borenstein, D. (2023). Lumpy and intermittent retail demand forecasts with score-driven models. European Journal of Operational Research, 307(3), 1146-1160. https://doi.org/10.1016/j.ejor.2022.10.006

Li, C., & Lim, A. (2018). A greedy aggregation–decomposition method for intermittent demand forecasting in fashion retailing. European Journal of Operational Research, 269(3), 860-869. https://doi.org/10.1016/j.ejor.2018.02.029

Ma, S., Fildes, R., & Huang, T. (2016). Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information. European Journal of Operational Research, 249(1), 245-257. https://doi.org/10.1016/j.ejor.2015.08.029

McIntyre, S. H., Achabal, D. D., & Miller, C. M. (1993). Applying case-based reasoning to forecasting retail sales. Journal of Retailing, 69(4), 372-398. https://doi.org/10.1016/0022-4359(93)90014-a

Wellens, A. P., Boute, R. N., & Udenio, M. (2024). Simplifying tree-based methods for retail sales forecasting with explanatory variables. European Journal of Operational Research, 314(2), 523-539. https://doi.org/10.1016/j.ejor.2023.10.039

Ma, S., & Fildes, R. (2021). Retail sales forecasting with meta-learning. European Journal of Operational Research, 288(1), 111-128. https://doi.org/10.1016/j.ejor.2020.05.038

Huang, T., Fildes, R., & Soopramanien, D. (2014). The value of competitive information in forecasting FMCG retail product sales and the variable selection problem. European Journal of Operational Research, 237(2), 738-748. https://doi.org/10.1016/j.ejor.2014.02.022

Schlaich, T., & Hoberg, K. (2024). When is the next order? Nowcasting channel inventories with Point-of-Sales data to predict the timing of retail orders. European Journal of Operational Research, 315(1), 35-49. https://doi.org/10.1016/j.ejor.2023.10.038

Agrawal, D., & Schorling, C. (1996). Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model. Journal of Retailing, 72(4), 383-407. https://doi.org/10.1016/s0022-4359(96)90020-2

Rahman, M. A., Modak, C., Mozumder, M. A. S., Miah, M. N. I., Hasan, M., Sweet, M. M. R., Hossan, M.Z., & Alam, M. (2024). Advancements in Retail Price Optimization: Leveraging Machine Learning Models for Profitability and Competitiveness. Journal of Business and Management Studies, 6(3), 103-110. https://doi.org/10.32996/jbms.2024.6.3.11

Box, G.E.P., Jenkins, G.M., Reinsel, G.C. & Ljung, G.M. (2016) Time Series Analysis: Forecasting and Control. Fifth Edition, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken.

Liu, H., Tian, H. Q., & Li, Y. F. (2015). An EMD-recursive ARIMA method to predict wind speed for railway strong wind warning system. Journal of Wind Engineering and Industrial Aerodynamics, 141, 27-38. https://doi.org/10.1016/j.jweia.2015.02.004

Wang, Z. Y., Qiu, J., & Li, F. F. (2018). Hybrid models combining EMD/EEMD and ARIMA for long-term streamflow forecasting. Water, 10(7), 853. https://doi.org/10.3390/w10070853

Li, T., Qu, S., & Huang, G. (2021). Research on the prediction of Shenzhen growth enterprise market price index based on EMD-ARIMA model. In LISS 2020: Proceedings of the 10th International Conference on Logistics, Informatics and Service Sciences (pp. 783-795). Springer Singapore. https://doi.org/10.1007/978-981-33-4359-7_54

Fatema, N., Malik, H., & Abd Halim, M. S. (2022). Hybrid approach combining EMD, ARIMA and monte carlo for multi-step ahead medical tourism forecasting. Journal of Intelligent & Fuzzy Systems, 42(2), 1235–1251. https://doi.org/10.3233/jifs-189785

Powell, B., Nason, G., Elliott, D., Mayhew, M., Davies, J., & Winton, J. (2018). Tracking and Modelling Prices Using Web-Scraped Price Microdata: Towards Automated Daily Consumer Price Index Forecasting. Journal of the Royal Statistical Society Series A: Statistics in Society, 181(3), 737–756. https://doi.org/10.1111/rssa.12314

Claveria, O., Monte, E., & Torra, S. (2017). Using survey data to forecast real activity with evolutionary algorithms. A cross-country analysis. Journal of Applied Economics, 20(2), 329–349. https://doi.org/10.1016/s1514-0326(17)30015-6

Liu, M. D., Ding, L., & Bai, Y. L. (2021). Application of hybrid model based on empirical mode decomposition, novel recurrent neural networks and the ARIMA to wind speed prediction. Energy Conversion and Management, 233, 113917. https://doi.org/10.1016/j.enconman.2021.113917

Cao, Y., Zhang, D., Ding, S., Zhong, W., & Yan, C. (2024). A Hybrid Air Quality Prediction Model Based on Empirical Mode Decomposition. Tsinghua Science and Technology, 29(1), 99–111. https://doi.org/10.26599/tst.2022.9010060

Wang, H., Liu, L., Dong, S., Qian, Z., & Wei, H. (2015). A novel work zone short-term vehicle-type specific traffic speed prediction model through the hybrid EMD–ARIMA framework. Transportmetrica B: Transport Dynamics, 4(3), 159–186. https://doi.org/10.1080/21680566.2015.1060582

Abadan, S., & Shabri, A. (2014). Hybrid Empirical Mode Decomposition-ARIMA for Forecasting Price of Rice. Applied Mathematical Sciences, 8(63), 3133–3143. https://doi.org/10.12988/ams.2014.43189

Nasir, J., Aamir, M., Haq, Z. U., Khan, S., Amin, M. Y., & Naeem, M. (2023). A New Approach for Forecasting Crude Oil Prices Based on Stochastic and Deterministic Influences of LMD Using ARIMA and LSTM Models. IEEE Access, 11, 14322–14339. https://doi.org/10.1109/access.2023.3243232

Yang, H. L., & Lin, H. C. (2016). An integrated model combined ARIMA, EMD with SVR for stock indices forecasting. International Journal on Artificial Intelligence Tools, 25(02), 1650005. https://doi.org/10.1142/s0218213016500056

Zhou, Y., & Huang, M. (2016). Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model. Microelectronics Reliability, 65, 265–273. https://doi.org/10.1016/j.microrel.2016.07.151

Awajan, A. M., Al-Hasanat, B., Elkaroui, E., AL e’damat, A., Al-Gounmeein, R. S., Al-Jawarneh, A. S., Ayyoub H. N., & Alfarajat, E. (2024). Time Series Forecasting of New Cases for COVID-19 Pandemic in Jordan Using Enhanced Hybrid EMD-ARIMA. Journal of Statistics Applications & Probability An International Journal, 13(1), 261. https://doi.org/10.18576/jsap/130118

Chyon, F. A., Suman, M. N. H., Fahim, M. R. I., & Ahmmed, M. S. (2022). Time series analysis and predicting COVID-19 affected patients by ARIMA model using machine learning. Journal of Virological Methods, 301, 114433. https://doi.org/10.1016/j.jviromet.2021.114433

Faruk, D. Ö. (2010). A hybrid neural network and ARIMA model for water quality time series prediction. Engineering Applications of Artificial Intelligence, 23(4), 586–594. https://doi.org/10.1016/j.engappai.2009.09.015

PennState. (2023). https://online.stat.psu.edu/stat510/lesson/2/2.2

Duke. (2023). https://people.duke.edu/~rnau/arimrule.htm

Mohammed, E. A., Naugler, C., & Behrouz, H. (2015). Emerging Business Intelligence Framework for a Clinical Laboratory Through Big Data Analytics. Emerging Trends in Computational Biology, Bioinformatics, and Systems Biology: Algorithms and Software Tools, 577–602. https://doi.org/10.1016/b978-0-12-802508-6.00032-6

Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N. C., Tung, C. C., & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971), 903–995. https://doi.org/10.1098/rspa.1998.0193

Lin, Y., Yan, Y., Xu, J., Liao, Y., & Ma, F. (2021). Forecasting stock index price using the CEEMDAN-LSTM model. The North American Journal of Economics and Finance, 57, 101421. https://doi.org/10.1016/j.najef.2021.101421

Zhang, Y., Yan, B., & Aasma, M. (2020). A novel deep learning framework: Prediction and analysis of financial time series using CEEMD and LSTM. Expert Systems with Applications, 159, 113609. https://doi.org/10.1016/j.eswa.2020.113609

Ning, Y., Kazemi, H., & Tahmasebi, P. (2022). A comparative machine learning study for time series oil production forecasting: ARIMA, LSTM, and Prophet. Computers & Geosciences, 164, 105126. https://doi.org/10.1016/j.cageo.2022.105126

Matthews, K. (1985). Forecasting with a Rational Expectations Model of The UK. Oxford Bulletin of Economics and Statistics, 47(4), 311–336. https://doi.org/10.1111/j.1468-0084.1985.mp47004001.x

Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295–316. https://doi.org/10.1016/j.neucom.2020.07.061

Published

2025-02-23

How to Cite

Ozcalci, M., & Kaya, E. (2025). Retail Products Price Forecasting with Empirical Mode Decomposition and Auto Regressive Integrated Moving Average Model Using Web-Scraped Price Microdata. Spectrum of Decision Making and Applications, 2(1), 315-355. https://doi.org/10.31181/sdmap21202519