U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Entropy (Basel)

Logo of entropy

Stock Market Volatility and Return Analysis: A Systematic Literature Review

Roni bhowmik.

1 School of Economics and Management, Jiujiang University, Jiujiang 322227, China

2 Department of Business Administration, Daffodil International University, Dhaka 1207, Bangladesh

Shouyang Wang

3 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China; nc.ca.ssma@gnawys

In the field of business research method, a literature review is more relevant than ever. Even though there has been lack of integrity and inflexibility in traditional literature reviews with questions being raised about the quality and trustworthiness of these types of reviews. This research provides a literature review using a systematic database to examine and cross-reference snowballing. In this paper, previous studies featuring a generalized autoregressive conditional heteroskedastic (GARCH) family-based model stock market return and volatility have also been reviewed. The stock market plays a pivotal role in today’s world economic activities, named a “barometer” and “alarm” for economic and financial activities in a country or region. In order to prevent uncertainty and risk in the stock market, it is particularly important to measure effectively the volatility of stock index returns. However, the main purpose of this review is to examine effective GARCH models recommended for performing market returns and volatilities analysis. The secondary purpose of this review study is to conduct a content analysis of return and volatility literature reviews over a period of 12 years (2008–2019) and in 50 different papers. The study found that there has been a significant change in research work within the past 10 years and most of researchers have worked for developing stock markets.

1. Introduction

In the context of economic globalization, especially after the impact of the contemporary international financial crisis, the stock market has experienced unprecedented fluctuations. This volatility increases the uncertainty and risk of the stock market and is detrimental to the normal operation of the stock market. To reduce this uncertainty, it is particularly important to measure accurately the volatility of stock index returns. At the same time, due to the important position of the stock market in the global economy, the beneficial development of the stock market has become the focus. Therefore, the knowledge of theoretical and literature significance of volatility are needed to measure the volatility of stock index returns.

Volatility is a hot issue in economic and financial research. Volatility is one of the most important characteristics of financial markets. It is directly related to market uncertainty and affects the investment behavior of enterprises and individuals. A study of the volatility of financial asset returns is also one of the core issues in modern financial research and this volatility is often described and measured by the variance of the rate of return. However, forecasting perfect market volatility is difficult work and despite the availability of various models and techniques, not all of them work equally for all stock markets. It is for this reason that researchers and financial analysts face such a complexity in market returns and volatilities forecasting.

The traditional econometric model often assumes that the variance is constant, that is, the variance is kept constant at different times. An accurate measurement of the rate of return’s fluctuation is directly related to the correctness of portfolio selection, the effectiveness of risk management, and the rationality of asset pricing. However, with the development of financial theory and the deepening of empirical research, it was found that this assumption is not reasonable. Additionally, the volatility of asset prices is one of the most puzzling phenomena in financial economics. It is a great challenge for investors to get a pure understanding of volatility.

A literature reviews act as a significant part of all kinds of research work. Literature reviews serve as a foundation for knowledge progress, make guidelines for plan and practice, provide grounds of an effect, and, if well guided, have the capacity to create new ideas and directions for a particular area [ 1 ]. Similarly, they carry out as the basis for future research and theory work. This paper conducts a literature review of stock returns and volatility analysis based on generalized autoregressive conditional heteroskedastic (GARCH) family models. Volatility refers to the degree of dispersion of random variables.

Financial market volatility is mainly reflected in the deviation of the expected future value of assets. The possibility, that is, volatility, represents the uncertainty of the future price of an asset. This uncertainty is usually characterized by variance or standard deviation. There are currently two main explanations in the academic world for the relationship between these two: The leverage effect and the volatility feedback hypothesis. Leverage often means that unfavorable news appears, stock price falls, leading to an increase in the leverage factor, and thus the degree of stock volatility increases. Conversely, the degree of volatility weakens; volatility feedback can be simply described as unpredictable stock volatility that will inevitably lead to higher risk in the future.

There are many factors that affect price movements in the stock market. Firstly, there is the impact of monetary policy on the stock market, which is extremely substantial. If a loose monetary policy is implemented in a year, the probability of a stock market index rise will increase. On the other hand, if a relatively tight monetary policy is implemented in a year, the probability of a stock market index decline will increase. Secondly, there is the impact of interest rate liberalization on risk-free interest rates. Looking at the major global capital markets, the change in risk-free interest rates has a greater correlation with the current stock market. In general, when interest rates continue to rise, the risk-free interest rate will rise, and the cost of capital invested in the stock market will rise simultaneously. As a result, the economy is expected to gradually pick up during the release of the reform dividend, and the stock market is expected to achieve a higher return on investment.

Volatility is the tendency for prices to change unexpectedly [ 2 ], however, all kinds of volatility is not bad. At the same time, financial market volatility has also a direct impact on macroeconomic and financial stability. Important economic risk factors are generally highly valued by governments around the world. Therefore, research on the volatility of financial markets has always been the focus of financial economists and financial practitioners. Nowadays, a large part of the literature has studied some characteristics of the stock market, such as the leverage effect of volatility, the short-term memory of volatility, and the GARCH effect, etc., but some researchers show that when adopting short-term memory by the GARCH model, there is usually a confusing phenomenon, as the sampling interval tends to zero. The characterization of the tail of the yield generally assumes an ideal situation, that is, obeys the normal distribution, but this perfect situation is usually not established.

Researchers have proposed different distributed models in order to better describe the thick tail of the daily rate of return. Engle [ 3 ] first proposed an autoregressive conditional heteroscedasticity model (ARCH model) to characterize some possible correlations of the conditional variance of the prediction error. Bollerslev [ 4 ] has been extended it to form a generalized autoregressive conditional heteroskedastic model (GARCH model). Later, the GARCH model rapidly expanded and a GARCH family model was created.

When employing GARCH family models to analyze and forecast return volatility, selection of input variables for forecasting is crucial as the appropriate and essential condition will be given for the method to have a stationary solution and perfect matching [ 5 ]. It has been shown in several findings that the unchanged model can produce suggestively different results when it is consumed with different inputs. Thus, another key purpose of this literature review is to observe studies which use directional prediction accuracy model as a yardstick from a realistic point of understanding and has the core objective of the forecast of financial time series in stock market return. Researchers estimate little forecast error, namely measured as mean absolute deviation (MAD), root mean squared error (RMSE), mean absolute error (MAE), and mean squared error (MSE) which do not essentially interpret into capital gain [ 6 , 7 ]. Some others mention that the predictions are not required to be precise in terms of NMSE (normalized mean squared error) [ 8 ]. It means that finding the low rate of root mean squared error does not feed high returns, in another words, the relationship is not linear between two.

In this manuscript, it is proposed to categorize the studies not only by their model selection standards but also for the inputs used for the return volatility as well as how precise it is spending them in terms of return directions. In this investigation, the authors repute studies which use percentage of success trades benchmark procedures for analyzing the researchers’ proposed models. From this theme, this study’s authentic approach is compared with earlier models in the literature review for input variables used for forecasting volatility and how precise they are in analyzing the direction of the related time series. There are other review studies on return and volatility analysis and GARCH-family based financial forecasting methods done by a number of researchers [ 9 , 10 , 11 , 12 , 13 ]. Consequently, the aim of this manuscript is to put forward the importance of sufficient and necessary conditions for model selection and contribute for the better understanding of academic researchers and financial practitioners.

Systematic reviews have most notable been expanded by medical science as a way to synthesize research recognition in a systematic, transparent, and reproducible process. Despite the opportunity of this technique, its exercise has not been overly widespread in business research, but it is expanding day by day. In this paper, the authors have used the systematic review process because the target of a systematic review is to determine all empirical indication that fits the pre-decided inclusion criteria or standard of response to a certain research question. Researchers proved that GARCH is the most suitable model to use when one has to analysis the volatility of the returns of stocks with big volumes of observations [ 3 , 4 , 6 , 9 , 13 ]. Researchers observe keenly all the selected literature to answer the following research question: What are the effective GARCH models to recommend for performing market volatility and return analysis?

The main contribution of this paper is found in the following four aspects: (1) The best GARCH models can be recommended for stock market returns and volatilities evaluation. (2) The manuscript considers recent papers, 2008 to 2019, which have not been covered in previous studies. (3) In this study, both qualitative and quantitative processes have been used to examine the literature involving stock returns and volatilities. (4) The manuscript provides a study based on journals that will help academics and researchers recognize important journals that they can denote for a literature review, recognize factors motivating analysis stock returns and volatilities, and can publish their worth study manuscripts.

2. Methodology

A systematic literature examination of databases should recognize as complete a list as possible of relevant literature while keeping the number of irrelevant knocks small. The study is conducted by a systematic based literature review, following suggestions from scholars [ 14 , 15 ]. This manuscript was led by a systematic database search, surveyed by cross-reference snowballing, as demonstrated in Figure 1 , which was adapted from Geissdoerfer et al. [ 16 ]. Two databases were selected for the literature search: Scopus and Web-of-Science. These databases were preferred as they have some major depositories of research and are usually used in literature reviews for business research [ 17 ].

An external file that holds a picture, illustration, etc.
Object name is entropy-22-00522-g001.jpg

Literature review method.

At first stage, a systematic literature search is managed. The keywords that were too broad or likely to be recognized in literature-related keywords with other research areas are specified below. As shown in Table 1 , the search string “market return” in ‘Title‘ respectively “stock market return”, “stock market volatility”, “stock market return volatility”, “GARCH family model* for stock return”, “forecasting stock return”, and GARCH model*, “financial market return and volatility” in ‘Topic’ separately ‘Article title, Abstract, Keywords’ were used to search for reviews of articles in English on the Elsevier Scopus and Thomson Reuters Web-of-Science databases. The asterisk (*) is a commonly used wildcard symbol that broadens a search by finding words that start with the same letters.

Literature search strings for database.

At second stage, suitable cross-references were recognized in this primary sample by first examining the publications’ title in the reference portion and their context and cited content in the text. The abstracts of the recognized further publications were examined to determine whether the paper was appropriate or not. Appropriate references were consequently added to the sample and analogously scanned for appropriate cross-references. This method was continual until no additional appropriate cross-references could be recognized.

At the third stage, the ultimate sample was assimilated, synthesized, and compiled into the literature review presented in the subsequent section. The method was revised a few days before the submission.

Additionally, the list of affiliation criteria in Table 2 , which is formed on discussions of the authors, with the summaries of all research papers were independently checked in a blind system method. Evaluations were established on the content of the abstract, with any extra information unseen, and were comprehensive rather than exclusive. In order to check for inter-coder dependability, an initial sample of 30 abstracts were studied for affiliation by the authors. If the abstract was not satisfactorily enough, the whole paper was studied. Simply, 4.61 percent of the abstract resulted in variance between the researchers. The above-mentioned stages reduced the subsequent number of full papers for examination and synthesis to 50. In order to recognize magnitudes, backgrounds, and moderators, these residual research papers were reviewed in two rounds of reading.

Affiliation criteria.

3. Review of Different Studies

In this paper, a large amount of articles were studied but only a few were well thought out to gather the quality developed earlier. For every published article, three groups were specified. Those groups were considered as index and forecast time period, input elements, econometric models, and study results. The first group namely “index and forecast time period with input elements” was considered since market situation like emerging, frontier, and developed markets which are important parameters of forecast and also the length of evaluation is a necessary characteristic for examining the robustness of the model. Furthermore, input elements are comparatively essential parameters for a forecast model because the analytical and diagnostic ability of the model is mainly supported on the inputs that a variable uses. In the second group, “model” was considered forecast models proposed by authors and other models for assessment. The last group is important to our examination for comparing studies in relationships of proper guiding return and volatility, acquired by using recommended estimate models, named the “study results” group.

Measuring the stock market volatility is an incredibly complex job for researchers. Since volatility tends to cluster, if today’s volatility is high, it is likely to be high tomorrow but they have also had an attractive high hit rate with major disasters [ 4 , 7 , 11 , 12 ]. GARCH models have a strong background, recently having crossed 30 years of the fast progress of GARCH-type models for investigating the volatility of market data. Literature of eligible papers were clustered in two sub groups, the first group containing GARCH and its variations model, and the second group containing bivariate and other multivariate GARCH models, summarized in a table format for future studies. Table 3 explains the review of GARCH and its variations models. The univariate GARCH model is for a single time series. It is a statistical model that is used to analyze a number of different kinds of financial data. Financial institutions and researchers usually use this model to estimate the volatility of returns for stocks, bonds, and market indices. In the GARCH model, current volatility is influenced by past innovation to volatility. GARCH models are used to model for forecast volatility of one time series. The most widely used GARCH form is GARCH (1, 1) and this has some extensions.

Different literature studies based on generalized autoregressive conditional heteroskedastic (GARCH) and its variations models.

Notes: APARCH (Asymmetric Power ARCH), AIC (Akaike Information Criterion), OHLC (Open-High-Low-Close Chart), NSE (National Stock Exchange of India), EWMA (Exponentially Weighted Moving Average), CGARCH (Component GARCH), BDS (Brock, Dechert & Scheinkman) Test, ARCH-LM (ARCH-Lagrange Multiplier) test, VAR (Vector Autoregression) model, VEC (Vector Error Correction) model, ARFIMA (Autoregressive Fractional Integral Moving Average), FIGARCH (Fractionally Integrated GARCH), SHCI (Shanghai Stock Exchange Composite Index), SZCI (Shenzhen Stock Exchange Component Index), ADF (Augmented Dickey–Fuller) test, BSE (Bombay Stock Exchange), and PGARCH (Periodic GARCH) are discussed.

In a simple GARCH model, the squared volatility σ t 2 is allowed to change on previous squared volatilities, as well as previous squared values of the process. The conditional variance satisfies the following form: σ t 2 = α 0 + α 1 ϵ t − 1 2 + … + α q ϵ t − q 2 + β 1 σ t − 1 2 + … + β p σ t − p 2 where, α i > 0 and β i > 0 . For the GARCH model, residuals’ lags can substitute by a limited number of lags of conditional variances, which abridges the lag structure and in addition the estimation method of coefficients. The most often used GARCH model is the GARCH (1, 1) model. The GARCH (1, 1) process is a covariance-stationary white noise process if and only if α 1 + β < 1 . The variance of the covariance-stationary process is given by α 1   /   ( 1 − α 1 − β ) . It specifies that σ n 2     is based on the most recent observation of φ t 2   and the most recent variance rate σ n − 1 2 . The GARCH (1, 1) model can be written as σ n 2 = ω + α φ n − 1 2 + β σ n − 1 2 and this is usually used for the estimation of parameters in the univariate case.

Though, GARCH model is not a complete model, and thus could be developed, these developments are detected in the form of the alphabet soup that uses GARCH as its key component. There are various additions of the standard GARCH family models. Nonlinear GARCH (NGARCH) was proposed by Engle and Ng [ 18 ]. The conditional covariance equation is in the form: σ t 2 = γ + α ( ε t − 1 − ϑ σ t − 1   ) 2 + β σ t − 1 2 , where α ,   β ,   γ > 0 . The integrated GARCH (IGARCH) is a restricted version of the GARCH model, where the sum of all the parameters sum up to one and this model was introduced by Engle and Bollerslev [ 19 ]. Its phenomenon might be caused by random level shifts in volatility. The simple GARCH model fails in describing the “leverage effects” which are detected in the financial time series data. The exponential GARCH (EGARCH) introduced by Nelson [ 5 ] is to model the logarithm of the variance rather than the level and this model accounts for an asymmetric response to a shock. The GARCH-in-mean (GARCH-M) model adds a heteroskedasticity term into the mean equation and was introduced by Engle et al. [ 20 ]. The quadratic GARCH (QGARCH) model can handle asymmetric effects of positive and negative shocks and this model was introduced by Sentana [ 21 ]. The Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model was introduced by Glosten et al. [ 22 ], its opposite effects of negative and positive shocks taking into account the leverage fact. The threshold GARCH (TGARCH) model was introduced by Zakoian [ 23 ], this model is also commonly used to handle leverage effects of good news and bad news on volatility. The family GARCH (FGARCH) model was introduced by Hentschel [ 24 ] and is an omnibus model that is a mix of other symmetric or asymmetric GARCH models. The COGARCH model was introduced by Klüppelberg et al. [ 25 ] and is actually the stochastic volatility model, being an extension of the GARCH time series concept to continuous time. The power-transformed and threshold GARCH (PTTGARCH) model was introduced by Pan et al. [ 26 ], this model is a very flexible model and, under certain conditions, includes several ARCH/GARCH models.

Based on the researchers’ articles, the symmetric GARCH (1, 1) model has been used widely to forecast the unconditional volatility in the stock market and time series data, and has been able to simulate the asset yield structure and implied volatility structure. Most researchers show that GARCH (1, 1) with a generalized distribution of residual has more advantages in volatility assessment than other models. Conversely, the asymmetry influence in stock market volatility and return analysis was beyond the descriptive power of the asymmetric GARCH models, as the models could capture more specifics. Besides, the asymmetric GARCH models can incompletely measure the effect of positive or negative shocks in stock market return and volatility, and the GARCH (1, 1) comparatively failed to accomplish this fact. In asymmetric effect, the GJR-GARCH model performed better and produced a higher predictable conditional variance during the period of high volatility. In addition, among the asymmetric GARCH models, the reflection of EGARCH model appeared to be superior.

Table 4 has explained the review of bivariate and other multivariate GARCH models. Bivariate model analysis was used to find out if there is a relationship between two different variables. Bivariate model uses one dependent variable and one independent variable. Additionally, the Multivariate GARCH model is a model for two or more time series. Multivariate GARCH models are used to model for forecast volatility of several time series when there are some linkages between them. Multivariate model uses one dependent variable and more than one independent variable. In this case, the current volatility of one time series is influenced not only by its own past innovation, but also by past innovations to volatilities of other time series.

Different literature studies based on bivariate and other multivariate GARCH models.

The most recognizable use of multivariate GARCH models is the analysis of the relations between the volatilities and co-volatilities of several markets. A multivariate model would create a more dependable model than separate univariate models. The vector error correction (VEC) models is the first MGARCH model which was introduced by Bollerslev et al. [ 66 ]. This model is typically related to subsequent formulations. The model can be expressed in the following form: v e c h   ( H t ) = ℂ + ∑ j = 1 q X j   v e c h   ( ϵ t − j   ϵ t − j ' ) + ∑ j = 1 p Y j   v e c h   ( H t − j   )   where v e c h is an operator that stacks the columns of the lower triangular part of its argument square matrix and H t is the covariance matrix of the residuals. The regulated version of the VEC model is the DVEC model and was also recommended by Bollerslev et al. [ 66 ]. Compared to the VEC model, the estimation method proceeded far more smoothly in the DVEC model. The Baba-Engle-Kraft-Kroner (BEKK) model was introduced by Baba et al. [ 67 ] and is an innovative parameterization of the conditional variance matrix H t . The BEKK model accomplishes the positive assurance of the conditional covariance by conveying the model in a way that this property is implied by the model structure. The Constant Conditional Correlation (CCC) model was recommended by Bollerslev [ 68 ], to primarily model the conditional covariance matrix circuitously by estimating the conditional correlation matrix. The Dynamic Conditional Correlation (DCC) model was introduced by Engle [ 69 ] and is a nonlinear mixture of univariate GARCH models and also a generalized variety of the CCC model. To overcome the inconveniency of huge number of parameters, the O-GARCH model was recommended by Alexander and Chibumba [ 70 ] and consequently developed by Alexander [ 71 , 72 ]. Furthermore, a multivariate GARCH model GO-GARCH model was introduced by Bauwens et al. [ 73 ].

The bivariate models showed achieve better in most cases, compared with the univariate models [ 85 ]. MGARCH models could be used for forecasting. Multivariate GARCH modeling delivered a realistic but parsimonious measurement of the variance matrix, confirming its positivity. However, by analyzing the relative forecasting accuracy of the two formulations, BEKK and DCC, it could be deduced that the forecasting performance of the MGARCH models was not always satisfactory. By comparing it with the other multivariate GARCH models, BEKK-GARCH model was comparatively better and flexible but it needed too many parameters for multiple time series. Conversely, for the area of forecasting, the DCC-GARCH model was more parsimonious. In this regard, it was significantly essential to balance parsimony and flexibility when modeling multivariate GARCH models.

The current systematic review has identified 50 research articles for studies on significant aspects of stock market return and volatility, review types, and GARCH model analysis. This paper noticed that all the studies in this review used an investigational research method. A literature review is necessary for scholars, academics, and practitioners. However, assessing various kinds of literature reviews can be challenging. There is no use for outstanding and demanding literature review articles, since if they do not provide a sufficient contribution and something that is recent, it will not be published. Too often, literature reviews are fairly descriptive overviews of research carried out among particular years that draw data on the number of articles published, subject matter covered, authors represented, and maybe methods used, without conducting a deeper investigation. However, conducting a literature review and examining its standard can be challenging, for this reason, this article provides some rigorous literature reviews and, in the long run, to provide better research.

4. Conclusions

Working on a literature review is a challenge. This paper presents a comprehensive literature which has mainly focused on studies on return and volatility of stock market using systematic review methods on various financial markets around the world. This review was driven by researchers’ available recommendations for accompanying systematic literature reviews to search, examine, and categorize all existing and accessible literature on market volatility and returns [ 16 ]. Out of the 435 initial research articles located in renowned electronic databases, 50 appropriate research articles were extracted through cross-reference snowballing. These research articles were evaluated for the quality of proof they produced and were further examined. The raw data were offered by the authors from the literature together with explanations of the data and key fundamental concepts. The outcomes, in this research, delivered future magnitudes to research experts for further work on the return and volatility of stock market.

Stock market return and volatility analysis is a relatively important and emerging field of research. There has been plenty of research on financial market volatility and return because of easily increasing accessibility and availability of researchable data and computing capability. The GARCH type models have a good model on stock market volatilities and returns investigation. The popularity of various GARCH family models has increased in recent times. Every model has its specific strengths and weaknesses and has at influence such a large number of GARCH models. To sum up the reviewed papers, many scholars suggest that the GARCH family model provides better results combined with another statistical technique. Based on the study, much of the research showed that with symmetric information, GARCH (1, 1) could precisely explain the volatilities and returns of the data and when under conditions of asymmetric information, the asymmetric GARCH models would be more appropriate [ 7 , 32 , 40 , 47 , 48 ]. Additionally, few researchers have used multivariate GARCH model statistical techniques for analyzing market volatility and returns to show that a more accurate and better results can be found by multivariate GARCH family models. Asymmetric GARCH models, for instance and like, EGARCH, GJR GARCH, and TGARCH, etc. have been introduced to capture the effect of bad news on the change in volatility of stock returns [ 42 , 58 , 62 ]. This study, although short and particular, attempted to give the scholar a concept of different methods found in this systematic literature review.

With respect to assessing scholars’ articles, the finding was that rankings and specifically only one GARCH model was sensitive to the different stock market volatilities and returns analysis, because the stock market does not have similar characteristics. For this reason, the stock market and model choice are little bit difficult and display little sensitivity to the ranking criterion and estimation methodology, additionally applying software is also another matter. The key challenge for researchers is finding the characteristics in stock market summarization using different kinds of local stock market returns, volatility detection, world stock market volatility, returns, and other data. Additional challenges are modeled by differences of expression between different languages. From an investigation perception, it has been detected that different authors and researchers use special datasets for the valuation of their methods, which may put boundary assessments between research papers.

Whenever there is assurance that scholars build on high accuracy, it will be easier to recognize genuine research gaps instead of merely conducting the same research again and again, so as to progress better and create more appropriate hypotheses and research questions, and, consequently, to raise the standard of research for future generation. This study will be beneficial for researchers, scholars, stock exchanges, regulators, governments, investors, and other concerned parties. The current study also contributes to the scope of further research in the area of stock volatility and returns. The content analysis can be executed taking the literature of the last few decades. It determined that a lot of methodologies like GARCH models, Johansen models, VECM, Impulse response functions, and Granger causality tests are practiced broadly in examining stock market volatility and return analysis across countries as well as among sectors with in a country.

Author Contributions

R.B. and S.W. proposed the research framework together. R.B. collected the data, and wrote the document. S.W. provided important guidance and advice during the process of this research. All authors have read and agreed to the published version of the manuscript.

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Artificial Intelligence Applied to Stock Market Trading: A Review

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Deep learning in the stock market—a systematic survey of practice, backtesting, and applications

  • Open access
  • Published: 30 June 2022
  • Volume 56 , pages 2057–2109, ( 2023 )

Cite this article

You have full access to this open access article

stock market research paper

  • Kenniy Olorunnimbe 1 &
  • Herna Viktor   ORCID: orcid.org/0000-0003-1914-5077 1  

16k Accesses

21 Citations

2 Altmetric

Explore all metrics

The widespread usage of machine learning in different mainstream contexts has made deep learning the technique of choice in various domains, including finance. This systematic survey explores various scenarios employing deep learning in financial markets, especially the stock market. A key requirement for our methodology is its focus on research papers involving backtesting. That is, we consider whether the experimentation mode is sufficient for market practitioners to consider the work in a real-world use case. Works meeting this requirement are distributed across seven distinct specializations. Most studies focus on trade strategy, price prediction, and portfolio management, with a limited number considering market simulation, stock selection, hedging strategy, and risk management. We also recognize that domain-specific metrics such as “returns” and “volatility” appear most important for accurately representing model performance across specializations. Our study demonstrates that, although there have been some improvements in reproducibility, substantial work remains to be done regarding model explainability. Accordingly, we suggest several future directions, such as improving trust by creating reproducible, explainable, and accountable models and emphasizing prediction of longer-term horizons—potentially via the utilization of supplementary data—which continues to represent a significant unresolved challenge.

Similar content being viewed by others

stock market research paper

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

stock market research paper

A brief review of portfolio optimization techniques

stock market research paper

A systematic review of fundamental and technical analysis of stock market predictions

Avoid common mistakes on your manuscript.

1 Introduction

Technology has long substantially enabled financial innovation (Seese et al. 2008 ). In Insights ( 2019 ), Deloitte surveyed over 200 US financial services executives to determine their use of Artificial Intelligence (AI) and its impact on their business. A total of 70% of respondents indicated that they use general-purpose Machine Learning (ML), with 52% indicating that they use Deep Learning (DL). For these respondents, the most common uses of DL are reading claims documents for triage, providing data analytics to users through intuitive dashboards, and developing innovative trading and investment strategies.

The Institute for Ethical AI & Machine Learning (EAIML) has developed eight principles for responsible ML development; these include pertinent topics such as explainability, reproducibility, and practical accuracy (The Institute for Ethical AI & Machine Learning 2020 ). Recent research has emphasized the issue of Explainable AI (XAI) and Reproducible AI (Gundersen et al. 2018 ) in numerous application domains. In a survey on XAI, the need for interpretable AI was identified as a major step toward artificial general intelligence (Adadi and Berrada 2018 ). However, more work is needed to ensure domain-specific metrics and considerations are used to assess applicability and usability across diverse ML domains.

Paleyes et al. ( 2020 ) suggest practical consideration in deploying ML for production use: “The ability to interpret the output of a model into understandable business domain terms often plays a critical role in model selection, and can even outweigh performance consideration.” For example, Nascita et al. ( 2021 ) fully embraces XAI paradigms of trustworthiness and interpretability to classify data generated by mobile devices using DL approaches.

In the domain of financial analysis using stock market data, a key tool for achieving explainability and giving research a good chance at real-world adoption is backtesting (de Prado 2018 ; Arnott et al. 2018 ). This refers to using historical data to retrospectively assess a model’s viability and instill the confidence to employ it moving forward. This is based on the intuitive notion that any strategy that worked well in the past is likely to work well in the future, and vice versa (de Prado 2018 ).

Numerous surveys have considered applications of DL to financial markets (Jiang 2021 ; Zhang et al. 2021 ; Hu et al. 2021 ; Li and Bastos 2020 ; Ozbayoglu et al. 2020 ), with (Ozbayoglu et al. 2020 ) considering numerous financial applications to demonstrate that applications involving stock market data, such as algorithmic trading and portfolio management, present the most interesting cases for researchers. Elsewhere, (Jiang 2021 ) focuses on DL research in the stock market, especially research concerning reproducibility; however, despite presenting financial metrics, there is no indication of backtesting or practicality. Meanwhile, (Hu et al. 2021 ) presents an analysis based on evaluation results such as bins of accuracy results and ranges of returns that, nonetheless, offers no clear explanation for different kinds of metrics and does not consider XAI.

The authors of Li and Bastos ( 2020 ) emphasize the importance of evaluations using financial metrics but limit their focus to profitability as a financial evaluation. Although they do discuss volatility, this is not considered for evaluation because it can result in poor financial returns despite its high level of accuracy. This survey explores the strategies that various researchers have employed to understand DL in the stock market, focusing on studies addressing explainability, reproducibility, and practicality. To the best of our knowledge, this work represents the first study to adopt backtesting and domain-specific evaluation metrics as primary criteria. This is represented by the following specific questions:

What current research methods based on deep learning are used in the stock market context?

Are the research methods consistent with real-world applications, i.e., have they been backtested?

Is this research easily reproducible?

To answer question 2, we focus on works that were backtested as part of the research methodology. Proper backtesting provides assurance that the algorithm has been tested in different time horizons, consistent with domain-specific considerations, which improves investor confidence and makes its application in a real-world trading scenario more likely  (Arnott et al. 2018 ). This serves as the primary criteria for the literature reviewed. For question 3, we consider not only works where the source data and code are provided but also on works the research could be reproduced. Section  4 further explains the approach employed and the search criteria.

Section  2 explains the characteristics, types, and representations of stock market data. Then, Sect.  3 discusses applications of DL in the stock market. We begin the section by summarizing the different DL techniques currently used in the stock market context and conclude by itemizing the specific ways these techniques are applied to stock market data. In Sect.  4 , we elaborate on our research questions, answering the research questions by summarizing our survey findings. Section  5 presents challenges remaining to be unresolved and future research directions, and Sect.  6 concludes the survey.

2 Understanding stock market data

Not unlike other ML applications, data represents a crucial component of the stock market learning process (de Prado 2018 ). Understanding the different forms of data that are employed to utilize DL for the stock market substantially contributes to enabling proper identification of our data requirements in accordance with the task in question. This section considers the different characteristics, types, and representations of data that are relevant to mining stock market data using DL. Notably, as will become evident, some of these data forms are quite specific to stock market data.

2.1 Data characteristics

2.1.1 source.

Although trading venues such as stock exchanges are often perceived as the main source of stock market data, in recent years, other data sources, including news articles and social media, have been explored as data sources for ML processes (Day and Lee 2016 ; Haibe-Kains et al. 2020 ; Yang et al. 2018 ; Adosoglou et al. 2020 ). There is a direct correlation between data source and data type, as Sect.  2.2 demonstrates. Data source also largely depends on the intended type of analytics. If the goal is a simple regression task using purely historical market data, then the primary or only source could be trading data from the trading venue. For more complicated tasks, such as studying the effect of user sentiments on stock movement, it is common to combine trading data with data obtained from social media services or comments on relevant news articles. Irrespective of complications associated with the task at hand, it is rare to not use the trading venue as a source because literal data is always integral. Although several of the studies considered do not incorporate trading data—e.g., (Bao and Liu 2019 ; Ferguson and Green 2018 )—these are generally theoretical studies that utilize simulated data.

2.1.2 Frequency

Data frequency concerns the number of data points within a specific unit of time (de Prado 2018 ). What any particular data point captures can be reported in different ways, from being represented as an aggregate (e.g., min, max, average) to using actual values. Data granularity can range from a daily snapshot (typically the closing value for trading data) to a fraction of a second for high-frequency market data. A more established representation of stock market data as bars (Sect.  2.3.1 ) refers to presenting multiple data points as an understandable aggregate of the highlights within that time interval.

For non-traditional data sources, such as news or social media, it is quite common to combine and summarize multiple individual items within the same time interval. For example, (Day and Lee 2016 ) uses multiple daily news headlines as part of the training data. Elsewhere, using a sentence encoder (Conneau et al. 2017 ) generates equal length vectors from differently sized sets of words representing different sentences. The literature reviewed commonly uses a snapshot or aggregated data to summarize a data point within a time interval. This could be due to the data’s granularity being directly proportional to its volume. Consequently, more parameters will be required in neural networks comprising highly granular data.

2.1.3 Volume

Although the volume of the data closely relates to the frequency of the data and the specific unit of data (de Prado 2018 ), we should differentiate volume from frequency because, while a high frequency typically translates to a relatively high volume, volume size might not directly correlate to data frequency. This becomes more apparent when we consider seasonality or holidays for the same time interval. We can also recognize that, based on the time of day, the volume of data generated for the same subject of interest within the same period could be vastly different, suggesting a differential occurrence rate. This is particularly relevant for non-conventional data types, such as news and social media data, where high volume (i.e., the size of the volume) might not be directly correlated to data frequency. This becomes more apparent when we consider seasonality or holidays for the same time interval. We can also notice that based on the time of day, the volume of data generated for the same subject of interest within the same period could be vastly different, suggesting a different rate of occurrence. This is particularly relevant for non-conventional data types, such as news or social media data.

Using Apple Inc. as an example (Investing.com 2013 ), a day marking a product announcement produces a substantially larger volume of news articles and relevant social media content than other days. Although this content might not affect the volume of the trading data—which depends more heavily on market data frequency—such instances might produce noticeable differences in the rate of change in market values. An increased rate warrants a different level of attention compared to a typical market day. The relationship between market data frequency and alternative data volume itself represents an interesting area of research that deserves a special level of attention.

Understanding data volume and data frequency is critical to designing infrastructure for processing data. As data volume approaches the realm of big data, precluding efficient computation in memory, it is necessary to consider alternative ways of processing data while utilizing relevant components of that data. Here, we begin considering ways of parallelizing the learning process without losing relationships between parallel batches. Data processing at such a scale requires parallel processing tools, such as those described by Zaharia et al. ( 2010 ).

2.2 Data types

2.2.1 market data.

Market data are trading activities data generated by trading venues such as stock exchanges and investment firms. They are typically provided via streaming data feeds or Application Programming Interface (API) used within protocols such as the Financial Information eXchange (FIX) and the GPRS Tunnelling Protocol (GTP) (Wikipedia 2020d ) (accessed 19-Aug-2020). A typical trade message concerning stock market data comprises a ticker symbol (representing a particular company), bid price, ask price, time of last quote, and size of the sale (Table 1 ).

For messages with quote data, we should expect to see both the bid price & volume and the ask price & volume. These represent how much people are willing to buy and sell the asset at a given volume. Market data represent the core data type used by ML research in the stock market context and typically provide a detailed representation of trading activities regarding market assets such as equities/shares, currencies, derivatives, commodities, and digital assets. Derivatives can be further broken down into futures, forwards, options, and swaps (Derivative 2020 ).

Market data can be either real-time or historical (de Prado 2018 ). Real-time data are used to make real-time trading decisions about buying and selling market instruments. Historical data are used to analyze historical trends and make informed decisions regarding future investments. Typically, historical data can contain intraday or end-of-day data summaries. The granularity of real-time data can be as detailed as a fraction of a second, with some tolerance for short delays. Comparing data for the same period, the frequency of a real-time data feed is expected to be much higher than historical data.

We can further separate market data, based on the details it contains, into Level I and Level II market data. Level II data contains more information and provides detailed information on bids and offers at prices other than the highest price (Zhang et al. 2019 ). Level I data generally contain the basic trading data discussed thus far. Level II data are also referred to as order book or depth of book because they show details of orders that have been placed but not yet filled. These data also show the number of contracts available at different bid and ask prices.

2.2.2 Fundamental data

Unlike market data, where data directly relate to trading activity on the asset of interest, fundamental data are based on information about the company the asset is attached to Christina Majaski ( 2020 ). Such data depict the company’s standing using information such as cash flow, assets, liabilities, profit history, and growth projections. These kinds of information can be obtained from corporate documentation such as regulatory filings and quarterly reports. Care has to be taken to confirm whether fundamental data points are publicly available because these are typically reported with a lapse. This means that analyzing the data must align properly with the date it became publicly available and not necessarily the date the report was filed or indexed.

Notably, some fundamental data are reported with some data yet to be made available, becoming backfilled upon availability. When fundamental data are published before source data becomes available, placeholder values are used during the interim period. Furthermore, given companies can issue revisions or corrections to sources multiple times, these will need to be corrected in the fundamental data, which suggests the need to incorporate a backfilling technique into the data consumption design. By definition, the frequency of this kind of data is very low compared to market data. This might explain why limited DL literature employs fundamental data. However, this also indicates the existence of a gap in research utilizing this kind of data, which would ideally be filled by considering fundamental data alongside other data types to provide a significant learning signal that remains to be fully exploited.

2.2.3 Alternative data

Alternative data represents any other unconventional data type that can add value to already-established sources and types (de Prado 2018 ). This can range from user-generated data (e.g., social media posts, financial news, and comments) to Internet-of-Things data (e.g., data from different sensors and devices). Alternative data typically complement the aforementioned data types, especially market data. Given the nature of alternative data, they are typically much larger, hence requiring a sophisticated processing technique.

Notably, alternative data includes a vast amount of data that is open to interpretation because the signal might not be immediately obvious. For example, a market participant interested in Apple Inc. stocks might choose to observe different news articles related to the company. Although there might be no direct reports about the company releasing a new product line, news reports about key meetings or large component purchases can indicate the plausibility of action. Accordingly, stock market professionals and researchers have become attentive to such indirect signals, and now consider alternative data essential to their data pipeline. Numerous researchers now combine traditional data types with either or both news article and social media content to make market predictions. Social media especially has become a very popular alternative data type, primarily due to its position in the mainstream.

Table  3 presents certain representative attributes of the different data types. All of the attributes associated with market data and fundamental data are numerical and aggregated based on the available time series. For example, the intraday market data entry in row 1 of Table  2 shows the open and close prices for a one-hour time window that begins at 10 am and ends at 10:59 am. It also includes the maximum and minimum price and the total volume traded within the same window (Table 3 ).

A fourth data type known as Analytics data  (de Prado 2018 ), describes data derived from any of the other three types. Attributes of analytics data are earnings projections or sentiments from news or tweets that are combined with trade volume. We have chosen not to include this category because it does not clearly represent a direct source, and it is usually unclear what heuristics have been used to obtain the derived data points. Furthermore, given the objective of academic research is to make the metrics explicit, it is counter-intuitive to consider them useable input.

Table  4 presents the characteristics of the data employed by the literature reviewed, including the aforementioned data types. It is apparent that market data represents the most common type, with actual trading prices and volumes often paired with fundamental data to compute technical indicators (Soleymani and Paquet 2020 ; Wang et al. 2019b ). Table  5 presents a more complete representation of freely or publicly available data sources that fully itemizes attributes.

Sources including investing.com , finance.yahoo.com and kaggle.com utilize either API or libraries, facilitating interactions with them and unlocking better integration with the ML system. Sources without any programmatic interface usually make data available as manual downloadable files.

The other major factor that affects the preferred data source is the frequency of availability, for example, whether the data is available multiple times a day (intraday data) or once a day (interday data). Given the potential volume and size of historical data, it is common for intraday data to remain available for a shorter timeframe than interday data, especially for freely available data sets. However, in most cases, it is possible to pay for intraday data for a longer timeframe if required for lower latency projects.

2.3 Data representation

Data generated from the stock market are typically represented as Bars and Charts . It is worth discussing these representations because they represent the most typical forms of representing data either numerically (bars) or graphically (charts).

Bars enable extraction of valuable information from market data in a regularized manner (de Prado 2018 ). They categorize futures into standard and more advanced types, with the advanced types comprising derivative computation from standard types. However, standard types are more common and also form the basis of chart representation.

Standard bars help to summarize market data into equivalent intervals and can be used with both intraday and historical data (Fig. 1 ). The different types of standard bars all typically contain certain basic information for the specified interval, including the timestamp, Volume-Weighted Average Price (VWAP), open price, close price, high price, low price , and traded volume , all within the specified interval. The VWAP is based on the total traded for the day, irrespective of the time interval, and is computed as \(\sum price \cdot volume/\sum volume\) . The different standard bars are described in the following paragraphs.

figure 1

Survey structure

figure 2

Intraday tick time series showing trade price and volume within the trading hours, across 2 days (Investing.com 2013 )

Time bars This is the most common bar type and derives from summarizing data into an equivalent time interval that includes all of the aforementioned standard bar information. Intraday hourly time bars feature hourly standard bar information for every hour of the day. For historical data, it is common to obtain details for each day. Table  2 exemplifies intraday time bars that can capture information.

The VWAP assists by demonstrating the trend for the price of a traded item during a given day. This single-day indicator is reset at the start of each trading day and should not be used in the context of daily historical data.

Tick bars Unlike time bars that capture information at regular time intervals, tick bars capture the same information at a regular number of transactions or ticks . Ticks are trades in the stock market that can be used to represent the movement of price in trading data (i.e., the uptick and downtick ). Ticks are commonly used for different stages of modeling market data, as in the case of backtesting . However, historical stock market data are not as freely accessible in the form of tick bars, especially for academic research purposes. For this purpose, most of the literature reviewed uses time bars, despite its statistical inferiority for predictive purposes.

Volume bars Although tick bars exhibit better statistical properties than time bars (i.e., they are closer to independent distribution), they still feature the shortcoming of uneven distribution and propensity for outliers (de Prado 2018 ). This can be because a large volume of trade is placed together from accumulated bids in the order book, which gets reported as a single tick, or because orders are equally recorded as a unit, irrespective of size. That is, an order for 10 shares of a security and an order for 10,000 shares are both recorded as a single tick. Volume bars help to mitigate this issue by capturing information at every predefined volume of securities. Although volume bars feature better statistical properties than tick bars (Easley et al. 2012 ), they are similarly seldom used in academic research.

Range bars Range bars involve information being captured when a predefined monetary range is traded. They are also referred to as dollar bars (de Prado 2018 ). Range bars are particularly useful because, by nature, securities appreciate or depreciate constantly over a given period. Consider a security that has depreciated by 50% over a certain period; by the end of that period, it is possible to purchase twice as much as at the beginning. For instance, consider a security that has depreciated from $100 to $50 over a given period. A capital investment of $1000 would only have obtained 10 units at the start of the depreciation period; however, at the end of the period, that investment can obtain 20 units. Furthermore, corporate actions (e.g., splits, reverse splits, and buy-backs) do not impact range bars to the extent that they impact ticks and volume bars.

2.3.2 Charts

Charts visually represent the aforementioned bars, especially time bars. It might not be clear how these are relevant to a survey of DL applications in the stock market context, given it is possible to use the actual data that the charts are based on. However, various novel applications have used charts as training data. For example, (Kusuma et al. 2019 ) uses the candlestick plot chart as the input image for a Convolutional Neural Network (CNN) algorithm. The charts most commonly used to visually represent stock market data are line, area, bar, and candlestick charts. Of interest here, however, are the candlestick and bar charts, which visually encode valuable information that can be used as input for DL algorithms.

figure 3

Candlestick & bar charts

Candlestick and bar charts can visually represent Open-High-Low-Close (OHLC) data, as Figure  3 shows. These two types of charts are optionally color-coded, with red indicating bearish (closing lower than it opened) and green indicating bullish (closing higher than it opened). By properly encoding this information into these charts, an algorithm such as CNN can interpret numerous signals to generate an intelligent model.

2.4 Lessons learned

The distinctive structure and differential representations of stock market data cannot be overestimated. This section considers some of these differences, especially those used in stock-market implementations of ML algorithms using DL. Understanding data characteristics based on specific use cases can determine a given data set’s suitability for the intended use case. By understanding the different types of data used in the stock market, we can refer to the data types needed, which closely relate to their characteristics. For example, given the nature of alternative data, we can expect it to feature significant volume, especially in comparison to fundamental data.

The frequency of data also varies significantly by type. Understanding the granularity of the intended task enables determination of the frequency of the data to be obtained. For example, intraday market data will be required for modeling tasks requiring minute- or hour-level data. This also affects the volume of data required. It is interesting to note data representation, especially market data. The required frequency guides data representation as summarized time bars rather than tick-by-tick data.

Chart representations of market data also provide novel ways of learning from visual representations. Candlestick and bar charts convey information at a rich and detailed level worthy of exploitation as a learning source. Nonetheless, this is accompanied by the complex task of consuming the image rather than the data that it is based upon and, although  (Kusuma et al. 2019 ) used a candlestick chart for this purpose, the authors failed to compare the performance with the performance using the raw data. It would be interesting to observe comparisons of results for raw data and visual representations of that same data.

3 Deep learning for stock market applications

3.1 what is deep learning.

Deep learning describes an ML technique based on networks of simple concepts and featuring different arrangements or architecture that allows computers to learn complicated concepts from simple nodes that are graphically connected using multiple layers (Goodfellow et al. 2016 ). The resurgence of DL was led by probabilistic or Bayesian models such as Deep Belief Networks (DBN)  (Hu et al. 2021 ; Goodfellow et al. 2016 ), which comprise nodes representing random variables with probabilistic relationships to each other. More recently, however, Artificial Neural Networks (ANN) that comprise nodes representing neurons that are generated by the training process have witnessed increasing popularity. All of the architectures we encounter in this survey are based on ANN; this section details these architectures.

Generally speaking, ANN are information processing systems with designs based on the human nervous system, specifically the brain, and that emphasize problem-solving (Castro 2006 ). Typically, they comprise many simple processing elements with adaptive capabilities that can process a massive amount of information in tandem. Given neurons are the basic units for information processing in the brain, their simplified abstraction forms the foundation of ANN. The features and performance characteristics that ANN share with the human nervous system are (Castro 2006 ):

The initial information processing unit occurs in elements known as neurons , nodes or units .

Neurons can send and receive information from both each other and the environment.

Neurons can be connected, forming a connection of neurons that can be described as neural networks .

Information is transmitted between neurons via connection links called synapses .

The efficiency of synapses, represented by an associated weight value or strength , corresponds, in aggregate, to the information stored in the neural network.

To acquire knowledge, connective strengths (aggregated weight values) are adapted to the environmental stimuli, a process known as learning .

Patterns are created by the information stored between neurons, which represents their synaptic or connective strength (Goodfellow et al. 2016 ). Knowledge is represented to influence the course of processing, which becomes a part of the process itself. This invariably means that learning becomes a matter of finding the appropriate connective strength to produce satisfactory activation patterns. This generates the possibility that an information processing mechanism can learn by tuning its connective strength during the processing course. This representation also reveals that knowledge is distributed over the connections between numerous nodes, meaning no single unit is reserved for any particular pattern.

Thus, an ANN can be summarized according to these three key features:

A set of artificial neurons , also known as nodes, units, or neurons.

A method for determining weight values, known as training or learning techniques.

A pattern of connectivity, known as the network architecture or structure .

The following sections detail these three features.

3.1.1 Artificial neurons

A biological neuron primarily comprises a nucleus (or soma ) in a cell body and neurites ( axons and dendrites ) (Wikipedia 2020b ). The axons send output signals to other neurons, and the dendrites receive input signals from other neurons. The sending and receiving of signals take place at the synapses , where the sending (or presynaptic ) neuron contacts the receiving (or postsynaptic ) neuron. The synaptic junction can be at either the cell body or the dendrites. This means that the synapses are responsible for signal/information processing in the neuron, a feature that allows them to alter the state of a postsynaptic neuron, triggering an electric pulse (known as action potential ) in that neuron. The spikes cause the release of neurotransmitters at the axon terminals, which form synapses with the dendrites of other neurons. The action potential only occurs when the neuron’s intrinsic electric potential (known as membrane potential ) surpasses a threshold value.

An artificial neuron attempts to emulate these biological processes. In an artificial neuron, the synapse that connects the input to the rest of the neuron is known as a weight , characterized by synaptic strength, synaptic efficiency, connection strength , or weight value . Figure  4 show a typical artificial neuron.

figure 4

Model of a typical neuron (Castro 2006 )

As each input connects to the neuron, it is individually multiplied by the synaptic weight at each of the connections, which are aggregated in the summing junction . The summing junction adds the product of all of the weighted inputs with the neuron’s bias value, i.e., \(z = \sum \mathbf {wx}+ b\) . The images essentially represent this. The activation function (also referred to as the squashing function ) is represented as \(g(z)\) and has the primary role of limiting the permissible value of the summation to some finite value. It determines a neuron’s output relative to its net input, representing the summing junction’s output. Thus, the neuron’s consequent output, also known as the activation ( \(a\) ), becomes:

During the learning process, it is common to randomly initialize the weights and biases. These parameters are used by the activation to compute the neuron’s output. In this simple representation of one neuron, we can imagine that the output (prediction) of the neuron is compared with the input (true value) using a loss function to generate the error rate. Through an optimization method called Stochastic Gradient Descent , the error rate is propagated back to the network, a process called backpropagation  (Rumelhart et al. 1986 ). This process is repeated over multiple iterations or epochs until a defined number of iterations is achieved or the error rate falls below a satisfactory threshold.

Multiple types of activation functions (Wikipedia 2020b ) are used across different neural network architectures. The Rectified Linear Unit (ReLU) activation function has been more popular in recent applications of Feed-Forward Neural Networks (FFNN) because it is not susceptible to the vanishing gradient issue (Wikipedia 2020c ), which impacts use of the sigmoid function across multiple layers. It is also more computationally efficient. Other ReLU generalizations, such as Leaky ReL or Parametric ReLU (PReLU) are also commonly used. However, sigmoid continues to be used as a gating function in recurrent networks to maintain values between 0 and 1, hence controlling what passes through a node (Goodfellow et al. 2016 ). The hyperbolic tangent (tanh) activation function is also commonly used in recurrent networks, keeping the values that pass through a node between − 1 and 1 (Goodfellow et al. 2016 ).

3.1.2 Learning techniques

In the ANN context, learning refers to the way a network’s parameters adapt according to the input data. Typically, the learning technique is based on how weights are adjusted in the network and how data is made available to the network (Figs. 5 , 6 ).

figure 5

Supervision-based learning technique

figure 6

Learning technique based on data availability

Technique based on weight adjustment: The most common learning technique category, this technique is based solely on how weights are adjusted across an iterative process and is dependent on the type of supervision available to the network during the training process. The different types are supervised, unsupervised (or self-organized), and reinforcement learning.

Technique based on data availability: When categorized according to how data is presented to the network, the learning technique can be considered offline or online. This technique might be chosen because the complete data are not available for training in one batch. This could be because either data are streaming or a concept in the data changes at intervals, requiring the data to be processed in specific time windows. Another reason could be that the data are too large to fit into the memory, demanding processing in multiple smaller batches.

Techniques based on supervision are most common for DL (and indeed DL), with increasing studies adopting batch learning approaches. Nonetheless, the primary architecture of DL networks is not exclusive to one technique category; instead, it is typical to find a mix of both, i.e., offline supervised learning and online reinforcement learning. Unless otherwise specified, it can be assumed that the technique is offline/batch learning. For example, supervised learning refers to offline supervised learning unless it is specified as online. The key point is that each supervision-based technique can be further categorized according to data availability.

3.1.3 Network architecture

The architecture of an ANN importantly contributes to the ways that it is organized. Network inputs depend solely on training data, and, for the most part, the output represents a function of the expected output. The layers between the input and output are mostly a design decision that depends largely on the network architecture, which is based on a typical neural network’s system of multiple connections. Numerous ANN architectures exist across various domains, including communication systems and healthcare (Aceto et al. 2019 ; O’Shea and Hoydis 2017 ; Xiao 2021 ), with the stock market applications this survey considers adopting even more derivative architectures with easily identifiable and well-known foundations. Figure  7 presents these architectures and their common categorizations based on how they learn weight parameters). The following section describes their differences.

figure 7

Taxonomy of deep learning architecture used in stock market applications

The learning techniques based on these architectures can be either discriminative or generative . A discriminative model discriminates between different data classes by learning the boundaries between them or the conditional probability distribution \(p(y|x)\) ; meanwhile, a generative model learns the distribution of individual classes or joint probability distribution \(p(x,y)\)  (Hinton 2017 ). Although most traditional ANN architectures are discriminative, autoencoders and BoltzMann machine are considered generative. In a Generative Adversarial Network  (Hinton 2017 ), the two techniques are combined in a novel adversarial manner.

3.1.3.1 Feed-forward neural networks

Comprising multiple neurons connected in layers, DL architectures use FFNN widely. Figure  8 presents the architecture of an FFNN. It comprises an input layer , representing the input example, one or more hidden layers , and an output layer  (Goodfellow et al. 2016 ).

figure 8

n-layer feed-forward neural network (Castro 2006 )

Although Goodfellow et al. ( 2016 ) suggest that “a single layer is sufficient to represent a function”, hey also recommend deeper layers for better generalization. Ideally, the number of hidden layers should be decided for the specific task via experimentation. The input layer comprises a feature vector representing the input example that is fed to the first hidden layer. The hidden layer(s) and the output layer comprise multiple neurons, each with a vector of weights of the same size as the input, as well as a bias value. Within the layers, each neuron’s output becomes the input for the next layer, until, finally, the output layer uses the final activation to represent the model’s prediction.

Broadly, this process aims to derive a generalization about the weights and biases associated with each neuron in the network, that is, derive generalizable values of \({\mathbf {w}}, b\) to compute \(z = \sum \mathbf {wx}+ b\) for each neuron (with input \({\mathbf {x}}\) ) in the network. Using an iterative training process of forward and backward propagation over multiple examples (training data), each layer’s activations are propagated forward across the network, and the error rate is propagated back to the first hidden layer. Following the learning process, the network (model) can then be used to predict unseen/untested examples.

3.1.3.2 Recurrent neural network

Recurrent Neural Network (RNN) are a special type of neural network that keeps a representation of the previously seen input data. These networks are ideal for processes where the temporal or sequential order of the input example is relevant (Goodfellow et al. 2016 ).

figure 9

RNN (Goodfellow et al. 2016 )

The recurrence is represented as a loop in each neuron, as Fig.  9 shows, allowing one or more passes of the same input, with the network maintaining a state representation of each pass. Following the specified number of passes, the final state is transmitted as output parameters. This means that RNN allow the possibility of inputs and outputs of variable length. That is, given the loop’s flexibility, the architecture can be constructed to be one-to-one, one-to-many, many-to-one, or many-to-many.

However, typical RNN, make it difficult for the hidden state to retain information over a long period. That is, they have a short memory due to the gradient becoming smaller and smaller as it is propagated backward in time steps across the recurring loop, a phenomenon known as vanishing gradient . This means that for temporal data, in which the relevant relationship between data points occurs over a lengthy period, a typical RNN model is not ideal. Thus, other versions of RNN have been formulated, with the most frequently used approaches being Long Short-term Memory (LSTM) lstm and Gated Recurrent Unit (GRU) (Goodfellow et al. 2016 ). The architectures discussed can largely reduce the vanishing gradient effect by maintaining a cell state via additive updates rather than just the RNN hidden state with product updates (Fig. 10 ).

figure 10

LSTM & GRU (Goodfellow et al. 2016 )

3.1.3.3 Convolutional neural networks

Another network architecture type that has gained substantial popularity, especially for analyzing digital images, is CNN (Goodfellow et al. 2016 ). The reason is that CNN can simplify large amounts of pixel density, vastly reducing the number of parameters to work with, making the ANN highly efficient. Unlike more conventional ANN, in which the input is represented as a feature vector, CNN represent the input as a matrix, which they use to generate the first convolutional layer .

figure 11

Architecture of a convolutional neural network (Goodfellow et al. 2016 )

A typical CNN will contain one or more convolutional layers, each connected to its respective pooling layer . Figure  11 provides a simple representation of such a network.

3.1.3.4 Autoencoder

Autoencoders are unsupervised ANN that efficiently encode input data, a process known as latent representation or encoding . This process involves using input data as a feature vector and attempting to reconstruct the same data using fewer nodes than the input (Goodfellow et al. 2016 ). As such, autoencoders are frequently used for dimensionality reduction.

figure 12

A simple Autoencoder (Goodfellow et al. 2016 )

As Fig.  12 , shows, an autoencoder’s architecture imposes a bottleneck for encoding the input representation. A decoder layer subsequently reproduces an output to represent the reconstructed input. In so doing, it learns a representation of the input data while ignoring the input noise. The encoder’s representation of the transformed input is referred to as the emphcode, code, and it is the internal or hidden layer of the autoencoder. The decoder subsequently generates the output from the code.

Autoencoders are commonly used in stock market data for their dimension reduction functionality (Chen et al. 2018a ; Chong et al. 2017 ) to avoid dimensionality curse (Soleymani and Paquet 2020 ). This is an important consideration for stock market data, where there is value in network simplicity without losing important features. In Soleymani and Paquet ( 2020 ), a restricted stacked autoencoder network reduces an 11 feature set to a three feature set before it is fed into a CNN architecture in a deep reinforcement learning framework called DeepBreath . This enables an efficient approach to a portfolio management problem in a setting that combines offline and online learning. Elsewhere, (Hu et al. 2018a ) combines CNN and autoencoder architectures in its Convoluted Autoencoder (CAE) to reduce candlestick charts to numerical representations to improve stock similarity.

3.1.3.5 Deep Reinforcement Learning

Unlike supervised and unsupervised learning, in which all learning occurs within the training dataset, a Reinforcement Learning (RL) problem is formulated as a discrete-time stochastic process. The learning process interacts with the environment via an iterative sequence of actions, state transitions, and rewards, in a bid to maximize the cumulative reward (François-Lavet et al. 2018 ). The future state depends only on the current state and action, meaning it learns using a trial-and-error reinforcement process in which an agent incrementally obtains experience from its environment, thereby updating its current state (Fig. 13 ). The action to take (from the action space) by the agent is defined by a policy .

figure 13

Reinforcement Learning (François-Lavet et al. 2018 )

It is common to see a RL system formulated as a Markov decision process Markov decision process (MDP) in which the system is fully observable, i.e., the state of the environment is the same as the observation that the agent perceives (François-Lavet et al. 2018 ). Furthermore, RL can be categorized as model-based or model-free  (Russell and Norvig 2010 ).

Model-based reinforcement learning The agent retains a transition model of the environment to enable it to select actions that maximize the cumulative utility. The agent learns a utility function that is based on the total rewards from a starting state. It can either start with a known model (i.e., chess) or learn by observing the effects of its actions.

Model-free reinforcement learning The agent does not retain a model of the environment, instead focusing on directly learning how to act in different states. This could be via either an action-utility function (Q-learning) that learns the utility of taking an action in a given state or a policy-search in which a reflex agent directly learns to map policy, \(\pi (s)\) , from different states to corresponding actions.

Deep Reinforcement Learning (DRL) is a deep representation of RL that can be model-based, model-free, or a combination of the two (Ivanov and D’yakonov 2019 ). The stock market can be considered to feature an DRL characteristic, with past states well-encapsulated in current states and events and the only requirement for future states being the current state. For this reason, DRL is a particularly popular approach for modern quantitative analysis of the stock market. Applications of DRL in these scenarios vary from profitable/value stock selection or portfolio allocation strategy (Wang et al. 2019b ; Li et al. 2019 ) to simulating market trades in a bid to develop optimal liquidation strategy (Bao and Liu 2019 ).

3.2 Using deep learning in the stock market

In Section  3.1 , we considered what DL is and discussed certain specific DL architectures that are commonly used in stock market applications. Although we referred to certain specific uses of these network types that are employed in the stock market, it is important to note that all of the architectures mentioned are also commonly used for other applications. However, some specific considerations must be kept in mind when the stock market is the target. These range from the model’s composition to backtesting and evaluation requirements and criteria. Some of these items do not correspond to a traditional ML toolbox but are crucial to stock market models and cannot be ignored, especially given the monetary risks involved.

This section first discusses the specifics of modeling considerations for stock market applications. It also discusses backtesting as an integral part of the process, and details some backtesting methodology. This is followed by a review of the different evaluation criteria and evaluation types.

3.2.1 Modeling considerations

When training an ML model for most applications, we consider how bias and variance affect the model’s performance, and we focus on establishing the tradeoffs between the two. Bias measures how much average model predictions differ from actual values, and variance measures the model’s generalizability and its sensitivity to changes in the training data. High degrees of bias suggest underfit, and high levels of variance suggest overfit. It is typical to aim to balance bias and variance for an appropriate model fit that can be then applied to any unseen dataset, and most ML applications are tuned and focused accordingly.

However, in financial applications, we must exceed these to avoid some of the following pitfalls, which are specific to financial data.

3.2.1.1 Sampling intervals

Online ML applications typically feature sampling windows in consistent chronological order. While this is practical for most streaming data, it is not suitable for stock market data and can produce substantial irregularities in model performance. As Fig.  2 demonstrates, the volume of trade in the opening and closing period is much higher than the rest of the day for most publicly available time-based market data. This could result from pre-market or after-hours trading and suggests that sampling at a consistent time will inadvertently undersample the market data during high-activity periods and undersample during low-activity periods, especially when modeling for intraday activities.

A possible solution is using data that has been provided in ticks, but these are not always readily available for stock market data without significant fees, potentially hindering academic study. Tick data can also make it possible to generate data in alternative bars, such as tick or volume bars, significantly enhancing the model performance. Notably, (Easley et al. 2012 ) uses the term volume clock to formulate volume bars to align data sampling to volume-wise market activities. This enables high-frequency trading to have an advantage over low-frequency trading.

3.2.1.2 Stationarity

Time-series data are either stationary or non-stationary. Stationary time-series data preserve the statistical properties of the data (i.e., mean, variance, covariance) over time, making them ideal for forecasting purposes (de Prado 2018 ). This implies that spikes are consistent in the time series, and the distribution of data across different windows or sets of data within the same series remains the same. However, because stock market data are non-stationary, statistical properties change over time and within the same time series. Also, trends and spikes in non-stationary time series are not consistent. By definition, such data are difficult to model because of their unpredictability. Before any work on such data, it is necessary to render them as stationary time series (Fig. 14 ).

figure 14

Time-series for the same value of \(\epsilon _t \sim {\mathcal {N}}(0,1)\)

A common approach to converting non-stationary time series to stationary time series involves differencing. This can involve either computing the difference between conservative observations or, for seasonal time series, the difference between previous observations of the same season. This approach is known as integral differencing , with (de Prado 2018 ) discussing fraction differencing as a memory-preserving alternative that produces better results.

3.2.1.3 Backtesting

In ML, it is common to split data into training and testing sets during the modeling process. Given the goal of this exercise is to determine the accuracy or evaluate performance in some other way, it follows that adhering to such a conventional approach is appropriate. However, when modeling for the financial market, performance is measured by the model’s profitability or volatility of the model. According to Arnott et al. ( 2018 ), there should be a checklist or Protocol that mandates that ML research include the goal of presenting proof of positive outcomes through backtesting.

Opacity and bias in AI systems represent two of the overarching debates in AI ethics (Müller 2020 ). Although a significant part of the conversation concerns the civil construct, it is clear that the same reasoning applies to other economic and financial AI applications. For example, (Müller 2020 ) raises concerns about statistical bias and the lack of due process and auditing surrounding using ML for decision-making. This relates to conversations about honesty in backtesting reports and the selection bias that typically affects academic research in the financial domain (Fabozzi and De Prado 2018 ).

In the context of DL in the stock market, backtesting involves building models that simulate trading strategy using historical data. This serves to consider the model’s performance and, by implication, helps to discard unsuitable models or strategies, preventing selection bias. To properly backtest, we must test on unbiased and sufficiently representative data, preferably across different sample periods or over a sufficiently long period. This positions backtesting among the most essential tools for modeling financial data. However, it also means it is among the least understood in research (de Prado 2018 ).

When a backtested result is presented as part of a study, it demonstrates the consistency of the approach across various time instances. Recall that overfitting in ML describes a model performing well on training data but poorly on test or unseen data, indicating a large gap between the training error and the test error (de Prado 2018 ). Thus, when backtesting a model on historical data, one should consider the issue of backtest overfitting , especially during walk-forward backtesting  (de Prado 2018 ).

figure 15

Backtesting strategies

Walk-forward is the more common backtesting approach and refers to simulating trading actions using historical market data—with all of the actions and reactions that might have been part of that—in chronological time. Although this does not guarantee future performance on unseen data/events, it does allow us to evaluate the system according to how it would have performed in the past. Figure  15 shows two common ways of formulating data for backtesting purposes. Formulating the testing process in this manner removes the need for cross-validation because training and testing would have been evaluated across different sets. Notably, traditional K-fold cross-validation is not recommended in time series experiments such as this, especially when the data is not Independent and Identically Distributed (IID) (Bergmeir and Benítez 2012 ; Zaharia et al. 2010 ).

Backtesting must be conducted in good faith. For example, given backtest overfitting means that a model is overfitted to specific historical patterns, if favorable results are not observed, researchers might return to the model’s foundations to improve generalizability. That is, researchers are not expected to fine-tune an algorithm in response to specific events that might affect its performance. For example, consider overfitting a model to perform favorably in the context of the 1998 recession, and then consider how such a model might perform in response to the 2020 COVID-19 market crash. By backtesting using various historical data or over a relatively long period, we modify our assumptions to avoid misinterpretations.

3.2.1.4 Assessing feature importance

In discussing backtesting, we have discussed why we shouldn’t selectively “tune” a model to specific historical scenarios to achieve a favorable performance to challenge the usefulness of the knowledge gained from the model’s performance in such experiments. Feature Importance becomes relevant here. Feature importance enables the measurement of the contribution of input features to a model’s performance. Given neural networks are typically considered “black-box” algorithms, the movement around explanation AI contributes to the interpretation of the output of the network and understanding of the importance of the constituent features, as observed in the important role of Feature Importance Ranking in Samek et al. ( 2017 ), Wojtas and Chen ( 2020 ). Unlike traditional ML algorithms, this is a difficult feat for ANN models, typically requiring a separate network for the feature ranking.

3.2.2 Model evaluation

Machine learning algorithms use evaluation metrics such as accuracy and precision. This is because we are trying to measure the algorithm’s predictive ability. Although the same remains relevant for ML algorithms for financial market purposes, what is ultimately measured is the algorithm’s performance with respect to returns or volatility. The works reviewed include various performance metrics that are commonly used to evaluate an algorithm’s performance in the financial market context.

Recall that in Sect.  3.2.1 emphasized the importance of avoiding overfitting when backtesting. It is crucial to be consistent with backtesting different periods and to be able to demonstrate consistency across different financial evaluations of models and strategies. Returns represents the most common financial evaluation metric for obvious reasons. Namely, it measures the profitability of a model or strategy (Kenton 2020 ). It is commonly measured in terms of rate during a specific window of time, such as day, month, or year. It is also common to see returns annualized over various years, which is known as Compound Annual Growth Rate (CAGR) . When evaluating different models across different time windows, higher returns indicate a better model performance.

However, it is also important to consider Volatility because returns alone do not relay the full story regarding a model’s performance. Volatility measures the variance or how much the price of an asset can increase or decrease within a given timeframe (Investopedia 2016 ). Similar to returns, it is common to report on daily, monthly, or yearly volatility. However, contrary to returns, lower volatility indicates a better model performance. The The Volatility Index (VIX), a real-time index from the Chicago Board Options Exchange (CBOE), is commonly used to estimate the volatility of the US financial market at any given point in time (Chow et al. 2021 ). The VIX measures the US stock market volatility based on its relative strength compared to the S &P 500 index, with measures between 0 and 12 considered low, measures between 13 and 19 considered normal, and measures above 20 considered high.

Building on the information derived from returns and volatility, the Sharpe ratio enables investors to identify little-to-no-risk investments by comparing investment returns with risk-free assets such as treasury bonds (Hargrave 2019 ). It measures average returns after accounting for risk-free assets per volatility unit. The higher the Sharpe ratio, the better the model’s performance. However, the Sharpe ratio features the shortcoming of assuming the data’s normal distribution due to the upward price movement. The Sortino ratio can mitigate against this, differing by using only the standard deviation of the downward price movement rather than the full swing that the Sharpe ratio employs.

Other commonly used financial metrics are MDD and the Calmar ratio , both of which are used to assess the risk involved in an investment strategy. Maximum drawdown describes the difference between the highest and lowest values between the start of a decline in peak value to the achievement of a new peak value, which indicates losses from past investments (Hayes 2020 ). The lower the MDD, the better the strategy, with zero value suggesting zero loss in investment capital. The Calmar ratio measures the MDD adjusted returns on capital to gauge the performance of an investment strategy. The higher the Calmar ratio, the better the strategy.

Another metric considered important by the works reviewed was VaR, which measures risk exposure by estimating the maximum loss of an investment over time using historical performance (Harper 2016 ).

Meanwhile, other well-known non-financial ML metrics commonly used are based on the accuracy of a model’s prediction. These metrics are calculated in terms of either the following confusion matrix or in terms of the difference between the derived and observed target values.

True Positive (TP) and True Negative (TN) are the correctly predicted positive and negative classes respectively. Subsequently, False Positive (FP) and False Negative (FN) are the incorrectly predicted positive and negative classes (Han et al. 2012 ).

The evaluation metrics in Table  7 are expected to be used as complementary metrics to the primary and more specific financial metrics in Table  6 . This is because the financial metrics can evaluate various investment strategies in the context of backtested data, which the ML metrics are not designed for. Section  4 demonstrates how these different evaluation metrics are combined across the works of literature that we reviewed (Table  7 ).

3.2.3 Lessons learned

This section has reviewed different types of deep ANN architectures that are commonly used in the stock market literature considering DL. The ANN landscape in this context is vast and evolving. We have focused on summarizing these architectures on the basis of their recurrence across different areas of specialization within the stock market. Explicitly recalling the architectures used should assist explanations of their usage as we proceed to our findings in Sect.  4 .

We have similarly detailed the expectations of modeling for the financial market and how these differ from the traditional ML approach, an important consideration for the rest of the survey. That is, although it is worthwhile applying methodologies and strategies across different areas of a discipline to advance scientific practice, we should endeavor to also attend to established practice and the reasoning behind that practice. This includes also understanding the kinds of metrics that should be used. In conducting this survey, we identified several works that used only ML metrics, such as accuracy and F-score, as evaluation metrics (Ntakaris et al. 2019 ; Lee and Yoo 2019 ; Kim and Kang 2019 ; Passalis et al. 2019 ; Ganesh and Rakheja 2018 ). Although this might be ideal for complementary metrics, the performance of an algorithm or algorithmic strategy must ultimately be relevant to the study domain. By more deeply exploring intra-disciplinary research in the computer science field, we begin to understand the space we open up and the value we confer in the context of established processes.

By highlighting various considerations and relevant metrics, we trust that we have facilitated computer science research’s exploration of ideas using stock market data and indeed contributed to the research in the broader econometric space. The next section presents this survey’s culmination, discussing how the findings relate to the previously discussed background and attempting to answer the study’s research questions and demonstrating the criteria employed to shortlist the literature reviewed.

4 Survey findings

4.1 research methodology.

This research work set out to investigate applications of DL in the stock market context by answering three overarching research questions:

Although many research works have used stock market data with DL in some form, we quickly discovered that many are not easily applicable in practice due to how the research has been conducted. Although we retrieved over 10,000 works Footnote 1 , by not being directly applicable, most of the experiments are not formulated to provide insight for financial purposes, with the most common formulation being as a traditional ML problem that assumes that it is sufficient to break the data into training and test sets.

Recall that we categorized learning techniques by data availability in Sect.  3.1.2 . When the complete data are available to train the algorithm, it is defined as offline or batch learning. When that is not the case, and it is necessary to process the data in smaller, sequential phases, as in streaming scenarios or due to changes in data characteristics, we categorize the learning technique as online . Although ML applications in the stock market context are better classified as online learning problems, surprisingly, very few research papers approach the problem accordingly, instead mostly approaching it as an offline learning problem, a flawed approach (de Prado 2018 ).

To apply this approach to financial ML research for the benefit of market practitioners, the provided insight must be consistent with established domain norms. One generally accepted approach to achieving this is backtesting the algorithm or strategy using historical data, preferably across different periods (Bergmeir and Benítez 2012 ; Institute 2020 ). Although Sect.  3.2.1 discussed backtesting, we should re-iterate that backtesting does not constitute a “silver bullet” or a method of evaluating results. However, it does assist evaluation of the performance of an algorithm across different periods. Financial time-series data are not IID, meaning the data distribution differs across different independent sets. This also means that there is no expectation that results across a particular period will produce similar performances in different periods, no matter the quality of the presented result. Meanwhile, the relevant performance evaluation criteria are those that are financially specific, as discussed in Sect.  3.2.2 . To this end, we ensured that the papers reviewed provide some indication of consideration of backtesting. An ordinary reference sufficed, even if the backtested results are not presented.

We used Google Scholar (Google 2020 ) as the search engine to find papers matching our research criteria. The ability to search across different publications and the sophistication of the query syntax (Ahrefs 2020 ) was invaluable to this process. While we also conducted spot searches of different publications and websites to validate that nothing was missed by our chosen approach, the query results from Google Scholar proved sufficient, notably even identifying articles that were missing from the results of direct searches on publication websites. We used the following query to conduct our searches:

“deep learning” AND “stock market” AND (“backtest” OR “back test” OR “back-test”)

This query searches for publications including the phrases “deep learning” , “stock market” , and any one of “backtest” , “back test” or “back-test” . We observed these three different spellings of “backtest” in different publications, suggesting the importance of catching all of these alternatives. This produced 185 results Footnote 2 , which include several irrelevant papers. For validation, we searched using Semantic Scholar (Scholar 2020 ), obtaining approximately the same number of journal and conference publications. We chose to proceed with Google Scholar because Semantic Scholar does not feature such algebraic query syntax, requiring that we search for the different combinations of “backtest” individually with the rest of the search query.

The search query construct provided us with the starting point for answering research questions (1) and (2). Then, we evaluated the relevance to the research objective of the 185 publications and considered how each study answered question (3). We objectively reviewed all query responses without forming an opinion on the rest of their experimental procedure with the rationale that addressing the basic concerns of a typical financial analyst represents a good starting point. Consequently, we identified only 35 papers as relevant to the research objective. Table  8 quantifies the papers reviewed by publication and year of publication. It is interesting to observe the non-linear change in the number of publications over the last 3 years as researchers have become more conscious of some of these considerations

4.2 Summary of findings

Section  3.1.3 explained the different architectures of the deep ANN that are commonly used in stock market experiments. Based on the works reviewed, we can categorize the algorithms into the following specializations:

Trade Strategy: Algorithmically generated methods or procedures for making buying and selling decisions in the stock market.

Price Prediction: Forecasting the future value of a stock or financial asset in the stock market. It is commonly used as a trading strategy.

Portfolio Management: Selecting and managing a group of financial assets for long term profit.

Market Simulation: Generating market data under various simulation what-if market scenarios.

Stock Selection: Selecting stocks in the stock market as part of a portfolio based on perceived or analyzed future returns. It is commonly used as a trading or portfolio management strategy.

Risk Management: Evaluating the risks involved in trading, to maximize returns.

Hedging Strategy: Mitigating the risk of investing in an asset by taking an opposite investment position in another asset.

Although a single specialization is usually the primary area of focus for a given paper, it is common to see at least one other specialization in some form. An example is testing a minor trade strategy in price prediction work or simulating market data for risk management. Table  9 illustrates the distribution of the different DL architectures across different areas of specialization for the studies reviewed by this survey. Architectures such as LSTM and DRL are more commonly used because of their inherent temporal and state awareness. In particular, lstm is favorable due to its relevant characteristic of remembering states over a relatively long period, which price prediction and trade strategy applications, in particular, require. Novel use cases (e.g., (Wang et al. 2019b ) combine LSTM and RL to perform remarkably well in terms of annualized returns. There are many such combinations in trade strategy and portfolio management, where state observability is of utmost importance.

Although FFNN is seldom used by itself, there are multiple instances of it being used alongside other approaches, such as CNN and RNN. Speaking of CNN, it is surprising how popular it is, considering it is more commonly used for image data. True to its nature, attempts have been made to train models using stock market chart images (Kusuma et al. 2019 ; Hu et al. 2018a ). Given its ability to localize features, CNN is also used with high-frequency market data to identify local time series patterns and extract useful features (Chong et al. 2017 ). Autoencoders and Restricted Boltzmann machine (RBM) are also used for feature extraction, with the output fed into another kind of deep neural network architecture (Table 10 ).

We further examined the evaluation metrics used by the reviewed works. Recall that Sect.  3.2.2 presented the different financial and ML evaluation metrics observed by our review. As Table  11 shows, returns constitute the most commonly used comparison measure for obvious reasons, especially for trade strategy and price prediction; the most common objective is profit maximization. It is also common to see different derivations of returns across different time horizons, including daily, weekly, and annual returns (Wang et al. 2019c ; Théate and Ernst 2020 ; Zhang et al. 2020a ).

Although ML metrics such as accuracy and MSE are typically combined with financial metrics, it is expected that the primary focus remains financial metrics; hence, these are the most commonly observed.

The following observations can be made based on the quantified evaluation metrics presented in Table  11 :

Returns is the most common financial evaluation metric because it can more intuitively evaluate profitability.

Maximum drawdown and Sharpe ratio are also common, especially for trade strategy and price prediction specialization.

The Sortino and Calmar ratios are not as common, but they are useful, especially given the Sortino ratio improves upon the Sharp ratio, and the Calmar ratio adds metrics related to risk assessment. Furthermore, neither is computationally expensive.

For completeness, some studies include ML evaluation metrics such as accuracy and precision; however, financial evaluation metrics remains the focus when backtesting.

Mean square error is the more common error type used (i.e., more common than MAE or MAPE).

4.2.1 Findings: trade strategy

A good understanding of the current and historical market state is expected before making buying and selling decisions. Therefore, it is understandable that DRL is particularly popular for trade strategy, especially in combination with LSTM. The feasibility of using DRL for stock market applications is addressed in Li et al. ( 2020 ), which also articulates the credibility of using it for strategic decision-making. That paper compares implementations of three different DRL algorithms with the Adaboost ensemble-based algorithm, suggesting that better performance is achieved by using Adaboost in a hybrid approach with DRL.

The authors of Wang et al. ( 2019c ) address challenges in quantitative financing related to balancing risks, the interrelationship between assets, and the interpretability of strategies. They propose a DRL model called AlphaStock that uses LSTM for state management to address the issue. For the interrelationship amongst assets, (Vaswani et al. 2017 ) proposes a Cross-Asset Attention Network (caan) using an Attention Network. This research uses the buy-winners-and-sell-losers (BWSL) trading strategy and is optimized on the Sharpe ratio, evaluating performance according to profit and risk. The approach demonstrates good performance for commutative wealth, performing over three times better than the market. Although there could be some questions regarding the way the training and test sets were divided, especially given cross-validation was not used, this work demonstrates an excellent implementation of a DL architecture using stock market data.

Elsewhere, (Théate and Ernst 2020 ) maximizes the Sharpe ratio using a state-of-the-art DRL architecture called the Trading Deep Q-Network (TQDN) and also proposes a performance assessment methodology. To differentiate from the Deep Q-Network (DQN), which uses a CNN algorithm as the base, the TQDN uses an FFNN along with certain hyperparameter changes. This is compared with common baseline strategies, such as buy-and-hold, sell-and-hold, trend with moving average, and reversion with moving average, producing the conclusion that there is some room for performance improvements. Meanwhile, (Zhang et al. 2020d ) uses DRL as a trading strategy for futures contracts from the Continuously Linked Commodities (CLC) database for 2019. Fifty futures are investigated to understand how performance varies across different classes of commodities and equities. The model is trained specifically for the output trading position, with the objective function of maximizing wealth. While the literature also includes forex and other kinds of assets, we focused on stock/equities. Other DRL applications include (Chakole and Kurhekar 2020 ), which combines DRL with FFNN, and (Wu et al. 2019 ), which combines DRL with LSTM.

Among non-DRL architectures, the most common we observed were CNN and LSTM. In Hu et al. ( 2018b ), Candlestick charts are used as input for a CAE, primarily to capture non-linear stock dynamics, and long periods of historical data are represented as charts. The algorithm starts by clustering stocks by sector and selects top stocks based on returns within each cluster. This procedure outperforms the FTSE 100 index over 2,000 backtested trading days. It would be interesting to observe how this compares to using the numbers directly instead of using the chart representation.

Given Moving Average Convergence/Divergence (MACD) is known to perform worse than expected in a stable market (Lei et al. 2020 ), uses uses Residual Network (ResNet) , a layer-skipping mechanism, to improve its effectiveness. The authors propose a strategy called MACD-KURT, which is based on ResNet as an algorithm and Kurtosis as a prediction target. Meanwhile, (Chen et al. 2018b ) uses a filterbank to generate 2D visualizations using historical time series data. Fed into CNN for pair trading strategy, this helps to improve accuracy and profitability. It is also common to observe LSTM-based strategies, either for converting futures into options (Wu et al. 2020 ), in combination with Autoencoders for training market data (Koshiyama et al. 2020 ), or in more general trade strategy applications (Sun et al. 2019 ; Silva et al. 2020 ; Wang et al. 2020 ; Chalvatzis and Hristu-Varsakelis 2020 ).

4.2.2 Findings: price prediction

The Random Walk Hypothesis , popularized by Malkiel ( 1973 ), suggests that stock price changes in random ways, similar to a coin toss, precluding prediction. However, because price changes are influenced by factors other than historical price, numerous papers and practical applications combine all of these to attempt to obtain some insight into price movement. Given the temporal nature of buying and selling, the price prediction specialization also requires some degree of historical context. For this reason, RNN and LSTM are, unsurprisingly, often relied on. However, what is surprising is the novel use of CNN for this purpose, either as an independent algorithm or in combination with RNN algorithms.

For example, (Wang et al. 2019a ) takes inspiration from RNN applications involving observing repeating patterns in speech and video, proposing Convolutional LSTM-based Variational Sequence-to-Sequence model with Attention (CLVSA) as a hybrid comprising RNN and convoluted RNN. The paper also introduces Kullback-Leibler divergence (KLD) to address overfitting in financial data. This work follows an optimal backtesting method involving training and testing in a sliding windows approach for 8 years. Specifically, from the start of the period, the model is trained on 3 years of data, evaluated on 1 week of data, and tested the following week. Then, the training regimen shifts forward by a week before being repeated until the end of the period. However, there is no indication of whether the model is updated (i.e., online learning) or a net-new model is introduced for each sliding window. The latter is suspected. Nonetheless, the experiment shows that the algorithm produces very high returns. Elsewhere, (Baek and Kim 2018 ) proposes an LSTM architecture called ModAugNet as a data augmentation approach designed to prevent overfitting of stock market data.

Although most algorithms use data from market trades, DeepLOB (Zhang et al. 2019 ) uses Limit Order Book (LOB) data with Google’s Inception Module CNN to infer local interaction and feed the output to an LSTM model. It uses a CNN filter to capture spatial structure in LOB and LSTM to capture time dependencies, achieving Accuracy/Precision/Recall/F1 in the 60–70% range. The study also performs a minor simulation to test a mock trade strategy using the model’s prediction. It would be interesting to see results on returns based on a full trade strategy or portfolio management.

In another use of LSTM with other architectures (Zhao et al. 2018 ), incorporates fundamental and technical indicators to create a market attention model featuring a temporal component to learn a representation of the stock market. They propose MarketSegNet , a convolutional Autoencoder architecture that uses an image of numerical daily market activities to generate a generic market feature representation. The generated features are subsequently fed into an LSTM architecture to generate the prediction model. The results of such an approach compared with a model using actual numbers would be interesting to consider. Elsewhere, (Zhang et al. 2020b ) compares LSTM with two different LSTM hybrid architectures, one using Autoencoder and one using CNN. Although the hybrid versions demonstrate better performance on accuracy tests, one hybrid’s performance is only slightly better than non-hybridized LSTM in terms of Returns/Sharpe Ratio. Meanwhile, (Fang et al. 2019 ), combines a non-NN Regression model with LSTM, concluding that the hybrid is better than the plain LSTM in terms of accuracy but less stable when backtested.

In terms of non-LSTM architectures (Wang et al. 2018 ), uses a one-dimensional CNN for price prediction, with the results suggesting that the model can extract more generalized feature information than traditional algorithms. This claims to be the first application of CNN on financial data, with the authors suggesting that their method achieves a significantly higher Sharpe ratio than Support Vector Machines (Support Vector Machines (SVM)), FFNN, and simple buy-and-hold. Furthermore, the work proposes a weighted F-score that assigns priority to the different errors based on how critical they are. It is suggested that the weighted F-score works better than the traditional F-score for financial data. Finally, (Zhang et al. 2020c ) achieves a promising performance with a much simpler approach, using an Autoencoder algorithm for feature extraction alone.

4.2.3 Findings: portfolio management

Portfolio management represents another specialization area that relies heavily on DRL. In (Liang et al. 2018 ), three state-of-the-art gameplay and robotics DRL algorithms, namely, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Policy Gradient (PG), are implemented for portfolio management. The paper also proposes a new training method that improves efficiency and returns in the Chinese stock market. This approach does not produce favorable results, with the authors discovering that their model needs more data to work sufficiently in a bull market. Adjusting the objective function does not help to alleviate the risk, which is deemed too complex. However, it represents one of the earliest works to attempt to properly tackle the problem of conducting DL research using stock market data.

The authors of Park et al. ( 2020 ) also use DRL—specifically, Q-Learning—for optimal portfolio management across multiple assets. Departing from a formulated trading process, they use an MDP in which the action space, with respect to size, is the trading direction. They also use a mapping function to derive a reasonable trading strategy in the action space and simulate actions in the space, enabling them to obtain experience beyond the available data. For a baseline comparison, the authors use known strategies, such as buy-and-hold, random selection, momentum (buy improvement in previous or sales decrease in previous, based on priority), and reversion (opposite of momentum). Their experimentation outperforms baseline comparisons in terms of overall returns.

The authors of Guo et al. ( 2018 ) propose the Robust Log-Optimal Strategy (RLOS) as part of an ensemble of pattern matching strategies comprising RLOS and DRL (i.e., RLOSRL) for portfolio management. This approach, based on the log optimal (logarithmically optimal rate of returns), approximates log function using Taylor expansion. It was compared with the naïve average and follows the winning strategies as a baseline. Both RLOS and RLOSRL perform better than all other approaches across multiple backtests with consistently impressive returns. Notably, the RLOSRL demonstrates superior performance, potentially significantly due to the state-aware DRL architecture. To help understand the environmental state, (Wang and Wang 2019 ) uses FFNN with ResNet to address overfitting problems associated with noisy financial data, applying the strategy to regime-switching (statistical change in the data series) and concluding that ResNet performs better than a regular FFNN.

4.2.4 Findings: market simulation

Historical data are very useful and commonly used to evaluate performance over different known states. However, this features the problem that it relies entirely upon history, and the state is fully known and encapsulated into past market and economic events, introducing complications when unknown states or future what-if scenarios must be tested to ensure a robust model performance in such circumstances. Consider, for example, that a SARS-like global pandemic had been predicted for several years before the COVID-19 outbreak of 2020. It would have been useful to know how the market might react before the pandemic. In this context, market data simulation is invaluable.

The authors in Maeda et al. ( 2020 ) propose a market DRL framework to help improve the performance of DL algorithms using a combination of DRL and LSTM with simulated market data. By simulating the order books for limit, market, and cancel orders, they are able to maximize returns. This draws upon the premise that because past market actions might not represent a good indicator for the future, it is better to use simulated data for backtesting purposes. Also, specific scenarios can be created using simulated data that does not correspond to past situations, enabling the generation of data for the forecasted circumstance. This combines market simulation with trade strategy specialization. For a baseline, it compares random market actions using the same simulated data, achieving consistently impressive results.

A different approach is taken in Raman and Leidner ( 2019 ), which uses 6 weeks of real market data to generate simulated data. The authors use a DRL model to decide on a trading decision (sell, hold, or buy) for the simulated conditions, comparing the algorithm with other baseline strategies and comparing the simulated data with Monte Carlo simulations. It would be interesting to see comparisons with substantially longer time frames. Elsewhere, (Buehler et al. 2020 ) introduces a financial time series market simulation that relies on a very small amount of training data, using the signature of historical path segments known as “rough paths” (Vaswani et al. 2017 ; Boedihardjo et al. 2016 ) in combination with an Autoencoder. Interestingly, the authors conclude that the data generated are not significantly better than the market data and are useful for test purposes but not for real applications.

4.2.5 Findings: stock selection

The stock selection problem is at the core of most, if not all, stock market specializations. This represents a hard problem that some deem impossible to solve. According to Malkiel ( 1973 ), a group of monkeys throwing darts at a financial page will perform equally as well as experts in the task of stock selection. Nonetheless, this has not stopped researchers exploring the problem. Although the research focus usually exceeds the singular action of selecting the stock, few studies really emphasize either this or the reasoning behind it.

The authors of Zhang et al. ( 2020a ) use a feature selection technique called DoubleEnsemble to identify key features from stock market data. This involves training sub-models [FFNN or gradient boosting ensembles (Zhang et al. 2020a )] with weighted features to alleviate overfitting problems and stabilize them to learn with noisy financial data. To prevent stability issues and incurring huge costs by retraining models after feature removal, as traditional approaches do, a shuffling-based feature selection method has been proposed. This means that different feature sets are trained across different sample sets, and loss is measured as indicated by the missing feature. The authors backtest by hedging on a position based on model prediction, with the results showing significantly improved returns and Sharpe ratio in the context of China’s A-share market. It would be interesting to see how this compares to traditional feature reduction methods, such as Principal Component Analysis [(Principal Component Analysis (PCA)], in terms of performance, compute cost, and returns.

More sophisticated architectures have also been used. For instance, (Yang et al. 2019 ) uses CNN and LSTMfor a stock trading strategy based on stock selection. Their proposal builds features directly from the Chinese market, and a purchase is made from the model’s prediction based on a projected profit of ≥ 0.14%. The models perform significantly better than the baseline of CSI300 in the Chinese market, which is impressive considering transaction fees are included. Interestingly, a CNN-based architecture outperforms an LSTM-based architecture. This study features the drawback of not providing a comparison with a simple, baseline strategy, such as buy and hold.

Rather than constructing features using market data alone (Amel-Zadeh et al. 2020 ), bases its predictions entirely on existing financial statement data (from Compustat), comparing RNN and LSTM with non-DL algorithms, such as random forest and regression. These experiments achieve a mild, slightly-better-than-chance prediction rate of 53–59%, with the random forest model outperforming the DL algorithms in terms of returns. There is no evidence that lagged-time fundamentals are included as a factor in the feature engineering procedure.

4.2.6 Findings: risk management

Aiming to minimize risk to maximize returns, risk management represents an important specialization that must be incorporated into other strategies. However, our findings reveal that limited attention is focused on this specialization. Nonetheless, the recent market crash of 2020, caused primarily by the COVID-19 pandemic (Wikipedia 2020a ), is likely to renew interest in this line of research, with at least one study already motivated by these events.

That study, (Arimond et al. 2020 ), compares FFNN, temporal CNN, and LSTM algorithms with the Hidden Markov Model (Hidden Markov Model (HMM)) to estimate the VaR threshold. A VaR breach is reached when portfolio returns fall below the threshold. The model is trained to estimate the probability of regime change, referred to as regime-switching. This is commonly modeled as the change in market condition from a bull market (trending up) to a bear market (trending down). By estimating the moment of the VaR breach, it is possible to mitigate the risk to the portfolio.

4.2.7 Findings: hedging strategy

Similar to risk management, the hedging strategy specialization does not feature an extensive work of literature that fits our survey of backtested DL research in the stock market context. The authors of Ruf and Wang ( 2020 ) propose HedgeNet for generating a hedging strategy using FFNN over one period. Rather than predicting an estimate for option price and using that as the hedging strategy, a hedging ratio is predicted directly from the FFNN, the main metric of interest. This aligns with a recommendation from Bengio ( 1997 ).

Considering hedging strategies rely on training pairs of an asset at opposite positions, it would be interesting to see applications of state-conscious algorithms, such as DRL or LSTM, applied in this context.

Table  12 presents the highlights of and problems with the studies reviewed, demonstrating that while all represent impressive work in different capacities, many insufficiently discuss model explainability, and none focus on the long-term investment horizon. Also, while these works mostly combine market and fundamental data, it is still difficult to include alternative data, such as news texts or Twitter data, which can enrich the modeling process. This is largely due to the unavailability of such data, especially for long historical time windows. The next section elaborates on these challenges. As this area of research continues to mature, we hope that more attention is paid to these issues and that researcher interests can influence the industry at large to make most of the cost-prohibitive data forms available for research purposes.

4.3 Lessons learned

This section has focused on our research’s findings and methodology. While numerous studies have used stock market data for ML, readers will notice that very few works do the due diligence of backtesting as part of their experimentation. Of over 10,000 publications identified, only 35 papers meet this criterion. We have reviewed and summarized these contributions. These studies primarily focus on several specialization areas, and we have reviewed them on the basis of those specializations. Notably, the works considered were mostly published in the last 3 years and mostly based on market data from the US and China.

Upon analyzing the specific work items and methodologies in these papers, several simple patterns become obvious. For example, tasks depending heavily upon historical context—i.e., trading strategy and price prediction—commonly employ stateful architectures, such as DRL and LSTM, as the primary architecture to approach past market activities. Interestingly, although various of these problems have been formulated as online learning problems, the literature has not substantially established that connection. One of this work’s objectives has been to identify this blind spot such that, as the computer science research community matures in this area, it will be possible to leverage established practices to further improve the state-of-the-art.

The next section itemizes some of the interesting challenges identified during this survey and suggests future directions that can improve the field.

5 Challenges and future directions

Previous sections have discussed what it means to conduct backtested DL research in the stock market context and summarized current research pursuing such a direction. Although there has been increasing focus on this area in recent years, numerous research challenges clearly remain. This section summarizes these challenges and provides suggested research directions.

5.1 Challenges

5.1.1 availability of historical market data.

At the core of studies based on stock market analysis is the availability of consistently updated historical data. Unfortunately, such data is a premium product that is not readily available, especially at high levels of granularity (i.e., intraday and tick data). Paywalls often restrict access to such data, complicating its use for academic research, especially research without significant financial backing. Institutions such as Wharton Research Data Services (WRDS) (Wachowicz 2020 ) collaborate with academic institutions to provide access to some of these kinds of data. However, the degree of access is determined by the subscription level, which depends on the importance ascribed by the subscribing institution. Nonetheless, the data remain widely inaccessible to a larger pool of institutions, making the only options either inconsistent publicly available market data or paying the premium.

5.1.2 Access to supplementary data

Closely related to the previous issue is access to related data types, which can be used to improve performance on modeling tasks involving financial data. Examples include fundamental data (e.g., quarterly reports) and alternative data (e.g., news articles and tweets about the company of interest). It is important to differentiate these kinds of data because sources usually differ from those responsible for market data. Notably, Twitter recently announced API access for research purposes (Tornes and Truijillo 2021 ), which could help with this issue. However, there are many other kinds of potential supplementary data, and there remains some work to reach a state where such data is readily available. For example, it would be invaluable for news API services, such as webhose.io , to provide API access to supplementary news data for research purposes.

5.1.3 Long term investment horizon

Several studies reviewed consider a relatively short investment horizon, from a few days to a few months. Given a significant amount of investments in the stock market are associated with portfolios that span decades, such as retirement funds, buying and holding growth investment is attractive. Growth investment expects above-average returns for young public companies, with the expectation of significant future growth. For example, Shopify (SHOP) IPO-ed at $17 in May-2015; as of Feb-2020, a share was worth \({\sim }\$530\) , with the price ending the year at \({\sim }\$1100\) . This suggests that it was a growth investment at the early stage; identifying that character early would have produced larger than average returns. Such patterns could be discovered by using supplementary data as discussed. By modeling similar historical growth investments as part of an investment strategy, it might be possible to identify newer investments that can produce handsome returns for long-term investments.

5.1.4 Effect of capital gains tax

Several studies draw conclusions on strategy without considering trading costs or taxation. This is more pronounced for short-term investments, for which tax rates are high (10–37% in the US) compared to long-term investments (0–20%). Thus, to accurately represent returns, these costs must be considered; however, this is seldom done.

5.1.5 Financial ML/DL framework

Many popular ML and DL frameworks, including scikit-learn (Pedregosa et al. 2011 ), TensorFlow (Abadi et al. 2016 ), Keras (Chollet et al. 2015 ), PyTorch (Paszke et al. 2019 ), have improved the state-of-the-art. These frameworks are commonly used in both academic research and industrial research for production-level use cases. Although these frameworks appeared frequently in the studies reviewed, implementations generally corresponded to the respective financial considerations, that is, we observed no real attempts to extend existing frameworks using improvements based on these specialized works.

Stock market ML problems involve incrementally learning using time-series data. Although this represents an online learning problem, the similarity remains to be fully appreciated. For example, ideas commonly used for concept drift in online learning research (Lu et al. 2020 ) appear perfectly suited to regime switch in quantitative analysis research. Meanwhile, some ML frameworks that are dedicated to online learning research have the tools and consideration for concept drift and prequential evaluation built into their framework. These include scikit-multiflow (Montiel et al. 2018 ) and River (Montiel et al. 2020 ).

The absence of such frameworks for financial ML means that individual research teams must implement their ideas without attempting to integrate them into an open-source framework. Section  3.2.1 discusses protocols for ML research that involve proving results via backtesting. Having an accessible framework focused on DL research using financial data would enable the promotion of such ideas and allow research in this area to more closely conform to established industry practice. It would also enable researchers to provide specific implementations to improve the state-of-the-art, avoiding the current siloed approach that precludes real effort at cohesion.

5.2 Future directions

The challenges identified in the previous section lead to several ideas for future research in this area:

Applicability in practice This work’s focus has been on ensuring we attend to how previous works have been validated in practice. Industry applicability, trustworthiness, and usability (The Institute for Ethical AI & Machine Learning 2020 ; Gundersen et al. 2018 ) should be our core guiding forces as we expand computer science learnings and research into domain-specific applications such as the financial market. One approach is ensuring that we adhere to guiding protocols, such as backtesting, when conducting research experiments in the financial market context (Arnott et al. 2018 ). This aligns with pertinent AI research topics such as reproducibility and explainability (i.e., XAI).

Improvements in trust Although significant attention has recently been focused on AI trustworthiness, there remains much work to be done. An important principle for building trust in AI is explicability , which entails creating explainable and accountable AI models (Thiebes et al. 2020 ). Ensuring that research is explicable further improves the chance of employing that research in real-world scenarios. Recall that Sect.  3.2.1 indicated that feature importance could provide explainable insights from input features, which, in turn, endow trust. There remains substantial work to be done on this matter, as the summaries provided in Table  12 evidence, especially the limited attention given to explainability. Another important point of tension for generating trust in AI is reproducibility. Among other considerations, publications must be easy to validate by external researchers. Notably, (Thiebes et al. 2020 ) provides a checklist including relevant statistical items and code and data availability. However, of the 35 papers reviewed, only seven (20%) provide the source code for their research. Ensuring that all published works include access to the source code and data would help increase trust, making industrial application more plausible.

Public availability of data One means of improving trust in AI research is the availability of public data that researchers can use as a benchmark. Unfortunately, because this is relatively uncommon for financial market research, relevant fundamental (i.e., quarterly reports), alternative (i.e., news and social media), and granular/intraday market data are often behind paywalls. This means that even if most researchers were to publish their source code, they still might not be able to publish their data due to legal implications. While efforts made by corporate organizations such as Twitter is laudable (Tornes and Truijillo 2021 ), there remains work to be done by the industry and researchers to make relevant research data available for this purpose. An ideal set would be historical market data over a long period, with corresponding fundamental and alternative data sets. Although WRDS (Wachowicz 2020 ) is a good source of such for research purposes, research institutions must choose to subscribe and will provide varying levels of access based on financial commitment.

Focus on long-horizon More emphasis should be made to apply DL market strategies to long-horizon investments targeted at growth investing. As previously mentioned, significantly more gains can be expected in the long-term investment horizon (i.e., > a year) by focusing on potential unicorns in their early stage. The consideration that one common investment portfolio type is retirement funds, which feature a relatively long time span, makes a compelling case for considering modeling techniques focused on long-term returns. However, a potential drawback is that this complicates evaluating annualized metrics, especially for longer-term objectives. A hybrid approach might be to mix a short-term strategy with a vision for the long term. Additionally, employing alternative data, such as news articles, about not only the company of interest but also competitors can enable longer-term horizons to be better forecast. Additionally, tracking either or both geopolitical and environmental events and their potential impacts to “learn from the past” represents an interesting future study direction.

Financial DL frameworks Significant work has been done to apply ML to stock market research. However, unified frameworks remain uncommon, especially in DL research. Thus, a useful step would be to develop a financial DL toolbox for online learning using non-stationary financial data that are inherently volatile (Pesaranghader et al. 2016 ). Section  3.2.1 discussed the peculiarities of learning from non-stationary time-series data pertaining to the stock market. A unified financial DL toolbox improved by different research would help to foster innovation based on newer ideas.

6 Conclusion

As DL becomes more common in financial research, it is apparent that attention is increasingly focused on ensuring that the research process conforms to procedures established in the financial domain. A recent example of this is the renewed attention on backtesting algorithms using historical data and domain-specific evaluation metrics. As neural processors become ubiquitous, traditionally compute-intensive algorithms become more attractive for online learning. Consequently, we expect to see DL increasingly applied to solving research problems using stock market data.

This survey involved reviewing backtested applications of DL in the stock market. The backtesting requirement indicates that the research has demonstrated some degree of due diligence, enabling consideration for real-world use. After demonstrating the nature of stock market data and common representations of these data, before and after some pre-processing for ML purposes to understanding the nuances of this type of data, we summarized DL architectures, focusing on those used in the literature reviewed. This enabled the quick establishment of points of reference for discussion of the architectures in the context of those studies.

While numerous studies have explored stock market applications of DL, we focused on those that demonstrate evidence of research methodology consistent with the domain and thus more likely to be considered by industry practitioners (Paleyes et al. 2020 ; The Institute for Ethical AI & Machine Learning 2020 ; Gundersen et al. 2018 ). In following this approach, it was hoped that this survey might serve as a basis for future research answering similar questions. To that end, we concluded the survey by identifying open challenges and suggesting future research directions. Our future work will aim to assist in addressing such challenges, especially through explorations of supplementary data and developing novel explainable financial DL frameworks.

Searched for “deep learning” AND “stock market” on Google Scholar

Query results as of November 5, 2020

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th \(\{\) USENIX \(\}\) symposium on operating systems design and implementation ( \(\{\) OSDI \(\}\) 16), pp 265–283

Aceto G, Ciuonzo D, Montieri A, Pescape A (2019) Mobile encrypted traffic classification using deep learning: experimental evaluation, lessons learned, and challenges. IEEE eTrans Netw Serv Manag 16(2):445–458

Article   Google Scholar  

Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6:52138–52160

Adosoglou G, Lombardo G, Pardalos PM (2020) Neural network embeddings on corporate annual filings for portfolio selection. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.114053

Ahrefs (2020) Google Search Operators: the complete list (42 Advanced Operators). https://ahrefs.com/blog/google-advanced-search-operators/

Amel-Zadeh A, Calliess JP, Kaiser D, Roberts S (2020) Machine learning-based financial statement. Analysis. https://doi.org/10.2139/ssrn.3520684

Arimond A, Borth D, Hoepner AGF, Klawunn M, Weisheit S (2020) Neural Networks and Value at risk. https://doi.org/10.2139/ssrn.3591996 ,

Arnott RD, Harvey CR, Markowitz H (2018) A backtesting protocol in the era of machine learning. SSRN Electron J. https://doi.org/10.2139/ssrn.3275654

Baek Y, Kim HY (2018) ModAugNet: a new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480. https://doi.org/10.1016/j.eswa.2018.07.019

Bao W, Liu Xy (2019) Multi-agent deep reinforcement learning for liquidation strategy analysis. arXiv: org/abs/1906.11046

Bengio Y (1997) Using a financial training criterion rather than a prediction criterion. Int J Neural Syst 8(4):433–443. https://doi.org/10.1142/S0129065797000422

Bergmeir C, Benítez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf Sci 191:192–213

Boedihardjo H, Geng X, Lyons T, Yang D (2016) The signature of a rough path: uniqueness. Adv Math 293:720–737. https://doi.org/10.1016/j.aim.2016.02.011

Article   MathSciNet   MATH   Google Scholar  

Buehler H, Horvath B, Lyons T, Perez Arribas I, Wood B (2020). A data-driven market simulator for small data environments. https://doi.org/10.2139/ssrn.3632431

Castro LNd (2006) Fundamentals of natural computing (Chapman & Hall/Crc Computer and Information Sciences). Chapman & Hall/CRC, Boca Raton

Google Scholar  

Chakole J, Kurhekar M (2020) Trend following deep Q-Learning strategy for stock trading. Expert Syst 37:e12514. https://doi.org/10.1111/exsy.12514

Chalvatzis C, Hristu-Varsakelis D (2020) High-performance stock index trading via neural networks and trees. Appl Soft Comput 96:106567. https://doi.org/10.1016/j.asoc.2020.106567

Chen L, Qiao Z, Wang M, Wang C, Du R, Stanley HE (2018a) Which artificial intelligence algorithm better predicts the Chinese Stock Market? IEEE Access 6:48625–48633. https://doi.org/10.1109/ACCESS.2018.2859809

Chen YY, Chen WL, Huang SH (2018b) Developing arbitrage strategy in high-frequency pairs trading with filterbank CNN algorithm. In: Proceedings—2018 IEEE international conference on agents, ICA 2018, Institute of Electrical and Electronics Engineers Inc., pp 113–116, https://doi.org/10.1109/AGENTS.2018.8459920

Chollet F et al (2015) Keras. https://keras.io

Chong E, Han C, Park FC (2017) Deep learning networks for stock market analysis and prediction: methodology, data representations, and case studies. Expert Syst Appl 83:187–205. https://doi.org/10.1016/j.eswa.2017.04.030

Chow KV, Jiang W, Li J (2021) Does vix truly measure return volatility? In: Handbook of financial econometrics, mathematics, statistics, and machine learning. World Scientific, pp 1533–1559

Christina Majaski (2020) Fundamentals. https://www.investopedia.com/terms/f/fundamentals.asp

Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: EMNLP 2017—conference on empirical methods in natural language processing, proceedings, https://doi.org/10.18653/v1/d17-1070 , arXiv: 1705.02364

de Prado ML (2018) Advances in financial machine learning, 1st edn. Wiley, New York

Day MY, Lee CC (2016) Deep learning for financial sentiment analysis on finance news providers. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM 2016, Institute of Electrical and Electronics Engineers Inc., pp 1127–1134, https://doi.org/10.1109/ASONAM.2016.7752381

Derivative (2020) List of electronic trading protocols. https://www.investopedia.com/terms/d/derivative.asp

Easley D, López de Prado MM, O’Hara M (2012) The volume clock: insights into the high-frequency paradigm. J Portfolio Manag 39(1):19–29. https://doi.org/10.3905/jpm.2012.39.1.019

Fabozzi FJ, De Prado ML (2018) Being honest in backtest reporting: a template for disclosing multiple tests. J Portfolio Manag 45(1):141–147. https://doi.org/10.3905/jpm.2018.45.1.141

Fang Y, Chen J, Xue Z (2019) Research on quantitative investment strategies based on deep learning. Algorithms 12(2):35. https://doi.org/10.3390/a12020035

Article   MathSciNet   Google Scholar  

Ferguson R, Green A (2018) Deeply learning derivatives. arXiv: org/abs/1809.02233

François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Found Trends Mach Learn 11(3–4):219–354. https://doi.org/10.1561/2200000071

Article   MATH   Google Scholar  

Ganesh P, Rakheja P (2018) VLSTM: very long short-term memory networks for high-frequency trading. Papers arXiv:abs/1809.01506 , https://ideas.repec.org/p/arx/papers/1809.01506.html

Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, Cambridge

MATH   Google Scholar  

Google (2020) Google Scholar. https://scholar.google.ca/

Gundersen OE, Gil Y, Aha D (2018) On reproducible AI: towards reproducible research, open science, and digital scholarship in AI publications. AI Mag 39:56–68

Guo Y, Fu X, Shi Y, Liu M (2018) Robust log-optimal strategy with reinforcement learning. arXiv: org/abs/1805.00205

Haibe-Kains B, Adam GA, Hosny A, Khodakarami F, Waldron L, Wang B, McIntosh C, Kundaje A, Greene CS, Hoffman MM, Leek JT, Huber W, Brazma A, Pineau J, Tibshirani R, Hastie T, Ioannidis JP, Quackenbush J, Aerts HJ, Shraddha T, Kusko R, Sansone SA, Tong W, Wolfinger RD, Mason C, Jones W, Dopazo J, Furlanello C (2020) The importance of transparency and reproducibility in artificial intelligence research. Nature 586(7829):E14–E16

Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Elsevier Inc., Amsterdam. https://doi.org/10.1016/C2009-0-61819-5

Book   MATH   Google Scholar  

Hargrave M (2019) Sharpe ratio definition. https://www.investopedia.com/terms/s/sharperatio.asp

Harper D (2016) An introduction to value at risk (VAR). Investopedia pp 1–7, http://www.investopedia.com/articles/04/092904.asp

Hayes A (2020) Maximum Drawdown (MDD) Definition. https://www.investopedia.com/terms/m/maximum-drawdown-mdd.asp

Hinton G (2017) Boltzmann machines. In: Encyclopedia of machine learning and data mining. https://doi.org/10.1007/978-1-4899-7687-1_31

Hu G, Hu Y, Yang K, Yu Z, Sung F, Zhang Z, Xie F, Liu J, Robertson N, Hospedales T, Miemie Q (2018a) Deep stock representation learning: from candlestick charts to investment decisions. In: ICASSP, IEEE international conference on acoustics, speech and signal processing - proceedings, Institute of Electrical and Electronics Engineers Inc., vol 2018, April, pp 2706–2710. https://doi.org/10.1109/ICASSP.2018.8462215

Hu G, Hu Y, Yang K, Yu Z, Sung F, Zhang Z, Xie F, Liu J, Robertson N, Hospedales T, Miemie Q (2018b) Deep stock representation learning: from candlestick charts to investment decisions. In: ICASSP, IEEE international conference on acoustics, speech and signal processing—proceedings, Institute of Electrical and Electronics Engineers Inc., vol 2018, April, pp 2706–2710, https://doi.org/10.1109/ICASSP.2018.8462215

Hu Z, Zhao Y, Khushi M (2021) A survey of forex and stock price prediction using deep learning. Appl Syst Innov 4(1):9. https://doi.org/10.3390/asi4010009

Insights D (2019) AI leaders in financial services. www2.deloitte.com/us/en/insights/industry/financial-services/artificial-intelligence-ai-financial-services-frontrunners.html

Institute CF (2020) Backtesting—overview, how it works, common measures. https://corporatefinanceinstitute.com/resources/knowledge/trading-investing/backtesting/

Investingcom (2013) AAPL|Apple Stock Price. https://www.investing.com/equities/apple-computer-inc

Investopedia (2016) Volatility definition. https://www.investopedia.com/terms/v/volatility.asp

Ivanov S, D’yakonov A (2019) Modern deep reinforcement learning algorithms. arxiv: org/abs/1906.10025v2

Jiang W (2021) Applications of deep learning in stock market prediction: recent progress. Expert Syst Appl 184:115537. https://doi.org/10.1016/j.eswa.2021.115537

Kenton W (2019) Sortino ratio definition. https://www.investopedia.com/terms/s/sortinoratio.asp

Kenton W (2020) Rate of Return—RoR Definition. https://www.investopedia.com/terms/r/rateofreturn.asp

Kim S, Kang M (2019) Financial series prediction using attention lstm. arXiv: 1902.10877

Koshiyama A, Blumberg SB, Firoozye N, Treleaven P, Flennerhag S (2020) QuantNet: transferring learning across systematic trading strategies. arXiv: org/abs/2004.03445

Kusuma RMI, Ho TT, Kao WC, Ou YY, Hua KL (2019) Using deep learning neural networks and candlestick chart representation to predict stock market. arXiv: org/abs/1903.12258

Lee SI, Yoo SJ (2019) Multimodal deep learning for finance: integrating and forecasting international stock markets. arXiv: 1903.06478

Lei Y, Peng Q, Shen Y (2020) Deep learning for algorithmic trading: enhancing MACD strategy. In: ACM international conference proceeding series, Association for Computing Machinery, New York, NY, USA, pp 51–57, https://doi.org/10.1145/3404555.3404604

Li AW, Bastos GS (2020) Stock market forecasting using deep learning and technical analysis: a systematic review. IEEE Access 8:185232–185242. https://doi.org/10.1109/ACCESS.2020.3030226

Li X, Li Y, Zhan Y, Liu XY (2019) Optimistic bull or pessimistic bear: adaptive deep reinforcement learning for stock portfolio allocation. arXiv: org/abs/1907.01503

Li Y, Ni P, Chang V (2020) Application of deep reinforcement learning in stock trading strategies and stock forecasting. Computing 102(6):1305–1322. https://doi.org/10.1007/s00607-019-00773-w

Liang Z, Chen H, Zhu J, Jiang K, Li Y (2018) Adversarial deep reinforcement learning in portfolio management. arXiv: org/abs/1808.09940

Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2020) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363. https://doi.org/10.1109/TKDE.2018.2876857

Maeda I, DeGraw D, Kitano M, Matsushima H, Sakaji H, Izumi K, Kato A (2020) Deep reinforcement learning in agent based financial market simulation. J Risk Financ Manag 13(4):71. https://doi.org/10.3390/jrfm13040071

Malkiel BG (1973) A random walk down Wall Street, 1st edn. Norton, New York

Montiel J, Read J, Bifet A, Abdessalem T (2018) Scikit-multiflow: a multi-output streaming framework. J Mach Learn Res 19(72):1–5

MathSciNet   Google Scholar  

Montiel J, Halford M, Mastelini SM, Bolmier G, Sourty R, Vaysse R, Zouitine A, Gomes HM, Read J, Abdessalem T, Bifet A (2020) River: machine learning for streaming data in python. arXiv: 2012.04740

Müller VC (2020) Ethics of Artificial Intelligence and Robotics. In: Zalta EN (ed) The Stanford Encyclopedia of Philosophy, winter, 2020th edn. Stanford University, Metaphysics Research Lab

Murphy CB (2019) Compound annual growth rate—CAGR definition. https://www.investopedia.com/terms/c/cagr.asp

Nascita A, Montieri A, Aceto G, Ciuonzo D, Persico V, Pescape A (2021) Xai meets mobile traffic classification: understanding and improving multimodal deep learning architectures. IEEE eTrans Netw Serv Manag 18(4):4225–4246

Ntakaris A, Mirone G, Kanniainen J, Gabbouj M, Iosifidis A (2019) Feature engineering for mid-price prediction with deep learning. IEEE Access 7:82390–82412. https://doi.org/10.1109/ACCESS.2019.2924353

O’Shea T, Hoydis J (2017) An introduction to deep learning for the physical layer. IEEE Trans Cogn Commun Netw 3(4):563–575. https://doi.org/10.1109/TCCN.2017.2758370

Ozbayoglu AM, Gudelek MU, Sezer OB (2020) Deep learning for financial applications: a survey. Appl Soft Comput 93:106384. https://doi.org/10.1016/j.asoc.2020.106384

Paleyes A, Urma RG, Lawrence N (2020) Challenges in deploying machine learning: a survey of case studies. arXiv: abs/2011.09926

Park H, Sim MK, Choi DG (2020) An intelligent financial portfolio trading strategy using deep Q-learning. Expert Syst Appl 158:113573. https://doi.org/10.1016/j.eswa.2020.113573

Passalis N, Tefas A, Kanniainen J, Gabbouj M, Iosifidis A (2019) Deep adaptive input normalization for time series forecasting. arXiv: 1902.07892

Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R (eds) Advances in Neural Information Processing Systems 32, Curran Associates, Inc., pp 8024–8035, http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

MathSciNet   MATH   Google Scholar  

Pesaranghader A, Viktor HL, Paquet E (2016) A framework for classification in data streams using multi-strategy learning. In: Calders T, Ceci M, Malerba D (eds) Discovery Science—19th international conference, DS 2016, Bari, Italy, October 19–21, 2016, Proceedings, Lecture Notes in Computer Science, vol 9956, pp 341–355, https://doi.org/10.1007/978-3-319-46307-0_22

Raman N, Leidner JL (2019) Financial market data simulation using deep intelligence agents. In: Demazeau Y, Matson E, Corchado JM, De la Prieta F (eds) Advances in practical applications of survivable agents and multi-agent systems: the PAAMS Collection. Springer, Cham, pp 200–211

Chapter   Google Scholar  

Ruf J, Wang W (2020) Hedging with linear regressions and neural networks. Tech. rep. https://optionmetrics.com

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536. https://doi.org/10.1038/323533a0

Russell S, Norvig P (2010) Artificial intelligence a modern approach, 3rd edn. https://doi.org/10.1017/S0269888900007724

Samek W, Wiegand T, Müller KR (2017) Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. arXiv: org/abs/1708.08296v1

Scholar S (2020) AI-powered research tool. https://www.semanticscholar.org/

Seese D, Weinhardt C, Schlottmann F (2008) Handbook on information technology in finance. Springer, New York

Book   Google Scholar  

Silva TR, Li AW, Pamplona EO (2020) Automated trading system for stock index using LSTM neural networks and risk management. In: Proceedings—2020 International Joint Conference on Neural Networks (IJCNN), Institute of Electrical and Electronics Engineers (IEEE), pp 1–8, https://doi.org/10.1109/ijcnn48605.2020.9207278

Soleymani F, Paquet E (2020) Financial portfolio optimization with online deep reinforcement learning and restricted stacked autoencoder-DeepBreath. Expert Syst Appl 156:113456. https://doi.org/10.1016/j.eswa.2020.113456

Sun T, Wang J, Ni J, Cao Y, Liu B (2019) Predicting futures market movement using deep neural networks. In: Proceedings—18th IEEE international conference on machine learning and applications, ICMLA 2019, Institute of Electrical and Electronics Engineers Inc., pp 118–125, https://doi.org/10.1109/ICMLA.2019.00027

The Institute for Ethical AI & Machine Learning (2020) The 8 principles for responsible development of AI & Machine Learning systems. https://ethical.institute/principles.html

Théate T, Ernst D (2020) An application of deep reinforcement learning to algorithmic trading. arXiv: org/abs/2004.06627

Thiebes S, Lins S, Sunyaev A (2020) Trustworthy artificial intelligence. Electron Markets. https://doi.org/10.1007/s12525-020-00441-4

Tornes A, Truijillo L (2021) Enabling the future of academic research with the Twitter API. https://blog.twitter.com/developer/en_us/topics/tools/2021/enabling-the-future-of-academic-research-with-the-twitter-api.html

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, Neural information processing systems foundation, vol 2017-Decem, pp 5999–6009, arXiv: org/abs/1706.03762v5

Wachowicz E (2020) Wharton Research Data Services (WRDS). J Bus Financ Librariansh 25(3–4):184–187. https://doi.org/10.1080/08963568.2020.1847552

Wang J, Wang L (2019) Residual Switching Network for Portfolio Optimization. arXiv: org/abs/1910.07564

Wang J, Sun T, Liu B, Cao Y, Wang D (2018) Financial markets prediction with deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 97–104. https://doi.org/10.1109/ICMLA.2018.00022

Wang J, Sun T, Liu B, Cao Y, Zhu H (2019a) CLVSA: a convolutional LSTM based variational sequence-to-sequence model with attention for predicting trends of financial markets. In: IJCAI International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, vol 2019, August, pp 3705–3711. https://doi.org/10.24963/ijcai.2019/514

Wang J, Zhang Y, Tang K, Wu J, Xiong Z (2019b) AlphaStock: a buying-winners-and-selling-losers investment strategy using interpretable deep reinforcement attention networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. https://doi.org/10.1145/3292500.3330647 , arXiv: 1908.02646

Wang J, Zhang Y, Tang K, Wu J, Xiong Z (2019c) Alphastock: a buying-winners-and-selling-losers investment strategy using interpretable deep reinforcement attention networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, New York, NY, USA, KDD ’19, p p1900–1908, 10.1145/3292500.3330647,

Wang J, Yang Q, Jin Z, Chen W, Pan T, Shen J (2020) Research on quantitative trading strategy based on LSTM. In: Proceedings of 2020 Asia-Pacific conference on image processing, electronics and computers, IPEC 2020, Institute of Electrical and Electronics Engineers Inc., pp 266–270. https://doi.org/10.1109/IPEC49694.2020.9115114

Wikipedia (2020a) 2020 stock market crash—Wikipedia. https://en.wikipedia.org/wiki/2020_stock_market_crash

Wikipedia (2020b) Neuron. https://en.wikipedia.org/wiki/Neuron

Wikipedia (2020c) Vanishing gradient problem. https://en.wikipedia.org/wiki/Vanishing_gradient_problem

Wikipedia (2020d). Accessed 19 Aug 2020. List of electronic trading protocols. “Neuron”

Will Kenton (2020) Calmar Ratio. Investopedia pp 0–3. https://www.investopedia.com/terms/c/calmarratio.asp

Wojtas M, Chen K (2020) Feature importance ranking for deep learning. arXiv: 2010.08973

Wu J, Wang C, Xiong L, Sun H (2019) Quantitative trading on stock market based on deep reinforcement learning. In: Proceedings of the international joint conference on neural networks, Institute of Electrical and Electronics Engineers Inc., vol 2019, July, 10.1109/IJCNN.2019.8851831

Wu JMT, Wu ME, Hung PJ, Hassan MM, Fortino G (2020) Convert index trading to option strategies via LSTM architecture. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05377-6

Xiao C (2021) Introduction to deep learning for healthcare. Springer, Cham

Yang J, Li Y, Chen X, Cao J, Jiang K (2019) Deep learning for stock selection based on high frequency price-volume data. arXiv: org/abs/1911.02502

Yang SY, Yu Y, Almahdi S (2018) An investor sentiment reward-based trading system using Gaussian inverse reinforcement learning algorithm. Expert Syst Appl 114:388–401. https://doi.org/10.1016/j.eswa.2018.07.056

Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: Cluster computing with working sets. In: 2nd USENIX Workshop on Hot Topics in Cloud Computing, HotCloud 2010

Zhang Z, Zohren S, Roberts S (2019) DeepLOB: deep convolutional neural networks for limit order books. https://doi.org/10.1109/TSP.2019.2907260 , arXiv: 1808.03668

Zhang C, Li Y, Chen X, Jin Y, Tang P, Li J (2020a) DoubleEnsemble: a new ensemble method based on sample reweighting and feature selection for financial data analysis. arXiv: org/abs/2010.01265

Zhang H, Liang Q, Li S, Wang R, Wu Q (2020b) Research on stock prediction model based on deep learning. J Phys. https://doi.org/10.1088/1742-6596/1549/2/022124

Zhang H, Liang Q, Wang R, Wu Q (2020c) Stacked model with autoencoder for financial time series prediction. In: 15th international conference on computer science and education, ICCSE 2020, Institute of Electrical and Electronics Engineers (IEEE), pp 222–226, 10.1109/ICCSE49874.2020.9201745

Zhang Z, Zohren S, Roberts S (2020d) Deep reinforcement learning for trading. J Financ Data Sci 2(2):25–40. https://doi.org/10.3905/jfds.2020.1.030

Zhang J, Zhai J, Wang H (2021) A survey on deep learning in financial markets. In: Proceedings of the first international forum on financial mathematics and financial technology. Springer, pp 35–57

Zhao R, Deng Y, Dredze M, Verma A, Rosenberg D, Stent A (2018) Visual attention model for cross-sectional stock return prediction and end-to-end multimodal market representation learning. arXiv: org/abs/1809.03684

Download references

Author information

Authors and affiliations.

School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, Canada

Kenniy Olorunnimbe & Herna Viktor

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Herna Viktor .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1—Acronyms

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Olorunnimbe, K., Viktor, H. Deep learning in the stock market—a systematic survey of practice, backtesting, and applications. Artif Intell Rev 56 , 2057–2109 (2023). https://doi.org/10.1007/s10462-022-10226-0

Download citation

Published : 30 June 2022

Issue Date : March 2023

DOI : https://doi.org/10.1007/s10462-022-10226-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning
  • Machine learning
  • Neural network
  • Stock market
  • Financial market
  • Quantitative analysis
  • Backtesting
  • Practice and application
  • Find a journal
  • Publish with us
  • Track your research

SYSTEMATIC REVIEW article

Covid and world stock markets: a comprehensive discussion.

\nShaista Jabeen

  • 1 Department of Management Sciences, Lahore College for Women University, Lahore, Pakistan
  • 2 Department of Management Sciences, National University of Modern Languages, Islamabad, Pakistan

The COVID-19 outbreak has disturbed the victims' economic conditions and posed a significant threat to economies worldwide and their respective financial markets. The majority of the world stock markets have suffered losses in the trillions of dollars, and international financial institutions were forced to reduce their forecasted growth for 2020 and the years to come. The current research deals with the impact of the COVID-19 pandemic on the global stock markets. It has focused on the contingent effects of previous and current pandemics on the financial markets. It has also elaborated on the pandemic impact on diverse pillars of the economy. Irrespective of all these destructive effects of the pandemic, still hopes are there for a sharp rise and speedy improvement in global stock markets' performance.

Introduction

The world is experiencing the worst health and economic disaster in the shape of COVID-19 pandemic. Dealing with this pandemic is the most challenging task being faced by human beings since the Second World War ( Maqsood et al., 2021 ). Coronavirus has pushed the markets toward the danger zone. The market panic has been started. This disease is contagious even before it shows obvious symptoms. It is quite difficult to hold people in quarantine in this outbreak. That's the narrative, and we haven't gotten very far into it yet. So, the potential for market disruption because of a scary narrative is quite high.

—Robert James Shriller, Nobel Memorial Prize Winner in Economic Sciences, 2013.

The epidemiological perspectives are not required to be understood here. Currently, well-informed individuals ought to have some know-how about the basics of contagious diseases. Times of fear are also times of rumor and misinformation; knowledge is the antidote ( Baldwin et al., 2020 ). The COVID-19 outbreak was officially reported in the Wuhan City of China in December 2019 and covered all continents of this globe other than Antarctica ( Hui et al., 2020 ). COVID-19 is a distinctive black swan event, and we are unaware of its existence, expansion, breadth, depth, magnitude, and even its disappearance ( He P. et al., 2020 ; He Q. et al., 2020 ). World Health Organization (WHO) officially declared COVID-19 a pandemic on 11th March 2020 ( Cucinotta and Vanelli, 2020 ). The pandemic has severely hit global economies ( Shafi et al., 2020 ). It has disrupted the life and lifestyle of almost everyone ( Aqeel et al., 2021 ). Almost no one has been left untouched. Another pandemic of information and misinformation is keeping pace with it during this pandemic, spreading fear and anxiety ( Koley and Dhole, 2020 ). The outbreak has changed the outlook of this globe within no time at all. Human beings are struggling with the long-lasting effects of this disease and the unforgettable reality of their existence which has never happened before. The pandemic affected more than 107 million people, with around 2.3 million causalities, and the numbers of cases are escalated day by day. The alarming point is the growth factor of this disease, where 100 contaminated cases create another 10,000 within a very limited time ( Bagchi et al., 2020 ).

The people from this generation have seen wars. They have seen the collapse of the Soviet Union. They have seen extremely dangerous terrorist attacks. They have seen the burst of financial bubbles, and they have seen the effects of climate change. However, they had not seen anything like the coronavirus before. A similar case has not existed for more than one hundred years. They were not ready for it, and they did not know how to respond to it. Since it was something that no one had any prior experience with, the pandemic has also led to reconsidering some things which were always previously thought either right or wrong ( Sharma, 2021 ).

The COVID-19 pandemic has spread globally, has made millions of people sick, and triggered an international response spearheaded by the World Health Organization to stop its spread. From Wuhan, China, it spread like wildfire. The virus has now visited almost every nation in the world, bringing helplessness and death with it. None are spared, and in some way or another, almost everyone has become a victim. In a recent message, the WHO warned that the worst is yet to come. The coronavirus has not only triggered disease and death, and it has affected almost every aspect human life. There is a long list of disruptions to daily life in the cities and states with lockdown, global sporting events, weddings, social events, post-poned ceremonies; all this has elicited the global crisis. Moreover, industries worldwide have been affected; stock markets have been reported in record downfall; airlines, travel, tourism, and hospitality sectors are the major victims of this pandemic. A significant disaster is job loss in various sectors ( Koley and Dhole, 2020 ).

Crucial and groundbreaking strategies are required to protect not only human lives but also to safeguard economies and uplift economic growth and financial health. Nations are exposed to a global health crisis, the like of which has not occurred for a century. This crisis is killing human beings, enhancing human distress, and upsetting the lives of individuals. This can be considered a sort of human, social, and economic crisis ( Mishra, 2020 ). The best efforts by governments from every country have failed to halt its spread: cities were put under lockdown; people were advised to stay at home; international borders were closed; travel bans at local, national and international level were imposed; markets, schools, universities and shopping complexes were closed. Quarantine and self-isolation have been advised to stop the spread of COVID-19. The virus has triggered an unprecedented global crisis which led the WHO to provide technical guidance for government authorities, healthcare workers, and other key stakeholders to respond to community spread ( Koley and Dhole, 2020 ).

In the intermingled economies, the Covid-19 pandemic came as a global distress that affects both the demand and supply side concurrently. Rapidly growing infectivity limits labor supply and badly affects productivity, whereas supply disruptions are also caused by social distancing, lockdowns and industry closures. On the other hand, disruption on the demand side is caused by reduced consumption, unemployment, and income loss and these economic prospects result in reduced company investment. The unpredictability about the path, instance, enormity and impact of Covid-19 could create a vicious cycle of redundancy, less consumption, and business closures, leading to financial distress. To identify and determine this extraordinary shock is the key challenge for the experiential analysis of this pandemic. The unprecedented nature of COVID-19 makes it difficult to recognize its non-linear effects, cross-country spillovers, and quantify unobserved factors to compose forecasts ( Chudik et al., 2020 ).

International institutions including the FAO, ILO, IFAD, and WHO jointly declared this pandemic a global challenge to food systems, public health, trade, and industry. Overwhelming social and economic disruptions put tens of millions of individuals at risk of falling below the poverty line. According to another approximation, by the end of this year, the number of undernourished people could increase by up to 32 million, which are ~690 million at present. It also poses an existential threat to a considerable number of business ventures. The world has a 3.3 billion workforce and ~50% of which are near to being unemployed. Significant individuals are informal workers with limited access to productive assets, quality health, and the majority lack social protection. Due to lockdowns, they lost their means to earn money and became incapable of feeding themselves and their families because for most their daily food depends upon their daily wages earned. Such a devastating effect on the entire food chain has exposed the vulnerability of this pandemic. Farmers have no access to markets, nor can buy inputs or sell their output and result in a reduced harvest. In addition to market shutdown, trade limitations, border closures, and detention measures dislocate food supply chains nationally and internationally, which badly influenced a healthy diet. Small scale farmers are the soft target of COVID-19 and placed nutrition and food security of the most marginalized population under threat, as income producers fall ill, die, or otherwise lose their work ( Chriscaden, 2020 ).

It is still difficult to understand the recovery due to the development of vaccines. To understand corona's economic impact, the following charts and maps exhibit real statistics so far.

Impact on Jobs

A report published by the OECD (2020) shows the impact of COVID-19 and containment measures on OECD economies where people were prohibited from going to work, resulting in a significant drop in business activity and extraordinary job losses. In some countries, millions have been moved to reduced hours and most people worked up to ten times fewer hours. Moreover, the rate of entire job loss is also very high. Some people are more exposed to this pandemic than others. As young people and women workers are at greater risk due to less secured and unskilled jobs. They are also associated with the industries most affected by this unprecedented shock, including restaurants, cafés and tourism.

Causing Recession

Worldwide economic downturn caused by the COVID-19 pandemic forced the Organization for Economic Cooperation and Development (OECD), International Monetary Fund (IMF), and World Bank (WB) to revise their forecasts and reported a significant decline in the projected rate of growth in late 2019 and mid-2020. Such deterioration can be seen in the IMF figures in which global economic growth forecasts declined from +3.4% to −4.4% during October 2019 and October 2020. In the same way, OECD also revised its forecast and lowered the growth rate from positive 2.9% in November to −4.5% in September 2020. In June 2020, OECD anticipated the blow of another wave of infections.

Impact on Travel

The travel industry is one of those acutely damaged industries due to lockdowns, border closures, and abandoned flight operations. Airlines are not only canceling flights, but customers also restrict themselves from holidays and business trips. A recently discovered subsequent wave of COVID-19 has forced national and international airlines to promulgate new travel restrictions and tighten their policies. While providing data of 2020, Flight tracking service Flight Radar 24 reveals a huge hit in number of flights worldwide and requires a long way to recover ( Jones et al., 2021 ).

Impact on Tourism

Tourism is another badly affected industry due to this unprecedented pandemic. The World Tourism Organization, also known as UNWTO (2020) marked this pandemic as a serious threat to the travel and tourism sector. Many jurisdictions put restrictions on international travel to restrict the spread of the virus; some fully closed their borders, resulting in a massive decline in demand. In 2020, tourism reported a loss of ~1 billion tourists, equivalent to US$ 1.1 trillion in international tourism receipts. This decline in international tourism could cause an ~$2 trillion loss in global GDP, over 2% of the global GDP in 2019. While predicting a rebound in the global tourism industry, UNWTO presented an extended scenario for the year 2021 to the year 2024. According to them, global tourism will start recovery from the second half of the year 2021 but it will take 2.5 to 4 years to return to 2019.

Impact on Stock Markets

The capital markets are at the front line of any country's economy, and the stock markets are considered the indicator of any economy ( He P. et al., 2020 ; He Q. et al., 2020 ). The COVID-19 outbreak has disturbed the victims' economic conditions and posed a significant threat to the worldwide economies and their respective financial markets ( Barro et al., 2020 ; Ramelli and Wagner, 2020 ). The majority of the world stock markets have suffered in terms of trillion-dollar losses ( Lyócsa et al., 2020 ) and international financial institutions were forced to reduce their forecasted growth for 2020 and the years to come ( Boone et al., 2020 ). The root cause of this severe decline is the exposure of stock markets to several risks, for instance, the global financial crisis of 2008, which had pushed these markets in a melting position ( Dang and Nguyen, 2020 ). The current pandemic has affected the global stock markets significantly compared to the SARS virus, which was spread in 2003 as China has got tremendous development in comparison to the last 17 years and recognized as a leading economy of the world and also a global production hub, manufacturing the highly demanded technology products ( Alameer et al., 2019 ).

Effects of Previous Pandemics on Stock Markets

Scholars have argued that previous pandemics triggered fragile stock markets ( Chen et al., 2018 ) and impeded stock market participants' decision-making capacity by reducing their active involvement in stock market trading ( Dong and Heo, 2014 ). The literature has provided empirical evidence of the stock market reactions to significant systematic events. The research has shown the cyclical nature of the stock market reactions and the factors that affected the stock markets ( Keating, 2001 ). The historical performance of stock markets has been documented in the previous literature regarding influenza and other major epidemics. Similarly, the scholars have examined the influences of significant events on the stock markets, i.e., Severe Acute Respiratory Syndrome (SARS), ( Chen et al., 2018 ), natural disasters ( Caporale et al., 2019 ), corporate events ( Ranju and Mallikarjunappa, 2019 ), public news, and political events ( Bash and Alsaifi, 2019 ). Some other studies have also demonstrated that SARS in 2003 weakened the Taiwanese economy ( Chen et al., 2007 ) and regional stock markets ( Chen et al., 2018 ).

The previous studies have comprehensively examined the association between outbreaks and stock market performance. Kalra et al. (1993) investigated the disaster of the Soviet Chernobyl nuclear power plant. Delisle (2003) recognized that effects were of greater intensity after SARS (2003) than the Asian financial crisis. Nippani and Washer (2004) investigated the effects of SARS on global financial markets and found that it influenced the markets of China and Vietnam. Lee and and McKibbin (2004) reported the strong effect of SARS on human beings and financial integration. Loh (2006) explained a robust linkage between SARS and airline stocks performance in Canada, China, Hong Kong, Singapore, and Thailand and illustrated that the stocks of the aviation sector are more sensitive than non-aviation stocks. MckKibbin and Sidorenko (2006) investigated the influenza epidemic's impact on the global economy's growth by considering its diverse magnitudes like slight, moderate and intense. Moreover, Chen et al. (2007) noticed the negative effects of SARS on the hotel industry's stock prices in Taiwan. They also investigated the significant influence of SARS on the four major stock markets of Asia and China. Nikkinen et al. (2008) discovered the impact of the 9/11 incident on the global stock prices; however, the markets recovered rapidly. Al Rjoub (2009) also studied the influence of financial crisis on stock market.

Besides, Kaplanski and Levy (2010) studied the effect of aviation accidents on stock returns and established that price fluctuations are sensitive to such incidents. Al Rjoub (2011) and Al Rjoub and Azzam (2012) investigated the impact of the Mexican tequila crisis (1994), Asian-Russian financial crisis (1997–98), 9/11 incident, Iraq war (2004), financial crisis (2005), and global financial crisis (2008–09) on the stock compensation behavior in Jordan's Stock Exchange. Righi and Ceretta (2011) established the positive effect of the European debt crisis (2010) on European markets' risk aptitude, especially the German, French, and British markets. Schwert (2011) explored the variabilities in the prices of US stocks during the financial crisis. Mctier et al. (2011) found the negative impact of Flu on the intensity of trading activities and stock returns in the USA. Besides, Rengasamy (2012) examined the effect of Eurozone sovereign debt-related policy announcements, development rewards, and stock market volatility on Brazil, Russia, India, China, and South Africa. Karlsson and Nilsson (2014) found the negative impact of the 1918 Spanish flu epidemic on capital returns. Lanfear et al. (2018) conducted a study to explore the effect of cyclones on stock returns, and they observed the effect of emergencies on stock returns. Chen et al. (2018) examined the influence of SARS on Asian financial markets.

Brief Overview of Literature

Studies have elaborated on the performance of global stock markets affecting the COVID-19 outbreak ( Ahmar and del Val, 2020 ; Al-Awadhi et al., 2020 ; Liu et al., 2020 ; Zhang et al., 2020 ). The pandemic has decreased investors' confidence level in the stock market as the market uncertainty was very high ( Liu et al., 2020 ). Iyke (2020) explained that COVID-19 has robust and continual negative effects on the global economy. Ahmar and del Val (2020) used ARIMA and SutteARIMA and forecasted the short-term impact of COVID-19 on Spain's IBEX index. They further explained that SutteARIMA is the better statistical measure in forecasting such impact.

Moreover, Alam et al. (2020) explained that pandemic has greatly hit Australia's capital market right from the start of 2020. The stock market has shown a bearish trend, though some sectors were at high risk and others have performed well. The researchers have focused on initial volatility and sectoral returns in eight different sectors. They have analyzed the data using the event study method and 10-days window for the official announcements of COVID-19 events in Australia. The findings revealed that some sectors performed well on the day of the announcement. Simultaneously, some others also showed good performance after the announcement except the transportation sector, which performed poorly.

The pandemic has posed severe challenges to the global economies ( Wang et al., 2021 ) and it has also created mental health issues ( Abbas et al., 2021 ). Chowdhury et al. (2020) examined the impact of COVID-19 on economic activities and stock markets worldwide. The study has targeted 12 countries from four continents from January-April 2020 by using the panel data. The stock market impact was measured using the event study method, and economic impact was measured using the panel vector autoregressive model. The results showed the extremely negative effects of pandemic variables on stock returns. Singh et al. (2020) investigated the influence of COVID-19 on the stock markets of G-20 states. The study used an event study for measuring abnormal returns and panel data to describe the causes of abnormal returns. The data consisted of 58 days of post-COVID period news provided by international media and 120 days before the event. The findings exhibited the significant negative abnormal returns during the event days. Liu et al. (2020) also examined the pandemic's effect on the most affected countries' stock markets by using the event study. The researchers revealed the negative effects of COVID-19 on the stock markets' performance.

He P. et al. (2020) and He Q. et al. (2020) also used the event study method to explore the impact of COVID-19 on Chinese industries and stock market performance. It has been observed that some industries were severely affected by the pandemic (mining, environment etc.). However, some other industries have faced limited effects of an outbreak (manufacturing, education etc.). Machmuddah et al. (2020) used the event study method to observe consumer goods' share prices before and after COVID-19. The data about daily stock prices and stock trade volume has been collected before and after the pandemic. Significant differences have been observed between daily closing prices and stock trade volume before and after the pandemic. Liu et al. (2020) used an event study method to study the short term impact of the outbreak on the stock market indices of 21 countries strongly affected by pandemic (Italy, UK, Germany etc.). Asian countries have taken the severe negative effect of the pandemic as compared to other states. Khatatbeh et al. (2020) also applied the event study method to discover the impact of COVID-19 on some targeted countries' stock indices by employing the daily stock prices and found a significant negative impact on returns.

Al-Awadhi et al. (2020) investigated the association of pandemic and stock market outcomes in the Chinese stock market. The findings showed the effect of pandemic cases and deaths on the stock returns of different organizations. Baker et al. (2020) claimed that COVID-19 strongly affects the US stock market compared to previous epidemics, including the Spanish Flu. Eighteen market jumps were observed from February-March 2020. The market jumps were considered to be the largest ones since 1990. The causes behind such jumps were the lockdowns and production cut. Ozili and Arun (2020) described that COVID-19 uncertainty and the fear of losing profit have resulted in 6 trillion USD in the global stock market on 24th February 2020. Similarly, the S&P 500 index has faced a loss of 5 trillion US dollar. The research also demonstrated the significant influence the pandemic on the opening, highest, and lowest stock indices in the US. Ngwakwe (2020) illustrated the influence of COVID-19 outbreak on some targeted stock indices (SSE, Euronext, and DJIA) by collecting the data for 50 days before and 50 days within the pandemic. The differential effects of the pandemic were observed in different stock markets. DJIA stock returns were decreased, SSE increased, however, S&P 500 index and Euronext 100 revealed insignificant effects.

He P. et al. (2020) and He Q. et al. (2020) examined the direct effects of COVID-19 spillovers on the stock market. The daily return data has collected from China, Italy, South Korea, France, Spain, Germany, Japan and the USA. The findings showed the negative short term effects of COVID-19 on the stock indices. Zhang et al. (2020) elaborated the impact of pandemic fear on the pattern of systematic risk and country-specific risk in the global financial markets. They explained the volatile nature of financial markets and the huge impact of uncertain market conditions on financial market risk. Sobieralski (2020) evaluated the effect of COVID-19 on employment and the aviation industry. The stock returns of China and US stocks have declined at a record level. Qin et al. (2020) investigated the influence of outbreak on oil markets.

Sansa (2020) explained the association between COVID-19 recorded cases and financial markets systems of SSE and DJIA during March 2020. Aslam et al. (2020) studied the impact of COVID-19 on 56 global stock market (developed, developing, emerging, and frontier) indices by using the network method. Topcu and Gulal (2020) have discovered a huge impact on Asian markets as compared to European markets. Ashraf (2020) explained that confirmed cases more strongly affect the stock market than deaths. Czech et al. (2020) used the TGARCH model and found the negative impact of COVID-19 on Visegrad stock market indices. They discovered that stock markets were seriously affected when the disease's nature was changed from epidemic to pandemic.

Zhang et al. (2020) also investigated the influence of COVID-19 on the stock markets of 10 countries. It was concluded that European stock markets showed connectivity during the outbreak; however, US markets could not show a leading role before and during the pandemic. Okorie and Lin (2021) discovered the occurrence of financial contagion during the pandemic. Corbet et al. (2020) presented some interesting insights. They illustrated that pandemic greatly affected the companies having names related to the virus, although these companies were not related to the virus.

The current research work basically pertains to the comprehensive discussion about the past present and future of world stock markets. For the sake of achieving the research aims, it has also presented a somehow brief yet inclusive debate about the happenings in the renowned stock markets. It has focused on the major market indices belong to different regions and also it has attempted to explain the actual position of some famous indices with the help of underlying real time data based graphs. Its major contribution is presenting the diverse opinions of traditional and behavioral finance regarding the behavior of stock market participants.

A General Debate About Stock Markets Performance

The global stock markets have been reported for their record decline. On 23 March 2020 the S&P 500 Index witnessed an usual drop of 35% compared to the record high on 18 February 2020. In no time at all, the intensity of this record fall became comparable with the financial crisis of 2008, black Monday of 1987, and the great depression of October-November, 1929 ( Helppie McFall, 2011 ). Fernandes (2020) also explained that the US S&P 500 index went down to 30% during March 2020. He further described that the UK and Germany's stock markets were noticed for their worst performance than the US market. The returns of these two markets were fallen by 37 and 33%, respectively. However, the worst performers in the global stock markets were Brazil (−48%) and Columbia (−47%).

Japan's market index dropped more than 20% compared to the record high values of December 2019. S&P 500 Index and Dow Jones share points were declined by 20% in March 2020. The Nikkei Index also reported the same downfall. The Colombo Stock Exchange witnessed a 9% drop in share value and experienced three market halts during mid-March 2020. The Indonesian stock market followed a similar decline. In April 2020, the index was opened with a 64.06 points decline. The UK-FTSE index plunged by 29.72%. The DAX (Germany) index was dropped by 33.37%, CAC (France) by 33.63%, NIKKEI (Japan) by 26.85%, and SUNSEX (India) want down by 17.74% ( Machmuddah et al., 2020 ). Shanghai Composite went down to 2,660.17 points on 23rd March 2020, showing a decline of 12.49% compared to December 2019. KOSPI touched the peak level of 2,204.21 points on 27th December 2019 and dropped to the lowest point of 1,457.64 on 19th March 2020, showing a drop of 33.87%. The BSE SENSEX reported the highest points of 41,681.54 on 20th December 2019. BSE SENSEX plunged to 25,981 points on 23rd March 2020 due to the COVID-19 outbreak, demonstrating a decline of 37.66%. FTSE 100 showed an upward trend on 27th December 2019 with a record index of 7,644.90 points, but it reflected the downward trend followed by a pandemic with an index value of 4,993.89 34.67% decline. The NASDAQ 100 Index reached 8,778.31 points on 26th December 2019 and observed the negative effects of the COVID outbreak by touching 7006.92 points with a declining trend of 20.17%. Moreover, MOEX revealed a bullish trend on 27th December 2019 with an index value of 3,050.47 points and reflected the effects of COVID-19 by reaching 2,112.64 points with the corresponding decline of 30.74%.

Besides, FTSE MIB reached the record level of 24,003.64 points on 20th December 2019 and then touched 14,894.44 points due to pandemic on 12th March 2020 with a declining rate of 37.94%. Nikkei 225 demonstrated an upward trend with the peak value of 24,066.12 on 17th December 2019 and represented the lowest range of 16,552.83 points following the pandemic on 19th March 2020 with the corresponding decline of 31.21%. CAC 40 represented 6,037.39 points on 27th December 2019, consequently faced the sharp jerk of 37.80% on 18th March 2020. DAX exhibited an ascending trend on 16th December 2020 with a peak value of 13,407.66, with the corresponding decline of 8,441.71 on 18th March 2020, signifying an increase of 37.04%. Moving forward, S&P/TSX jumped to 17,180.15 on 24th December 2019 and showed the devastating effects of COVID-19 with the sharp decline of 34.64% on 23rd March 2020. Besides, FTS/JSE reflected 3,513.21 points on 20th November 2020 and affected by the outbreak with a decline of 36.37% on 23rd March 2020 (Investopedia).

However, the global stock markets regained and demonstrated a bullish trend during the days of April 2020. The S&P 500 index increased by 29% and regained the strong position it had held in August 2019 ( Cox et al., 2020 ). Shanghai Composite index further increased by 8.22% in May 2020. KOSPI index showed a bullish trend and the index increased by 27.05%. Similarly, BSE SENSEX recaptured its position and touched 33,717.62 points on 30th April 2020, representing the rise of 22.94%. FTSE 100 secured an 18.33% increase, and the index targeted 6,115.25 points on 29th April 2020. NASDAQ 100 touched 9,485.02 points on 20th May 2020 with the respective rise of 26.12%. MOEX showed an upward trend with a 74.64% increase on 13th April 2020. Also, BOVESPA regained by 23.56% on 29th April 2020 and touched 83.170.80 points. FTSE MIB upbeat and reached 18,067.29 on 29th April 2020. On 20th May 2020, NIKKEI Index climbed at 20,595.15 points, reflecting an increase of 19.62%. Moreover, CAC 40 revived by 19.61% on 29th April 2020. DAX invigorated with the 24.79% increase on 20th May 2020. S&P/TSX touched 15,228 points on 29th April 2020, FTSE/JSE recovered by 27.09% on 20th May 2020, beating the outbreak's negative effect (Investopedia).

The stock market indices worldwide have been categorized in terms of Major Stock Indices, Global Stock Indices, and World Stock Indices etc. The Major World Stock market indices as well as their respective countries have been presented in the Table 1 .

www.frontiersin.org

Table 1 . Major world market indices.

Graphical Representation of Some Leading Indices

Source of all figures: tradingeconomics.com .

Figure 1 represents the stock market performance of the S&P ASX 50 index of Australia. It can be seen that the index was performing well-during January 2020, when COVID-19 was at its initial phase. However, March seemed to be a nightmare, when the index plunged and reached the lowest level as COVID-19 spread rapidly and hit a majority of the nations. But the index revived during April 2020, and a gradually limited bullish trend was observed. In-spite of such revival, the index could not reach its peak as the world is still facing the 3rd wave of the pandemic.

www.frontiersin.org

Figure 1 . S&P ASX 50 (Australia). Source: tradingeconomics.com . Reproduced with permission.

Figure 2 exhibits the stock market conditions of DAX Germany. The stock market did perform well-until February 2020, it showed a bearish trend in March 2020, followed by a gradual increase, and finally, it realized the position as it was before the pandemic.

www.frontiersin.org

Figure 2 . DAX (Germany). Source: tradingeconomics.com . Reproduced with permission.

Figure 3 demonstrates the stock market situation of Dow Jones Industrial Averages, one of the USA's leading indices. The same situation was observed just like previous indices. The bullish trend was observed before March 2020, followed by the bearish trend during March-April, 2020. Index regained slowly, and revival leads to the extreme upward movements.

www.frontiersin.org

Figure 3 . Dow Jones industrial averages (USA). Source: tradingeconomics.com . Reproduced with permission.

Figure 4 illustrates the stock market trend of CAC 40, the index of France. The index was at its peak during February 2020. However, a sudden jerk was observed during March 2020, and the index touched the lowest points. The index recovered quite slowly, and to date, it could not recover its previous position. The fluctuations in the index can still be noticed.

www.frontiersin.org

Figure 4 . CAC40 (France). Source: tradingeconomics.com . Reproduced with permission.

The variations in the FTSE 100 index of Europe's market conditions can be seen in Figure 5 . The bullish trend can be observed before March 2020, followed by the extreme bearish trend. The index went to the historical lowest points during March 2020. The upward movements were started during April 2020; however, slow movements were there, and the index is still in a slow recovery phase.

www.frontiersin.org

Figure 5 . FTSE-100 (Europe). Source: tradingeconomics.com . Reproduced with permission.

SENSEX index is a famous stock market index of India. Figure 6 is representing its performance. In January 2020, though the index was not performing very well, it faced the effects of COVID-19 during March 2020. The extreme slow revival was observed after March, and the index remains at the same pace. However, the gradual upward trends lead the index to its highest peak in 2021, as shown in the figure.

www.frontiersin.org

Figure 6 . SENSEX (India). Source: tradingeconomics.com . Reproduced with permission.

Figure 7 depicts Japan's famous index, i.e., the Nikkei 225. The index was in the recovery phase during January 2020; however, a bearish trend was observed from the outbreak. After March 2020, the recovery phase was there, but static movements were observed. Nevertheless, these slow recoveries finally touched the highest peak in 2021.

www.frontiersin.org

Figure 7 . Nikkei-225 (Japan). Source: tradingeconomics.com . Reproduced with permission.

Figure 8 deals with the NASDAQ stock market performance, one of the USA's leading indices. During the start of 2020, its performance was below average, ultimately reaching the lowest points in March 2020 as per the COVID-19 effects. The index escalates gradually, and to date, it jumped and touched the peak level.

www.frontiersin.org

Figure 8 . NASDAQ (USA). Source: tradingeconomics.com . Reproduced with permission.

Referring to Figure 9 , the PSX-100 of Pakistan was performing well-before the sharp rise of COVID-19 in Pakistan. However, the month of March 2020 proved to be a terrible one; the index plunged and touched the lowest level. A slow revival was observed, which ultimately hit the highest points during 2021, as showing in the Figure 9 .

www.frontiersin.org

Figure 9 . PSX-100 (Pakistan). Source: tradingeconomics.com . Reproduced with permission.

The performance of the S&P 500 index, a prominent index of the USA, seems to be similar to the NASDAQ stock market index. However, before COVID and the recovery phase after March 2020 is better than the NASDAQ index. Currently, the index has reached the highest level as shown in Figure 10 .

www.frontiersin.org

Figure 10 . S&P 500 (USA). Source: tradingeconomics.com . Reproduced with permission.

Figure 11 exhibits the Shanghai Stock Exchange Composite index of China, the origin of the COVID-19 pandemic. The SSE index is the outperformer index of China; however, it was severely affected by the pandemic. The index plunged from the start of the outbreak up to June 2020, followed by a sharp rise and now, with the gradual increase, the index has reached the maximum points.

www.frontiersin.org

Figure 11 . SSE composite (China). Source: tradingeconomics.com . Reproduced with permission.

Implications for Stock Market Participants

The current study has some implications for market participants and policymakers, i.e., investors, managers, corporations, and governments. The investors must have market know how to invest their resources in favorable avenues even during the contingent market conditions, which have just happened during COVID-19. The investors can take guidance from the mangers and policymakers in this regard. In this way, investors can take rational decision making for their investments. Moreover, investors can generate their portfolios and risk management strategies. Besides, investors must focus on diversification to avoid losses during the pandemic situation.

The managers are the key stakeholders of financial markets; they have experience with stocks' risky nature during the pandemic and can take preventive measures accordingly. The mangers can increase the confidence level of investors, which lead them to make long term investments.

Governments can also play a vital role in assisting with the outbreak in tax rebates and interest-free loans. Governments can even facilitate the national markets by relaxing the lending policies and providing short-term loans on relaxing terms. Governments can conduct surveys and can assist investors in reducing their uncertainty.

Policymakers can develop successful methodologies for balancing financial investments during the outbreak. For this, they can focus on understanding the dynamics of stock markets in devising effective strategies. Besides, policymakers can integrate policies to cope-up with the financial and economic impacts of the COVID-19 outbreak. The emphasis must be on the improvement of stock market stability.

Unlocking the Future in Post-COVID-19 World

The aim of realizing sustainable growth, a big challenge for all economies, has been underscored by the novel virus. The pandemic has directed that economies are not goal-oriented in terms of their aspiration; they are required to achieve the milestones of a robust global economy. The greatest accomplishments are always achieved through the heights of determination. History is always there to provide lessons for the future. Nearly 75 years ago, amid World War II and one of Britain's most difficult hours, Winston Churchill inspired the whole nation not with the slogans to “reconsider what is achievable” but with a firm determination to “never surrender.” Now at the most difficult time of the century, we are required to continue our fight for what the world needs instead of reconsidering the sustainable development goals. This crisis requires a determined global effort to “build back better” by making a big reset to reach where we were before ( Kharas and and McArthur, 2020 ). Moreover, the leaders of the world must draw a new course of action for improving the functioning of international financial and monetary system to make it strong enough to cope with any such crisis in future ( Coulibaly and Prasad, 2020 ). The stock markets had to face the worst situation in the last 30 years, business operations have abruptly failed, and various economic sectors have been critically affected. However, the best point that came out of the COVID-19 pandemic is the businesses' pressure to be innovative and redefine their operations. One example is the tech community, which has been progressing to facilitate the community in adopting the technology to deal with the pandemic's challenges. Such technological innovations assist specific divisions of organizations or even the whole organizations to carry on their operations irrespective of the current contingent situation. Certainly, the world and the multinational business models will face diverse post-virus issues. Following the COVID-19 pandemic, the nations will observe new policies relating to restructuring and operational strategies, i.e., strategic workforce planning including remote staff planning, flexible conventions, workers proficiency, best practices and HR strategies; Crisis response and business continuity planning, risk control strategies and measures; Financial resources to weather future unforeseen events; Cloud-enabled IT infrastructure (and the attendant improved cybersecurity procedures); and the Redundant sourcing of necessities (inventory, materials and individuals), ( David, 2020 ).

The prospects of the stock markets and the economies are based on the availability and accessibility of the vaccine. The optimism about the vaccine has revitalized the investors' appetite regarding hotels, energy firms, and airlines. However, some others have been brutally affected by the pandemic and are forced to sell their respective market shares. Stock markets are largely dealing with the sentiment that tomorrow may be better than today, leading to a fundamental and perhaps enduring sea change. The development of more vaccines would pave the way for more optimism. What this result demonstrates is that while the virus is not yet beaten, it is beatable. That ray of light has lit up stock markets around the world. As usual, some stock market participants are there to look for something else to worry about ( Jack, 2020 ).

The current study deals with the impact of the COVID-19 pandemic on the global stock markets. It has focused on the contingent effects of previous and current pandemics on the financial markets. It has also elaborated on the impact of the pandemic on diverse pillars of the economy. The pandemic has severely hit the worldwide markets and posited challenges for economists, policymakers, head of states, international financial institutions, regulatory authorities, and health institutions to deal with the long-lasting effects of the outbreak. It has opened our eyes to concentrate our efforts to protect the future health of citizens and the also financial issues. In the current pandemic situation, the stock markets faced the effects of Covid-19, and this back-and-forth is ongoing. The majority of the world stock markets have suffered trillion-dollar losses ( Lyócsa et al., 2020 ). International financial institutions like IMP and World Bank have been forced to reduce their forecasted growth for 2020 and the years to come ( Boone et al., 2020 ). The global stock markets have been reported for their record decline. The month of March 2020 saw an unusual drop in most worldwide indices like the S&P 500 Index, NASDAQ, NIKKEI, SSE composite, CAC-40; DAX etc. However, the global stock markets regained and demonstrated the bullish trend during the days of April 2020. Irrespective of all these cyclical effects of the pandemic, still hopes are there for the sharp rise and speedy improvements in global stock markets' performance. Moreover, these past events have become a key for mankind to get insights for better future planning ( Su et al., 2021 ).

Behavioral vs. Conventional Finance

The two polar aspects of finance i.e., traditional finance and behavioral finance have also shed light on the psychology of investors during the COVID-19 pandemic. As far as traditional finance is concerned, investors behave rationally. The rational attitude of investors restricts them from imitating the decisions of others. Investors get the basic facts and figures about the stock markets through their own efforts, resultantly the fear of avoiding future losses compel the investors to sell their stocks and the market shows bearish trend ( Jabeen and Rizavi, 2021 ). The same has happened in the world's stock markets during the peak of pandemic. There have been sudden jerks observed around the global stock markets ( Jabeen and Farhan, 2020 ).

On the other hand, behavioral finance is supposed to consist of a set of theories which focus on the irrationality of investors. The viewpoint of irrationality of investors is the foundation of behavioral finance. The irrational aspect of investors forces them to follow the decision of other investors by setting aside their own information. In such context, investors have confidence on the decision of other investors as they feel that others may possess better information skills ( Jabeen and Rizavi, 2021 ). As a result the panic market conditions lead investors to blindly follow the others to protect their investment and market also depicts the bearish trend, the one which has been seen during the COVID-19 outbreak.

This debate has proven that both the traditional finance and behavioral finance have provided the same mechanism during the COVID-19 pandemic, irrespective of the fact that these pillars of finance deal with the opposing behaviors of investors i.e., rational and irrational. In both of the scenarios, the investors have sale their shares and resultantly the bearish trend has been observed.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Abbas, J., Wang, D., Su, Z., and Ziapour, A. (2021). The role of social media in the advent of COVID-19 pandemic: crisis management, mental health challenges and implications. Risk Manag. Healthc. Policy 14, 1917–1932. doi: 10.2147/RMHP.S284313

PubMed Abstract | CrossRef Full Text | Google Scholar

Ahmar, A. S., and del Val, E. B. (2020). SutteARIMA: short-term forecasting method, a case: Covid-19 and stock market in Spain. Sci. Total Environ. 729:138883. doi: 10.1016/j.scitotenv.2020.138883

Al Rjoub, S. A. (2011). Business cycles, financial crises, and stock volatility in Jordan stock exchange. Int. J. Econ. Perspect. 5:21. doi: 10.2139/ssrn.1461819

CrossRef Full Text | Google Scholar

Al Rjoub, S. A., and Azzam, H. (2012). Financial crises, stock returns and volatility in an emerging stock market: the case of Jordan. Journal of Economic Studies 39, 178–211. doi: 10.1108/01443581211222653

Al Rjoub, S. A. M. (2009). Business cycles, financial crises, and stock volatility in Jordan stock exchange. Soc. Sci. Electron. Publish. 31, 127–132.

Google Scholar

Alam, M. M., Wei, H., and Wahid, A. N. (2020). COVID-19 outbreak and sectoral performance of the Australian stock market: an event study analysis. Curr. Protocols 60, 482-495. doi: 10.1111/1467-8454.12215

Alameer, Z., Elaziz, M. A., Ewees, A. A., Ye, H., and Jianhua, Z. (2019). Forecasting copper prices using hybrid adaptive neuro-fuzzy inference system and genetic algorithms. Nat. Resour. Res. 28, 1385–1401. doi: 10.1007/s11053-019-09473-w

Al-Awadhi, A. M., Alsaifi, K., Al-Awadhi, A., and Alhammadi, S. (2020). Death and contagious infectious diseases: impact of the COVID-19 virus on stock market returns. J. Behav. Exp. Finance 27:100326. doi: 10.1016/j.jbef.2020.100326

Aqeel, M., Shuja, k,. H., Abbas, J., and Ziapour, A. (2021). The influence of illness perception, anxiety, and depression disorders on student mental health during COVID-19 outbreak in Pakistan: a web-based cross sectional survey. Int. J. Hum. Rights Healthc. 14, 1–20. doi: 10.1108/IJHRH-10-2020-0095

Ashraf, B. N. (2020). Stock markets' reaction to COVID-19: cases or fatalities? Res. Int. Bus. Finance 54:101249. doi: 10.1016/j.ribaf.2020.101249

Aslam, F., Mohmand, Y. T., Ferreira, P., Memon, B. A., Khan, M., and Khan, M. (2020). Network analysis of global stock markets at the beginning of the coronavirus disease (Covid-19) outbreak. Borsa Istanb. Rev. 20, 49-61. doi: 10.1016/j.bir.2020.09.003

Bagchi, B., Chatterjee, S., Ghosh, R., and Dandapat, D. (2020). Coronavirus outbreak and the great lockdown: impact on oil prices and major stock markets across the globe (Heidelberg; Singapore: Springer), 112. doi: 10.1007/978-981-15-7782-6

Baker, S. R., Bloom, N., Davis, S. J., Kost, K. J., Sammon, M. C., and Viratyosin, T. (2020). The unprecedented stock market impact of COVID-19 (No. w26945). Natl. Bur. Econ. Res . doi: 10.3386/w26945

Baldwin, R. E., and Weder, B. (2020). Economics in the Time of COVID-19 . Washington, DC: CEPR Press.

Barro, R. J., Ursua, J. F., and Weng, J. (2020). The coronavirus and the great influenza epidemic: lessons from the “Spanish Flu” for the coronavirus' potential effects on mortality and economic activity. Natl. Bur. Econ. Res . doi: 10.3386/w26866

Bash, A., and Alsaifi, K. (2019). Fear from uncertainty: an event study of Khashoggi and stock market returns. J. Behav. Exp. Finance 23, 54–58. doi: 10.1016/j.jbef.2019.05.004

Boone, L., Haugh, D., Pain, N., Salins, V., and Boone, L. (2020). Tackling the Fallout from COVID-19. Economics in the Time of COVID-19 (London: CEPR Press), 37.

Caporale, G. M., Plastun, A., and Makarenko, I. (2019). Force majeure events and stock market reactions in Ukraine. Invest. Manag. Financ. Innov. 16, 334–345. doi: 10.21511/imfi.16(1)0.2019.26

Chen, M. H., Jang, S. S., and Kim, W. G. (2007). The impact of the SARS outbreak on Taiwanese hotel stock performance: an event-study approach. Int. J. Hosp. Manag. 26, 200–212. doi: 10.1016/j.ijhm.2005.11.004

Chen, M. P., Lee, C. C., Lin, Y. H., and Chen, W. Y. (2018). Did the SARS epidemic weaken the integration of Asian stock markets? Evidence from smooth time-varying cointegration analysis. Econ. Res. Ekon. Istraz. 3, 908–926. doi: 10.1080/1331677X.2018.1456354

Chowdhury, E. K., Khan, I. I., and Dhar, B. K. (2020). Catastrophic impact of Covid-19 on the global stock markets and economic activities. Bus. Soc. Rev. 1–24. doi: 10.1111/basr.12219

Chriscaden, K. (2020). Impact of COVID-19 on People's Livelihoods, Their Health and our Food Systems [Joint statement by ILO, FAO, IFAD and WHO] . Available online at: https://www.who.int/news/item/13-10-2020-impact-of-covid-19-on-people's-livelihoods-their-health-and-our-food-systems (accessed February 13, 2021).

Chudik, A., Mohaddes, K., Pesaran, M. H., Raissi, M., and Rebucci, A. (2020). Economic Consequences of Covid-19: A Counterfactual Multi-Country analysis. VoxEU.Org . Available online at: https://voxeu.org/article/economic-consequences-covid-19-multi-country-analysis (accessed October 19, 2020).

Corbet, S., Hou, Y., Hu, Y., Lucey, B., and Oxley, L. (2020). Aye Corona! The contagion effects of being named Corona during the COVID-19 pandemic. Finance Res. Lett. 38:2020. doi: 10.2139/ssrn.3561866

Coulibaly, B., and Prasad, E. (2020). “The international monetary and financial system: how to fit it for purpose?,” in Reimagining the Global Economy: Building Back Better in a Post-COVID-19 World . Washington, DC: The Brookings Institution.

Cox, J., Greenwald, D. L., and Ludvigson, S. C. (2020). What explains the COVID-19 stock market. Natl. Bur. Econ. Res . doi: 10.3386/w27784

Cucinotta, D., and Vanelli, M. (2020). WHO declares COVID-19 a pandemic. Acta Biomed. Ateneo Parm. 9, 157–160. doi: 10.23750/abm.v91i1.9397

Czech, K., Wielechowsk, M., Kotyza, P., Benešová, I., and Laputková, A. (2020). Shaking stability: COVID-19 impact on the Visegrad group countries' financial markets. Sustain. Times 12:6282. doi: 10.3390/su12156282

Dang, T. L., and Nguyen, T. M. H. (2020). Liquidity risk and stock performance during the financial crisis. Res. Int. Bus. Finance 52:101165. doi: 10.1016/j.ribaf.2019.101165

David, S. (2020). International Business Models for a Post-COVID-19 World HLB. HLB The Global Advisory and Accounting Network . Available online at: https://www.hlb.global/international-business-models-for-a-post-covid-19-world/ (accessed February 23, 2021).

Delisle, J. (2003). SARS, greater China, and the pathologies of globalization and transition. Orbis 47, 587–604. doi: 10.1016/S0030-4387(03)00076-0

Dong, G. N., and Heo, Y. (2014). Flu epidemic, limited attention and analyst forecast behavior. Limited Attent. Analyst Forecast Behav. doi: 10.2139/ssrn.3353255

Fernandes, N. (2020). Economic effects of coronavirus outbreak (COVID-19) on the world economy . doi: 10.2139/ssrn.3557504

He, P., Sun, Y., Zhang, Y., and Li, T. (2020). COVID−19's impact on stock prices across different sectors—an event study based on the chinese stock market. Emerg. Mark. Finance Trade 56, 2198–2212. doi: 10.1080/1540496X.2020.1785865

He, Q., Liu, J., Wang, S., and Yu, J. (2020). The impact of COVID-19 on stock markets. Econ. Political Stud. 8, 275–288. doi: 10.1080/20954816.2020.1757570

Helppie McFall, B. (2011). Crash and wait? The impact of the great recession on the retirement plans of older Americans. Am. Econ. Rev. 101, 40–44. doi: 10.1257/aer.101.3.40

Hui, D. S., Azhar, D. I., Madani, T. A., Ntoumi, F., Kock, R., Dar, O., et al. (2020). The continuing 2019 nCoV epidemic threat of novel coronaviruses to global health—the latest 2019 novel coronavirus outbreak in Wuhan, China. Int. J. Infect. Dis. 91, 264–266. doi: 10.1016/j.ijid.2020.01.009

Iyke, B. N. (2020). The disease outbreak channel of exchange rate return predictability: evidence from COVID-19. Emerg. Mark. Finance Trade 56, 2277–2297. doi: 10.1080/1540496X.2020.1784718

Jabeen, S., and Rizavi, S. S. (2021). Long term and short term herding prospects: evidence from Pakistan stock exchange. Abasyn J. Manag. Sci. 14, 119–144. doi: 10.34091/AJSS.14.1.08

Jabeen, S., and Farhan, M. (2020). COVID-19: the pandemic's impact on economy and stock markets. J. Manag. Sci. 14, 29–43.

Jack, S. (2020). Covid-19: Global Stock Markets Rocket on Vaccine Hopes. BBC News . Available online at: https://www.bbc.com/news/business-54874108 (accessed November 9, 2020).

Jones, L., Palumbo, D., and Brown, D. (2021). Coronavirus: How the Pandemic has Changed the World Economy. BBC News . Available online at: https://www.bbc.com/news/business-51706225 (accessed January 24, 2021).

Kalra, R., Henderson, G. V., and Raines, G. A. (1993). Effects of the chernobyl nuclear accident on utility share prices. Q. J. Bus. Econ. 32, 52–77.

Kaplanski, G., and Levy, H. (2010). Sentiment and stock prices: the case of aviation disasters. J. Financ. Econ. 95, 174–201. doi: 10.1016/j.jfineco.2009.10.002

Karlsson, M., and Nilsson, S. (2014). The impact of the 1918 Spanish flu epidemic on economic performance in Sweden: an investigation into the consequences of an extraordinary mortality shock. J. Health Econ. 36, 1–9. doi: 10.1016/j.jhealeco.2014.03.005

Keating, J. (2001). An investigation into the cyclical incidence of dengue feve. Soc. Sci. Med. 53, 1587–1597. doi: 10.1016/S0277-9536(00)00443-3

Kharas, H., and McArthur, J. W. (2020). “Sustainable development goals: how can they be a handrail for recovery?,” in Reimagining the Global Economy: Building Back Better in a Post-COVID-19 World (Washington, DC. The Brookings Institution). Available online at: https://www.brookings.edu/multi-chapter-report/reimagining-the-global-economy-building-back-better-in-a-post-covid-19-world

Khatatbeh, I. N., Hani, M. B., and Abu-Alfoul, M. N. (2020). The impact of COVID-19 pandemic on global stock markets: an event study. Int. J. Econ. Bus. 8, 505–514. doi: 10.35808/ijeba/602

Koley, T. K., and Dhole, M. (2020). COVID-19 pandemic: The Deadly Coronavirus Outbreak in the 21st Century, 1st Edn . Oxfordshire: Routledge. doi: 10.4324/9781003095590

Lanfear, M. G., Lioui, A., and Siebert, M. G. (2018). Market anomalies and disaster risk: evidence from extreme weather events. J. Financial Mark. 46:100477. doi: 10.1016/j.finmar.2018.10.003

Lee, J. W., and McKibbin, W. J. (2004). “Estimating the global economic costs of SARS,” in Learning from SARS: Preparing for the Next Disease Outbreak , eds S. Knobler, A. Mahmoud, and S. Lemon, et al. (Washington, DC: National Academies Press).

Liu, H., Manzoor, A., Wang, C., Zhang, L., and Manzoor, Z. (2020). The COVID-19 outbreak and affected countries stock markets response. Int. J. Environ. Res. Public Health 17:2800. doi: 10.3390/ijerph17082800

Loh, E. (2006). The impact of SARS on the performance and risk profile of airline stocks. Int. J. Transp. Econ. 33, 401–422.

Lyócsa, Š., Baumöhl, E., Výrost, T., and Molná, P. (2020). Fear of the coronavirus and the stock markets. Finance Res. Lett. 36:101735. doi: 10.1016/j.frl.2020.101735

Machmuddah, Z., Utomo, S.t., Suhartono, E., Ali, S., and Ghulam, W. A. (2020). Stock market reaction to COVID-19: evidence in customer goods sector with the implication for open innovation. J. Open Innov.: Technol. Mark. Complex. 6:99. doi: 10.3390/joitmc6040099

Maqsood, A., Abbas, J., Rehman, G., and Mubeen, R. (2021). The paradigm shift for educational system continuance in the advent of COVID-19 pandemic: mental health challenges and reflections. Curr. Res. Behav. Sci. 2, 1–5. doi: 10.1016/j.crbeha.2020.100011

MckKibbin, W. J., and Sidorenko, A. A. (2006). Global macroeconomic consequences of pandemic influenza. 79.

Mctier, B. C., Tse, Y., and Wald, J. K. (2011). Do stock markets catch the flu? J. Financ. Quant. Anal. 48, 979–1000. doi: 10.1017/S0022109013000239

Mishra, M. K. (2020). The World After COVID-19 and Its Impact on Global Economy . Kiel: ZBW–Leibniz Information Centre for Economics. Available online at: https://www.econstor.eu/handle/10419/215931

Ngwakwe, C. V. (2020). Effect of COVID-19 pandemic on global stock market values: a differential analysis. Economica 16, 261–275.

Nikkinen, J., Omran, M. M., and Sahlstr, M. P. (2008). Stock returns and volatility following the september 11 attacks: evidence from 53 equity markets. Int. Rev. Financial Anal. 17, 27–46. doi: 10.1016/j.irfa.2006.12.002

Nippani, S., and Washer, K. M. (2004). SARS: a non-event for affected countries' stock markets? Appl. Financial Econ. 14, 1105–1110 doi: 10.1080/0960310042000310579

OECD (2020). OECDEmployment Outlook 2020 Facing the Jobs Crisis. Paris: OECD. Available online at: http://www.oecd.org/employment-outlook/2020 (accessed February 13, 2021).

Okorie, D. I., and Lin, B. (2021). Stock markets and the COVID-19 fractal contagion effects. Finance Res. Lett. 38:101640.

PubMed Abstract | Google Scholar

Ozili, P. K., and Arun, T. (2020). Spillover of COVID-19: impact on the global economy. doi: 10.2139/ssrn.3562570

Qin, M., Zhang, Y. C., and Su, C. W. (2020). The essential role of pandemics: a fresh insight into the oil market. Energy Res. Lett. 1, 1–6. doi: 10.46557/001c.13166

Ramelli, S., and Wagner, A. F. (2020). Feverish stock price reactions to COVID-19. Rev. Corp. Finance Stud. 9, 622–655. doi: 10.1093/rcfs/cfaa012

Ranju, P. K., and Mallikarjunappa, T. (2019). Spillover effect of MandA announcements on acquiring firms' rivals: evidence from India. Glob. Bus. Rev. 20, 692–707. doi: 10.1177/0972150919837080

Rengasamy, E. (2012). Sovereign debt crisis in the euro zone and its impact on the BRICS's stock index returns and volatility. Econ. Finance Rev. 2, 37–46. Retrieved from: https://tradingeconomics.com/pakistan/stock-market (accessed February 21, 2021).

Righi, M. B., and Ceretta, P. S. (2011). Analyzing the structural behavior of volatility in the major European markets during the Greek crisis. Econ. Bull . 31, 3016–3029.

Sansa, N. A. (2020). The impact of the COVID-19 on the financial markets: evidence from China and USA. Electron. Res. J. Soc. Sci. Human. 2:11. doi: 10.2139/ssrn.3567901

Schwert, G. W. (2011). Stock volatility during the recent financial crisis. Eur. Financ. Manag. 17, 789–805. doi: 10.1111/j.1468-036X.2011.00620.x

Shafi, M., Liu, J., and Ren, W. (2020). Impact of COVID-19 pandemic on micro, small, and medium-sized enterprises operating in Pakistan. Res. Glob. 2:100018. doi: 10.1016/j.resglo.2020.100018

Sharma, P. (2021). Coronavirus News, Markets and AI: The COVID-19 Diaries . Oxfordshire: Routledge. doi: 10.4324/9781003138976-1

Singh, B., Dhall, R., Narang, S., and Rawat, S. (2020). The outbreak of COVID-19 and stock market responses: an event study and panel data analysis for G-20 countries. Glob. Bus. Rev. 1–26. doi: 10.1177/0972150920957274

Sobieralski, J. B. (2020). Covid-19 and airline employment: insights from historical uncertainty shocks to the industry. Transp. Res. Interdiscip. Perspect. 5:100123. doi: 10.1016/j.trip.2020.100123

Su, Z., McDonnell, D., Cheshmehzangi, A., Abbas, J., Li, X., and Cai, Y. (2021). The promise and perils of Unit 731 data. BMG Glob. Health 6, 1–4. doi: 10.1136/bmjgh-2020-004772

Topcu, M., and Gulal, O. S. (2020). The impact of COVID-19 on emerging stock markets. Finance Res. Lett. 36, 101691.

UNWTO (2020). Impact Assessment of the COVID-19 Outbreak on International Tourism UNWTO . Available online at: https://www.unwto.org/impact-assessment-of-the-covid-19-outbreak-on-international-tourism (accessed February 13, 2021).

Wang, C., Wang, D., Duan, K., and Mubeen, R. (2021). Global financial crisis, smart lockdown strategies, and the COVID-19 spillover impacts: a global perspective implications from Southeast Asia. Front. Psychiatry 12:643783. doi: 10.3389/fpsyt.2021.643783

Zhang, D., Hu, M., and Ji, Q. (2020). Financial markets under the global pandemic of COVID-19. Finance Res. Lett. 36:101528. doi: 10.1016/j.frl.2020.101528

Keywords: COVID-19, stock markets, market indices, behavioral finance, SARS

Citation: Jabeen S, Farhan M, Zaka MA, Fiaz M and Farasat M (2022) COVID and World Stock Markets: A Comprehensive Discussion. Front. Psychol. 12:763346. doi: 10.3389/fpsyg.2021.763346

Received: 23 August 2021; Accepted: 30 September 2021; Published: 28 February 2022.

Reviewed by:

Copyright © 2022 Jabeen, Farhan, Zaka, Fiaz and Farasat. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Muhammad Farhan, muhammad.farhan@numl.edu.pk

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Browse All Articles
  • Newsletter Sign-Up

FinancialMarkets →

No results found in working knowledge.

  • Were any results found in one of the other content buckets on the left?
  • Try removing some search filters.
  • Use different search filters.

Macroeconomics and Finance: The Role of the Stock Market

The treatment of the stock market in finance and macroeconomics exemplifies many of the important differences in perspective between the two fields. In finance, the stock market is the single most important market with respect to corporate investment decisions. In contrast, macroeconomic modelling and policy discussion assign a relatively minor role to the stockmarket in investment decisions. This paper explores four possible explanations for this neglect and concludes that macro analysis should give more attention to the stock market. Despite the frequent jibe that "the stockmarket has forecast ten of the last six recessions," the stock market is in fact a good predictor of the business cycle and the components of GNP. We examine the relative importance of the required return on equity compared with the interest rate in the determination of the cost of capital, and hence,investment. In this connection, we review the empirical success of the Q theory of investment which relates investment to stock market evaluations of firms. One of the explanations for the neglect of the stock market in macroeconomics may be the view that because the stock market fluctuates excessively, rational managers will pay little attention to the market informulating investment plans. This view is shown to be unfounded by demonstrating that rational managers will react to stock price changes even if the stock market fluctuates excessively. Finally, we review the extremely important issue of whether the market does fluctuate excessively, and conclude that while not ruled out on a priori theoretical grounds, the empirical evidence for such excess fluctuations has not been decisive.

  • Acknowledgements and Disclosures

MARC RIS BibTeΧ

Download Citation Data

Published Versions

More from nber.

In addition to working papers , the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter , the NBER Digest , the Bulletin on Retirement and Disability , the Bulletin on Health , and the Bulletin on Entrepreneurship  — as well as online conference reports , video lectures , and interviews .

15th Annual Feldstein Lecture, Mario Draghi, "The Next Flight of the Bumblebee: The Path to Common Fiscal Policy in the Eurozone cover slide

  • Open access
  • Published: 28 August 2020

Short-term stock market price trend prediction using a comprehensive deep learning system

  • Jingyi Shen 1 &
  • M. Omair Shafiq   ORCID: orcid.org/0000-0002-1859-8296 1  

Journal of Big Data volume  7 , Article number:  66 ( 2020 ) Cite this article

268k Accesses

164 Citations

91 Altmetric

Metrics details

In the era of big data, deep learning for predicting stock market prices and trends has become even more popular than before. We collected 2 years of data from Chinese stock market and proposed a comprehensive customization of feature engineering and deep learning-based model for predicting price trend of stock markets. The proposed solution is comprehensive as it includes pre-processing of the stock market dataset, utilization of multiple feature engineering techniques, combined with a customized deep learning based system for stock market price trend prediction. We conducted comprehensive evaluations on frequently used machine learning models and conclude that our proposed solution outperforms due to the comprehensive feature engineering that we built. The system achieves overall high accuracy for stock market trend prediction. With the detailed design and evaluation of prediction term lengths, feature engineering, and data pre-processing methods, this work contributes to the stock analysis research community both in the financial and technical domains.

Introduction

Stock market is one of the major fields that investors are dedicated to, thus stock market price trend prediction is always a hot topic for researchers from both financial and technical domains. In this research, our objective is to build a state-of-art prediction model for price trend prediction, which focuses on short-term price trend prediction.

As concluded by Fama in [ 26 ], financial time series prediction is known to be a notoriously difficult task due to the generally accepted, semi-strong form of market efficiency and the high level of noise. Back in 2003, Wang et al. in [ 44 ] already applied artificial neural networks on stock market price prediction and focused on volume, as a specific feature of stock market. One of the key findings by them was that the volume was not found to be effective in improving the forecasting performance on the datasets they used, which was S&P 500 and DJI. Ince and Trafalis in [ 15 ] targeted short-term forecasting and applied support vector machine (SVM) model on the stock price prediction. Their main contribution is performing a comparison between multi-layer perceptron (MLP) and SVM then found that most of the scenarios SVM outperformed MLP, while the result was also affected by different trading strategies. In the meantime, researchers from financial domains were applying conventional statistical methods and signal processing techniques on analyzing stock market data.

The optimization techniques, such as principal component analysis (PCA) were also applied in short-term stock price prediction [ 22 ]. During the years, researchers are not only focused on stock price-related analysis but also tried to analyze stock market transactions such as volume burst risks, which expands the stock market analysis research domain broader and indicates this research domain still has high potential [ 39 ]. As the artificial intelligence techniques evolved in recent years, many proposed solutions attempted to combine machine learning and deep learning techniques based on previous approaches, and then proposed new metrics that serve as training features such as Liu and Wang [ 23 ]. This type of previous works belongs to the feature engineering domain and can be considered as the inspiration of feature extension ideas in our research. Liu et al. in [ 24 ] proposed a convolutional neural network (CNN) as well as a long short-term memory (LSTM) neural network based model to analyze different quantitative strategies in stock markets. The CNN serves for the stock selection strategy, automatically extracts features based on quantitative data, then follows an LSTM to preserve the time-series features for improving profits.

The latest work also proposes a similar hybrid neural network architecture, integrating a convolutional neural network with a bidirectional long short-term memory to predict the stock market index [ 4 ]. While the researchers frequently proposed different neural network solution architectures, it brought further discussions about the topic if the high cost of training such models is worth the result or not.

There are three key contributions of our work (1) a new dataset extracted and cleansed (2) a comprehensive feature engineering, and (3) a customized long short-term memory (LSTM) based deep learning model.

We have built the dataset by ourselves from the data source as an open-sourced data API called Tushare [ 43 ]. The novelty of our proposed solution is that we proposed a feature engineering along with a fine-tuned system instead of just an LSTM model only. We observe from the previous works and find the gaps and proposed a solution architecture with a comprehensive feature engineering procedure before training the prediction model. With the success of feature extension method collaborating with recursive feature elimination algorithms, it opens doors for many other machine learning algorithms to achieve high accuracy scores for short-term price trend prediction. It proved the effectiveness of our proposed feature extension as feature engineering. We further introduced our customized LSTM model and further improved the prediction scores in all the evaluation metrics. The proposed solution outperformed the machine learning and deep learning-based models in similar previous works.

The remainder of this paper is organized as follows. “ Survey of related works ” section describes the survey of related works. “ The dataset ” section provides details on the data that we extracted from the public data sources and the dataset prepared. “ Methods ” section presents the research problems, methods, and design of the proposed solution. Detailed technical design with algorithms and how the model implemented are also included in this section. “ Results ” section presents comprehensive results and evaluation of our proposed model, and by comparing it with the models used in most of the related works. “ Discussion ” section provides a discussion and comparison of the results. “ Conclusion ” section presents the conclusion. This research paper has been built based on Shen [ 36 ].

Survey of related works

In this section, we discuss related works. We reviewed the related work in two different domains: technical and financial, respectively.

Kim and Han in [ 19 ] built a model as a combination of artificial neural networks (ANN) and genetic algorithms (GAs) with discretization of features for predicting stock price index. The data used in their study include the technical indicators as well as the direction of change in the daily Korea stock price index (KOSPI). They used the data containing samples of 2928 trading days, ranging from January 1989 to December 1998, and give their selected features and formulas. They also applied optimization of feature discretization, as a technique that is similar to dimensionality reduction. The strengths of their work are that they introduced GA to optimize the ANN. First, the amount of input features and processing elements in the hidden layer are 12 and not adjustable. Another limitation is in the learning process of ANN, and the authors only focused on two factors in optimization. While they still believed that GA has great potential for feature discretization optimization. Our initialized feature pool refers to the selected features. Qiu and Song in [ 34 ] also presented a solution to predict the direction of the Japanese stock market based on an optimized artificial neural network model. In this work, authors utilize genetic algorithms together with artificial neural network based models, and name it as a hybrid GA-ANN model.

Piramuthu in [ 33 ] conducted a thorough evaluation of different feature selection methods for data mining applications. He used for datasets, which were credit approval data, loan defaults data, web traffic data, tam, and kiang data, and compared how different feature selection methods optimized decision tree performance. The feature selection methods he compared included probabilistic distance measure: the Bhattacharyya measure, the Matusita measure, the divergence measure, the Mahalanobis distance measure, and the Patrick-Fisher measure. For inter-class distance measures: the Minkowski distance measure, city block distance measure, Euclidean distance measure, the Chebychev distance measure, and the nonlinear (Parzen and hyper-spherical kernel) distance measure. The strength of this paper is that the author evaluated both probabilistic distance-based and several inter-class feature selection methods. Besides, the author performed the evaluation based on different datasets, which reinforced the strength of this paper. However, the evaluation algorithm was a decision tree only. We cannot conclude if the feature selection methods will still perform the same on a larger dataset or a more complex model.

Hassan and Nath in [ 9 ] applied the Hidden Markov Model (HMM) on the stock market forecasting on stock prices of four different Airlines. They reduce states of the model into four states: the opening price, closing price, the highest price, and the lowest price. The strong point of this paper is that the approach does not need expert knowledge to build a prediction model. While this work is limited within the industry of Airlines and evaluated on a very small dataset, it may not lead to a prediction model with generality. One of the approaches in stock market prediction related works could be exploited to do the comparison work. The authors selected a maximum 2 years as the date range of training and testing dataset, which provided us a date range reference for our evaluation part.

Lei in [ 21 ] exploited Wavelet Neural Network (WNN) to predict stock price trends. The author also applied Rough Set (RS) for attribute reduction as an optimization. Rough Set was utilized to reduce the stock price trend feature dimensions. It was also used to determine the structure of the Wavelet Neural Network. The dataset of this work consists of five well-known stock market indices, i.e., (1) SSE Composite Index (China), (2) CSI 300 Index (China), (3) All Ordinaries Index (Australian), (4) Nikkei 225 Index (Japan), and (5) Dow Jones Index (USA). Evaluation of the model was based on different stock market indices, and the result was convincing with generality. By using Rough Set for optimizing the feature dimension before processing reduces the computational complexity. However, the author only stressed the parameter adjustment in the discussion part but did not specify the weakness of the model itself. Meanwhile, we also found that the evaluations were performed on indices, the same model may not have the same performance if applied on a specific stock.

Lee in [ 20 ] used the support vector machine (SVM) along with a hybrid feature selection method to carry out prediction of stock trends. The dataset in this research is a sub dataset of NASDAQ Index in Taiwan Economic Journal Database (TEJD) in 2008. The feature selection part was using a hybrid method, supported sequential forward search (SSFS) played the role of the wrapper. Another advantage of this work is that they designed a detailed procedure of parameter adjustment with performance under different parameter values. The clear structure of the feature selection model is also heuristic to the primary stage of model structuring. One of the limitations was that the performance of SVM was compared to back-propagation neural network (BPNN) only and did not compare to the other machine learning algorithms.

Sirignano and Cont leveraged a deep learning solution trained on a universal feature set of financial markets in [ 40 ]. The dataset used included buy and sell records of all transactions, and cancellations of orders for approximately 1000 NASDAQ stocks through the order book of the stock exchange. The NN consists of three layers with LSTM units and a feed-forward layer with rectified linear units (ReLUs) at last, with stochastic gradient descent (SGD) algorithm as an optimization. Their universal model was able to generalize and cover the stocks other than the ones in the training data. Though they mentioned the advantages of a universal model, the training cost was still expensive. Meanwhile, due to the inexplicit programming of the deep learning algorithm, it is unclear that if there are useless features contaminated when feeding the data into the model. Authors found out that it would have been better if they performed feature selection part before training the model and found it as an effective way to reduce the computational complexity.

Ni et al. in [ 30 ] predicted stock price trends by exploiting SVM and performed fractal feature selection for optimization. The dataset they used is the Shanghai Stock Exchange Composite Index (SSECI), with 19 technical indicators as features. Before processing the data, they optimized the input data by performing feature selection. When finding the best parameter combination, they also used a grid search method, which is k cross-validation. Besides, the evaluation of different feature selection methods is also comprehensive. As the authors mentioned in their conclusion part, they only considered the technical indicators but not macro and micro factors in the financial domain. The source of datasets that the authors used was similar to our dataset, which makes their evaluation results useful to our research. They also mentioned a method called k cross-validation when testing hyper-parameter combinations.

McNally et al. in [ 27 ] leveraged RNN and LSTM on predicting the price of Bitcoin, optimized by using the Boruta algorithm for feature engineering part, and it works similarly to the random forest classifier. Besides feature selection, they also used Bayesian optimization to select LSTM parameters. The Bitcoin dataset ranged from the 19th of August 2013 to 19th of July 2016. Used multiple optimization methods to improve the performance of deep learning methods. The primary problem of their work is overfitting. The research problem of predicting Bitcoin price trend has some similarities with stock market price prediction. Hidden features and noises embedded in the price data are threats of this work. The authors treated the research question as a time sequence problem. The best part of this paper is the feature engineering and optimization part; we could replicate the methods they exploited in our data pre-processing.

Weng et al. in [ 45 ] focused on short-term stock price prediction by using ensemble methods of four well-known machine learning models. The dataset for this research is five sets of data. They obtained these datasets from three open-sourced APIs and an R package named TTR. The machine learning models they used are (1) neural network regression ensemble (NNRE), (2) a Random Forest with unpruned regression trees as base learners (RFR), (3) AdaBoost with unpruned regression trees as base learners (BRT) and (4) a support vector regression ensemble (SVRE). A thorough study of ensemble methods specified for short-term stock price prediction. With background knowledge, the authors selected eight technical indicators in this study then performed a thoughtful evaluation of five datasets. The primary contribution of this paper is that they developed a platform for investors using R, which does not need users to input their own data but call API to fetch the data from online source straightforward. From the research perspective, they only evaluated the prediction of the price for 1 up to 10 days ahead but did not evaluate longer terms than two trading weeks or a shorter term than 1 day. The primary limitation of their research was that they only analyzed 20 U.S.-based stocks, the model might not be generalized to other stock market or need further revalidation to see if it suffered from overfitting problems.

Kara et al. in [ 17 ] also exploited ANN and SVM in predicting the movement of stock price index. The data set they used covers a time period from January 2, 1997, to December 31, 2007, of the Istanbul Stock Exchange. The primary strength of this work is its detailed record of parameter adjustment procedures. While the weaknesses of this work are that neither the technical indicator nor the model structure has novelty, and the authors did not explain how their model performed better than other models in previous works. Thus, more validation works on other datasets would help. They explained how ANN and SVM work with stock market features, also recorded the parameter adjustment. The implementation part of our research could benefit from this previous work.

Jeon et al. in [ 16 ] performed research on millisecond interval-based big dataset by using pattern graph tracking to complete stock price prediction tasks. The dataset they used is a millisecond interval-based big dataset of historical stock data from KOSCOM, from August 2014 to October 2014, 10G–15G capacity. The author applied Euclidean distance, Dynamic Time Warping (DTW) for pattern recognition. For feature selection, they used stepwise regression. The authors completed the prediction task by ANN and Hadoop and RHive for big data processing. The “ Results ” section is based on the result processed by a combination of SAX and Jaro–Winkler distance. Before processing the data, they generated aggregated data at 5-min intervals from discrete data. The primary strength of this work is the explicit structure of the whole implementation procedure. While they exploited a relatively old model, another weakness is the overall time span of the training dataset is extremely short. It is difficult to access the millisecond interval-based data in real life, so the model is not as practical as a daily based data model.

Huang et al. in [ 12 ] applied a fuzzy-GA model to complete the stock selection task. They used the key stocks of the 200 largest market capitalization listed as the investment universe in the Taiwan Stock Exchange. Besides, the yearly financial statement data and the stock returns were taken from the Taiwan Economic Journal (TEJ) database at www.tej.com.tw/ for the time period from year 1995 to year 2009. They conducted the fuzzy membership function with model parameters optimized with GA and extracted features for optimizing stock scoring. The authors proposed an optimized model for selection and scoring of stocks. Different from the prediction model, the authors more focused on stock rankings, selection, and performance evaluation. Their structure is more practical among investors. But in the model validation part, they did not compare the model with existed algorithms but the statistics of the benchmark, which made it challenging to identify if GA would outperform other algorithms.

Fischer and Krauss in [ 5 ] applied long short-term memory (LSTM) on financial market prediction. The dataset they used is S&P 500 index constituents from Thomson Reuters. They obtained all month-end constituent lists for the S&P 500 from Dec 1989 to Sep 2015, then consolidated the lists into a binary matrix to eliminate survivor bias. The authors also used RMSprop as an optimizer, which is a mini-batch version of rprop. The primary strength of this work is that the authors used the latest deep learning technique to perform predictions. They relied on the LSTM technique, lack of background knowledge in the financial domain. Although the LSTM outperformed the standard DNN and logistic regression algorithms, while the author did not mention the effort to train an LSTM with long-time dependencies.

Tsai and Hsiao in [ 42 ] proposed a solution as a combination of different feature selection methods for prediction of stocks. They used Taiwan Economic Journal (TEJ) database as data source. The data used in their analysis was from year 2000 to 2007. In their work, they used a sliding window method and combined it with multi layer perceptron (MLP) based artificial neural networks with back propagation, as their prediction model. In their work, they also applied principal component analysis (PCA) for dimensionality reduction, genetic algorithms (GA) and the classification and regression trees (CART) to select important features. They did not just rely on technical indices only. Instead, they also included both fundamental and macroeconomic indices in their analysis. The authors also reported a comparison on feature selection methods. The validation part was done by combining the model performance stats with statistical analysis.

Pimenta et al. in [ 32 ] leveraged an automated investing method by using multi-objective genetic programming and applied it in the stock market. The dataset was obtained from Brazilian stock exchange market (BOVESPA), and the primary techniques they exploited were a combination of multi-objective optimization, genetic programming, and technical trading rules. For optimization, they leveraged genetic programming (GP) to optimize decision rules. The novelty of this paper was in the evaluation part. They included a historical period, which was a critical moment of Brazilian politics and economics when performing validation. This approach reinforced the generalization strength of their proposed model. When selecting the sub-dataset for evaluation, they also set criteria to ensure more asset liquidity. While the baseline of the comparison was too basic and fundamental, and the authors did not perform any comparison with other existing models.

Huang and Tsai in [ 13 ] conducted a filter-based feature selection assembled with a hybrid self-organizing feature map (SOFM) support vector regression (SVR) model to forecast Taiwan index futures (FITX) trend. They divided the training samples into clusters to marginally improve the training efficiency. The authors proposed a comprehensive model, which was a combination of two novel machine learning techniques in stock market analysis. Besides, the optimizer of feature selection was also applied before the data processing to improve the prediction accuracy and reduce the computational complexity of processing daily stock index data. Though they optimized the feature selection part and split the sample data into small clusters, it was already strenuous to train daily stock index data of this model. It would be difficult for this model to predict trading activities in shorter time intervals since the data volume would be increased drastically. Moreover, the evaluation is not strong enough since they set a single SVR model as a baseline, but did not compare the performance with other previous works, which caused difficulty for future researchers to identify the advantages of SOFM-SVR model why it outperforms other algorithms.

Thakur and Kumar in [ 41 ] also developed a hybrid financial trading support system by exploiting multi-category classifiers and random forest (RAF). They conducted their research on stock indices from NASDAQ, DOW JONES, S&P 500, NIFTY 50, and NIFTY BANK. The authors proposed a hybrid model combined random forest (RF) algorithms with a weighted multicategory generalized eigenvalue support vector machine (WMGEPSVM) to generate “Buy/Hold/Sell” signals. Before processing the data, they used Random Forest (RF) for feature pruning. The authors proposed a practical model designed for real-life investment activities, which could generate three basic signals for investors to refer to. They also performed a thorough comparison of related algorithms. While they did not mention the time and computational complexity of their works. Meanwhile, the unignorable issue of their work was the lack of financial domain knowledge background. The investors regard the indices data as one of the attributes but could not take the signal from indices to operate a specific stock straightforward.

Hsu in [ 11 ] assembled feature selection with a back propagation neural network (BNN) combined with genetic programming to predict the stock/futures price. The dataset in this research was obtained from Taiwan Stock Exchange Corporation (TWSE). The authors have introduced the description of the background knowledge in detail. While the weakness of their work is that it is a lack of data set description. This is a combination of the model proposed by other previous works. Though we did not see the novelty of this work, we can still conclude that the genetic programming (GP) algorithm is admitted in stock market research domain. To reinforce the validation strengths, it would be good to consider adding GP models into evaluation if the model is predicting a specific price.

Hafezi et al. in [ 7 ] built a bat-neural network multi-agent system (BN-NMAS) to predict stock price. The dataset was obtained from the Deutsche bundes-bank. They also applied the Bat algorithm (BA) for optimizing neural network weights. The authors illustrated their overall structure and logic of system design in clear flowcharts. While there were very few previous works that had performed on DAX data, it would be difficult to recognize if the model they proposed still has the generality if migrated on other datasets. The system design and feature selection logic are fascinating, which worth referring to. Their findings in optimization algorithms are also valuable for the research in the stock market price prediction research domain. It is worth trying the Bat algorithm (BA) when constructing neural network models.

Long et al. in [ 25 ] conducted a deep learning approach to predict the stock price movement. The dataset they used is the Chinese stock market index CSI 300. For predicting the stock price movement, they constructed a multi-filter neural network (MFNN) with stochastic gradient descent (SGD) and back propagation optimizer for learning NN parameters. The strength of this paper is that the authors exploited a novel model with a hybrid model constructed by different kinds of neural networks, it provides an inspiration for constructing hybrid neural network structures.

Atsalakis and Valavanis in [ 1 ] proposed a solution of a neuro-fuzzy system, which is composed of controller named as Adaptive Neuro Fuzzy Inference System (ANFIS), to achieve short-term stock price trend prediction. The noticeable strength of this work is the evaluation part. Not only did they compare their proposed system with the popular data models, but also compared with investment strategies. While the weakness that we found from their proposed solution is that their solution architecture is lack of optimization part, which might limit their model performance. Since our proposed solution is also focusing on short-term stock price trend prediction, this work is heuristic for our system design. Meanwhile, by comparing with the popular trading strategies from investors, their work inspired us to compare the strategies used by investors with techniques used by researchers.

Nekoeiqachkanloo et al. in [ 29 ] proposed a system with two different approaches for stock investment. The strengths of their proposed solution are obvious. First, it is a comprehensive system that consists of data pre-processing and two different algorithms to suggest the best investment portions. Second, the system also embedded with a forecasting component, which also retains the features of the time series. Last but not least, their input features are a mix of fundamental features and technical indices that aim to fill in the gap between the financial domain and technical domain. However, their work has a weakness in the evaluation part. Instead of evaluating the proposed system on a large dataset, they chose 25 well-known stocks. There is a high possibility that the well-known stocks might potentially share some common hidden features.

As another related latest work, Idrees et al. [ 14 ] published a time series-based prediction approach for the volatility of the stock market. ARIMA is not a new approach in the time series prediction research domain. Their work is more focusing on the feature engineering side. Before feeding the features into ARIMA models, they designed three steps for feature engineering: Analyze the time series, identify if the time series is stationary or not, perform estimation by plot ACF and PACF charts and look for parameters. The only weakness of their proposed solution is that the authors did not perform any customization on the existing ARIMA model, which might limit the system performance to be improved.

One of the main weaknesses found in the related works is limited data-preprocessing mechanisms built and used. Technical works mostly tend to focus on building prediction models. When they select the features, they list all the features mentioned in previous works and go through the feature selection algorithm then select the best-voted features. Related works in the investment domain have shown more interest in behavior analysis, such as how herding behaviors affect the stock performance, or how the percentage of inside directors hold the firm’s common stock affects the performance of a certain stock. These behaviors often need a pre-processing procedure of standard technical indices and investment experience to recognize.

In the related works, often a thorough statistical analysis is performed based on a special dataset and conclude new features rather than performing feature selections. Some data, such as the percentage of a certain index fluctuation has been proven to be effective on stock performance. We believe that by extracting new features from data, then combining such features with existed common technical indices will significantly benefit the existing and well-tested prediction models.

The dataset

This section details the data that was extracted from the public data sources, and the final dataset that was prepared. Stock market-related data are diverse, so we first compared the related works from the survey of financial research works in stock market data analysis to specify the data collection directions. After collecting the data, we defined a data structure of the dataset. Given below, we describe the dataset in detail, including the data structure, and data tables in each category of data with the segment definitions.

Description of our dataset

In this section, we will describe the dataset in detail. This dataset consists of 3558 stocks from the Chinese stock market. Besides the daily price data, daily fundamental data of each stock ID, we also collected the suspending and resuming history, top 10 shareholders, etc. We list two reasons that we choose 2 years as the time span of this dataset: (1) most of the investors perform stock market price trend analysis using the data within the latest 2 years, (2) using more recent data would benefit the analysis result. We collected data through the open-sourced API, namely Tushare [ 43 ], mean-while we also leveraged a web-scraping technique to collect data from Sina Finance web pages, SWS Research website.

Data structure

Figure  1 illustrates all the data tables in the dataset. We collected four categories of data in this dataset: (1) basic data, (2) trading data, (3) finance data, and (4) other reference data. All the data tables can be linked to each other by a common field called “Stock ID” It is a unique stock identifier registered in the Chinese Stock market. Table  1 shows an overview of the dataset.

figure 1

Data structure for the extracted dataset

The Table  1 lists the field information of each data table as well as which category the data table belongs to.

In this section, we present the proposed methods and the design of the proposed solution. Moreover, we also introduce the architecture design as well as algorithmic and implementation details.

Problem statement

We analyzed the best possible approach for predicting short-term price trends from different aspects: feature engineering, financial domain knowledge, and prediction algorithm. Then we addressed three research questions in each aspect, respectively: How can feature engineering benefit model prediction accuracy? How do findings from the financial domain benefit prediction model design? And what is the best algorithm for predicting short-term price trends?

The first research question is about feature engineering. We would like to know how the feature selection method benefits the performance of prediction models. From the abundance of the previous works, we can conclude that stock price data embedded with a high level of noise, and there are also correlations between features, which makes the price prediction notoriously difficult. That is also the primary reason for most of the previous works introduced the feature engineering part as an optimization module.

The second research question is evaluating the effectiveness of findings we extracted from the financial domain. Different from the previous works, besides the common evaluation of data models such as the training costs and scores, our evaluation will emphasize the effectiveness of newly added features that we extracted from the financial domain. We introduce some features from the financial domain. While we only obtained some specific findings from previous works, and the related raw data needs to be processed into usable features. After extracting related features from the financial domain, we combine the features with other common technical indices for voting out the features with a higher impact. There are numerous features said to be effective from the financial domain, and it would be impossible for us to cover all of them. Thus, how to appropriately convert the findings from the financial domain to a data processing module of our system design is a hidden research question that we attempt to answer.

The third research question is that which algorithms are we going to model our data? From the previous works, researchers have been putting efforts into the exact price prediction. We decompose the problem into predicting the trend and then the exact number. This paper focuses on the first step. Hence, the objective has been converted to resolve a binary classification problem, meanwhile, finding an effective way to eliminate the negative effect brought by the high level of noise. Our approach is to decompose the complex problem into sub-problems which have fewer dependencies and resolve them one by one, and then compile the resolutions into an ensemble model as an aiding system for investing behavior reference.

In the previous works, researchers have been using a variety of models for predicting stock price trends. While most of the best-performed models are based on machine learning techniques, in this work, we will compare our approach with the outperformed machine learning models in the evaluation part and find the solution for this research question.

Proposed solution

The high-level architecture of our proposed solution could be separated into three parts. First is the feature selection part, to guarantee the selected features are highly effective. Second, we look into the data and perform the dimensionality reduction. And the last part, which is the main contribution of our work is to build a prediction model of target stocks. Figure  2 depicts a high-level architecture of the proposed solution.

figure 2

High-level architecture of the proposed solution

There are ways to classify different categories of stocks. Some investors prefer long-term investments, while others show more interest in short-term investments. It is common to see the stock-related reports showing an average performance, while the stock price is increasing drastically; this is one of the phenomena that indicate the stock price prediction has no fixed rules, thus finding effective features before training a model on data is necessary.

In this research, we focus on the short-term price trend prediction. Currently, we only have the raw data with no labels. So, the very first step is to label the data. We mark the price trend by comparing the current closing price with the closing price of n trading days ago, the range of n is from 1 to 10 since our research is focusing on the short-term. If the price trend goes up, we mark it as 1 or mark as 0 in the opposite case. To be more specified, we use the indices from the indices of n  −  1 th day to predict the price trend of the n th day.

According to the previous works, some researchers who applied both financial domain knowledge and technical methods on stock data were using rules to filter the high-quality stocks. We referred to their works and exploited their rules to contribute to our feature extension design.

However, to ensure the best performance of the prediction model, we will look into the data first. There are a large number of features in the raw data; if we involve all the features into our consideration, it will not only drastically increase the computational complexity but will also cause side effects if we would like to perform unsupervised learning in further research. So, we leverage the recursive feature elimination (RFE) to ensure all the selected features are effective.

We found most of the previous works in the technical domain were analyzing all the stocks, while in the financial domain, researchers prefer to analyze the specific scenario of investment, to fill the gap between the two domains, we decide to apply a feature extension based on the findings we gathered from the financial domain before we start the RFE procedure.

Since we plan to model the data into time series, the number of the features, the more complex the training procedure will be. So, we will leverage the dimensionality reduction by using randomized PCA at the beginning of our proposed solution architecture.

Detailed technical design elaboration

This section provides an elaboration of the detailed technical design as being a comprehensive solution based on utilizing, combining, and customizing several existing data preprocessing, feature engineering, and deep learning techniques. Figure  3 provides the detailed technical design from data processing to prediction, including the data exploration. We split the content by main procedures, and each procedure contains algorithmic steps. Algorithmic details are elaborated in the next section. The contents of this section will focus on illustrating the data workflow.

figure 3

Detailed technical design of the proposed solution

Based on the literature review, we select the most commonly used technical indices and then feed them into the feature extension procedure to get the expanded feature set. We will select the most effective i features from the expanded feature set. Then we will feed the data with i selected features into the PCA algorithm to reduce the dimension into j features. After we get the best combination of i and j , we process the data into finalized the feature set and feed them into the LSTM [ 10 ] model to get the price trend prediction result.

The novelty of our proposed solution is that we will not only apply the technical method on raw data but also carry out the feature extensions that are used among stock market investors. Details on feature extension are given in the next subsection. Experiences gained from applying and optimizing deep learning based solutions in [ 37 , 38 ] were taken into account while designing and customizing feature engineering and deep learning solution in this work.

Applying feature extension

The first main procedure in Fig.  3 is the feature extension. In this block, the input data is the most commonly used technical indices concluded from related works. The three feature extension methods are max–min scaling, polarizing, and calculating fluctuation percentage. Not all the technical indices are applicable for all three of the feature extension methods; this procedure only applies the meaningful extension methods on technical indices. We choose meaningful extension methods while looking at how the indices are calculated. The technical indices and the corresponding feature extension methods are illustrated in Table  2 .

After the feature extension procedure, the expanded features will be combined with the most commonly used technical indices, i.e., input data with output data, and feed into RFE block as input data in the next step.

Applying recursive feature elimination

After the feature extension above, we explore the most effective i features by using the Recursive Feature Elimination (RFE) algorithm [ 6 ]. We estimate all the features by two attributes, coefficient, and feature importance. We also limit the features that remove from the pool by one, which means we will remove one feature at each step and retain all the relevant features. Then the output of the RFE block will be the input of the next step, which refers to PCA.

Applying principal component analysis (PCA)

The very first step before leveraging PCA is feature pre-processing. Because some of the features after RFE are percentage data, while others are very large numbers, i.e., the output from RFE are in different units. It will affect the principal component extraction result. Thus, before feeding the data into the PCA algorithm [ 8 ], a feature pre-processing is necessary. We also illustrate the effectiveness and methods comparison in “ Results ” section.

After performing feature pre-processing, the next step is to feed the processed data with selected i features into the PCA algorithm to reduce the feature matrix scale into j features. This step is to retain as many effective features as possible and meanwhile eliminate the computational complexity of training the model. This research work also evaluates the best combination of i and j, which has relatively better prediction accuracy, meanwhile, cuts the computational consumption. The result can be found in the “ Results ” section, as well. After the PCA step, the system will get a reshaped matrix with j columns.

Fitting long short-term memory (LSTM) model

PCA reduced the dimensions of the input data, while the data pre-processing is mandatory before feeding the data into the LSTM layer. The reason for adding the data pre-processing step before the LSTM model is that the input matrix formed by principal components has no time steps. While one of the most important parameters of training an LSTM is the number of time steps. Hence, we have to model the matrix into corresponding time steps for both training and testing dataset.

After performing the data pre-processing part, the last step is to feed the training data into LSTM and evaluate the performance using testing data. As a variant neural network of RNN, even with one LSTM layer, the NN structure is still a deep neural network since it can process sequential data and memorizes its hidden states through time. An LSTM layer is composed of one or more LSTM units, and an LSTM unit consists of cells and gates to perform classification and prediction based on time series data.

The LSTM structure is formed by two layers. The input dimension is determined by j after the PCA algorithm. The first layer is the input LSTM layer, and the second layer is the output layer. The final output will be 0 or 1 indicates if the stock price trend prediction result is going down or going up, as a supporting suggestion for the investors to perform the next investment decision.

Design discussion

Feature extension is one of the novelties of our proposed price trend predicting system. In the feature extension procedure, we use technical indices to collaborate with the heuristic processing methods learned from investors, which fills the gap between the financial research area and technical research area.

Since we proposed a system of price trend prediction, feature engineering is extremely important to the final prediction result. Not only the feature extension method is helpful to guarantee we do not miss the potentially correlated feature, but also feature selection method is necessary for pooling the effective features. The more irrelevant features are fed into the model, the more noise would be introduced. Each main procedure is carefully considered contributing to the whole system design.

Besides the feature engineering part, we also leverage LSTM, the state-of-the-art deep learning method for time-series prediction, which guarantees the prediction model can capture both complex hidden pattern and the time-series related pattern.

It is known that the training cost of deep learning models is expansive in both time and hardware aspects; another advantage of our system design is the optimization procedure—PCA. It can retain the principal components of the features while reducing the scale of the feature matrix, thus help the system to save the training cost of processing the large time-series feature matrix.

Algorithm elaboration

This section provides comprehensive details on the algorithms we built while utilizing and customizing different existing techniques. Details about the terminologies, parameters, as well as optimizers. From the legend on the right side of Fig.  3 , we note the algorithm steps as octagons, all of them can be found in this “ Algorithm elaboration ” section.

Before dive deep into the algorithm steps, here is the brief introduction of data pre-processing: since we will go through the supervised learning algorithms, we also need to program the ground truth. The ground truth of this research is programmed by comparing the closing price of the current trading date with the closing price of the previous trading date the users want to compare with. Label the price increase as 1, else the ground truth will be labeled as 0. Because this research work is not only focused on predicting the price trend of a specific period of time but short-term in general, the ground truth processing is according to a range of trading days. While the algorithms will not change with the prediction term length, we can regard the term length as a parameter.

The algorithmic detail is elaborated, respectively, the first algorithm is the hybrid feature engineering part for preparing high-quality training and testing data. It corresponds to the Feature extension, RFE, and PCA blocks in Fig.  3 . The second algorithm is the LSTM procedure block, including time-series data pre-processing, NN constructing, training, and testing.

Algorithm 1: Short-term stock market price trend prediction—applying feature engineering using FE + RFE + PCA

The function FE is corresponding to the feature extension block. For the feature extension procedure, we apply three different processing methods to translate the findings from the financial domain to a technical module in our system design. While not all the indices are applicable for expanding, we only choose the proper method(s) for certain features to perform the feature extension (FE), according to Table  2 .

Normalize method preserves the relative frequencies of the terms, and transform the technical indices into the range of [0, 1]. Polarize is a well-known method often used by real-world investors, sometimes they prefer to consider if the technical index value is above or below zero, we program some of the features using polarize method and prepare for RFE. Max-min (or min-max) [ 35 ] scaling is a transformation method often used as an alternative to zero mean and unit variance scaling. Another well-known method used is fluctuation percentage, and we transform the technical indices fluctuation percentage into the range of [− 1, 1].

The function RFE () in the first algorithm refers to recursive feature elimination. Before we perform the training data scale reduction, we will have to make sure that the features we selected are effective. Ineffective features will not only drag down the classification precision but also add more computational complexity. For the feature selection part, we choose recursive feature elimination (RFE). As [ 45 ] explained, the process of recursive feature elimination can be split into the ranking algorithm, resampling, and external validation.

For the ranking algorithm, it fits the model to the features and ranks by the importance to the model. We set the parameter to retain i numbers of features, and at each iteration of feature selection retains Si top-ranked features, then refit the model and assess the performance again to begin another iteration. The ranking algorithm will eventually determine the top Si features.

The RFE algorithm is known to have suffered from the over-fitting problem. To eliminate the over-fitting issue, we will run the RFE algorithm multiple times on randomly selected stocks as the training set and ensure all the features we select are high-weighted. This procedure is called data resampling. Resampling can be built as an optimization step as an outer layer of the RFE algorithm.

The last part of our hybrid feature engineering algorithm is for optimization purposes. For the training data matrix scale reduction, we apply Randomized principal component analysis (PCA) [ 31 ], before we decide the features of the classification model.

Financial ratios of a listed company are used to present the growth ability, earning ability, solvency ability, etc. Each financial ratio consists of a set of technical indices, each time we add a technical index (or feature) will add another column of data into the data matrix and will result in low training efficiency and redundancy. If non-relevant or less relevant features are included in training data, it will also decrease the precision of classification.

figure a

The above equation represents the explanation power of principal components extracted by PCA method for original data. If an ACR is below 85%, the PCA method would be unsuitable due to a loss of original information. Because the covariance matrix is sensitive to the order of magnitudes of data, there should be a data standardize procedure before performing the PCA. The commonly used standardized methods are mean-standardization and normal-standardization and are noted as given below:

Mean-standardization: \(X_{ij}^{*} = X_{ij} /\overline{{X_{j} }}\) , which \(\overline{{X_{j} }}\) represents the mean value.

Normal-standardization: \(X_{ij}^{*} = (X_{ij} - \overline{{X_{j} }} )/s_{j}\) , which \(\overline{{X_{j} }}\) represents the mean value, and \(s_{j}\) is the standard deviation.

The array fe_array is defined according to Table  2 , row number maps to the features, columns 0, 1, 2, 3 note for the extension methods of normalize, polarize, max–min scale, and fluctuation percentage, respectively. Then we fill in the values for the array by the rule where 0 stands for no necessity to expand and 1 for features need to apply the corresponding extension methods. The final algorithm of data preprocessing using RFE and PCA can be illustrated as Algorithm 1.

Algorithm 2: Price trend prediction model using LSTM

After the principal component extraction, we will get the scale-reduced matrix, which means i most effective features are converted into j principal components for training the prediction model. We utilized an LSTM model and added a conversion procedure for our stock price dataset. The detailed algorithm design is illustrated in Alg 2. The function TimeSeriesConversion () converts the principal components matrix into time series by shifting the input data frame according to the number of time steps [ 3 ], i.e., term length in this research. The processed dataset consists of the input sequence and forecast sequence. In this research, the parameter of LAG is 1, because the model is detecting the pattern of features fluctuation on a daily basis. Meanwhile, the N_TIME_STEPS is varied from 1 trading day to 10 trading days. The functions DataPartition (), FitModel (), EvaluateModel () are regular steps without customization. The NN structure design, optimizer decision, and other parameters are illustrated in function ModelCompile () .

Some procedures impact the efficiency but do not affect the accuracy or precision and vice versa, while other procedures may affect both efficiency and prediction result. To fully evaluate our algorithm design, we structure the evaluation part by main procedures and evaluate how each procedure affects the algorithm performance. First, we evaluated our solution on a machine with 2.2 GHz i7 processor, with 16 GB of RAM. Furthermore, we also evaluated our solution on Amazon EC2 instance, 3.1 GHz Processor with 16 vCPUs, and 64 GB RAM.

In the implementation part, we expanded 20 features into 54 features, while we retain 30 features that are the most effective. In this section, we discuss the evaluation of feature selection. The dataset was divided into two different subsets, i.e., training and testing datasets. Test procedure included two parts, one testing dataset is for feature selection, and another one is for model testing. We note the feature selection dataset and model testing dataset as DS_test_f and DS_test_m, respectively.

We randomly selected two-thirds of the stock data by stock ID for RFE training and note the dataset as DS_train_f; all the data consist of full technical indices and expanded features throughout 2018. The estimator of the RFE algorithm is SVR with linear kernels. We rank the 54 features by voting and get 30 effective features then process them using the PCA algorithm to perform dimension reduction and reduce the features into 20 principal components. The rest of the stock data forms the testing dataset DS_test_f to validate the effectiveness of principal components we extracted from selected features. We reformed all the data from 2018 as the training dataset of the data model and noted as DS_train_m. The model testing dataset DS_test_m consists of the first 3 months of data in 2019, which has no overlap with the dataset we utilized in the previous steps. This approach is to prevent the hidden problem caused by overfitting.

Term length

To build an efficient prediction model, instead of the approach of modeling the data to time series, we determined to use 1 day ahead indices data to predict the price trend of the next day. We tested the RFE algorithm on a range of short-term from 1 day to 2 weeks (ten trading days), to evaluate how the commonly used technical indices correlated to price trends. For evaluating the prediction term length, we fully expanded the features as Table  2 , and feed them to RFE. During the test, we found that different length of the term has a different level of sensitive-ness to the same indices set.

We get the close price of the first trading date and compare it with the close price of the n _ th trading date. Since we are predicting the price trend, we do not consider the term lengths if the cross-validation score is below 0.5. And after the test, as we can see from Fig.  4 , there are three-term lengths that are most sensitive to the indices we selected from the related works. They are n  = {2, 5, 10}, which indicates that price trend prediction of every other day, 1 week, and 2 weeks using the indices set are likely to be more reliable.

figure 4

How do term lengths affect the cross-validation score of RFE

While these curves have different patterns, for the length of 2 weeks, the cross-validation score increases with the number of features selected. If the prediction term length is 1 week, the cross-validation score will decrease if selected over 8 features. For every other day price trend prediction, the best cross-validation score is achieved by selecting 48 features. Biweekly prediction requires 29 features to achieve the best score. In Table  3 , we listed the top 15 effective features for these three-period lengths. If we predict the price trend of every other day, the cross-validation score merely fluctuates with the number of features selected. So, in the next step, we will evaluate the RFE result for these three-term lengths, as shown in Fig.  4 .

We compare the output feature set of RFE with the all-original feature set as a baseline, the all-original feature set consists of n features and we choose n most effective features from RFE output features to evaluate the result using linear SVR. We used two different approaches to evaluate feature effectiveness. The first method is to combine all the data into one large matrix and evaluate them by running the RFE algorithm once. Another method is to run RFE for each individual stock and calculate the most effective features by voting.

Feature extension and RFE

From the result of the previous subsection, we can see that when predicting the price trend for every other day or biweekly, the best result is achieved by selecting a large number of features. Within the selected features, some features processed from extension methods have better ranks than original features, which proves that the feature extension method is useful for optimizing the model. The feature extension affects both precision and efficiency, while in this part, we only discuss the precision aspect and leave efficiency part in the next step since PCA is the most effective method for training efficiency optimization in our design. We involved an evaluation of how feature extension affects RFE and use the test result to measure the improvement of involving feature extension.

We further test the effectiveness of feature extension, i.e., if polarize, max–min scale, and calculate fluctuation percentage works better than original technical indices. The best case to leverage this test is the weekly prediction since it has the least effective feature selected. From the result we got from the last section, we know the best cross-validation score appears when selecting 8 features. The test consists of two steps, and the first step is to test the feature set formed by original features only, in this case, only SLOWK, SLOWD, and RSI_5 are included. The next step is to test the feature set of all 8 features we selected in the previous subsection. We leveraged the test by defining the simplest DNN model with three layers.

The normalized confusion matrix of testing the two feature sets are illustrated in Fig.  5 . The left one is the confusion matrix of the feature set with expanded features, and the right one besides is the test result of using original features only. Both precisions of true positive and true negative have been improved by 7% and 10%, respectively, which proves that our feature extension method design is reasonably effective.

figure 5

Confusion matrix of validating feature extension effectiveness

Feature reduction using principal component analysis

PCA will affect the algorithm performance on both prediction accuracy and training efficiency, while this part should be evaluated with the NN model, so we also defined the simplest DNN model with three layers as we used in the previous step to perform the evaluation. This part introduces the evaluation method and result of the optimization part of the model from computational efficiency and accuracy impact perspectives.

In this section, we will choose bi-weekly prediction to perform a use case analysis, since it has a smoothly increasing cross-validation score curve, moreover, unlike every other day prediction, it has excluded more than 20 ineffective features already. In the first step, we select all 29 effective features and train the NN model without performing PCA. It creates a baseline of the accuracy and training time for comparison. To evaluate the accuracy and efficiency, we keep the number of the principal component as 5, 10, 15, 20, 25. Table  4 recorded how the number of features affects the model training efficiency, then uses the stack bar chart in Fig.  6 to illustrate how PCA affects training efficiency. Table  6 shows accuracy and efficiency analysis on different procedures for the pre-processing of features. The times taken shown in Tables  4 , 6 are based on experiments conducted in a standard user machine to show the viability of our solution with limited or average resource availability.

figure 6

Relationship between feature number and training time

We also listed the confusion matrix of each test in Fig.  7 . The stack bar chart shows that the overall time spends on training the model is decreasing by the number of selected features, while the PCA method is significantly effective in optimizing training dataset preparation. For the time spent on the training stage, PCA is not as effective as the data preparation stage. While there is the possibility that the optimization effect of PCA is not drastic enough because of the simple structure of the NN model.

figure 7

How does the number of principal components affect evaluation results

Table  5 indicates that the overall prediction accuracy is not drastically affected by reducing the dimension. However, the accuracy could not fully support if the PCA has no side effect to model prediction, so we looked into the confusion matrices of test results.

From Fig.  7 we can conclude that PCA does not have a severe negative impact on prediction precision. The true positive rate and false positive rate are barely be affected, while the false negative and true negative rates are influenced by 2% to 4%. Besides evaluating how the number of selected features affects the training efficiency and model performance, we also leveraged a test upon how data pre-processing procedures affect the training procedure and predicting result. Normalizing and max–min scaling is the most commonly seen data pre-procedure performed before PCA, since the measure units of features are varied, and it is said that it could increase the training efficiency afterward.

We leveraged another test on adding pre-procedures before extracting 20 principal components from the original dataset and make the comparison in the aspects of time elapse of training stage and prediction precision. However, the test results lead to different conclusions. In Table  6 we can conclude that feature pre-processing does not have a significant impact on training efficiency, but it does influence the model prediction accuracy. Moreover, the first confusion matrix in Fig.  8 indicates that without any feature pre-processing procedure, the false-negative rate and true negative rate are severely affected, while the true positive rate and false positive rate are not affected. If it performs the normalization before PCA, both true positive rate and true negative rate are decreasing by approximately 10%. This test also proved that the best feature pre-processing method for our feature set is exploiting the max–min scale.

figure 8

Confusion matrices of different feature pre-processing methods

In this section, we discuss and compare the results of our proposed model, other approaches, and the most related works.

Comparison with related works

From the previous works, we found the most commonly exploited models for short-term stock market price trend prediction are support vector machine (SVM), multilayer perceptron artificial neural network (MLP), Naive Bayes classifier (NB), random forest classifier (RAF) and logistic regression classifier (LR). The test case of comparison is also bi-weekly price trend prediction, to evaluate the best result of all models, we keep all 29 features selected by the RFE algorithm. For MLP evaluation, to test if the number of hidden layers would affect the metric scores, we noted layer number as n and tested n  = {1, 3, 5}, 150 training epochs for all the tests, found slight differences in the model performance, which indicates that the variable of MLP layer number hardly affects the metric scores.

From the confusion matrices in Fig.  9 , we can see all the machine learning models perform well when training with the full feature set we selected by RFE. From the perspective of training time, training the NB model got the best efficiency. LR algorithm cost less training time than other algorithms while it can achieve a similar prediction result with other costly models such as SVM and MLP. RAF algorithm achieved a relatively high true-positive rate while the poor performance in predicting negative labels. For our proposed LSTM model, it achieves a binary accuracy of 93.25%, which is a significantly high precision of predicting the bi-weekly price trend. We also pre-processed data through PCA and got five principal components, then trained for 150 epochs. The learning curve of our proposed solution, based on feature engineering and the LSTM model, is illustrated in Fig.  10 . The confusion matrix is the figure on the right in Fig.  11 , and detailed metrics scores can be found in Table  9 .

figure 9

Model prediction comparison—confusion matrices

figure 10

Learning curve of proposed solution

figure 11

Proposed model prediction precision comparison—confusion matrices

The detailed evaluate results are recorded in Table  7 . We will also initiate a discussion upon the evaluation result in the next section.

Because the resulting structure of our proposed solution is different from most of the related works, it would be difficult to make naïve comparison with previous works. For example, it is hard to find the exact accuracy number of price trend prediction in most of the related works since the authors prefer to show the gain rate of simulated investment. Gain rate is a processed number based on simulated investment tests, sometimes one correct investment decision with a large trading volume can achieve a high gain rate regardless of the price trend prediction accuracy. Besides, it is also a unique and heuristic innovation in our proposed solution, we transform the problem of predicting an exact price straight forward to two sequential problems, i.e., predicting the price trend first, focus on building an accurate binary classification model, construct a solid foundation for predicting the exact price change in future works. Besides the different result structure, the datasets that previous works researched on are also different from our work. Some of the previous works involve news data to perform sentiment analysis and exploit the SE part as another system component to support their prediction model.

The latest related work that can compare is Zubair et al. [ 47 ], the authors take multiple r-square for model accuracy measurement. Multiple r-square is also called the coefficient of determination, and it shows the strength of predictor variables explaining the variation in stock return [ 28 ]. They used three datasets (KSE 100 Index, Lucky Cement Stock, Engro Fertilizer Limited) to evaluate the proposed multiple regression model and achieved 95%, 89%, and 97%, respectively. Except for the KSE 100 Index, the dataset choice in this related work is individual stocks; thus, we choose the evaluation result of the first dataset of their proposed model.

We listed the leading stock price trend prediction model performance in Table  8 , from the comparable metrics, the metric scores of our proposed solution are generally better than other related works. Instead of concluding arbitrarily that our proposed model outperformed other models in related works, we first look into the dataset column of Table  8 . By looking into the dataset used by each work [ 18 ], only trained and tested their proposed solution on three individual stocks, which is difficult to prove the generalization of their proposed model. Ayo [ 2 ] leveraged analysis on the stock data from the New York Stock Exchange (NYSE), while the weakness is they only performed analysis on closing price, which is a feature embedded with high noise. Zubair et al. [ 47 ] trained their proposed model on both individual stocks and index price, but as we have mentioned in the previous section, index price only consists of the limited number of features and stock IDs, which will further affect the model training quality. For our proposed solution, we collected sufficient data from the Chinese stock market, and applied FE + RFE algorithm on the original indices to get more effective features, the comprehensive evaluation result of 3558 stock IDs can reasonably explain the generalization and effectiveness of our proposed solution in Chinese stock market. However, the authors of Khaidem and Dey [ 18 ] and Ayo [ 2 ] chose to analyze the stock market in the United States, Zubair et al. [ 47 ] performed analysis on Pakistani stock market price, and we obtained the dataset from Chinese stock market, the policies of different countries might impact the model performance, which needs further research to validate.

Proposed model evaluation—PCA effectiveness

Besides comparing the performance across popular machine learning models, we also evaluated how the PCA algorithm optimizes the training procedure of the proposed LSTM model. We recorded the confusion matrices comparison between training the model by 29 features and by five principal components in Fig.  11 . The model training using the full 29 features takes 28.5 s per epoch on average. While it only takes 18 s on average per epoch training on the feature set of five principal components. PCA has significantly improved the training efficiency of the LSTM model by 36.8%. The detailed metrics data are listed in Table  9 . We will leverage a discussion in the next section about complexity analysis.

Complexity analysis of proposed solution

This section analyzes the complexity of our proposed solution. The Long Short-term Memory is different from other NNs, and it is a variant of standard RNN, which also has time steps with memory and gate architecture. In the previous work [ 46 ], the author performed an analysis of the RNN architecture complexity. They introduced a method to regard RNN as a directed acyclic graph and proposed a concept of recurrent depth, which helps perform the analysis on the intricacy of RNN.

The recurrent depth is a positive rational number, and we denote it as \(d_{rc}\) . As the growth of \(n\) \(d_{rc}\) measures, the nonlinear transformation average maximum number of each time step. We then unfold the directed acyclic graph of RNN and denote the processed graph as \(g_{c}\) , meanwhile, denote \(C(g_{c} )\) as the set of directed cycles in this graph. For the vertex \(v\) , we note \(\sigma_{s} (v)\) as the sum of edge weights and \(l(v)\) as the length. The equation below is proved under a mild assumption, which could be found in [ 46 ].

They also found that another crucial factor that impacts the performance of LSTM, which is the recurrent skip coefficients. We note \(s_{rc}\) as the reciprocal of the recurrent skip coefficient. Please be aware that \(s_{rc}\) is also a positive rational number.

According to the above definition, our proposed model is a 2-layers stacked LSTM, which \(d_{rc} = 2\) and \(s_{rc} = 1\) . From the experiments performed in previous work, the authors also found that when facing the problems of long-term dependency, LSTMs may benefit from decreasing the reciprocal of recurrent skip coefficients and from increasing recurrent depth. The empirical findings above mentioned are useful to enhance the performance of our proposed model further.

This work consists of three parts: data extraction and pre-processing of the Chinese stock market dataset, carrying out feature engineering, and stock price trend prediction model based on the long short-term memory (LSTM). We collected, cleaned-up, and structured 2 years of Chinese stock market data. We reviewed different techniques often used by real-world investors, developed a new algorithm component, and named it as feature extension, which is proved to be effective. We applied the feature expansion (FE) approaches with recursive feature elimination (RFE), followed by principal component analysis (PCA), to build a feature engineering procedure that is both effective and efficient. The system is customized by assembling the feature engineering procedure with an LSTM prediction model, achieved high prediction accuracy that outperforms the leading models in most related works. We also carried out a comprehensive evaluation of this work. By comparing the most frequently used machine learning models with our proposed LSTM model under the feature engineering part of our proposed system, we conclude many heuristic findings that could be future research questions in both technical and financial research domains.

Our proposed solution is a unique customization as compared to the previous works because rather than just proposing yet another state-of-the-art LSTM model, we proposed a fine-tuned and customized deep learning prediction system along with utilization of comprehensive feature engineering and combined it with LSTM to perform prediction. By researching into the observations from previous works, we fill in the gaps between investors and researchers by proposing a feature extension algorithm before recursive feature elimination and get a noticeable improvement in the model performance.

Though we have achieved a decent outcome from our proposed solution, this research has more potential towards research in future. During the evaluation procedure, we also found that the RFE algorithm is not sensitive to the term lengths other than 2-day, weekly, biweekly. Getting more in-depth research into what technical indices would influence the irregular term lengths would be a possible future research direction. Moreover, by combining latest sentiment analysis techniques with feature engineering and deep learning model, there is also a high potential to develop a more comprehensive prediction system which is trained by diverse types of information such as tweets, news, and other text-based data.

Abbreviations

Long short term memory

Principal component analysis

Recurrent neural networks

Artificial neural network

Deep neural network

Dynamic Time Warping

Recursive feature elimination

Support vector machine

Convolutional neural network

Stochastic gradient descent

Rectified linear unit

Multi layer perceptron

Atsalakis GS, Valavanis KP. Forecasting stock market short-term trends using a neuro-fuzzy based methodology. Expert Syst Appl. 2009;36(7):10696–707.

Article   Google Scholar  

Ayo CK. Stock price prediction using the ARIMA model. In: 2014 UKSim-AMSS 16th international conference on computer modelling and simulation. 2014. https://doi.org/10.1109/UKSim.2014.67 .

Brownlee J. Deep learning for time series forecasting: predict the future with MLPs, CNNs and LSTMs in Python. Machine Learning Mastery. 2018. https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/

Eapen J, Bein D, Verma A. Novel deep learning model with CNN and bi-directional LSTM for improved stock market index prediction. In: 2019 IEEE 9th annual computing and communication workshop and conference (CCWC). 2019. pp. 264–70. https://doi.org/10.1109/CCWC.2019.8666592 .

Fischer T, Krauss C. Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res. 2018;270(2):654–69. https://doi.org/10.1016/j.ejor.2017.11.054 .

Article   MathSciNet   MATH   Google Scholar  

Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Mach Learn 2002;46:389–422.

Hafezi R, Shahrabi J, Hadavandi E. A bat-neural network multi-agent system (BNNMAS) for stock price prediction: case study of DAX stock price. Appl Soft Comput J. 2015;29:196–210. https://doi.org/10.1016/j.asoc.2014.12.028 .

Halko N, Martinsson PG, Tropp JA. Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 2001;53(2):217–88.

Article   MathSciNet   Google Scholar  

Hassan MR, Nath B. Stock market forecasting using Hidden Markov Model: a new approach. In: Proceedings—5th international conference on intelligent systems design and applications 2005, ISDA’05. 2005. pp. 192–6. https://doi.org/10.1109/ISDA.2005.85 .

Hochreiter S, Schmidhuber J. Long short-term memory. J Neural Comput. 1997;9(8):1735–80. https://doi.org/10.1162/neco.1997.9.8.1735 .

Hsu CM. A hybrid procedure with feature selection for resolving stock/futures price forecasting problems. Neural Comput Appl. 2013;22(3–4):651–71. https://doi.org/10.1007/s00521-011-0721-4 .

Huang CF, Chang BR, Cheng DW, Chang CH. Feature selection and parameter optimization of a fuzzy-based stock selection model using genetic algorithms. Int J Fuzzy Syst. 2012;14(1):65–75. https://doi.org/10.1016/J.POLYMER.2016.08.021 .

Huang CL, Tsai CY. A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Syst Appl. 2009;36(2 PART 1):1529–39. https://doi.org/10.1016/j.eswa.2007.11.062 .

Idrees SM, Alam MA, Agarwal P. A prediction approach for stock market volatility based on time series data. IEEE Access. 2019;7:17287–98. https://doi.org/10.1109/ACCESS.2019.2895252 .

Ince H, Trafalis TB. Short term forecasting with support vector machines and application to stock price prediction. Int J Gen Syst. 2008;37:677–87. https://doi.org/10.1080/03081070601068595 .

Jeon S, Hong B, Chang V. Pattern graph tracking-based stock price prediction using big data. Future Gener Comput Syst. 2018;80:171–87. https://doi.org/10.1016/j.future.2017.02.010 .

Kara Y, Acar Boyacioglu M, Baykan ÖK. Predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the Istanbul Stock Exchange. Expert Syst Appl. 2011;38(5):5311–9. https://doi.org/10.1016/j.eswa.2010.10.027 .

Khaidem L, Dey SR. Predicting the direction of stock market prices using random forest. 2016. pp. 1–20.

Kim K, Han I. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl. 2000;19:125–32. https://doi.org/10.1016/S0957-4174(00)00027-0 .

Lee MC. Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst Appl. 2009;36(8):10896–904. https://doi.org/10.1016/j.eswa.2009.02.038 .

Lei L. Wavelet neural network prediction method of stock price trend based on rough set attribute reduction. Appl Soft Comput J. 2018;62:923–32. https://doi.org/10.1016/j.asoc.2017.09.029 .

Lin X, Yang Z, Song Y. Expert systems with applications short-term stock price prediction based on echo state networks. Expert Syst Appl. 2009;36(3):7313–7. https://doi.org/10.1016/j.eswa.2008.09.049 .

Liu G, Wang X. A new metric for individual stock trend prediction. Eng Appl Artif Intell. 2019;82(March):1–12. https://doi.org/10.1016/j.engappai.2019.03.019 .

Liu S, Zhang C, Ma J. CNN-LSTM neural network model for quantitative strategy analysis in stock markets. 2017;1:198–206. https://doi.org/10.1007/978-3-319-70096-0 .

Long W, Lu Z, Cui L. Deep learning-based feature engineering for stock price movement prediction. Knowl Based Syst. 2018;164:163–73. https://doi.org/10.1016/j.knosys.2018.10.034 .

Malkiel BG, Fama EF. Efficient capital markets: a review of theory and empirical work. J Finance. 1970;25(2):383–417.

McNally S, Roche J, Caton S. Predicting the price of bitcoin using machine learning. In: Proceedings—26th Euromicro international conference on parallel, distributed, and network-based processing, PDP 2018. pp. 339–43. https://doi.org/10.1109/PDP2018.2018.00060 .

Nagar A, Hahsler M. News sentiment analysis using R to predict stock market trends. 2012. http://past.rinfinance.com/agenda/2012/talk/Nagar+Hahsler.pdf . Accessed 20 July 2019.

Nekoeiqachkanloo H, Ghojogh B, Pasand AS, Crowley M. Artificial counselor system for stock investment. 2019. ArXiv Preprint arXiv:1903.00955 .

Ni LP, Ni ZW, Gao YZ. Stock trend prediction based on fractal feature selection and support vector machine. Expert Syst Appl. 2011;38(5):5569–76. https://doi.org/10.1016/j.eswa.2010.10.079 .

Pang X, Zhou Y, Wang P, Lin W, Chang V. An innovative neural network approach for stock market prediction. J Supercomput. 2018. https://doi.org/10.1007/s11227-017-2228-y .

Pimenta A, Nametala CAL, Guimarães FG, Carrano EG. An automated investing method for stock market based on multiobjective genetic programming. Comput Econ. 2018;52(1):125–44. https://doi.org/10.1007/s10614-017-9665-9 .

Piramuthu S. Evaluating feature selection methods for learning in data mining applications. Eur J Oper Res. 2004;156(2):483–94. https://doi.org/10.1016/S0377-2217(02)00911-6 .

Qiu M, Song Y. Predicting the direction of stock market index movement using an optimized artificial neural network model. PLoS ONE. 2016;11(5):e0155133.

Scikit-learn. Scikit-learn Min-Max Scaler. 2019. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html . Retrieved 26 July 2020.

Shen J. Thesis, “Short-term stock market price trend prediction using a customized deep learning system”, supervised by M. Omair Shafiq, Carleton University. 2019.

Shen J, Shafiq MO. Deep learning convolutional neural networks with dropout—a parallel approach. ICMLA. 2018;2018:572–7.

Google Scholar  

Shen J, Shafiq MO. Learning mobile application usage—a deep learning approach. ICMLA. 2019;2019:287–92.

Shih D. A study of early warning system in volume burst risk assessment of stock with Big Data platform. In: 2019 IEEE 4th international conference on cloud computing and big data analysis (ICCCBDA). 2019. pp. 244–8.

Sirignano J, Cont R. Universal features of price formation in financial markets: perspectives from deep learning. Ssrn. 2018. https://doi.org/10.2139/ssrn.3141294 .

Article   MATH   Google Scholar  

Thakur M, Kumar D. A hybrid financial trading support system using multi-category classifiers and random forest. Appl Soft Comput J. 2018;67:337–49. https://doi.org/10.1016/j.asoc.2018.03.006 .

Tsai CF, Hsiao YC. Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst. 2010;50(1):258–69. https://doi.org/10.1016/j.dss.2010.08.028 .

Tushare API. 2018. https://github.com/waditu/tushare . Accessed 1 July 2019.

Wang X, Lin W. Stock market prediction using neural networks: does trading volume help in short-term prediction?. n.d.

Weng B, Lu L, Wang X, Megahed FM, Martinez W. Predicting short-term stock prices using ensemble methods and online data sources. Expert Syst Appl. 2018;112:258–73. https://doi.org/10.1016/j.eswa.2018.06.016 .

Zhang S. Architectural complexity measures of recurrent neural networks, (NIPS). 2016. pp. 1–9.

Zubair M, Fazal A, Fazal R, Kundi M. Development of stock market trend prediction system using multiple regression. Computational and mathematical organization theory. Berlin: Springer US; 2019. https://doi.org/10.1007/s10588-019-09292-7 .

Book   Google Scholar  

Download references

Acknowledgements

This research is supported by Carleton University, in Ottawa, ON, Canada. This research paper has been built based on the thesis [ 36 ] of Jingyi Shen, supervised by M. Omair Shafiq at Carleton University, Canada, available at https://curve.carleton.ca/52e9187a-7f71-48ce-bdfe-e3f6a420e31a .

NSERC and Carleton University.

Author information

Authors and affiliations.

School of Information Technology, Carleton University, Ottawa, ON, Canada

Jingyi Shen & M. Omair Shafiq

You can also search for this author in PubMed   Google Scholar

Contributions

Yes. All authors read and approved the final manuscript.

Corresponding author

Correspondence to M. Omair Shafiq .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Shen, J., Shafiq, M.O. Short-term stock market price trend prediction using a comprehensive deep learning system. J Big Data 7 , 66 (2020). https://doi.org/10.1186/s40537-020-00333-6

Download citation

Received : 24 January 2020

Accepted : 30 July 2020

Published : 28 August 2020

DOI : https://doi.org/10.1186/s40537-020-00333-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning
  • Stock market trend
  • Feature engineering

stock market research paper

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

Majority of workers who quit a job in 2021 cite low pay, no opportunities for advancement, feeling disrespected

stock market research paper

The COVID-19 pandemic set off nearly unprecedented churn in the U.S. labor market. Widespread job losses in the early months of the pandemic gave way to tight labor markets in 2021, driven in part by what’s come to be known as the Great Resignation . The nation’s “quit rate” reached a 20-year high last November.

A bar chart showing the top reasons why U.S. workers left a job in 2021: Low pay, no advancement opportunities

A new Pew Research Center survey finds that low pay, a lack of opportunities for advancement and feeling disrespected at work are the top reasons why Americans quit their jobs last year. The survey also finds that those who quit and are now employed elsewhere are more likely than not to say their current job has better pay, more opportunities for advancement and more work-life balance and flexibility.

Majorities of workers who quit a job in 2021 say low pay (63%), no opportunities for advancement (63%) and feeling disrespected at work (57%) were reasons why they quit, according to the Feb. 7-13 survey. At least a third say each of these were major reasons why they left.  

Roughly half say child care issues were a reason they quit a job (48% among those with a child younger than 18 in the household). A similar share point to a lack of flexibility to choose when they put in their hours (45%) or not having good benefits such as health insurance and paid time off (43%). Roughly a quarter say each of these was a major reason.

Pew Research Center conducted this analysis to better understand the experiences of Americans who quit a job in 2021. This analysis is based on 6,627 non-retired U.S. adults, including 965 who say they left a job by choice last year. The data was collected as a part of a larger survey conducted Feb. 7-13, 2022. Everyone who took part is a member of the Center’s American Trends Panel (ATP), an online survey panel that is recruited through national, random sampling of residential addresses. This way, nearly all U.S. adults have a chance of selection. The survey is weighted to be representative of the U.S. adult population by gender, race, ethnicity, partisan affiliation, education and other categories. Read more about the ATP’s methodology .

Here are the questions used for this analysis, along with responses, and its methodology.

About four-in-ten adults who quit a job last year (39%) say a reason was that they were working too many hours, while three-in-ten cite working too few hours. About a third (35%) cite wanting to relocate to a different area, while relatively few (18%) cite their employer requiring a COVID-19 vaccine as a reason.

When asked separately whether their reasons for quitting a job were related to the coronavirus outbreak, 31% say they were. Those without a four-year college degree (34%) are more likely than those with a bachelor’s degree or more education (21%) to say the pandemic played a role in their decision.

For the most part, men and women offer similar reasons for having quit a job in the past year. But there are significant differences by educational attainment.

A chart showing that the reasons for quitting a job in 2021 vary by education

Among adults who quit a job in 2021, those without a four-year college degree are more likely than those with at least a bachelor’s degree to point to several reasons. These include not having enough flexibility to decide when they put in their hours (49% of non-college graduates vs. 34% of college graduates), having to work too few hours (35% vs. 17%) and their employer requiring a COVID-19 vaccine (21% vs. 8%).

There are also notable differences by race and ethnicity. Non-White adults who quit a job last year are more likely than their White counterparts to say the reasons include not having enough flexibility (52% vs. 38%), wanting to relocate to a different area (41% vs. 30%), working too few hours (37% vs. 24%) or their employer requiring that they have a COVID-19 vaccine (27% vs. 10%). The non-White category includes those who identify as Black, Asian, Hispanic, some other race or multiple races. These groups could not be analyzed separately due to sample size limitations.

Many of those who switched jobs see improvements

A majority of those who quit a job in 2021 and are not retired say they are now employed, either full-time (55%) or part-time (23%). Of those, 61% say it was at least somewhat easy for them to find their current job, with 33% saying it was very easy. One-in-five say it was very or somewhat difficult, and 19% say it was neither easy nor difficult.

For the most part, workers who quit a job last year and are now employed somewhere else see their current work situation as an improvement over their most recent job. At least half of these workers say that compared with their last job, they are now earning more money (56%), have more opportunities for advancement (53%), have an easier time balancing work and family responsibilities (53%) and have more flexibility to choose when they put in their work hours (50%).

Still, sizable shares say things are either worse or unchanged in these areas compared with their last job. Fewer than half of workers who quit a job last year (42%) say they now have better benefits, such as health insurance and paid time off, while a similar share (36%) says it’s about the same. About one-in-five (22%) now say their current benefits are worse than at their last job.

A bar chart showing that college graduates who quit a job are more likely than those with less education to say they’re now earning more, have more opportunities for advancement

College graduates are more likely than those with less education to say that compared with their last job, they are now earning more (66% vs. 51%) and have more opportunities for advancement (63% vs. 49%). In turn, those with less education are more likely than college graduates to say they are earning less in their current job (27% vs. 16%) and that they have fewer opportunities for advancement (18% vs. 9%).

Employed men and women who quit a job in 2021 offer similar assessments of how their current job compares with their last one. One notable exception is when it comes to balancing work and family responsibilities: Six-in-ten men say their current job makes it easier for them to balance work and family – higher than the share of women who say the same (48%).

Some 53% of employed adults who quit a job in 2021 say they have changed their field of work or occupation at some point in the past year. Workers younger than age 30 and those without a postgraduate degree are especially likely to say they have made this type of change.

Younger adults and those with lower incomes were more likely to quit a job in 2021

A bar chart showing that about a quarter of adults with lower incomes say they quit a job in 2021

Overall, about one-in-five non-retired U.S. adults (19%) – including similar shares of men (18%) and women (20%) – say they quit a job at some point in 2021, meaning they left by choice and not because they were fired, laid off or because a temporary job had ended.

Adults younger than 30 are far more likely than older adults to have voluntarily left their job last year: 37% of young adults say they did this, compared with 17% of those ages 30 to 49, 9% of those ages 50 to 64 and 5% of those ages 65 and older.

Experiences also vary by income, education, race and ethnicity. About a quarter of adults with lower incomes (24%) say they quit a job in 2021, compared with 18% of middle-income adults and 11% of those with upper incomes.

Across educational attainment, those with a postgraduate degree are the least likely to say they quit a job at some point in 2021: 13% say this, compared with 17% of those with a bachelor’s degree, 20% of those with some college and 22% of those with a high school diploma or less education.  

About a quarter of non-retired Hispanic and Asian adults (24% each) report quitting a job last year; 18% of Black adults and 17% of White adults say the same.

Note: Here are the questions used for this analysis, along with responses, and its methodology.

  • Business & Workplace
  • Coronavirus (COVID-19)
  • COVID-19 & the Economy
  • Income & Wages

Kim Parker's photo

Kim Parker is director of social trends research at Pew Research Center

Juliana Menasce Horowitz's photo

Juliana Menasce Horowitz is an associate director of research at Pew Research Center

A look at small businesses in the U.S.

Majorities of adults see decline of union membership as bad for the u.s. and working people, a look at black-owned businesses in the u.s., from businesses and banks to colleges and churches: americans’ views of u.s. institutions, 2023 saw some of the biggest, hardest-fought labor disputes in recent decades, most popular.

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Age & Generations
  • Economy & Work
  • Family & Relationships
  • Gender & LGBTQ
  • Immigration & Migration
  • International Affairs
  • Internet & Technology
  • Methodological Research
  • News Habits & Media
  • Non-U.S. Governments
  • Other Topics
  • Politics & Policy
  • Race & Ethnicity
  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

Copyright 2024 Pew Research Center

Terms & Conditions

Privacy Policy

Cookie Settings

Reprints, Permissions & Use Policy

IMAGES

  1. FREE 8+ Stock Research Report Templates in PDF

    stock market research paper

  2. (PDF) International Stock Market Diversification among BRICS-P: A

    stock market research paper

  3. 🌈 Example of marketing research paper. 38 Marketing Plan Examples

    stock market research paper

  4. Market Research Report Format

    stock market research paper

  5. (PDF) The Stock Market and Investment

    stock market research paper

  6. Overview of the Stock Market

    stock market research paper

VIDEO

  1. Stock Market Update

  2. Fundamentally Strong Stocks

  3. Healthcare Stocks to Invest In

  4. Best IT Stocks to Buy

  5. Long Term Investment Stocks

  6. Investor & Trader#stockmarket#optionstrading#stock#crypto#tranding#forex#sharemarket#intradaytrading

COMMENTS

  1. Stock market movement forecast: A Systematic review

    This paper presents an updated systematic review of the state of the art in the stock market forecast considering fundamental and technical analysis from 2014 to 2018. ... after 2009, there has been produced a significant amount of new research work on stock market prediction using techniques from the family of neural networks. At the same time ...

  2. A systematic review of fundamental and technical analysis of stock

    The stock market is a key pivot in every growing and thriving economy, and every investment in the market is aimed at maximising profit and minimising associated risk. As a result, numerous studies have been conducted on the stock-market prediction using technical or fundamental analysis through various soft-computing techniques and algorithms. This study attempted to undertake a systematic ...

  3. Review Machine learning techniques and data for stock market

    To our knowledge, there has been very limited research on the variables included during the model building process for stock market prediction. In this paper, we focus on a systematic review of the literature in financial market forecasting with a focus on the data included in the studies and the statistical and machine learning methods used.

  4. PDF Stock Market and Investment: Is the Market a Sideshow?

    In this paper, we try to address empirically the broader question of how the stock market affects investment. We identify four theories that ... Since Robert Shiller's demonstration of the excess volatility of stock market prices, research on the efficiency of financial markets has exploded.4 In subsequent work, Shiller suggested that fads and ...

  5. Stock Market Volatility and Return Analysis: A Systematic Literature

    Abstract must show clear indication of stock market volatility and return studies through GARCH model robustness. The focus of the research is to study stock market return and volatility analysis by GARCH family model. Research paper must be written in English language. English language is the leading research language in the arena of finance.

  6. A systematic review of stock market prediction using machine learning

    Stock market predictions use mathematical strategies and learning tools. This paper provides a complete overview of 30 research papers recommending methods that include calculation methods, ML algorithms, performance parameters, and outstanding journals. The studies are selected based on research questions.

  7. The Unprecedented Stock Market Impact of COVID-19

    The evidence we amass suggests that government restrictions on commercial activity and voluntary social distancing, operating with powerful effects in a service-oriented economy, are the main reasons the U.S. stock market reacted so much more forcefully to COVID-19 than to previous pandemics in 1918-19, 1957-58 and 1968. May 18, 2020.

  8. Artificial Intelligence Applied to Stock Market Trading: A Review

    The application of Artificial Intelligence (AI) to financial investment is a research area that has attracted extensive research attention since the 1990s, when there was an accelerated technological development and popularization of the personal computer. Since then, countless approaches have been proposed to deal with the problem of price prediction in the stock market. This paper presents a ...

  9. Deep learning in the stock market—a systematic survey of practice

    The widespread usage of machine learning in different mainstream contexts has made deep learning the technique of choice in various domains, including finance. This systematic survey explores various scenarios employing deep learning in financial markets, especially the stock market. A key requirement for our methodology is its focus on research papers involving backtesting. That is, we ...

  10. COVID and World Stock Markets: A Comprehensive Discussion

    The majority of the world stock markets have suffered losses in the trillions of dollars, and international financial institutions were forced to reduce their forecasted growth for 2020 and the years to come. The current research deals with the impact of the COVID-19 pandemic on the global stock markets.

  11. [2106.12985] Stock Market Analysis with Text Data: A Review

    Finally, this paper shows the findings on unaddressed open problems and gives suggestions for future work. The aim of this study is to survey the main stock market analysis models, text representation techniques for financial market prediction, shortcomings of existing techniques, and propose promising directions for future research.

  12. Stock Market Liquidity: A Literature Review

    The purpose of this study is to identify the key aspects that have been studied in the area of stock market liquidity, accumulate their important findings, and also provide a quantitative categorization of reviewed literature that will facilitate in conducting further research. The study analyzes relevant research papers published after the ...

  13. (PDF) Stock Markets: An Overview and A Literature Review

    Abstract. Stock markets are without any doubt, an integral and indispensable part of a country's economy. But the impact of stock markets on the country's economy can be different from how the ...

  14. PDF Text-based Stock Market Analysis: A Review

    There are also some survey papers that review text-based works only and analyze textual data sources and represen-tation techniques. Li et al. [77] review the studies from 2007 to 2016 with a focus on web media in the form of textual ... the incorporation of textual content into stock market research has become an appealing topic. As the social ...

  15. Impact of COVID-19 Outbreak on the Stock Market: An Evidence from

    But black swan events like COVID-19 are rare, unlike corporate events, and very limited research has been carried out to study the impact of such events. ... investing in the Mexican stock market will give a robust return in a pandemic like COVID-19 as it ... The unprecedented stock market impact of Covid-19 (Working Paper No. 26945). National ...

  16. Financial Markets: Articles, Research, & Case Studies on Financial

    by Carolin E. Pflueger, Emil Siriwardane, and Adi Sunderam. This paper sheds new light on connections between financial markets and the macroeconomy. It shows that investors' appetite for risk—revealed by common movements in the pricing of volatile securities—helps determine economic outcomes and real interest rates.

  17. Macroeconomics and Finance: The Role of the Stock Market

    This paper explores four possible explanations for this neglect and concludes that macro analysis should give more attention to the stock market. Despite the frequent jibe that "the stockmarket has forecast ten of the last six recessions," the stock market is in fact a good predictor of the business cycle and the components of GNP.

  18. PDF Effectiveness of Artificial Intelligence in Stock Market Prediction

    public. This resulted in a far higher volume of research papers studying the prediction of the stock market based on the technical analysis approaches. As one of the early studies in this field, at the beginning of the 90s, Kimoto et al: [9] worked on a feed-forward neural network (NN) algorithm [10] to predict the stock market

  19. Short-term stock market price trend prediction using a ...

    In the era of big data, deep learning for predicting stock market prices and trends has become even more popular than before. We collected 2 years of data from Chinese stock market and proposed a comprehensive customization of feature engineering and deep learning-based model for predicting price trend of stock markets. The proposed solution is comprehensive as it includes pre-processing of ...

  20. The Impact of Green Investors on Stock Prices

    Abstract. We study the impact of green investors on stock prices in a dynamic equilibrium model where investors are green, passive or active. Green investors track an index that progressively excludes the stocks of the brownest firms; passive investors hold a value-weighted index of all stocks; and active investors hold a mean-variance efficient portfolio of all stocks.

  21. Stock Market Prediction Using Python

    The paper concludes by highlighting the potential of Python for advanced stock market analysis and the need for further research to enhance the tool's functionalities in this regard. Overall, this research paper demonstrates how Python can extract insights from stock market data efficiently and effectively.

  22. (PDF) EXPLORING THE RISE OF STOCK MARKET AWARENESS IN ...

    This research paper aims to examine the current state of stock exchange awareness in India and identify various key factors that discourage investors from the stock market. ... A stock-market is a ...

  23. The Impact of Data Elements on Enterprises' Capital Market Performance

    Amidst a backdrop of global economic challenges and shifting market dynamics, this study highlights the transformative role of data elements in enhancing enterprise performance within capital markets, particularly focusing on China's leading position in the digital economy as a model with implications for global markets. This study utilized a panel data set consisting of 10,493 observations ...

  24. PDF Stock Price Prediction using Sentiment Analysis and Deep Learning for

    review of available literature in this domain. Research papers as well as online sources tackling this problem were reviewed, a brief list of the same is included as part of ref-erences. 1.1 Literature Review Early research on Stock Market Prediction was based on Random walk and Efficient Market Hypothesis (EMH).

  25. (PDF) Indian stock market

    Recent Trends in Multi-Disciplinary Resea rch, Vol-1, 2022. (II) Bombay Stock Exchange (BSE) The Bombay Stock Exchange (BSE) is a n Indian stock exchange located at. Dalal Street, Mumbai. Esta ...

  26. Worried About a Stock Market Correction? Keep These 7 Timeless Warren

    Stock Market Basics. Stock Market 101 Types of Stocks Stock Market Sectors ... in-depth research, investing resources, and more. Learn More. NYSE: BRK.B Berkshire Hathaway. Market Cap.

  27. The Great Resignation: Why workers say they quit jobs in 2021

    (Getty Images) The COVID-19 pandemic set off nearly unprecedented churn in the U.S. labor market. Widespread job losses in the early months of the pandemic gave way to tight labor markets in 2021, driven in part by what's come to be known as the Great Resignation.The nation's "quit rate" reached a 20-year high last November. A new Pew Research Center survey finds that low pay, a lack ...

  28. International Paper (IP) Earnings Date and Reports 2024

    Get daily stock ideas from top-performing Wall Street analysts. Get short term trading ideas from the MarketBeat Idea Engine. View which stocks are hot on social media with MarketBeat's trending stocks report. Advanced Stock Screeners and Research Tools. Identify stocks that meet your criteria using seven unique stock screeners.