THE SEARCH FOR TIME-SERIES PREDICTABILITY-BASED ANOMALIES

. This paper introduces a new algorithm for exploiting time-series predictability-based patterns to obtain an abnormal return, or alpha, with respect to a given benchmark asset pricing model. The algorithm proposes a deterministic daily market timing strategy that decides between being fully invested in a risky asset or in a risk-free asset, with the trading rule represented by a parametric perceptron. The optimal parameters are sought in-sample via differential evolution to directly maximize the alpha. Successively using two modern asset pricing models and two different portfolio weighting schemes, the algorithm was able to discover an undocumented anomaly in the United States stock market cross-section, both out-of-sample and using small transaction costs. The new algorithm represents a simple and flexible alternative to technical analysis and forecast-based trading rules, neither of which necessarily maximizes the alpha. This new algorithm was inspired by recent insights into representing reinforcement learning as evolutionary computation.


Introduction
This work introduces an algorithm designed to detect and profitably exploit the presence of time-series predictability-based anomalies. A market anomaly is a reliable and predictable pattern in the time-series or cross-section of asset returns that cannot be explained by a benchmark asset-pricing market model (Keim, 2008). An anomaly is typically demonstrated by rejecting the joint null hypothesis that the market is efficient and that asset returns behave according to a given benchmark asset-pricing model (Keim, 2008). In this context, the asset-basic mathematical operations and logical connectors. Significantly, the present work implies the novel proposition of instead representing such trading rules as neural networks, which are both considerably simpler to program than genetic programming rules and substantially more general, given the approximation properties of neural networks (cf. Kidger & Lyons, 2020). Importantly, in this context, the trading rules are the neural networks themselves (the policies), departing from more common neural-networks-as-forecasting-tools applications. This new approach was inspired by the work of Salimans, Ho, Chen, Sidor, and Sutskever (2017), which interpreted reinforcement learning as a general evolutionary algorithm form, broadly simplifying reinforcement learning programming. Such simplification may ultimately enable advances not only in other financial fields (including risk management, portfolio allocation, and market microstructure) but also in related economics fields (including stochastic games, real-time bidding, consumption and income dynamics, and adaptive experimental design) (cf. Charpentier et al., 2021). 3 Third, the proposed algorithm incorporates a market timing strategy that has already considered trading costs when it arrives at determining its trading signals -when to buy or sell the risky asset -meaning that the algorithm's alpha is already optimized for transaction costs, which contrasts with the usual development of trading rules (e.g., development based on forecasts or technical analysis), where rules might no longer be optimal after introducing transaction costs.
Fourth, this study's approach reveals a previously unreported anomaly in the U.S. stock market cross-section. The time-series predictability-based anomaly is robust to changes in the benchmark asset-pricing model, from the Fama and French (2015) to the Carhart model (1997), and portfolio construction, from equal-weighted to value-weighted. The first result is important because it demonstrates that the algorithm successfully exploits predictability patterns in the data, independent of the benchmark used. The second result is important because investors can often more easily reproduce value-weighted portfolios than equal-weighted portfolios, with the former approach often requiring less rebalancing.
Fifth, the anomaly documented in this work was identified using exclusively out-ofsample data. Searching for anomalies is frequently a trial-and-error procedure, which can produce false results given data-mining bias. All of the abnormal results this work reports were obtained using data not seen by the algorithm during its development. This practice alleviates any overfitting and data-mining concerns and strongly reinforces the conclusion that the anomaly found is real.
The rest of the paper is organized as follows: Section 1 provides a general overview of the literature associated with investment algorithms; Section 2 describes the algorithm for optimizing alpha directly and introduces the data used to test its efficacy; Section 3 provides the descriptive statistics of the (out-of-sample) data and the performance of the algorithm in terms of alphas for two types of size decile portfolios using two modern asset pricing models; the final section provides concluding remarks.

Literature review
The literature on investment algorithms (specifically, market-timing strategies based on past returns) is broad. It can be divided into four main branches: investment algorithms based on forecasts, investment algorithms based on conventional technical analysis, and investment algorithms based on reinforcement learning (or control theory in general, as in the case of dynamic programming), which includes the fourth branch, algorithms based on policy optimization.
Algorithms based on forecasts typically try to invest according to the best available forecast for the next day. Forecasts are based on different methodologies, from those based on conventional methodologies such as autoregressive moving averages models (Atsalakis & Valavanis, 2013), to those based on soft computing, such as neural networks or support-vector machines (Atsalakis & Valavanis, 2009;Henrique et al., 2019). The goal of such algorithms is to produce an optimal forecast and then to follow a simple investment strategy based on that forecast. An example of an investment strategy could be investing in a risky asset when the forecast for the risky asset's return is positive and investing in the riskless asset when the forecast for the risky asset's return is negative (Chan, 2017). Variants include investing in the risky asset only when a certain threshold for the risky asset's return is surpassed. A forecast is considered optimal or the best available because it minimizes an error measurement, such as the root-mean-squared error. However, an optimal forecast does not necessarily optimize risk-adjusted measures of total return, such as alpha, which the proposed algorithm does.
Algorithms based on conventional technical analysis, on the other hand, rely on investment indicators and rules often resembling folklore (Park & Irwin, 2007). While many traders follow these rules (cf. Lo & Hasanhodzic, 2010;Menkhoff, 2010), such rules do not generally have a substantial or scientific basis (Malkiel, 2007). Instead, they are heavily based on anecdotal experience or traditional beliefs. Although the empirical evidence in favor of technical analysis is controversial (Park & Irwin, 2007), some modern authors have tried to substantiate it (Han et al., 2013(Han et al., , 2016Lo et al., 2000). Given the dubious basis, this paper distances itself from conventional technical analysis.
Algorithms based on reinforcement learning, or control theory, attempt to optimize a measurement of (risk-adjusted) total return (v.g., Bertoluzzo & Corazza, 2014;Cong et al., 2021;Kolm & Ritter, 2021;Mosavi et al., 2020;Pendharkar & Cusatis, 2018;Xufre Casqueiro & Rodrigues, 2006;Zhang et al., 2020). The investment process is seen in these kinds of algorithms as the generalization of a Markov decision-making process: an agent in a given state has to act in the market, (e.g. invest in a risky or riskless asset), and the market returns a reward and a new state of being for the agent (Charpentier et al., 2021;Fischer, 2018). Specialized deep-reinforcement-learning techniques have recently begun to be used.
Algorithms based on policy optimization can be seen as a variant of reinforcement learning wherein the policy is optimized directly. Most commonly, policy optimization does not require computing the policy encoded in the V or Q function (Sutton & Barto, 2018), but a parametrized policy that generates a whole sequence of actions and states can be optimized directly based on the total (possibly risk-adjusted) reward it generates for a whole episode. (A parametrized policy here is an investment rule described by a finite number of parameters that indicates how to invest given a state.) Often, evolutionary algorithms are used to perform this optimization. An example of this kind of algorithm is the work of Brogaard and Zareei (2018). Like the current work, those researchers attempted to optimize alpha, but using genetic programming to find optimal-technical-analysis rules instead of, as this paper does, attempting to use a perceptron function of past returns to find the optimal policy directly. (Sometimes the other types of algorithms are presumed to be "augmented" by technical indicators, such as in Brogaard and Zareei (2018).) Reinforcement-learning literature is the newest and most regularly effective of the four different branches. Direct policy optimization is an innovation in this context. This work fills a gap in the literature by describing, for the first time, how alphas can be optimized directly in parametrized form without resorting to the traditional approach dictated by technical analysis and by introducing a more easily implemented and considerably faster new algorithm.

Data and methodology
To illustrate the identification of an anomaly based on predictability, the proposed algorithm is applied to the daily returns of the ( 1, ,10) j = … size-decile portfolios of the United States market, with j ranging from the decile with the smallest firms to the decile with the largest firms. The decile portfolios were constructed based on sorting NYSE, AMEX, and NASDAQ stocks into ten groups (deciles) according to market equity (size). Upon assigning stocks to portfolios, daily returns (including dividends) were calculated using equal weighting or value weighting. The portfolios were rebalanced at the end of each June using June's market equity and NYSE breakpoints. The portfolios for the period between July 1, 1963, and April 30, 2019, were obtained from Kenneth French's website 4 , with returns measured in terms of percentage (computed using current dollar prices). Half the sample was used for training (7026 data points), and half the sample was used for testing the algorithm out-of-sample.
Let us begin by expressing the market timing strategy in terms of the linear parametric functional form that guides the investment process. In a market timing strategy, the investment rule used is of one particular form: it indicates when to leave or when to enter the market, that is, when to hold the risky or the riskless asset. In other words, the market timing strategy's naked return M jt R at time t according to the investment rule on the underlying j -th portfolio is: Pos t − is the position the investment strategy set the previous day (i.e., at 1 t − ), 1 Pos t − (or 2 Pos t − ) takes the value of 1 if the investment rule dictates buying or holding the risky portfolio, obtaining return jt R for the portfolio, and it takes the value of 0 if the investment rule is invested off-the-market in the risk-free asset. The 30-day T-bill is used as a risk-free asset, assuming that a transaction cost is charged for trading the risky portfolio but that there are no costs for trading the 30-day T-bill (e.g., Balduzzi & Lynch, 1999;Han et al., 2013;Lynch & Balduzzi, 2000).
The measure of profitability or performance of the strategy is the Fama and French (2015) alpha, although another asset pricing model is used as a robustness check. The return used to measure the alpha is based on the zero-cost arbitrage portfolio, which is long in the market timing portfolio resulting from applying the investment rule and short in the underlying portfolio. That is, the excess return of the investment rule IR jt is computed relative to the buy and hold positions on the risky portfolio; therefore, it is defined as: According to the above, the portfolio performance measure IR jt is the alpha j α from the time series regression of the five-factor Fama and French model (2015). That is, the alpha from the regression where MKT,t r is the daily return in excess of the market, SMB,t r is the daily return of the small-minus-big (SMB) factor related to size, HML,t r is the daily return of the high-minuslow (HML) factor related to growth, RMW,t r is the daily return of the robust-minus-weak (RMW) factor related to operating profitability, and CMA,t r is the daily return of the conservative-minus-aggressive (CMA) factor related to investment aggressiveness. The Fama and French (2015) daily factors and the daily risk-free rate are taken from Kenneth French's website. See Fama and French (2015) for a complete description of the factor returns. The risk-free rate corresponds to the one-month Treasury bill rate (from Ibbotson Associates).
Before continuing, it is important to stress the main implication of the recorded alpha being measured using an algorithmic arbitrage market timing investment (long in the algorithm, short in the underlying decile portfolio). The implication is that, by the definition in Eq. (2), this work focuses on the degree to which the market timing strategy outperforms the size-decile portfolio after discounting the particular size-decile portfolio performance. Thus, if the market timing strategy naively advised to always buy and hold the size decile portfolio, then the daily returns of the algorithmic arbitrage market timing investment and its alpha would be zero every time (by Eq. (2)), as would the final cumulative return.
The proposal of the new investment strategy is to increase the likelihood of finding a market timing algorithm that automatically generates a high risk-adjusted return in terms of the alpha. On the one hand, rules from the technical analysis tradition are not used because their scientific status is highly doubtful. On the other hand, prediction rules ultimately only minimize the sum of the distances between the predicted values and the respective real values, thus not necessarily promoting maximum profitability. Instead, a market timing strategy that could succeed by the nature of its construction is sought after; that is, a strategy that could automatically maximize alpha.
Based on the previous framework, the equations that specify the position of a general investment strategy in terms of a parametric functional form f can be written as follows: where H is the Heaviside function f is a general parametric function, is a vector of parameters to be specified, is a vector with the information of the past returns are known, they use the information available on past returns to establish when to buy or hold the risky portfolio, and when to sell it and hold the risk-free asset instead. Notice how, unlike standard predictive investment strategies, neither a R is a prediction in itself but rather a rule about the investment position to be taken (just as a moving average rule establishes this position without making any prediction about what is the future return).
To make the problem more manageable, several simplifying assumptions can be made. For example, a linear function f could be used, one of the simplest possible cases, i.e., a function ( 1) * 0 ( , ) . This form, used in (4), recalls the perceptron function employed in reinforcement learning for trading by Gold (2003). Specifically, the present work uses 10 I = past returns for the optimal control investment rule. Following this section, this simple parametric rule's ability to cover a broad spectrum of possible investment strategies will be appreciated.
In this framework, to build an optimal investment rule, values of the vector a parameters such that  (2), (1) and (4), consecutively: the term IR jt in (3) can be expressed in terms of the rules , and it is even possible to estimate the alpha in this regression by ordinary least squares as that is, as the first element of the vector IR is the vector of regressands expressed in terms of the rules However, in practice, there are two major obstacles to achieving the desired maximization. The first one is that is an enormous algebraic expression, specifically of training 2 T − addends, which may involve the optimization of an expression of thousands of terms. But the second obstacle is more serious. Each of these terms contains expressions in terms of Heaviside functions. These expressions are not only highly nonlinear but also do not change locally almost everywhere. They are also discontinuous and nondifferentiable, since the derivative of ( ) H x is the Dirac delta function ( ) x δ which is a "generalized function" (i.e., a distribution) having the property of being zero everywhere, infinity in 0, and whose integral over the reals is 1.
To overcome the obstacles that the optimization involves, a heuristic method of optimization was used that, unlike classical optimization methods such as quasi-Newton methods or gradient descent, does not require that the function is continuous, changes locally or is even differentiable. The method chosen is inspired by the theory of evolution and is called differential evolution (DE) (Rocca et al., 2011;Storn & Price, 1997).
The method starts from an initial population of m input vectors { } 1 2 , , , m … a a a . This population is composed of vectors a randomly chosen as potential candidates to maximize the estimated alpha . As the number of components of each candidate vector is 1 11 I + = , m is chosen much higher than 1 I + . Later, each element of this initial population "evolves" to generate a new candidate to solve the optimization problem by two consecutive mechanisms. First, by mutation from three vectors of the initial population d a , e a and f a one solution candidate is obtained as where s is a scale factor less than one. Second, by crossing (interbreeding), a new candidate ′′ a is obtained from candidate ′ a and another point g a of the initial population. The new candidate takes the i -th coordinate of ′ a and replaces it with the i -th coordinate of g a with probability ρ or leaves if unchanged with probability 1 − ρ . If ˆ( ) ( ), a is not replaced. The end result is also a new population with m vectors.
The process is repeated iteratively, and the optimization stops when the difference between the optimal candidate of ˆj α is within a numerically insignificant specified distance of the optimum in the previous population, as are the distances between the vectors corresponding to these optimum values, (i.e., when the optimum converges based on predetermined tolerances). Given that, for the particular type of problem addressed here, there is no single solution vector -instead, the solution is a region -the fulfilling tolerances used were relatively large (0.001 in each case).
The final result of the optimization is an optimization parameter vector • a which chooses the positions taken by the investment strategy: in a manner that optimizes the estimated Fama and French alpha in the training data (i.e., within the sample). These optimal parameters are also expected to work out-of-sample if there is no overfitting during the optimization.
Since the parametric form of the function chosen to establish the position is linear, i.e., ), and besides it depends on 10 I = past returns, one wonders how general this parametric form is. Although it would have been possible to choose a more general functional form, for example, a nonlinear function with more parameters, the greater the number of parameters used and the complexity of the formula, the greater the possibility of overfitting. In other words, the possibility of finding an investment rule that generates an optimal alpha in the training data but only a mediocre alpha in the test data (outside the sample). Indeed, the functional form used was chosen to be simple and have few parameters for this reason. Despite this, such a functional form turns out to be quite flexible, as is explained below.
First, it is trivial to see that any inequality of the form or of the form . Similarly, any inequality of the form is also a special case of . Accordingly, neither the direction of the inequality, nor the presence or absence of a constant, nor the fact that the right side of the inequality is zero limit the generality of this type of investment strategy. Secondly, Zakamulin (2016) has shown that a wide variety of technical analysis indicators can be written in terms of a linear combination of returns. Among others, momentum indicators and general moving average indicators including simple moving averages, simple moving average crossings, and linear, exponential and reverse exponential moving averages. In line with this, all these indicators, if they maximize the estimated alpha, have been verified by optimizing the selected investment strategy. So, according to the above, the parametric form of the chosen function, although parsimonious, is also quite general.
Thirdly, the parametric form of the function used is nothing more than a perceptron; that is, an artificial neuron using the Heaviside function as the activation function. The learning here is achieved through differential evolution. As a linear algorithm, perceptrons are very efficient if the training set is linearly separable, and this choice of function can be seen as the first approximation of a solution to the problem of finding an optimal investment algorithm. Table 1 shows the descriptive statistics of the testing sample from May 22, 1991, to April 30, 2019, for equal-weighted (Panel A) and value-weighted (Panel B) size decile portfolios, respectively. Of the 20 portfolios, only the equal-weighted size decile portfolio with the smallest firms has a mean significantly different from zero ( 0.02) p = . This portfolio is also the only one with a Sharpe ratio over one, a risk-adjusted measure that otherwise ranges from 0.47 to 0.57 . Each portfolio is negatively skewed, except for the equal-weighted size-decile portfolio with the largest firms.

Results and discussion
The algorithm was designed and tested using the Wolfram and Julia languages, as well as Microsoft Excel. The final results (presented in Table 2) were computed in the Wolfram and R languages on a 6th Generation Intel Core i7-6700HQ. Training and testing each investment rule for a given decile portfolio required approximately 20 seconds. Panel A of Table 2 shows the out-of-sample Fama and French (2015) alphas that the algorithm achieved for the ten equal-weighted size-decile portfolios, while Panel B shows the same for the value-weighted size decile portfolios. To demonstrate that abnormal returns can survive transaction costs, a transaction cost of 1 basis point per trade was used for calculation as per Eq. (1). Given this cost was recognized by Balduzzi and Lynch (1999) as the lower limit for transaction costs in the 1990s, it is highly probable that big institutional investors can now trade below that cost. (In any case, Appendix A in the Supplementary Material develops a sensibility analysis of alpha in terms of transaction costs for further transparency.) The results clearly show alphas of significant economic importance in almost every portfolio, except for those with the largest firms. This could be because the excess returns of the arbitrage portfolios are less volatile, and therefore less suitable for a market-timing strategy. The alphas found in the equal-weighted size-decile portfolios grew from 0.09% to 12.12% annually as firm size decreased (Table 2, Panel A), while they grew from −3.73% to 12.03% annually as firm size decreased for the value-weighted portfolio (Table 2, Panel B).
However, trends were not monotonic. Using equal-weighted size decile portfolios (Table 2, Panel A), the algorithm did not return significant alphas for size deciles 9 and 10, those featuring the largest firms. For size deciles 2-8, alphas ranged from 5.49% annually for decile 3 to 8.70% annually for decile 7. The alpha calculated increased over 12% annually for size decile 1, which featured the smallest firms. For the value-weighted size decile portfolios (Table 2, Panel B), again, the trend was not monotonic. The alpha returned by the algorithm for size decile 9 was not significant, while the alpha for size decile 10 was negative. Between portfolio size decile 7 and 8, the alpha dropped from 7.40% annually for decile 7 to 4.06% annually for decile 8. Between portfolios with size decile 2 and 6, alphas ranged from 5.13% annually for size deciles 3 to 7.19% annually for size decile 6. Alphas grew to 12.03% annually for size decile 1 portfolio, which featured the smallest firms.
The betas for market and size factors were almost invariably negative for both the equalweighted ( where MKT,t r is the daily return in excess of the market, SMB,t r is the daily return of the small-minus-big (SMB) factor related to size, HML,t r is the daily return of the high-minuslow (HML) factor related to growth, and MOM,t r is the momentum factor. Panel C of Table 2 shows the out-of-sample results for equal-weighted size-decile portfolios, while Panel D shows the same for value-weighted size-decile portfolios under Carhart's model.
The results show that the algorithm is not only capable of reproducing the predictability anomaly using those factors, but is capable of obtaining even (economically) higher alphas for eight of the ten portfolios when considering equal-weighted size decile portfolios (Table 2, Panel C) and for ten of the ten portfolios when considering value-weighted size decile portfolios ( Note: Summary statistics for the daily returns of value-weighted size decile portfolios for the testing sample between May 22, 1991, and April 30, 2019. * , ** , and *** indicate a significant mean at the 0.1, 0.05, and 0.01 levels, respectively. "St. Deviation" stands for "Standard Deviation".   Note: A Fama and French (2015) time series regression on the algorithmic arbitrage market timing investment -long in the algorithm and short in the underlying decile portfolio -was conducted for each equal-weighted size decile portfolio for the testing sample between May 22, 1991, and April 30, 2019. For each decile, the algorithm was trained on data from July 1, 1963, to June 6, 1991. A transaction cost of 1 bp was assumed for training and testing. The 30-day T-bill is used as the risk-free asset, and one of the ten NYSE/AMEX/NASDAQ equal-weighted market-cap size-decile portfolios is used as the risky asset. The betas correspond to each of the Fama and French (2015) factors: MKT represents the excess market return factor, SMB represents the small-minus-big factor related to size, HML represents the high-minus-low factor related to growth, RMW represents the robust-minus-weak factor related to operating profitability, and CMA represents the conservative-minus-aggressive factor related to investment aggressiveness. Alphas are annualized and presented as percentages. Bootstrap p-values based on 5,000 bootstrap replications are in brackets (see Appendix B in the Supplementary Materials for details). * , ** , and *** indicate significance at the 0.1, 0.05, and 0.01 levels, respectively. Obs. and Adj. R 2 represent the number of observations and the adjusted coefficient of determination.
Panel B. Fama and French (2015) algorithmic alphas for each value-weighted size decile portfolio Decile Dependent variable:    (2015) time series regression on the algorithmic arbitrage market timing investment -long in the algorithm and short in the underlying decile portfolio -was conducted for each value-weighted size decile portfolio for the testing sample between May 22, 1991, and April 30, 2019. For each decile, the algorithm was trained on data from July 1, 1963, to June 6, 1991. A transaction cost of 1 bp was assumed for training and testing. The 30-day T-bill is used as the risk-free asset, and one of the ten NYSE/AMEX/NASDAQ value-weighted market-cap size-decile portfolios is used as the risky asset. The betas correspond to each of the Fama and French (2015) factors: MKT represents the excess market return factor, SMB represents the small-minus-big factor related to size, HML represents the high-minus-low factor related to growth, RMW represents the robust-minus-weak factor related to operating profitability, and CMA represents the conservative-minus-aggressive factor related to investment aggressiveness. Alphas are annualized and presented as percentages. Bootstrap p-values based on 5,000 bootstrap replications are in brackets (see Appendix B in the Supplementary Materials for details). * , ** , and *** indicate significance at the 0.1, 0.05, and 0.01 levels, respectively. Obs. and Adj. R 2 represent the number of observations and the adjusted coefficient of determination.
Panel C. Carhart (1997) Carhart (1997) time series regression on the algorithmic arbitrage market timing investment -long in the algorithm and short in the underlying decile portfolio -was conducted for each equal-weighted size decile portfolio for the testing sample between May 22, 1991, andApril 30, 2019. For each decile, the algorithm was trained on data from July 1, 1963, to June 6, 1991. A transaction cost of 1 bp was assumed for training and testing. The 30-day T-bill is used as the risk-free asset, and one of the ten NYSE/AMEX/NASDAQ equal-weighted market-cap size-decile portfolios is used as the risky asset. The betas correspond to each of the Carhart (1997) factors: MKT represents the excess market return factor, SMB represents the small-minus-big factor related to size, HML represents the high-minus-low factor related to growth, and MOM represents the momentum factor. Alphas are annualized and presented as percentages. Bootstrap p-values based on 5,000 bootstrap replications are in brackets (see Appendix B in the Supplementary Materials for details). * , ** , and *** indicate significance at the 0.1, 0.05, and 0.01 levels, respectively. Obs. and Adj. R 2 represent the number of observations and the adjusted coefficient of determination Panel D. Carhart (1997)    For each decile, the algorithm was trained on data from July 1, 1963, to June 6, 1991. A transaction cost of 1 bp was assumed for training and testing. The 30-day T-bill is used as the risk-free asset, and one of the ten NYSE/AMEX/NASDAQ value-weighted market-cap size-decile portfolios is used as the risky asset. The betas correspond to each of the Carhart (1997) factors: MKT represents the excess market return factor, SMB represents the small-minus-big factor related to size, HML represents the high-minus-low factor related to growth, and MOM represents the momentum factor. Alphas are annualized and presented as percentages. Bootstrap p-values based on 5,000 bootstrap replications are in brackets (see Appendix B in the Supplementary Materials for details). * , ** , and *** indicate significance at the 0.1, 0.05, and 0.01 levels, respectively. Obs. and Adj. R 2 represent the number of observations and the adjusted coefficient of determination.
General Note: The alphas in italics are significantly positive (abnormal) in the respective asset pricing model.
As firm size decreased, the alphas found for the equal-weighted size decile portfolios grew from 0.46% to 12.58% annually (Table 2, Panel C); for value-weighted portfolios, they grew from −3.67% to 13.03% (Table 2, Panel D). For the equal-weighted size decile portfolios ( Table 2, Panel C), the alpha calculated was insignificant for the two size decile portfolios featuring the largest firms. The alpha dropped monotonically between size decile portfolios 7 and 10, from 9.47% annually for 7 to 0.46% annually for 10. Between portfolios with size decile 2 and 6, the alpha ranged from 5.37% annually for size decile 3 to 7.69% annually for size decile 5. The alpha calculated reached over 12.5% annually for the portfolio featuring the smallest firms.
For the value-weighted portfolios ( Table 2, Panel D), the alpha was negative or not significantly different from zero for the two portfolios with the largest firms. From size deciles 2 and 8, the algorithm calculated alphas ranged from 5.17% annually for size decile 8 to 8.20% annually for size decile 2. The algorithm calculated an alpha of over 13% annually for the portfolio with the smallest firms.
As with the Fama and French (2015) model, negative exposure (beta) to the market and size factors were found to be predominant, for both the equal-weighted (Table 2, Panel C) and value-weighted portfolios (Table 2, Panel D). This, in turn, suggests that the algorithm constitutes a valuable hedge against exposure to these factors. The algorithm is also a valuable hedge against exposure to the book-to-market HML factor in three of the equal-weighted portfolios ( the case of the value-weighted portfolios, the algorithm also returned negative momentum betas for nine of the ten portfolios (Table 2, Panel D). In turn, it returned positive momentum betas in five of the ten equal-weighted portfolios; and statistically indistinguishable from zero betas in the other five cases (Table 2, Panel C).
In summary, the algorithm not only delivers statistically and economically significant positive risk-adjusted returns (alphas) most of the time using both the Fama and French (2015) and Carhart (1997) models but is also a valuable hedge against market and size factors. In the case of Carhart's model, the algorithm also provides a valuable hedge against the momentum factor when value-weighted portfolios are considered.
In simpler terms, the algorithm -applied to different portfolios ordered by size -was able to generate an investment that greatly outperformed the buy and hold strategy out-ofsample, even after both transaction costs and adjusting returns according to the algorithm's exposure to some of the most widely employed risk factors. Furthermore, exposure to these risk factors was generally low or negative, implying that when risk due to these factors is high (for example, when the market is trending downward), the algorithm performs even better. It should be noted that these risk-adjusted returns were obtained directly from the algorithm's design and, thus, should be generalizable to other portfolios containing time-series patterns. Additionally, the algorithm is simpler and much faster than alternatives, such as those using genetic programming (cf. Brogaard & Zareei, 2018). Together, these results strengthen the notion that using this algorithm represents a competitive choice for building market-timing strategies.

Conclusions
This paper has introduced a novel algorithm for automatically discovering anomalies based on the time-series predictability of asset returns. The proposed algorithm delivers a market timing strategy that decides whether to invest in or continue holding a risky asset or invest in or continue holding a riskless asset instead. This decision is made every day using a parametric perceptron function of past returns to represent the trading rule. Then, the algorithm directly optimizes the trading rule parameters for a maximum alpha using in-sample differential evolution. In contrast to forecast-based trading rules, which minimize forecast error, or technical analysis trading rules, which often reflect traditional or subjective interpretations, the proposed algorithm can automatically accommodate any exploitable time-series patterns of returns for optimal risk-adjusted returns. Given its design, the algorithm can also incorporate transaction costs in the trading rule representation -also differing from conventional technical analysis or prediction-based strategies -and deliver optimal alphas in the presence of transaction costs.
To demonstrate its capabilities, the algorithm was applied to size-decile portfolios representing a cross-section of the U.S. stock market, identifying an unreported out-of-sample anomaly using two of the most popular modern asset-pricing models and using different weighting schemes to construct the portfolios. The trading strategies obtained reasonably good (even as high as 12% annually) risk-adjusted performance in terms of the alpha for almost all of the examined portfolios.
One of the algorithm's current limitations concerns the complexities of establishing a market inefficiency. When testing efficiency, it is well-known that the rejection of efficiency can be caused by the market being truly inefficient or by the wrong asset-pricing model having been used to define normal returns. That is, reported abnormal returns might be the product of true exploitable predictability or the use of an inappropriate set of risk factors. Fortunately, if a new set of risk factors is required, and if exploitable patterns remain, the algorithm is sufficiently flexible to identify abnormal returns using the new set.
A second limitation of the algorithm is its use of a perceptron instead of a more general neural network architecture. Given the attempt to build a minimal working example, using a perceptron facilitated avoiding overfitting. Nonetheless, we are currently working towards generalizing the algorithm to other architectures, which may ultimately produce even better results. Despite this limitation, the present study provides a new avenue for research by offering the possibility of an algorithm that can both calculate investment strategies that optimize the alpha and automatically search for out-of-sample anomalies.
A direct line of future research might involve generalizing the linear trading rule that determines purchase-and-sales orders [ ] to even more general nonlinear alternatives to assess whether such alternatives tend towards overfitting. Furthermore, the input variables used to construct the trading strategies could easily be extended beyond previous returns to incorporate other market variables. The algorithm could also be used to optimize other risk-adjusted measures, such as the Sharpe ratio or even total returns. Additionally, the algorithm is not restricted to the data selected for this study. Instead, its design makes it generalizable to other portfolios featuring exploitable trading patterns.