An Inside Look:
The following paragraphs are a follow-up to the post, “Quant Strategies: Will Returns Seen in Research Be Achieved?“
There is a reason behind why passive index fund assets have either surpassed traditional active funds and quant fund assets or will do so within two years. A majority of such funds have diversification requirements with their strategy, they are unable to beat their benchmarks, and as a result, have suffered continuous redemptions in the last few years. But more central, “modern” models utilizing AI (machine learning) and big data have become the Achilles’ heel of quant funds and the trend in the asset management industry.
While there are exceptions to underperforming quant funds such as the grandaddy, highly successful Renaissance Technologies, per Bank of America, only 12% of US Large-Cap quant funds, investing in the broadest, most liquid securities market in the world, performed better than their benchmark for the year.
These same market conditions have slowed hedge fund growth. Neglecting to beat the market has brought about 8 consecutive months of hedge funds closings, a record in fund outflows, and the least number of hedge fund launches since 2000.
Because of the enormous fund management cash flow moves, passive fund managers now control close to 50% of all US-domiciled fund assets, up from only 20% in 2009.
Inside the S&P 500, the percentage of stocks that beat the benchmark for the decade was 32% versus practically half in the earlier decade. As it turns out, only a bunch of stocks produced the majority of last year’s S&P’s gains. Apple and Microsoft accounted for nearly 20% of the market’s entire 2019 gain.
This means that, while traditional active fund managers allocated to Apple and Microsoft, they were unable to overweight these stocks given their investment policies which typically require that no more than 3-5% of fund assets be invested in a single issue. This provides diversification benefits but results in a fund’s inability to outperform their market-cap weighted index benchmark if a significant majority of the index’s performance is tied to a handful of stocks.
AI, the Driving Hiring Trend:
In October 2017, Bloomberg took a look at Wall Street’s hiring trend and AI in their article “Want to Protect Your Wall Street Job From Robots? Learn How to Code.“
It focused on Bank of America, Citibank, Goldman, and Wells Fargo. The following chart depicts the trend to what Bloomberg concluded: “What’s clear is that few if any jobs in finance will be untouched as firms build computer systems capable of handling everything from routine tasks to trading and investing.”
Here’s the crux of what has happened to the asset management industry. Hiring has centered on AI and specifically on hiring coders who trade rather than traders who code. The emphasis shifted to hiring those with coding experience and knowing applicable techniques for the backtesting of algorithm code.
Here are some other articles related to the hiring trend and AI over the past three years.
February 26, 2017, Newsweek, “Goldman Sacked: How Artificial Intelligence Will Transform Wall Street“
December 21, 2017, CFA Institute, “Speaking Data Science with an Investment Accent“
October 7, 2018, Financial Times, “JPMorgan’s requirement for new staff: coding lessons.“
Investopedia culminates the hiring trend on Wall Street in detailing the AI asset management role in their post “Quants: The Rocket Scientists of Wall Street.” They describe the educational requirements and training required to land a job as one. They also say, “Most of their time is spent working with computer code and numbers on a screen.”
The Achilles’ heel – Machine Learning:
The growth of AI in asset management has come in the form of algorithms. Asset management is one of the top industries where information-based decision making is becoming prevalent. By far, one of the most widely focused topics is “stock prediction” among data scientists. Data such as stock prices have given the impression to data scientists and the asset management industry that it is the ideal contender for prediction and forecasts.
There is no doubt about it, artificial intelligence in asset management straddles specialized abilities in programming, math, and investment management is sophisticated.
To comprehend why the need for caution, you have to understand the information gathering and development process. Frequently utilized information includes GDP economic data and fundamental/price stock data. The primary issue with these types of data is referred to as serial correlation. This class of data activity is also called sequential connection. Basically, it means that forward-looking price forecasts are highly associated with past price sequences. Using the technique to forecast is like driving a car using your rearview mirror.
To see this, one only needs to take a look at a model of day to day stock prices. One can easily see that the closing price for each day is firmly tied down to the earlier day’s closing price, putting aside a “minor” deviation. Remember, prices are a representation of a company’s equity, which is calculated and dictated by an accounting equation of a company’s net worth (assets vs. liabilities). The price also includes measures of investor sentiment and expectations for growth. The obligations of the company reflected in the stock price do not change much over time, and especially day to day, therefore, prices do not change much on a daily basis.
For a machine-learning algorithm, it implies that a model can seem to perform sensibly well concerning return and loss forecasts by utilizing the earlier day’s price as the expectation at the present day’s price. This is unmistakably seen when the plotted price expectation closely mirrors a moving average since it draws from the previous day’s price.
Of a comparable vein, is the issue of stationarity. Many AI models make the assumption that the parameters of price variance are constant, meaning the mean and standard deviation of the prices show no trend nor change over time. But this is not the case in the real world.
One regularly utilized procedure to manage the issue is to take the price return from two distinct periods rather than the total return. The objective is to make the data “stationary” for ARIMA analysis. On the off chance that we apply this to trending macroeconomic data, we get something that begins to look like a typical “independent and identically distributed” variable.
Most supervised machine learning techniques are fitted by assessing or enhancing a set of weights that limit some functional objectives. In regression exercises, this capacity is regularly Root Mean Square Error (RMSE) and in classification, Cross-Entropy. Minimizing error for many data scientists has become the sole focus of many of their exercises.
In investing, finding a model that buys and sells with 95% precision, would appear to be an incredible result, however, customarily, this model won’t reflect true portfolio performance by taking into consideration the effect of significant loss in a short period of time.
While the chance of loss may only be 5%, it is quite possible a strategy will endure a total wipeout of the gains accumulated being right 95% of the time. This issue is much more significant for fund managers that suffer a big loss in a month or quarter, which subsequently results in not only a decline in performance plus fund withdrawals. Most data scientists that have never managed money professionally often lose sight of the balance required between returns and absolute loss greater than 10% when managing assets.
One simply needs to look at the dynamics of losses greater than 10% to understand how using “modern” models wiped out Long-Term Capital Management.
Here is what Buffett had to say was fascinating as it relates to the collapse of LTCM.
The whole LTCM is really fascinating because if you take Larry Hillenbrand, Eric Rosenfeld, John Meriwether and the two Nobel prize winners. If you take the 16 of them, they have about as high an IQ as any 16 people working together in one business in the country, including Microsoft. An incredible amount of intellect in one room. Now you combine that with the fact that those people had extensive experience in the field they were operating in. These were not a bunch of guys who had made their money selling men’s clothing and all of a sudden went into the securities business. They had in aggregate, the 16, had 300 or 400 years of experience doing exactly what they were doing and then you throw in the third factor that most of them had most of their very substantial net worth’s in the businesses. Hundreds and hundreds of millions of their own money up (at risk), super high intellect and working in a field that they knew. Essentially they went broke. That to me is absolutely fascinating.Buffett Lecture at the University of Florida School of Business October 15, 1998,
There is a historical backdrop with the advancement of crypto markets, AI and different types of periods.
Before the Hype: (up to 2016):
The correlation between prices and currency was exceptionally free. Specifically, altcoin prices were very loosely correlated to Bitcoin. Only Litecoin, Monero, and Dash showed any relationship to Bitcoin.
The Bull Run (2017):
Prices practically moved each altcoin toward Bitcoin. The correlation shot up drastically, with Ripple demonstrating some resistance to Bitcoin prices.
The Market Crash (2018):
Price patterns changed gears as positive market sentiment around cryptocurrency dwindled. Bitcoin prices tumbled from $17,000 to around $6,600. During this time, the correlation between cryptocurrency prices was strong, with the weight of Bitcoin’s price drops, pulling everything down with it.
Could AI help (or even supplant) a Human to distinguish a crash?
AI clustering has been a widely accepted technique for identifying patterns with cryptocurrency.
At first, look, when doing a free-for-all scatter plot, you can recognize some correlations between cryptos, and a few areas of ‘clusters.’ Utilizing different clustering techniques through unsupervised AI, one could realize patterns from more recent periods. A simple trick is to lessen the dimensionality of the entire crypto world into a 2-dimensional problem, and then cluster. Python permits you to do this with two or three lines of code. AI procedures can recognize entirely on its own various clusters.
However, because clustering is an unsupervised AI technique, it can’t identify a bubble. All that it can do is determine if a set of data points look similar. A human is still needed to understand whether it is a bubble, a bull trend (see red and green clusters), or a stagnant trend (gold line).
We can depend solely on models and backtests at our own risk. Your level of intelligence may not protect you. Experience may not protect you. Individuals contributing their own cash with you may not protect you. In the event that these things can’t protect you, what can? Maybe maintaining a strategic distance from overconfidence. Overconfidence in the form of an approach that doesn’t take into consideration the 5% chance of a significant loss. Overconfidence that if the 5% chance of loss occurs, the loss will be small.
The exercise here for the quantitative researcher is simple: comprehend the ramifications of their model, move away from being educated exclusively by error rates. Perform stroll forward testing as though you held your anticipated trades through time, explore loss and conduct sensitivity tests that punish your model for being wrong. Or better yet, don’t rely on model calculations as the base for the approach.
Just having a decent investing model that shows a positive return isn’t sufficient, to comprehend why we have to present the loss of potential risk-adjusted gain from alternatives. So the inquiry must be posed, if the exhibition of your model, when balanced for trading costs, doesn’t substantially surpass say, the S&P 500, would you say you are truly putting your cash to great use? Portfolio managers are bound to a benchmark where their skill is measured by surpassing it. Someone relying on a model to test an approach needs to realize that a simple machine learning model is not enough to know or prove that it will actually work in the real market.
Machine-learning’s growth is from a desire to turn finance into a complex field, like physics, and use difficult equations to identify solutions. The majority of pure quant asset managers that rely entirely on models fail to compensate for their models’ blind spots and many data scientists that are building models with financial and alternative data regularly fall into the trap of being overly impressed with backtesting results.
The high failure rate among quant funds includes smart beta, factor investing, statistical arbitrage, and CTAs. This industry problem can be summed up by the issue of false positives with back-testing.
Psychiatrists, for a long time, have long recognized the capacity of the human mind to find an intricate account in an arbitrary data plot. It becomes “obvious” that using real-world data with sophisticated math techniques, one “should” be able to work in the real world.
In their paper, “Detection of False Investment Strategies Using Unsupervised Learning Methods,” Marcos Lopez de Prado, of Cornell, and Michael J. Lewis of NYU, have composed a solution to weed-out the academic papers, articles, and research that promote algorithmic investing techniques and approaches that suffer from false positives.
The False Strategy Theorem:
With David Bailey of the University of California at Davis, the “False Strategy Theorem,” was developed, which means that AI makes it consistently easy to discover with back-testing noteworthy appearing investing strategies that are merely false positives.
The authors have turned to the theorem for a solution to estimate “family-wise false positive probability.” The answer has three essential features that are real world.
1. Investment returns do not follow a bell curve.
2. It assesses the probability of extreme events with a strategy.
3. It doesn’t assume correlations are stationary.
The solution they came up with is referred to as ONC, for the “optimal number of clusters.” The key to their learning algorithm is focused on clustered trials. Given N series, the algorithm partitions the tests into an optimal number of K subgroups, such that each cluster will have high intra-cluster correlations, low inter-cluster. To read more, open: “A Data Science Solution to the Multiple-Testing Crisis in Financial Research.”
Machine learning only provides some basic statistical techniques to calculate hypothetical performance, to develop a strategy that beats a benchmark requires a close comprehension, intuition, and a balance of all return and risk factors involved, not just value, momentum, and quality but performance in a crash.
The type of approach may utilize what Boston Consulting’s, Martin Reeves and Daichi Ueda, refer to as an integrated strategy machine which is described in the Harvard Business Review as “the collection of resources, both technological and human, that act in concert to develop and execute business strategies. It comprises a range of conceptual and analytical operations, including problem definition, signal processing, pattern recognition, abstraction and conceptualization, analysis, and prediction.”