Market Logic

Is data mining really the problem?

Posted in data mining, finance by mktlogic on August 12, 2009

In the WSJ, Jason Zweig writes,

“The stock market generates such vast quantities of information that, if you plow through enough of it for long enough, you can always find some relationship that appears to generate spectacular returns — by coincidence alone. This sham is known as “data mining.”

Every year, billions of dollars pour into data-mined investing strategies. No one knows if these techniques will work in the real world. Their results are hypothetical — based on “back-testing,” or a simulation of what would have happened if the manager had actually used these techniques in the past, typically without incurring any fees, trading costs or taxes.”

I think I agree with the spirit of what Zweig says, but articles like this always bug me for a handful of reasons.

First, any investing thesis is either based on past data or it is based on no data. There are problems with back testing versus other ways of using data, but the reliance on past data is not the problem.

Second, using the term “data mining” as some kind of slur for sloppy exploratory data analysis is misleading. Most of what Zweig criticizes isn’t strictly data mining and in fact his recommended alternatives are closer to actual data mining practice.

What Zweig actually seems to criticize are specification searching and parameter searching and those really are problems (What are the odds of not getting a t-stat greater than 2 if you try 50 variations on a model?) but that’s not data mining. Zweig does recommend some alternatives, but it’s worth mentioning the alternative implied in textbook statistical analysis: Come up with an idea, then test it under one specification, and let that be the end of it. I doubt that anyone actually does this. What people try to do is come up with an idea and test it under a handful of reasonable seeming specifications. This has a high risk of devolving into a statistically sloppy specification search. Or you can actually do real data mining i.e. exhaustively or nearly exhaustively testing lots of models and using cross-validation and out of sample analysis. So the choice is really between testing a handful of specifications or testing lots.

Zweig actually recommends the use of out of sample analysis and giggle testing a.k.a. asking “Does this idea make sense?”, but I have no idea why he mentions these as alternatives to data mining. Out of sample testing is a standard practice in data mining. Giggle testing can be used in conjunction with any other approach but it really just amounts to asking “Is this idea compatible with what I believed yesterday?”

Zweig isn’t all wet. In fact most of what he criticizes as data mining is really worthy of criticism. It just isn’t data mining.

What should we test when we test technical trading rules?

Posted in economics, finance by mktlogic on May 13, 2009

I recently read Mebane Faber’s paper, A Quantitative Approach to Tactical Asset Allocation, which is apparently quite popular on SSRN.

The purpose of this paper is to present a simple quantitative method that improves the risk-adjusted returns across various asset classes. A simple moving average timing model is tested since 1900 on the United States equity market before testing since 1973 on other diverse and publicly traded asset class indices, including the Morgan Stanley Capital International EAFE Index (MSCI EAFE), Goldman Sachs Commodity Index (GSCI), National Association of Real Estate Investment Trusts Index (NAREIT), and United States government 10-year Treasury bonds. The approach is then examined in a tactical asset allocation framework where the empirical results are equity-like returns with bond-like volatility and drawdown.

Early in the paper Faber presents a simple and mechanical market timing procedure:

BUY RULE: Buy when monthly price > 10-month SMA.
SELL RULE: Sell and move to cash when monthly price < 10-month SMA.

The remainder of the paper describes the performance of a hypothetical portfolio that adheres to these two rules.

Faber’s procedure is, in fact, one instance in a class of procedures of the form “Buy (sell) when the price is above (below) the N-period SMA.” Whenever I see papers like his, I’m curious as to why the focus is on testing the instance rather than the class. That is, how would performance have been if the wrong lookback had been used?

I’m sure that there are plenty of similar classes of procedures based on channel breakouts, trendlines, volatility bands and so on that outperform buy and hold over the same period with the right lookback. Such procedures would probably also work well enough with non-optimal lookbacks.

Another class of market timing procedure might be “Buy on Jan 1, 1900 and switch back and forth from assets to cash every N days.” I’m very sure that for the right values of N, this timing procedure can generate results better than buy and hold. I’m also confident that for the wrong values of N this procedure would work very poorly.

This isn’t a complaint about data mining for the right lookback to make a trading strategy seem better than it is. (To his credit, Faber’s paper includes out of sample tests of his procedure.) Rather, knowing how well a market timing procedure works for any given lookback period is just not that useful. Since it is impossible to know the optimal lookback ahead of time, the relevant question either for tests of market efficiency or for active management is “How well does the class of market timing procedures work when the lookback used is non-optimal?”