Separating signal from noise with Alphalens
Quantopian has open sourced the Python Alphalens library for the performance analysis of predictive stock factors. It integrates well with the Zipline backtesting library and the portfolio performance and risk analysis library pyfolio, which we will explore in the next chapter.
Alphalens facilitates the analysis of the predictive power of alpha factors concerning the:
- Correlation of the signals with subsequent returns
- Profitability of an equal or factor-weighted portfolio based on a (subset of) the signals
- Turnover of factors to indicate the potential trading costs
- Factor performance during specific events
- Breakdowns of the preceding by sector
The analysis can be conducted using tearsheets or inpidual computations and plots. The tearsheets are illustrated in the online repository to save some space.
Creating forward returns and factor quantiles
To utilize Alphalens, we need to provide two inputs:
- Signals for a universe of assets, like those returned by the ranks of the MeanReversion factor
- The forward returns that we would earn by investing in an asset for a given holding period
See the notebook 06_performance_eval_alphalens.ipynb for details.
We will recover the prices from the single_factor.pickle file as follows (and proceed in the same way for factor_data; see the notebook):
performance = pd.read_pickle('single_factor.pickle')
prices = pd.concat([df.to_frame(d) for d, df in performance.prices.items()],axis=1).T
prices.columns = [re.findall(r"\[(.+)\]", str(col))[0] for col in
prices.columns]
prices.index = prices.index.normalize()
prices.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 755 entries, 2015-01-02 to 2017-12-29
Columns: 1661 entries, A to ZTS
dtypes: float64(1661)
We can generate the Alphalens input data, namely the factor signal and forward returns described previously, in the required format from the Zipline output using the get_clean_factor_and_forward_returns utility function. This function returns the signal quintiles and the forward returns for the given holding periods:
HOLDING_PERIODS = (5, 10, 21, 42)
QUANTILES = 5
alphalens_data = get_clean_factor_and_forward_returns(factor=factor_data,
prices=prices,
periods=HOLDING_PERIODS,
quantiles=QUANTILES)
Dropped 14.5% entries from factor data: 14.5% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions). max_loss is 35.0%, not exceeded: OK!
The alphalens_data DataFrame contains the returns on an investment in the given asset on a given date for the indicated holding period, as well as the factor value—that is, the asset's MeanReversion ranking on that date and the corresponding quantile value:
The forward returns and the signal quantiles are the basis for evaluating the predictive power of the signal. Typically, a factor should deliver markedly different returns for distinct quantiles, such as negative returns for the bottom quintile of the factor values and positive returns for the top quantile.
Predictive performance by factor quantiles
As a first step, we would like to visualize the average period return by factor quantile. We can use the built-in function mean_return_by_quantile from the performance module and plot_quantile_returns_bar from the plotting module:
from alphalens.performance import mean_return_by_quantile
from alphalens.plotting import plot_quantile_returns_bar
mean_return_by_q, std_err = mean_return_by_quantile(alphalens_data)
plot_quantile_returns_bar(mean_return_by_q);
The result is a bar chart that breaks down the mean of the forward returns for the four different holding periods based on the quintile of the factor signal.
As you can see in Figure 4.9, the bottom quintiles yielded markedly more negative results than the top quintiles, except for the longest holding period:
Figure 4.9: Mean period return by factor quantile
The 10D holding period provides slightly better results for the first and fourth quartiles on average across the trading period.
We would also like to see the performance over time of investments driven by each of the signal quintiles. To this end, we calculate daily as opposed to average returns for the 5D holding period. Alphalens adjusts the period returns to account for the mismatch between daily signals and a longer holding period (for details, see the Alphalens documentation):
from alphalens.plotting import plot_cumulative_returns_by_quantile
mean_return_by_q_daily, std_err =
mean_return_by_quantile(alphalens_data, by_date=True)
plot_cumulative_returns_by_quantile(mean_return_by_q_daily['5D'],
period='5D');
The resulting line plot in Figure 4.10 shows that, for most of this 3-year period, the top two quintiles significantly outperformed the bottom two quintiles. However, as suggested by the previous plot, the signals by the fourth quintile produced slightly better performance than those by the top quintile due to their relative performance during 2017:
Figure 4.10: Cumulative return by quantile for a 5-day holding period
A factor that is useful for a trading strategy shows the preceding pattern, where cumulative returns develop along clearly distinct paths, because this allows for a long-short strategy with lower capital requirements and, correspondingly, lower exposure to the overall market.
However, we also need to take the dispersion of period returns into account, rather than just the averages. To this end, we can rely on the built-in plot_quantile_returns_violin:
from alphalens.plotting import plot_quantile_returns_violin
plot_quantile_returns_violin(mean_return_by_q_daily);
This distributional plot, shown in Figure 4.11, highlights that the range of daily returns is fairly wide. Despite different means, the separation of the distributions is very limited so that, on any given day, the differences in performance between the different quintiles may be rather limited:
Figure 4.11: Distribution of the period-wise return by factor quintile
While we focus on the evaluation of a single alpha factor, we are simplifying things by ignoring practical issues related to trade execution that we will relax when we address proper backtesting in the next chapter. Some of these include:
- The transaction costs of trading
- Slippage, or the difference between the price at decision and trade execution, for example, due to the market impact
The information coefficient
Most of this book is about the design of alpha factors using ML models. ML is about optimizing some predictive objective, and in this section, we will introduce the key metrics used to measure the performance of an alpha factor. We will define alpha as the average return in excess of a benchmark.
This leads to the information ratio (IR), which measures the average excess return per unit of risk taken by piding alpha by the tracking risk. When the benchmark is the risk-free rate, the IR corresponds to the well-known Sharpe ratio, and we will highlight crucial statistical measurement issues that arise in the typical case when returns are not normally distributed. We will also explain the fundamental law of active management, which breaks the IR down into a combination of forecasting skill and a strategy's ability to effectively leverage these forecasting skills.
The goal of alpha factors is the accurate directional prediction of future returns. Hence, a natural performance measure is the correlation between an alpha factor's predictions and the forward returns of the target assets.
It is better to use the non-parametric Spearman rank correlation coefficient, which measures how well the relationship between two variables can be described using a monotonic function, as opposed to the Pearson correlation, which measures the strength of a linear relationship.
We can obtain the information coefficient (IC) using Alphalens, which relies on scipy.stats.spearmanr under the hood (see the repo for an example of how to use scipy directly to obtain p-values). The factor_information_coefficient function computes the period-wise correlation and plot_ic_ts creates a time-series plot with a 1-month moving average:
from alphalens.performance import factor_information_coefficient
from alphalens.plotting import plot_ic_ts
ic = factor_information_coefficient(alphalens_data)
plot_ic_ts(ic[['5D']])
The time series plot in Figure 4.12 shows extended periods with significantly positive moving average IC. An IC of 0.05 or even 0.1 allows for significant outperformance if there are sufficient opportunities to apply this forecasting skill, as the fundamental law of active management will illustrate:
Figure 4.12: Moving average of the IC for 5-day horizon
A plot of the annual mean IC highlights how the factor's performance was historically uneven:
ic = factor_information_coefficient(alphalens_data)
ic_by_year = ic.resample('A').mean()
ic_by_year.index = ic_by_year.index.year
ic_by_year.plot.bar(figsize=(14, 6))
This produces the chart shown in Figure 4.13:
Figure 4.13: IC by year
An information coefficient below 0.05, as in this case, is low but significant and can produce positive residual returns relative to a benchmark, as we will see in the next section. The command create_summary_tear_sheet(alphalens_data) creates IC summary statistics.
The risk-adjusted IC results from piding the mean IC by the standard deviation of the IC, which is also subjected to a two-sided t-test with the null hypothesis IC = 0 using scipy.stats.ttest_1samp:
Factor turnover
Factor turnover measures how frequently the assets associated with a given quantile change, that is, how many trades are required to adjust a portfolio to the sequence of signals. More specifically, it measures the share of assets currently in a factor quantile that was not in that quantile in the last period. The following table is produced by this command:
create_turnover_tear_sheet(alphalens_data)
The share of assets that were to join a quintile-based portfolio is fairly high, suggesting that the trading costs pose a challenge to reaping the benefits from the predictive performance:
An alternative view on factor turnover is the correlation of the asset rank due to the factor over various holding periods, also part of the tear sheet:
Generally, more stability is preferable to keep trading costs manageable.