TIPS: Two Long(ish) Time Frame Backtests

Stage setting done, so now onto the testing. How would a portfolio with TIPS have done compared to other allocations? Two backtests here that go back as far as I can in the new ETF-based investing era: one back to 2001, and another back to late 2009.

đź’ˇ
This is the fifth post in my "RPC Investigates" series on Treasury Inflation-Protected Securities (TIPS). The first post introduced them and what I'm trying to do in this deep dive; the second was a Risk Parity Basics post explaining them from the ground up; the third differentiated between TIPS purchased individually and TIPS in funds. The fourth assembled five sources for more learning on TIPS.

We’re really into the center of the conversation on TIPS by now, and so, to really get a handle on TIPS, I needed to fire up the ol’ portfolio backtester on Portfolio Visualizer and explore some questions. With this search, my goal was to find the longest time frame for backtests that I could, and in the next post, I’ll concentrate on some narrower and more recent time periods, especially with times where inflation has been/is high, since this should be the ideal situation for TIPS.

For both of the backtests below, I’m using the chassis of the Golden Butterfly portfolio since this is a simple, straight-forward portfolio that has a relatively large allocation to experiment with. Recall that this portfolio is:

  1. 20% Large-cap Growth stocks (using SPY when I specify an ETF)
  2. 20% Small-cap Value stocks (using VBR)
  3. 20% Long-term Treasuries (using TLT)
  4. 20% Gold (using GLD)

(By the way, these aren’t my preferred assets for these categories, but all have long histories so they won’t shorten the backtests).

That leaves 20% left in the portfolio. Originally, that would be Short-term Treasuries, for which I’ll use SHY as my chosen ETF. I’ll be using this 20% for the experiments and swapping it with other possibilities.

Correlations

Before turning to the backtests, a rundown of the correlations between the assets. This matrix tracks data since October 2009 (matching backtest #2, but six years shy of backtest #1):

Some things to notice:

  1. TIPS really aren’t that much of a diversifier! If you take a look at TIP, the broad TIPS fund, you see correlations to the equity assets in the .2 range, which would be good if TIP itself were an equity - but it’s not. Short-term TIPS are even more correlated with equity funds, at .48 and .43 to large-cap growth and small-cap value.
  2. Another way to look at the TIPS correlations is to compare them to nominal treasuries. Regarding SPY, Long-term TIPS are .21, but regular Long-term Treasuries are -.25. Then Short-term TIPS are .48, while regular Short-term Treasuries are -.09. In both comparisons, the TIPS have a higher annualized return, but as for diversifying your portfolio, they won’t do as much as the nominal versions.
  3. I put in UUP, Invesco’s US Dollar Index Bullish Fund for reasons explained in backtest #2 below, but right away, its negative correlation to all sorts of assets just jumps off the page. With a modest but still positive annualized return and this level of negative correlation, UUP is now a fund that has my attention!

First Backtest, to July 2001

The oldest backtest I could recreate was by using the “asset class allocation” tool on Portfolio Visualizer. In this case, I didn’t have to specify ETFs but could instead just choose the asset class so for “TIPS,” I assume I am getting just a broad-based TIPS fund with bonds of varying durations.

The great thing about the asset class allocation tool as opposed to the “asset allocation” tool is that you get a longer backtest, so here I was able to go back to January, 2001.

A drawback to this tool is that the number of asset classes is understandably limited - you can’t do TIPS of different durations or more exotic assets like a “Bullish Dollar” fund. In the end, I was left with three possibilities:

  1. The Golden Butterfly with 20% in TIPS
  2. The original Golden Butterfly with 20% in Short-term Treasuries
  3. The Golden Butterfly with 20% in Intermediate-term Treasuries (since the TIPS were most likely to average out to Intermediate-term, this made sense)

Here are the results for Backtest #1:

Backtest to January, 2001

Interesting results! First off you’ll notice that TIPS wins for growth rate, just a hair over Intermediate-term Treasuries. Short-term Treasuries are not that far away, but still, .4% is enough of a gap over 15 years to be noticeable. On the other hand, the portfolio with TIPS got that growth with a bit more volatility, including a worse worst year and a bigger drawdown, and most significantly, the lowest Sharpe ratio. Not much of a difference, of course, but enough.

Finally, for my favorite statistic, the Perpetual Withdrawal Rate (found on the “metrics” page once you run the backtest), TIPS score a nice little win. Even with the added volatility, the higher CAGR means that the Golden Butterfly with TIPS could have sustained a higher perpetual withdrawal rate over this time period. And since this is my longest backtest, I give it more credence.

On a side note, I had hoped to compare the TIPS portfolio with one holding commodities, but the data for commodities went back to just 2007, so it wasn’t much of a test. In case you were curious, though, here is the link for that test (TIPS, Short-term Treasuries and Commodities). In sum - TIPS won again over this time period, and commodities were dismal. They peaked in 2006 and slid for basically 15 years until the recent uptick, which you can see in the data.

Second Backtest, to October 2009

To gain more specificity, I ran a second backtest in which I was able to choose specific assets. The bad news is that this shortens the backtest, due to the relative youth of two of the funds:

The top three are three varieties of TIPS, including the largest ETF in the TIPS space, TIP. The inclusion of SHY is obvious as the original asset in the Golden Butterfly. I cheated a bit with UUP, and included it based on its great performance in 2022. This is a great example of hindsight being 20/20. It’s not a fund that was even on my radar before, but after seeing UUP’s rise (alongside with my own lived experience of being in Japan and watching the Yen lose about 30% of its value compared to the dollar!), I put it in the backtest to see how it would have done over a longer period. The sixth fund is the biggest commodity fund I could find with an inception date before the PIMCO TIPS funds; PDBC is, in my opinion, a better choice as a commodity fund but it is too young.

Since I had six funds, I had to run two backtests but used the same parameters and the same start date of October, 2009. Here are the results:

Backtest #2a: 1) TIP, 2) STPZ, 3) LTPZ; Backtest #2b: 1) SHY, 2) UUP, 3) DBC

Backtest to October, 2009

Observations:

  1. Long-term TIPS scored the highest CAGR of the six, Broad TIPS index second, and Short-term TIPS fifth. The fact that Long-term TIPS did well, like Long-term Treasuries, is not a surprise - that’s the added premium for duration risk. Notably, this portfolio winds up being 40% in Long-term bonds, half of that in nominal and half inflation-protected.
  2. The highest Sharpe ratio portfolio was the one with UUP, the Bullish Dollar fund, and it was substantially higher than the others with TIPS, all in the .8s. The portfolio with UUP would have had the shallowest drawdown and the best worst year. For diversification purposes, UUP is the choice, in line with our correlation figures above.
  3. In terms of the all important Perpetual Withdrawal Rate, the portfolio with Long-term TIPS is the highest at 4.8%, despite the fact that it has the second-lowest Sharpe ratio. It’s higher CAGR is enough to overcome its lack of diversification. There are also three portfolios tied at 4.6%, and the two lowest are both Short-term Treasuries, with the TIPS version edging out the nominal version by 20 basis points.
  4. The difference between Short-term TIPS and regular  Short-term Treasuries wasn’t that significant, all things considered. There were higher returns with the TIPS version, but the nominal Treasuries had slightly calmer performance. Since this is the asset class investors look at as ballast, the modest steadiness of Short-term Treasuries is a virtue, not a vice.
  5. Commodities were the most volatile, and since their returns were not particularly strong, having 20% in gold and then another 20% in broad commodities does not seem like a winning formula. Going into these tests, I had imagined that commodities would do better, but as I mentioned, this particular 15 year stretch was horrible for commodities.

Conclusions

All in all, it was a better showing for TIPS than I thought would be the case. Long-term TIPS seem to be the choice if you are prioritizing growth rates, and in the first backtest where we couldn’t isolate Long-term TIPS, the broad TIPS index outperformed both short and intermediate bonds.

TIPS seem to deliver some reward, but don’t do so much to dampen risk. The tests show that TIPS don’t have that much of a diversifying effect on a portfolio. To start with, the correlations were high relative to other bonds by substantial amounts. Next, in the first backtest, TIPS delivered the worst Sharpe ratio, and then in the second backtest, all three TIPS funds lost out to Short-term Treasuries, and lost out big time compared to the Bullish Dollar fund.

At the same time, the results for Perpetual Withdrawal Rate show that the trade-off appears worth it. TIPS funds won both backtests, and then with the apples-to-apples comparison of Short-term TIPS and nominals in the second backtests, the TIPS version was higher.

Honestly, I didn’t expect this. I started the backtests thinking they would confirm the mini-test of inflation assets I did back in April. There I looked at performance from April 2021 to that point and found that commodities (in the form of PDBC) were far and away the best and that TIPS (VTIP) were far and away the worst. Seriously, it wasn’t even close! Not only did TIPS lose out to VOO, PDBC, GLDM and VNQ (though they did tie with VIOV), they didn’t even keep up with inflation.

So, actually, I’m pleased to see this test deliver some very different results. It’s making for a more complicated puzzle for me to figure out, but in the long run, it’s more interesting than a rout.

In the next post, I’ll return to the focus on inflation, and see if TIPS deliver any type of “crisis alpha” that you can count when inflation, or even the fear of it, strikes.


Bonus Charts!

Breakdown of the six assets used in the second backtest, by year: