Quantifying Uncertainty in the Bitcoin Stock To Flow Model
In this article we look at Plan B (aka 100trillionusd)'s article 'Modeling Bitcoin's Value with Scarcity' through a probabilistic lens. I'm not advocating for or against the applicability of the model and others have discussed its validity. See "Challenging Plan B: a review of 'Modelling Bitcoin’s value with scarcity'" by @BurgerCrypto and "Falsifying Stock-to-Flow as a Model of Bitcoin Value" by @phraudsta. Here, we only try to get a better understanding of the uncertainty of the predictions.
Btw, this is not investment advice, do you own research, I'm not you lawyer, eat your vegetables, etc.
The linear regression of the log of the values seems reasonable but the uncertainty (variability / error) large enough, especially when brought back to a linear scale, that it makes it difficult to make useful specific predictions. Then again, that alone is useful information.
The prediction ranges at the halvening come out to be:
LNMKT is the log of the market cap for all of bitcoin and $/BTC is the corresponding dollar price per bitcoin given 18,400,000 coins exist at that time.
Though time has passed and more data is available I'll use the data Plan B generously made available on Github. In the original article, as I understand it, Plan B uses a stock to flow ratio at the halvening of 50 and estimates 18,400,000 coins in existence. My code is on Github so that you can change these numbers if you like. Plan B also uses less data in the article than they later makes available so our numbers are close but not exact.
In Plan B's analysis they fit a line, y = a + b*x, with intercept/alpha of 14.6 and slope/beta of 3.3 to the log of the data which is then used to predict a bitcoin price of ~$55k at the halvening. With the data they provides (again which is more data than they used for the article) a linear regression gives us an alpha of 14.17 and beta of 3.51 and a predicted price of ~$71k.
Something to keep in mind though, is that those are not the only feasible values for alpha and beta. They are just the best fit.
Possible Other Regressions
If we use Pymc3 to create a Bayesian Linear Regression model we can get estimates of alpha and beta including standard deviations which we can interpret as uncertainty. Additionally we can estimate an error parameter, espilon, to represent noise/error in the model.
x_shared = theano.shared(x) with pm.Model() as model: alpha = pm.Normal('alpha', mu=0, sd=20) beta = pm.Normal('beta', mu=0, sd=20) epsilon = pm.HalfCauchy('epsilon', beta=10) mu = alpha + beta * x_shared y_likelihood = pm.Normal('y', mu=mu, sd=epsilon, observed=y) trace = pm.sample(draws=10000, tune=2000)
This gives us these parameter estimates.
When we plot the 95% HPD / credible interval, we see that alpha could range from 13.8 to 14.5 and beta between 3.4 and 3.7.
We can plot lines using these feasible parameter ranges, along with the best fit, and see that they also are reasonable fits.
Using The Parameter Ranges To Make Predictions
We can also use the plausible parameter ranges to plot the bands where we'd expect to see 50% and 95% of the values.
And finally we can use the plausible parameter ranges to study the distribution of predictions for the LNMKT at the halvening (SF = 50). Think of this as making a prediction using many values of alpha, beta and epsilon and then examining the resulting predictions as a distribution.
Note that the X axis is still in log values. The final market cap predictions look normal in log space but then become strongly skewed when brought back to a linear scale. I'm also showing the BTC price in dollars but don't make the mistake of interpolating a halfway point by dividing by 2 to get an estimate of the prediction.
I was not able to come up with a good linear space graph that I liked but if you have a good idea let me know.
Here I show a simplified table of the mean prediction and 1 or 2 standard deviations plus or minus is below. Note the wide range in dollar estimates is due to the wide range in the original data causing the calculations to be done in log space.
We can also plot the Empirical Cumulative Distribution Function (ECDF) to read off any confidence level you like. Again keep in mind that the X axis is in log space.
To me, it is a mixed bag. I like the idea of a $70k BTC but the variance of even one standard deviation, $23.5k to $208.2k, makes it difficult make specific useful predictions. I'd be curious to know if updating the dataset with prices from Jan - Aug 2019 would reduce the variability or not. To do that study, I'd want to use Plan B's process exactly which may not be feasible due to logistics.
Don't hesitate to let me know if you have any comments, questions or ideas on how to improve this analysis.