Variance Problem in MathsteemCreated with Sketch.

figures-2473795_1920.jpg


I have been doing my market research, and I have discovered a really annoying problem regarding the use of variances when we want to work with prices, so in general it pretty much affects the entire field of quantitative analysis including econometrics and such.

The problem is with fractions and using numbers that are < 1 . In fact it even persists in other cases. It gives you completely inaccurate distances between 2 numbers, since basically the variance is the average of the sum of the first order differences in a dataset, and tries to estimate the sort of average distance between the price, it totally breaks down when the price swings between different decades.

  • Let me give you an example, we generate 3 sets of random numbers with the same distance from eachother, 100 units:

1.png

The sample size is 5000, so there is plenty of room for the variance to smoothen out. Do you see the problem?

Basically we have 3 sets of random numbers, all of them having the same distance between them, so it would be logical to follow that the variance of all of these would be the same as well.

Because in a trading scenario:

  • Difference between 1$ and 100$ => 10000% Profit
  • Difference between 100$ and 10,000$ = 10000% Profit
  • Difference between 0.01$ and 1$ = 10000% Profit

So logically the distance should be exactly the same, regardless of how many decimals or how far we are from the comma.

So this is what I am talking about, when we are analyzing markets, this kind of problem is unacceptable, since the distance from the price should not swing with the number of digits. So we need to normalize it.

Now I don't know how to describe the normalization function, but you can do it in python via this script:

variance = variance * 10**len(str(variance).split('.')[1]) # normalize variance

So the normalized variance in our case would look like this:

2.png

Multiplying each number with it's coefficient, and then dividing it by a common number say 1010 just to make it more readable. In quantitative analysis the unit hardly matters only the ratio, so we can do this.

But this only works for numbers >1, while we have a set that has numbers < 1, this doesn't work. I am sure you can write a script to count the number of digits before and after the comma, so this normalization technique could work.

But why bother when we can use a much more elegant formula. To calculate a bias-less variance.



Introducing "Rational Variance"


The general (differential) variance formula for a discrete probability distribution is this:

v1.png

I have discovered the rational variance formula, rational as in it is based on the ratio of 2 numbers not the difference of it:

1.png

  • In python it's like this:
def Rational_Variance(price):
 total          = len(price)
 vari           = 0.0

 for i in range(1,total-1): 
  vari+= abs(   math.log( price[i]/price[i-1] )   )

 vari=vari/(total-1) 

 return vari


Now this way it becomes really easy to work with variances. Going back to the example given above, we can easily see that now the values are on the same playing field, with small discrepancy due to the randomness.

1.png

So this removes the obvious problem when we are calculating prices of different orders of magnitude, since we are only working with ratios not fix numbers. So the variance really becomes a variance of the % swing in a price set, which is more important for a trader than the actual value.


Sources:


Upvote, ReSteem & bluebutton


Sort:  

Very good explanation, where you are using this in your market research ?

I am looking into popular benchmarking tools like the Sharpe ratio and analyzing how useful they are, so I have to measure the variance, and by this issue alone it kind of shows you that the Sharpe ratio is really worthless if the price fluctuates between orders of magnitudes of digits. Thus I use this corrected variance from now on, but I still continue my research with it. I will publish the results later here on Steemit.

I will be looking for it.

This popped to my feed from a resteem. Good stuff, I will be following you :)

Coin Marketplace

STEEM 0.14
TRX 0.12
JST 0.026
BTC 54691.22
ETH 2323.26
USDT 1.00
SBD 2.12