Portfolio Returns - The Basics
June 28, 2008Thanks for visiting! If you like what you're reading, you may want to subscribe to my RSS feed.
I’m back from holiday and it’s time to start my new series on portfolio returns (as discussed in my previous post) with some basics on averages, limits and other interesting information you can extract from your data without too much effort. Since I’m working on a spreadsheet alongside these posts, I will also include some Excel references that you can use to shortcut the calculations.
Based on my outline, this is what I’m looking to cover in this post:
- arithmetic mean
- geometric mean
- frequency distribution / histograms
- maximum and minimum value
- positive / negative number of weeks / months / years
Arithmetic Mean
The arithmetic mean is more commonly referred to as the average (or just mean) and is calculated by summing all numbers of a list and subsequently dividing the result by the number of items in the list. Finding the average of three numbers A, B and C would involve forming the sum A + B + C and dividing by 3. It answers the question “If all quantities had the same value, what would that value have to be in order to achieve the same total?” and the calculation is relevant whenever the quantities concerned add together to produce a total. For instance, if you buy milk, bread and butter you will be expected to pay a total sum to cover the price of these three items and calculating the average would tell you what price a different product X would have to have so that you could buy three instances of X and end up paying the same amount.
The trouble with the simple average is that is greatly influenced by outliers, i.e. extreme values at either end of the spectrum. If you calculate the average of ten 5s and 1000 you will end up with (10 * 5 + 1000) / 11 = 95.45 which is unproportionately large and could give the impression (assuming the individual data points are not known) that the data is concentrated around values just short of 100. This problem is a simple example of skewed distributions where the mean of a data series doesn’t coincide with the median (the “middle value” if you were to list them all).
Finally, the arithmetic mean is not suitable for calculating stock performance since you’re looking for the product of the yearly performance, not the sum of those values. Consider an investment that loses 10% of its value in the first year and gains 30% in the second year. Now the annualised performance since you started investing is not [(-10%) + 30%]/2 = 10% but in fact 8.2% since the 30% increase only affects the 90% of your investment that’s left after the first year. To correctly calculate annualised performance rates like this we will have to use the geometric mean.
Excel Function: AVERAGE(number1, [number2], …)
Geometric Mean
The geometric mean is a value that gives an indication of the central tendency of a set of numbers (i.e. of a distribution) regardless of whether this distribution is normal or skewed. In other words, it shows the “typical value” of a list of values. Note that this is different from the most common value which is a mathematical value called “mode”. It answers the question “If all quantities had the same value, what would that value have to be in order to achieve the same product?” and hence is relevant whenever quantities multiply together to produce a product. The geometric mean is always less than or equal to the arithmetic mean.
To calculate the geometric mean you need to multiply all the values in your set and subsequently find a scientific calculator to take the nth root of the product where n is the number of numbers you previously multiplied. In the simplest case when calculating the geometric mean of two values A and B you would need to take the square root of A*B. Unfortunately, this also means that the geometric mean in its purest form can only be applied to positive values as (in conventional mathematics and without exploring imaginary values) you cannot take the root of a negative number.
While we all hope that our investments are going to increase year after year, we can’t simply assume this and happily apply the geometric mean without a little work first. To take negative growth into account, you will need to look at your percentage values from a slightly different angle. Let’s go back to our original example of a return of -10% in the first year and 30% in the second year. After the first year, only 90% of your original investment will remain - that is, your portfolio will stand at 0.9 instead of 1 (= 100%). In the second year, you’re more lucky and see a 30% growth of that remainder. If you would have had a fresh portfolio (i.e 100% = 1) you’d now be left with 130% of your money or 1.30. Since you only had 90% remaining we need to multiply 0.9 by 1.30 to get 1.17. This means that in relative terms to the amount you started with, your investment is worth 17% more after 2 years. To now calculate the annualised growth rate of your investment we need to take the square root of 1.17 which yields 1.0816 or a growth of 8.16% per annum.
The Excel function given below does not convert your percentages accordingly so the restriction with regards to positive values still applies. So far, I haven’t been able to find an in-built function that does the work for you which I find rather frustrating. It is extremely annoying to represent a 10% drop in my portfolio as 0.90 and if I don’t come across a more workable solution soon, I will probably end up writing a function myself
If you know something that I don’t please let me know before I spent hours writing an Excel Add-In from scratch.
Excel Function: GEOMEAN(number1, [number2], …)
Frequency Distribution
If you’re a visual thinker like myself, you will appreciate that I often find it more helpful to look at a graph instead of raw numbers. An obvious choice in the case of portfolio returns is a histogram which is based on a frequency distribution. At this stage, I won’t dwell on this topic for long as I haven’t got sufficient data yet to experiment with histograms and the associated frequency distributions yet. As soon as that is the case, I will almost certainly give you a little more insight into this topic.
For the moment, imagine that you have two buckets in which you need to throw little cards that represent your portfolio performance - whether you’re looking at weekly or monthly returns or any other values is fairly irrelevant as long as you have enough data. Your task now is to separate the positive returns from the negative ones by throwing them into their respective buckets. Once you’re done with that, you can draw a simple bar graph with the number of items per basket representing the height of an individual bar (assuming one unit width per bar). Voila - there is your first histogram
If you imagine that you could have as many buckets as you like, then you only need to come up with a criteria to decide which bucket contains what data to create whatever histogram you like. You could distinguish between negative and positive values, determine 5% steps or pick any other classification you might find useful. The only rule is that categories must not overlap - i.e. there is always one unique bucket for every single piece of data and at no point would any value (whether in your sample or not) match two or more buckets.
The interesting thing about histograms is that it’s not technically the height of a bar that determines the value it represents (unlike in a simple bar chart where only the height matters) but it’s the area that is covered by it. Hence you could draw a bar that’s one inch wide and three inches high and that would represent the same value as a bar that’s three inches wide and one inch high. For this reason the bars of a histogram have to be adjacent with no gaps in between.
Because you are free to choose however many buckets you want to use when constructing a histogram (the official term for my made-up bucket is “bin”) much research has been done to determine the optimal number of bins as too small a number might hide valuable insights while too many bins render the diagram useless. The level of granularity is crucial. One formula that has been put forward is k = [(max(x) - min(x)) / bin width h] where k represents the optimal number of bins and max(x) and min(x) denote the maximum and minimum values of your data respectively. Whatever value you calculate, you should round up to the nearest full value and use this to create your categories.
Excel 2007 includes an Add-In that creates a histogram from your data if presented in the correct way (data values in one column, bin limits in second column) and hence extends the basic FREQUENCY function that simply returned an array of values. While I have Excel 2007, the data arrangement it requires to execute properly doesn’t work very well if you’re planning to have more than just a histogram graph in your worksheet. Further, from the previews I have seen of the functionality, the histogram looks wrong as there are in fact gaps between the bars (which violates the rules given above). Hence, once more I find myself wondering whether I should invest the time and effort to create something a little more elegant myself. Stay tuned
Maximum and Minimum Value
Since I assume you graduated from primary school long ago, I don’t really need to explain the meaning of maximum and minimum value to you. I simply include it for completeness’s sake and to create a central place for all important Excel functions
Excel Function: MAX(number1, [number2], …) and MIN(number1, [number2], …)
Positive and Negative Number of Weeks
This is an interesting little insight I came across in the book I was reading about Asset Allocation not long ago. While it doesn’t tell you that much per se, it gives you valuable perspective and a sense of patience when you’re tempted to give up your long-term investing for short-term speculation. If you can see that your investment is up most weeks / months / years and only down in value occassionally, you might find it easier to sit out the rough patch - rest assured that other (better) times will come.
To avoid having to count these items manually every time you update the performance, you can use the Excel function COUNTIF that looks at a list of values and only counts the instances that fit a criteria given as part of the formula. By using the criteria “>0″ you will end up with a count of all positive values, while “<0″ returns a total of negative occurences.
Excel Function: COUNTIF(range, criteria)
I hope you enjoyed this first introduction into maths and basic statistics as I very much look forward to researching the upcoming posts! If you have any questions or concerns, just leave a comment and either myself or another reader will surely be able to help.
















