$30
This assignment gets you started working with prices and returns. You should do everything in R even though we are only working with six different stocks and it is possible to do some things manually. If you have a sample of hundreds of stocks this might become more difficult.
1. Go to CANVAS and download the data for PS1Monthly.xlsx. The data was downloaded from the CRSP (The Center for Research in Security Prices) data base via WRDS (Wharton Research Service) that is available for LBS students. Note the file contains data about Microsoft, Exxon Mobil (previously Exxon), General Electric, JP Morgan Chase (previously Chemical Banking and Chase Manhattan), Intel, Citigroup (previously Primerica and Travellers Group). In addition, the columns vwretd (ewretd) and vwredx (ewretx) contain value-weighted (equal-weighted) total returns and total returns excluding dividends for the CRSP index that contains stocks from NYSE, AMEX, and NASDAQ. Finally, sprtrn contains the total return for the S&P 500 Composite Index.
2. Use the holding period returns (RET) to create a total return index for the MSFT and GE stocks and the S&P 500 index, which shows the theoretical growth in value of an investment in the stock assuming that dividends are reinvested (normalize the start value to 1). That is, simply assume you invest 1 $ in each of the three assets and compound the returns. Do the same for the returns that abstract from dividend payments (i.e., use RETX instead). Plot the investments with and without dividends for each stock separately. How do dividends affect the results?
3. The holding period returns are simple returns. Generate a new variable that contains the correspondinglog returns (LRET). Calculate the mean, variance, skewness, and kurtosis of the normal and the log returns. Plot the normal against the log returns for MSFT. Briefly discuss your results!
4. Go to CANVAS and download the data for PS1 Daily.xlsx. This file contains two worksheets. HPRDaily contains the daily holding period returns for the six stocks, the S&P 500 Composite Index and the value-weighted market portfolio (including dividends) from CRSP. PricesDaily contains the prices for the six stocks and the S&P 500 Composite Index.
5. Construct a daily total return index for MSFT and GE stocks and the S&P 500 index and plot them against each other. Compare your results with the monthly total return indices from above. Are there any differences? Discuss.
6. As before, the holding period returns are simple returns. Create log returns. Calculate the mean,variance, skewness, and kurtosis of the normal and log returns. Compare and discuss your results with the results from monthly frequency.
7. Compare the statistical properties of the log holding period return time series both for monthly anddaily returns. Plot a histogram and discuss how the empirical distributions relate to the normal distribution.
8. Pick three stocks and the S&P 500 index (either you can use MSFT, GE and JPM or adapt the code to pick three random stocks). You will need the holding period returns (both normal and log returns) and the total return indices you created.
9. Calculate the covariance matrix for the log return series, using both the returns and returns squared.Discuss your results briefly.
10. Plot the ACF (autocorrelation function) for returns, returns squared, and absolute returns. Discussthe results!
11. Use the three assets and make up a portfolio by assigning arbitrary portfolio weights. What does itimply if you keep the weights fixed over time?
12. Using the portfolio weights and assets from above, calculate the corresponding portfolio returns. Moreover, use the portfolio returns to calculate the evolution of a $ 1 investment in the portfolio over the sample period. Plot the result against the evolution of a $ 1 investment in each of the three stocks. Discuss the result.
1. Portfolio theory with matrix algebra:
• Read the notes on portfolio algebra from Canvas. Also, make sure you understand the associated sample codes also available on Canvas.
• Calculate the means, the variance and the pairwise covariances for the three stocks MSFT, GE, and JPM for the sample period between 2/1/1990 and 31/12/2002.
• Define the following matrices that contain returns, expected returns, portfolio weights, and covariances:
rMSFT µMSFT xMSFT
R =rGE µ = µGE x = xGE
rJPM µJPM xJPM
σMSFT2 σMSFT,GE σMSFT,JPM Σ = σGE,MSFT σGE2 σGE,JPM σJPM,MSFT σJPM,GE σPM2
• Note that the expected portfolio return and variance equal:
µp = x’µ σp2 = x’Σx
Further, the condition that the portfolio weights have to sum up to one can be expressed as x’1 = 1.
• Calculate the return and standard deviation of a portfolio where you equal-weight the three stocks - call the portfolio e. Additionally, consider a portfolio y with a weight vector y’ = (0.8,0.4,−0.2). Calculate the risk-return tradeoff of y as well as its covariance with portfolio e.
• In order to find the global minimum variance portfolio with weights m’ = (mMSFT ,mGE,mJPM), we have to solve the following problem:
minσp,m2 = m’Σm s.t. m’1 = 1 m
The corresponding first order conditions are (check this by hand!)
Hence, the system is of the form
Amzm = b
and the solution for zm is then
z b
The first three elements of zm are the portfolio weights m’ = (mMSFT ,mGE,mJPM) for the global minimum variance portfolio. Calculate the variance and the expected return of the minimum variance portfolio.
• Find another efficient portfolio. Namely, the efficient portfolio that gives a return equal to the expected return of MSFT. Note, that your minimization problem now becomes:
x s.t. x’1 = 1 and µp = x’µ = µMSFT
Derive the solution as above in terms of portfolio weights and calculate them in your code. In addition, calculate the expected return and the variance of efficient portfolio x as well as its covariance with the global minimum portfolio.
• Plot the entire efficient frontier for the three risky assets.
• Now, rerun your code with sample moments for the three stocks MSFT, GE, and JPM for the sample period between 2/1/2003 and 31/12/2014.
• Finally, compare your results for the three assets across the two sample periods. Comment on potential problems that might arise when you based investment decisions on your analysis. Also discuss potential solutions to the problems mentioned.
2. Buffett’s Alpha:
• Go to CANVAS and download and read the paper ”Buffet’s Alpha” by Frazzini, Kabiller, and Pedersen. (hint: the following questions are about performance evaluation. We discussed this in the second lecture. In particular, we discussed the use of factor models when measuring performance).
• Warren Buffett has a strong positive CAPM alpha. We normally think of a positive alpha as evidence of investment skill. Explain why a positive alpha is an indicator of skill.
• Is it possible for Buffett to both a) have a positive CAPM alpha and b) have no investment skill? Explain.
• After reading the paper. Has your perception of Warren Buffett as one of most prominent and successful investors in the world changed?
• What is the main purpose of this paper? Put differently, what is the authors’ goal in this paper?
• What factors do the authors use to try to explain Buffett’s investment performance? Explain what return exposures these factors are trying to get exposures to.
• Should we give Buffett credit for his past high returns? Should we give him credit for future high returns if they are generated by maintaining his past factor exposures? Explain in detail.
1. Note: use value-weighted rather than equal-weighted returns for your analysis.
• Go to Kenneth French’s webpage https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/datalibrary.htmland download the 25 portfolios formed on size and book-to-market (25 Portfolios Formed on Size and Book-to-Market (5 × 5)) at the monthly frequency. In addition, also download the Fama/French 3 factors (Fama/French 3 Factors) at the monthly frequency.
• Run the following regressions:
ri,t − rf,t = αi + βi (rMKT,t − rf,t) + ǫi,t
where ri,t is the return of one of the 25 portfolios formed on size and book-to-market, rMKT,t is the return on the market, and rf is the monthly riskless rate.
• Plot the mean realized excess returns of the 25 portfolios (i.e., for each portfolio, plot the timeseries mean of ri,t − rf,t) against the predicted expected excess returns implied by the CAPM regressions above (i.e., a time-series mean of βˆi (rMKT,t − rf,t) ). Critically discuss the pricing ability of CAPM.
• Now run the following regressions:
ri,t − rf,t = αi + βi (rMKT,t − rf,t) + γirSMB,t + δirHML,t + ǫi,t
where rSMB,t is the return on size factor and rHML,t is the return on book-to-market factor.
As before, plot mean realized excess returns against the predicted excess returns implied by the regression model. How do the results compare to the ones from above?
• Now additionally download the 10 portfolios formed on operating profitability (Portfolios Formed on Operating Profitability - use Lo 10, De 2, ..., Dec 9, Hi 10), investment (Portfolios Formed on Investment - use Lo 10, De 2, ..., Dec 9, Hi 10), dividend yield (Portfolios Formed on Dividend Yield - use Lo 10, De 2, ..., Dec 9, Hi 10), and momentum (10 Portfolios Formed on Momentum); and 17, 30, and 49 industry portfolios - all at the monthly frequency. Assess whether the Fama-French three factor model is able to price these different cross-sections. What do you conclude?
• You decide that you would like to enrich the Fama-French three factor model with a momentum factor (for some details on the momentum factor have a look at the appendix of the lecture slides and read the paper ”Momentum” by Jegadeesh and Titman on CANVAS). How would you do this? Explain in detail. (hint: no need to do anything! Simply describe what you would do in order to add a momentum factor to your three factor model. Also note that you can assume access to to Kenneth French’s data library)
• Comment on the following statements:
– The Fama-French three factor model is as good as it gets. Hence, we should always use this model.
– As the Fama-French three factor model performs empirically superior to the CAPM, we might as well forget about CAPM entirely.
– Empirically, the best factor model has three factors. – Empirically, the best factor model has five factors.
– The validity of CAPM cannot be tested empirically.
2. Modelling Financial Risk:
• Volatility modeling using R
– Go to CANVAS and download the data for PS1Daily.xlsx. This file contains two worksheets. HPRDaily contains the daily holding period for the six stocks, the S&P 500 Composite Index and the value-weighted market portfolio (including dividends) from CRSP. PricesDaily contains the prices for the six stocks and the S&P 500 Composite Index.
– Calculate log returns.
– Plot the ACF (autocorrelation function) for returns squared, and absolute returns for the S&P 500 Composite Index. What do the ACF tell you about the predictability of squared and absolute returns?
– Calculate time-varying volatility for Microsoft using a moving average model. That is, use the log returns of Microsoft (using 10 and 20 weeks windows) and plot the result. Let’s denote the daily variance of Microsoft on day t as σt2. Further, suppose W is the window length and T the data length, then
where t = N,...,T
Discuss the results’ implications for portfolio choice in light of Modern Portfolio Theory.
– In a next step, select the daily log return series of the S&P 500 index. Plot the volatility time series using a moving average model as in the previous question above. Discuss your results for the market index in comparison to Microsoft.
– Plot the volatility time series that you get from an EWMA model for the log returns of the S&P 500 index. In order to initiate the EWMA model (i.e. at time t = 1), you need to make an assumption about . What could be a sensible choice for the starting value ? Try various starting values and discuss what happens. In addition, you need to take a stand on the λ-parameter that is crucial to the model. Sequentially, use the following values for λ: 0.5, 0.75, 0.94. What happens as you increase λ?
1. Measuring Financial Risk:
A short introduction to Value at Risk:
Value at risk (VaR) is a statistical measure of the riskiness of financial entities or portfolios of assets. It is defined as the maximum dollar (or any other currency) amount expected to be lost over a given time horizon, at a pre-defined confidence level. For example, if the 95% one-month VAR is $1 million, there is 95% confidence that over the next month the portfolio will not lose more than $1 million.
VaR can be calculated using different techniques. Under the parametric method, also known as variance-covariance method, VaR is calculated as a function of mean and standard deviation of the return series, assuming that returns are normally distributed.
The parametric method is the most widely used VaR in practice as one only needs to know the mean and the standard deviation of a return series of interest. However, the parametric method crucially relies on the assumption that returns are serially independent and normally distributed. These are potentially very restrictive assumptions as we have discussed in lecture: 1) returns might be autocorrelated in particular, during a crisis and 2) high frequency returns are clearly not normally distributed. On the bright side, these assumptions make the calculation of VaR much easier. First, assuming serial independence, allows us to abstract from estimating serial correlations. Second, assuming normality allows for computation of a standard normal z score to determine the risk position with a degree of confidence right off of a standard normal table. Hence, normality is an important assumption because it allows for the use of the normal distribution as a proxy for what expected returns might look like.
An example of a parametric VaR calculation is as follows:
Monthly standard deviation (in $ terms): $50,000
Monthly mean (in $ terms): $35,000
Z Score for 95% confidence: 1.65
The monthly VaR with 95% confidence is:
35,000 - 50,000 (1.65) = -$47,500
Alternatively, one could express the VaR in percent rather than $. For example, suppose the daily mean return of a portfolio is 0.1%, the standard deviation is 5%, and we are interested in a 99% confidence level which implies a z score of 2.33.
The daily VaR with 99% confidence is:
0.1% - 5% (2.33) = - 11.55%
• Implementing Value at Risk (VaR) in R
– Pick three stocks from the daily stock returns data set (PS1Daily.xlsx) and transform these simple returns to log returns.
– Estimate three volatility time series for each of these three stocks by either using a MA (10 weeks) or an EWMA (λ = 0.94 and , where T is number of observations of daily returns in your sample) model.
– Based on these six time series (two volatility time series for each of the three stocks) calculate the daily one day Value-at-Risk (VaR) 95% assuming normality. That is, you should use the estimated volatility time series together with the following formula for conditional VaR assuming normality
VaR95%,t = ¯r − Φ−1(0.05) × σt,
where ¯r is the mean return (i.e., the average return of the return series of interest up to time t), Φ−1 is the inverse of the standard normal cumulative density function and, hence, Φ−1(0.05) = 1.65 (the z score!). Moreover, σt is your estimated volatility at time t.
– In a last step, you are supposed to ”backtest” your VaR estimates. That is, for each stock you now have three VaR series as well as the realized returns. With this data, count for each VaR estimate separately the number of violations. In other words, count the negative realized market returns that are more extreme than the VaR on this given day. For example, a violation of VaR occurs on a day when the realized returns is -9% and the VaR is -8%. How many violations would you expect if your VaR estimates were to be accurate (i.e., true)? How many violations do you observe? What do you conclude?
2. Go to Kenneth French’s webpagehttps://mba.tuck.dartmouth.edu/pages/faculty/ken.french/datalibrary.html and download the 10 portfolios formed on operating profitability, investment, dividend yield, and momentum and the 49 industry portfolios at the monthly frequency. In addition, also download the Fama/French 3 factors at the monthly frequency.
3. Run a principal component on the combined excess returns of the 10 portfolios formed on operatingprofitability, investment, dividend yield, and momentum and the 49 industry portfolios (hint: use the prcomp command discussed in lecture 5). How many components/factors are needed to explain 95% of the return variation?
4. Run the following regressions for the 10 portfolios formed on operating profitability, investment, dividend yield, and momentum and the 49 industry portfolios and save all regression adjusted R2:
ri,t − rf,t = αi + βi (rMKT,t − rf,t) + γirSMB,t + δirHML,t + ǫi,t
What is the average and median regression adjusted R2? What is the standard deviation of adjusted R2?
5. Now, run the following regressions for the 10 portfolios formed on operating profitability, investment,dividend yield, and momentum and the 49 industry portfolios and save all regression adjusted R2:
ri,t − rf,t = αi + βirPC1,t + γirPC2,t + δirPC3,t + ǫi,t
(hint: rPC1,t,rPC2,t,rPC3,t are simply the first three principal components. These components are returns themselves as the principal components of excess returns are simply linear combinations (i.e., portfolios) thereof, i.e., excess returns. Also note that you can call these first three principal components as follows in R: pca$x[,1], pca$x[,2], and pca$x[,3].)
What is the average and median regression adjusted R2? What is the standard deviation of adjusted R2? Compare and discuss your results with the ones from above!
6. What are the problems with factor models based on principal components?
7. Go to CANVAS and download the data file PS4Daily.xlsx. This file contains daily yield curve data for the United States between July 2 1981 and January 31 2020. In particular, you are given spot rates for 1-year, 2-years, ..., 20-years.
8. Use principal component analysis to examine the data. How many principal components are needed toexplain the majority of the variation in the yields (hint: run prcomp on the yields and not on changes in yields)? Extract the first three components and plot them in a time series plot (again, you can extract them as discussed above).
9. Calculate the correlation between the first component and the 3-year yield and the second componentand the difference between the 10-year and the 1-year yield. What is the economic intuition for these components?
10. The following materials discuss what happens when one uses historical data to determine optimalportfolio weights. In other words, you use historical return data to calculate the weights of the tangency portfolio, i.e. the portfolio with the highest Sharpe ratio (highest return per unit of risk). In a next step, you then invest accordingly and see what happens out of sample. The hope is, that this procedure leads to very high risk-adjusted returns out of sample (i.e. high Sharpe ratios). However, as you can see from the additional lecture notes, a simple strategy that invests an equal amount in every asset, in other words you would invest 1/N of your wealth in each of the N assets, greatly outperforms the first strategy. What happened? The problem is that we tend to overfit the data in sample which leads to bad results. What can we do to improve on our performance? One possibility is to use a methodology that relies on shrinkage such as ridge, shrinkage to the mean, or lasso. You don’t need to answer any questions in this part of the assignment. Your task is simply to download the relevant files and run the R code which exemplifies how the various shrinkage methodologies can be implemented.