Free sample   Data analysis assignment stock return and portfolio analysis

## Data Analysis Assignment: Stock Return & Portfolio Analysis

Question

Data Analysis Assignment Requirement: Please finish the tasks according to the requirements. All the tasks need to be finished by using R. Please present your results in the word file, copy all your R code in the end of this word file.

Data Analysis Assignment Task 1: Stock Return and Portfolio Analysis
In the file named as “Stock.csv”, you have been provided with the daily prices of three stocks from 2012 to 2018.

a. Plot and present the stock prices in time series with appropriate labels.

b. Calculate the log returns of all the three stocks and express them in percentages.

Pls report the descriptive statistics of log returns of three stocks in Table 1. Pls change the names in Table 1 to the stock names in your file.

Table 1: Summary Statistics

 Variable MSFT MMM INTC Min -0.1210331 -0-0707713 -0-0954321 Max 0.0994128 0.0574467 0.1003147 Mean 0.0008531 0.0005652 0.0004914 Median 0.0004509 0.0001858 0.0006775 SD 0.01452656 0.01043945 0.01489527 Skewness -0.08481595 -0.6662234 -0.05525918 Kurtosis 11.24066 8.315543 8.115963 N 1758 1758 1758

c. Pls report the correlation matrix of log returns of three stocks in Table 1. Pls change the names in Table 1 to the stock names in your file.

Table 2: Correlation Matrix

 Variable MSFT MMM INTC MSFT 1.0000000 0.4738475 0.5441834 MMM 0.4738475 1.0000000 0.4900549 INTC 0.5441834 0.4900549 1.0000000

MSFT and INTC stocks highly correlated among other stock variables whereas MSFT and MMM has weak correlation.

d. There is a file named as “FF3factors.csv” in your folder. Merge your stock data with the Fama French three factors data, which are all in percentages. Pls create a new variable Stock.Rf for all the three stocks, which equals to the difference between log return of each stock and RF (available in the FF3factors.csv), and run a regression of :

Stock.Rf=?+?Mkt.Rf+u.

Report the regression results for all the three stocks in tables (you can use only one table to summarize all the results or three separate tables).

For Stock 1: Stock.Rf=-0.0239145+0.0003671Mkt.Rf+u

For Stock 2: Stock.Rf=-0.0242046+0.0003117Mkt.Rf+u

For Stock 3: Stock.Rf=-0.0242776+0.0003581Mkt.Rf+u

e. Pls run regressions of :
Stock.Rf=?+1Mkt.Rf+2SMB+3HML+u.

Report the regression results for all the three stocks in tables (you can use only one table to summarize all the results or three separate tables).

For Stock 1:
Stock.Rf=-0.0239123+0.0004275Mkt.Rf+0.0002940SMB-0.0003500HML+u

For Stock 2:
Stock.Rf=-0.0242031+0.0004039Mkt.Rf+0.0005355SMB-0.0004452HML+u

For Stock 3:
Stock.Rf=-0.0242708+0.0004460Mkt.Rf+0.002439SMB-0.0006986HML+u

f. Pls make your comments by comparing the results in d and e.

The multiple regression coefficients have larger coefficient values shows that a unit change in a new variable may change the dependent variables more than a linear regression coefficient.

g. If an investor would like to form a portfolio with a targeted expected return of 0.05% and achieve the minimized standard deviation by investing in these three stocks in the “Stock.csv” file. If the daily riskfree rate is 0.02%, what is the Sharpe ratio of this optimal portfolio, given there is no short sale constraint? [hint: can use the library of “quadprog”] [pls provide your R code used to form the optimal portfolio in the end of the word file].

Sharpe Ratio for Stock 1 = 0.00121663

Sharpe Ratio for Stock 2 = 0.0006628876

Sharpe Ratio for Stock 3 = 0.003161835

h. If an investor would like to form a portfolio with a targeted expected return of 0.05% and achieve the minimized standard deviation by investing in these three stocks. If the daily risk free rate is 0.02%, what is the Sharpe ratio of this optimal portfolio, given there is short sale constraint (i.e., you cannot short sell the stocks)? [hint: can use the library of “quadprog”]

Zk=C0k+C1kRf

C0kand C1k are the constants where k is asset value. We can determine Zk can be determine as the function of Rf. Each stock investment can be determined from at various pointsRf.

i. By comparing your answer in g) and h), is there any difference in their Sharpe ratios? Why that’s the case?

When sharp ratio is greater it shows the best performance, if it has negative sharp ratios it means risk free rate is greater than the portfolio risk free rate.

j. Given there is no short sale constraint, pls draw an efficient frontier that satisfies the following conditions: 1. The range of the mean return of the portfolio is from 0.75*min of the mean return of three stocks to 1.25*max of the mean return of three stocks; 2. Pls create 500 optimal portfolios in this range; 3 Then pls plot all the risk-return combination of the 500 optimal portfolios (i.e., the efficient frontier). [Pls provide your R code in the end of word file]

Task 2: Stock Market Reaction to FOMC Announcement
US monetary policy can be measured by changes in the Federal funds rate, which are announced by the Federal Open Market Committee (FOMC) meetings. Given the importance of monetary policy to the whole economy, investors pay special attention to the FOMC announcements. In the attached dataset (“Event Study.csv”), we have collected data on FOMC announcements from 1989 to 2007 and the aggregate stock market reactions to it.

Here is a list of variables:

• Date: FOMC announcement date
• Return: S&P 500 index return on the announcement date (in percentage)
• Total change: change of Federal fund rate (in bps)
• Expected: investors’ expected change of Federal fund rate, which is estimated based on the futures contracts written on the Federal fund rate (in bps)
• Surprise: the difference between Total change and Expected, which measures the change that is out of expectation of investors (in bps)
• Scheduled: equals to 1 if the FOMC is scheduled, and 0 if not.

a) Run three regressions and report the estimation results according.
[Note: please report the coefficient and t-statistics; indicate significance with difference levels, and the adjusted R-squared of the regression.] [pls provide your R code in the end of word file]

In the first regression, Y is Return, X is Total change;

In the second regression, Y is Return, X is Expected;

In the third regression, Y is Return, X is Surprise.

Required R code:

y <- data1\$Return

x1 <- data1\$total.change

x2 <- data1\$surprise

x3 <- data1\$expected

x4 <- data1\$Scheduled

fit1 <- lm(y~x1)

summary(fit1)

fit2 <- lm(y~x3)

summary(fit2)

fit3 <- lm(y~x2)

summary(fit3)

b) Comparing the three regression results, which X variable explains Y the best? Why that is the case? [Note: you may link it with market efficiency and event study]

Variable surprise best explains the regression results in all three regression lines as this variable has high R-sq (coefficient of determination).

c) Create a variable which equals to Scheduled*Surprise, and then run a regression in which Y is Return, the first X variable is Surprise, and the second X variable is Scheduled*Surprise. Report the regression results. [pls provide your R code]

Required R code:

y <- data1\$Return

x1 <- data1\$total.change

x2 <- data1\$surprise

x3 <- data1\$expected

x4 <- data1\$Scheduled

z <- x4*x3

fit4 <- lm(y~z)

summary(fit4)

d) Based on the results, is the effect of Surprise on Return dependent on whether the FOMC meeting is scheduled or not? Please explain.

Based on results (coefficient and pvalue), we can conclude that meeting is not scheduled.

R code:

#log returns for Stock1

log_returns1 <- diff(log(Stock1), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns1){

p <- (log_returns1[i]/sum(log_returns1))*100

print(p)

}

summary(log_returns1)

N <- length(log_returns1)

k <- mean(log_returns1)

StandardDeviation<- sqrt(var(log_returns1))

m1=sum(log_returns1 - k)/N

m2=sum((log_returns1 - k)^2)/N

m3=sum((log_returns1 - k)^3)/N

m4=sum((log_returns1 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#log returns for Stock2

log_returns2 <- diff(log(Stock2), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns2){

p <- (log_returns2[i]/sum(log_returns2))*100

print(p)

}

summary(log_returns2)

N <- length(log_returns2)

k <- mean(log_returns2)

StandardDeviation<- sqrt(var(log_returns2))

m1=sum(log_returns2 - k)/N

m2=sum((log_returns2 - k)^2)/N

m3=sum((log_returns2 - k)^3)/N

m4=sum((log_returns2 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#log returns for Stock3

log_returns3 <- diff(log(Stock3), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns3){

p <- (log_returns3[i]/sum(log_returns3))*100

print(p)

}

summary(log_returns3)

N <- length(log_returns3)

k <- mean(log_returns3)

StandardDeviation<- sqrt(var(log_returns3))

m1=sum(log_returns3 - k)/N

m2=sum((log_returns3 - k)^2)/N

m3=sum((log_returns3 - k)^3)/N

m4=sum((log_returns3 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#Correlation Matrix of 3 Stocks

y <- data.frame(log_returns1,log_returns2,log_returns3)

cor(y)

RCode:

Stock1 <- data\$MSFT

Stock2 <- data\$MMM

Stock3 <- data\$INTC

Date <- data\$Date

plot(Date, Stock1,xlab = "Date", ylab = "Stock1", main = "Time Series Plot of Stock1")

plot(Date, Stock2,Xlab = "Date", ylab = "Stock2", main = "Time Series Plot of Stock2")

plot(Date, Stock3,Xlab = "Date", ylab = "Stock3", main = "Time Series Plot of Stock3")

par(mfrow=c(3,3))

#log returns for Stock1

log_returns1 <- diff(log(Stock1), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns1){

p <- (log_returns1[i]/sum(log_returns1))*100

print(p)

}

summary(log_returns1)

N <- length(log_returns1)

k <- mean(log_returns1)

StandardDeviation<- sqrt(var(log_returns1))

m1=sum(log_returns1 - k)/N

m2=sum((log_returns1 - k)^2)/N

m3=sum((log_returns1 - k)^3)/N

m4=sum((log_returns1 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#log returns for Stock2

log_returns2 <- diff(log(Stock2), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns2){

p <- (log_returns2[i]/sum(log_returns2))*100

print(p)

}

summary(log_returns2)

N <- length(log_returns2)

k <- mean(log_returns2)

StandardDeviation<- sqrt(var(log_returns2))

m1=sum(log_returns2 - k)/N

m2=sum((log_returns2 - k)^2)/N

m3=sum((log_returns2 - k)^3)/N

m4=sum((log_returns2 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#log returns for Stock3

log_returns3 <- diff(log(Stock3), lag=1)

#log_returns in percentages

p <- c()

for(i in log_returns3){

p <- (log_returns3[i]/sum(log_returns3))*100

print(p)

}

summary(log_returns3)

N <- length(log_returns3)

k <- mean(log_returns3)

StandardDeviation<- sqrt(var(log_returns3))

m1=sum(log_returns3 - k)/N

m2=sum((log_returns3 - k)^2)/N

m3=sum((log_returns3 - k)^3)/N

m4=sum((log_returns3 - k)^4)/N

Sk<- m3/StandardDeviation^3;Sk

Kurtosis <- m4/(m2)^2;Kurtosis

#Correlation Matrix of 3 Stocks

y <- data.frame(log_returns1,log_returns2,log_returns3)

cor(y)

#(d)

Mkt.RF<- data1\$Mkt.RF

SMB <- data1\$SMB

HML <- data1\$HML

RF <- data1\$RF;length(RF)

Stock1 <- data\$MSFT

log_returns1 <- diff(log(Stock1), lag=1)

m1 <- log_returns1-RF;m1

Stock.Rf1 <- m1-RF

fit <- lm(Stock.Rf1~Mkt.RF);fit

Stock2 <- data\$MMM

log_returns2 <- diff(log(Stock2), lag=1)

m2 <- log_returns2-RF;m2

Stock.Rf2 <- m2-RF

fit1 <- lm(Stock.Rf2~Mkt.RF);fit1

Stock3 <- data\$INTC

log_returns3 <- diff(log(Stock3), lag=1)

m3 <- log_returns3-RF;m3

Stock.Rf3 <- m3-RF

fit2 <- lm(Stock.Rf3~Mkt.RF);fit2

#(e)

fit3 <- lm(Stock.Rf1~Mkt.RF+SMB+HML);fit3

fit4 <- lm(Stock.Rf2~Mkt.RF+SMB+HML);fit4

fit5 <- lm(Stock.Rf3~Mkt.RF+SMB+HML);fit5

#(h)

SharpeRatio1 <- (0.05-0.02)/24.65827;SharpeRatio1

SharpeRatio2 <- (0.05-0.02)/45.25654;SharpeRatio2

SharpeRatio3 <- (0.05-0.02)/9.488162;SharpeRatio3

#(j)

Stock1 <- data\$MSFT

Stock2 <- data\$MMM

Stock3 <- data\$INTC

Date <- data\$Date

returns <- getReturns(names(stocks[-1]), freq="week")

eff.frontier<- function (returns, short="no", max.allocation=NULL,

covariance <- cov(returns)

print(covariance)

n <- ncol(covariance)

Amat<- matrix (1, nrow=n)

bvec<- 1

meq<- 1

# Then modify the Amat and bvec if short-selling is prohibited

if(short=="no"){

Amat<- cbind(1, diag(n))

bvec<- c(bvec, rep(0, n))

}

if(!is.null(max.allocation)){

if(max.allocation> 1 | max.allocation<0){

stop("max.allocation must be greater than 0 and less than 1")

}

if(max.allocation * n < 1){

stop("Need to set max.allocation higher; not enough assets to add to 1")

}

Amat<- cbind(Amat, -diag(n))

bvec<- c(bvec, rep(-max.allocation, n))

}

# Calculate the number of loops

loops <- risk.premium.up / risk.increment + 1

loop <- 1

eff <- matrix(nrow=loops, ncol=n+3)

# Now I need to give the matrix column names

colnames(eff) <- c(colnames(returns), "Std.Dev", "Exp.Return", "sharpe")

# Loop through the quadratic program solver

for (i in seq(from=0, to=risk.premium.up, by=risk.increment)){

dvec<- colMeans(returns) * i # This moves the solution along the EF

sol <- solve.QP(covariance, dvec=dvec, Amat=Amat, bvec=bvec, meq=meq)

eff[loop,"Std.Dev"] <- sqrt(sum(sol\$solution*colSums((covariance*sol\$solution))))

eff[loop,"Exp.Return"] <- as.numeric(sol\$solution %*% colMeans(returns))

eff[loop,"sharpe"] <- eff[loop,"Exp.Return"] / eff[loop,"Std.Dev"]

eff[loop,1:n] <- sol\$solution

loop <- loop+1

}

return(as.data.frame(eff))

}

# Run the eff.frontier function based on no short and 50% alloc. restrictions

eff <- eff.frontier(returns=returns\$R, short="no", max.allocation=.50,

y <- data2\$Return

x1 <- data2\$total.change

x2 <- data2\$surprise

x3 <- data2\$expected

x4 <- data2\$Scheduled

fit1 <- lm(y~x1)

summary(fit1)

fit2 <- lm(y~x3)

summary(fit2)

fit3 <- lm(y~x2)

summary(fit3)

#c

z <- x4*x3

fit4 <- lm(y~z)

summary(fit4)

References:

• Matloff, S.N. (2011). The Art of R Programming. No Starch Press.
• Crawley, Michael J. (2013). The R book. Data analysis assignment Chichester, West Sussex, United Kingdom :Wiley,
NEXT SAMPLE