투자분석 과제3

Question

Select any four stocks from the pool of market index. Make sure all four stocks belong to four different industries. You will estimate the single index model for each of the chosen stocks.

Collect the 60 recent monthly returns of the chosen stocks, and the monthly T-bill rates for the same period. Also, you need market index returns for the same period. Run regression model with this data. Report the alpha and beta estimates.
Interpret each estimate of alpha and beta. Consider the smallest and largest betas among the four stocks. To which industries do the two companies belong? Is the business consistent with the estimated beta for the two companies?
Use the first 30 months only and run the regression. Report the alpha and beta estimates.
Use the last 30 months only and run the regression. Report the alpha and beta estimates.
Are the three sets of estimates identical? Discuss the result of a), c) and d).

Answer

먼저, 제가 선정한 주식은 금융산업의 뱅크오브아메리카(BAC), 거래소산업의 시카고거래소그룹(CME), IT업종의 IBM(IBM), 외식업종의 맥도날드(MCD)입니다.

네 종류의 주식은 모두 S&P500 지수의 구성종목이며, 출처는 야후파이낸스입니다.

무위험이자율은 4 weeks T-bill rate이며, 출처는 미 재무부 홈페이지입니다.

2019~2023년 5년간 데이터를 월별로 취합하여 지수 및 주식의 월별수익률을 산출한 다음, 월환산 무위험이자율을 차감하여 초과수익률을 산출하였고, 각 주식의 초과수익률과 지수의 초과수익률에 대해서 단순선형회귀분석을 진행하였습니다.

rm(list=ls())
library(tidyverse)

# stocks and market index S&P500 from yahoo finance
bac <- read_csv("investment_hw/bac.csv") %>% tibble()
cme <- read_csv("investment_hw/cme.csv") %>% tibble()
ibm <- read_csv("investment_hw/ibm.csv") %>% tibble()
mcd <- read_csv("investment_hw/mcd.csv") %>% tibble()
spx <- read_csv("investment_hw/spx.csv") %>% tibble()
# risk-free rate is effective-FFR(federal funds rate)
tbill <- read_csv("investment_hw/tbill.csv") %>% tibble()

# set period 5years
strt_dd='20181201'
end_dd='20231231'

# tidy date
rfr <- tbill %>% mutate(tbill=`4 WEEKS BANK DISCOUNT`) %>% 
  mutate(y=paste0("20",substr(Date,nchar(Date)-1,nchar(Date)))) %>% 
  mutate(m=if_else(substr(Date,2,2)=="/",
                   paste0("0",substr(Date,1,1)),substr(Date,1,2))) %>% 
  mutate(d=if_else(substr(Date,2,2)=="/",
                   if_else(substr(Date,4,4)=="/",
                           paste0("0",substr(Date,3,3)),substr(Date,3,4)),
                   if_else(substr(Date,5,5)=="/",
                           paste0("0",substr(Date,4,4)),substr(Date,4,5)))) %>% 
  mutate(day=paste0(y,m,d)) %>% 
  select(day,tbill)

raw_data <- tibble()
raw_data <- bac %>% mutate(bac=`Adj Close`) %>% select(Date,bac) %>% 
  left_join(cme %>% mutate(cme=`Adj Close`) %>% select(Date,cme)) %>% 
  left_join(ibm %>% mutate(ibm=`Adj Close`) %>% select(Date,ibm)) %>% 
  left_join(mcd %>% mutate(mcd=`Adj Close`) %>% select(Date,mcd)) %>% 
  left_join(spx %>% mutate(spx=`Adj Close`) %>% select(Date,spx)) %>% 
  mutate(day=gsub("-","",Date)) %>% 
  mutate(year=substr(day,1,4)) %>% 
  mutate(month=substr(day,1,6)) %>% 
  left_join(rfr,by="day") %>% 
  filter(day>=strt_dd,day<=end_dd) %>% 
  select(year,month,day,bac,cme,ibm,mcd,spx,tbill,Date)

# using monthly return
monthly_raw <- tibble()
monthly_raw <- raw_data %>% 
  group_by(year,month) %>% 
  arrange(day %>% desc()) %>% 
  slice(1) %>% 
  pivot_longer(.,c("bac","cme","ibm","mcd","spx"),
               names_to = "name",values_to = "price") %>% 
  ungroup() %>% 
  arrange(name,year,month) %>% 
  mutate(return=price/lag(price)-1) %>%  # monthly return
  mutate(ex_return=return-tbill/100/12) %>% # excess return
  filter(as.integer(month)>=201901)

(a)

각 회귀분석의 결과로 산출된 절편(알파, 연환산)과 계수(베타)는 아래와 같습니다.

구 분	alpla	beta
Band of America	-0.0571	1.42
CME	0.0054	0.45
IBM	0.0439	0.76
McDonald’s	0.0352	0.71

regression <- tibble()
regression <- monthly_raw %>% 
  select(day,name,ex_return) %>% 
  pivot_wider(.,values_from = "ex_return", names_from = "name")

reg_bac <- lm(regression$bac~regression$spx) %>% coef() %>% as_tibble()
reg_cme <- lm(regression$cme~regression$spx) %>% coef() %>% as_tibble()
reg_ibm <- lm(regression$ibm~regression$spx) %>% coef() %>% as_tibble()
reg_mcd <- lm(regression$mcd~regression$spx) %>% coef() %>% as_tibble()

result_reg <- tibble(name=c("bac","cme","ibm","mcd"),
                     alpha=c(reg_bac$value[1],reg_cme$value[1],reg_ibm$value[1],reg_mcd$value[1])*12,
                     beta=c(reg_bac$value[2],reg_cme$value[2],reg_ibm$value[2],reg_mcd$value[2]))
result_reg

# A tibble: 4 × 3
  name     alpha  beta
  <chr>    <dbl> <dbl>
1 bac   -0.0571  1.42 
2 cme    0.00536 0.449
3 ibm    0.0439  0.758
4 mcd    0.0352  0.708

(b)

베타가 가장 높은 주식은 뱅크오브아메리카이며, 가장 낮은 주식은 시카고거래소그룹입니다.

일반적으로 금융업종이 기간산업인 거래소업종에 비해 변동성이 큰 편이며, 실제로 기간 중 BOA의 연환산 변동성이 33.4%로 CME의 20.9%를 크게 상회하였습니다.

물론 기간 중 변동성이 가장 낮았던 주식은 맥도날드(19.1%)로, 베타는 0.71이 산출되어 변동성과 베타의 관계가 선형적이지는 않지만, 거래소산업의 베타가 일반적으로 0.5수준임을 고려할 때 합리적인 결과가 산출된 것으로 보입니다.

monthly_raw %>% 
  group_by(name) %>% 
  summarise(vol=sd(ex_return)*sqrt(12))

# A tibble: 5 × 2
  name    vol
  <chr> <dbl>
1 bac   0.334
2 cme   0.209
3 ibm   0.240
4 mcd   0.191
5 spx   0.185

(c)~(d)

먼저, (a), (c), (d)의 결과를 각각 정리하면 아래와 같습니다.

연환산 Alpha 및 Beta

구 분	(a)$\alpha$	(c)$\alpha$	(d)$\alpha$	(a)$\beta$	(c)$\beta$	(d)$\beta$
Band of America	-0.0571	-0.0814	-0.0699	1.42	1.61	1.23
CME	0.0054	-0.0314	0.0196	0.45	0.56	0.35
IBM	0.0439	-0.0804	0.0940	0.76	1.14	0.41
McDonald’s	0.0352	-0.0241	0.0895	0.71	0.73	0.71

알파의 경우 앞 30개월의 결과가 낮은 경향이 있고, 베타의 경우 앞 30개월의 결과가 높은 경향이 있습니다.

앞 30개월의 기간은 2019.1~2021.6이며, 뒤 30개월은 2021.7~2023.12입니다.

기간의 특징적인 부분은 앞 30개월의 기간에 코로나19 팬데믹으로 인해 주가변동성이 굉장히 높고, 주식이 급락했던 시기를 포함하고 있다는 것 입니다.

따라서 앞 30개월 기간동안은 평균적으로 주식의 변동성이 높았고, 주가급락으로 인해 평균적인 수익률은 낮아 시장지수인 S&P500을 Underperform하였을 가능성이 높습니다.

이로 인해 시장지수 대비 초과수익률인 알파는 (-)를 기록하였으며, 베타는 전체기간(a)과 후반기간(d) 대비 높은 수치가 산출된 것으로 해석할 수 있습니다.

오히려, 최근 30개월간의 주식시장을 평균적인 흐름이라고 본다면, 뒤 30개월의 결과값이 해당 주식들의 일반적인 알파 및 베타라고 볼 수 있습니다. 따라서, 코로나19 기간동안의 왜곡된 값으로 인해 전체 기간의 알파가 과소평가, 베타가 과대평가되었을 가능성이 있습니다.

regression2 <- tibble()
regression2 <- monthly_raw %>% 
  select(day,name,ex_return) %>% 
  pivot_wider(.,values_from = "ex_return", names_from = "name") %>% 
  slice(1:30)

reg_bac2 <- lm(regression2$bac~regression2$spx) %>% coef() %>% as_tibble()
reg_cme2 <- lm(regression2$cme~regression2$spx) %>% coef() %>% as_tibble()
reg_ibm2 <- lm(regression2$ibm~regression2$spx) %>% coef() %>% as_tibble()
reg_mcd2 <- lm(regression2$mcd~regression2$spx) %>% coef() %>% as_tibble()

result_reg2 <- tibble(name=c("bac","cme","ibm","mcd"),
                     alpha2=c(reg_bac2$value[1],reg_cme2$value[1],
                              reg_ibm2$value[1],reg_mcd2$value[1])*12,
                     beta2=c(reg_bac2$value[2],reg_cme2$value[2],
                             reg_ibm2$value[2],reg_mcd2$value[2]))

regression3 <- tibble()
regression3 <- monthly_raw %>% 
  select(day,name,ex_return) %>% 
  pivot_wider(.,values_from = "ex_return", names_from = "name") %>% 
  slice(31:60)

reg_bac3 <- lm(regression3$bac~regression3$spx) %>% coef() %>% as_tibble()
reg_cme3 <- lm(regression3$cme~regression3$spx) %>% coef() %>% as_tibble()
reg_ibm3 <- lm(regression3$ibm~regression3$spx) %>% coef() %>% as_tibble()
reg_mcd3 <- lm(regression3$mcd~regression3$spx) %>% coef() %>% as_tibble()

result_reg3 <- tibble(name=c("bac","cme","ibm","mcd"),
                     alpha3=c(reg_bac3$value[1],reg_cme3$value[1],
                              reg_ibm3$value[1],reg_mcd3$value[1])*12,
                     beta3=c(reg_bac3$value[2],reg_cme3$value[2],
                             reg_ibm3$value[2],reg_mcd3$value[2]))

result <- tibble()
result <- result_reg %>% 
  left_join(result_reg2,by="name") %>% 
  left_join(result_reg3,by="name")

result

# A tibble: 4 × 7
  name     alpha  beta  alpha2 beta2  alpha3 beta3
  <chr>    <dbl> <dbl>   <dbl> <dbl>   <dbl> <dbl>
1 bac   -0.0571  1.42  -0.0814 1.61  -0.0699 1.23 
2 cme    0.00536 0.449 -0.0314 0.564  0.0196 0.345
3 ibm    0.0439  0.758 -0.0804 1.14   0.0940 0.413
4 mcd    0.0352  0.708 -0.0241 0.729  0.0895 0.713

# 투자분석 과제3 {.unnumbered} ## Question Select any four stocks from the pool of market index. Make sure all four stocks belong to four different industries. You will estimate the single index model for each of the chosen stocks. a) Collect the 60 recent monthly returns of the chosen stocks, and the monthly T-bill rates for the same period. Also, you need market index returns for the same period. Run regression model with this data. Report the alpha and beta estimates. b) Interpret each estimate of alpha and beta. Consider the smallest and largest betas among the four stocks. To which industries do the two companies belong? Is the business consistent with the estimated beta for the two companies? c) Use the first 30 months only and run the regression. Report the alpha and beta estimates. d) Use the last 30 months only and run the regression. Report the alpha and beta estimates. e) Are the three sets of estimates identical? Discuss the result of a), c) and d). ## Answer 먼저, 제가 선정한 주식은 금융산업의 뱅크오브아메리카(BAC), 거래소산업의 시카고거래소그룹(CME), IT업종의 IBM(IBM), 외식업종의 맥도날드(MCD)입니다. 네 종류의 주식은 모두 S&P500 지수의 구성종목이며, 출처는 야후파이낸스입니다. 무위험이자율은 4 weeks T-bill rate이며, 출처는 미 재무부 홈페이지입니다. 2019~2023년 5년간 데이터를 월별로 취합하여 지수 및 주식의 월별수익률을 산출한 다음, 월환산 무위험이자율을 차감하여 초과수익률을 산출하였고, 각 주식의 초과수익률과 지수의 초과수익률에 대해서 단순선형회귀분석을 진행하였습니다. ```{r} #| output: false rm(list=ls()) library(tidyverse) # stocks and market index S&P500 from yahoo finance bac <- read_csv("investment_hw/bac.csv") %>% tibble() cme <- read_csv("investment_hw/cme.csv") %>% tibble() ibm <- read_csv("investment_hw/ibm.csv") %>% tibble() mcd <- read_csv("investment_hw/mcd.csv") %>% tibble() spx <- read_csv("investment_hw/spx.csv") %>% tibble() # risk-free rate is effective-FFR(federal funds rate) tbill <- read_csv("investment_hw/tbill.csv") %>% tibble() # set period 5years strt_dd='20181201' end_dd='20231231' # tidy date rfr <- tbill %>% mutate(tbill=`4 WEEKS BANK DISCOUNT`) %>% mutate(y=paste0("20",substr(Date,nchar(Date)-1,nchar(Date)))) %>% mutate(m=if_else(substr(Date,2,2)=="/", paste0("0",substr(Date,1,1)),substr(Date,1,2))) %>% mutate(d=if_else(substr(Date,2,2)=="/", if_else(substr(Date,4,4)=="/", paste0("0",substr(Date,3,3)),substr(Date,3,4)), if_else(substr(Date,5,5)=="/", paste0("0",substr(Date,4,4)),substr(Date,4,5)))) %>% mutate(day=paste0(y,m,d)) %>% select(day,tbill) raw_data <- tibble() raw_data <- bac %>% mutate(bac=`Adj Close`) %>% select(Date,bac) %>% left_join(cme %>% mutate(cme=`Adj Close`) %>% select(Date,cme)) %>% left_join(ibm %>% mutate(ibm=`Adj Close`) %>% select(Date,ibm)) %>% left_join(mcd %>% mutate(mcd=`Adj Close`) %>% select(Date,mcd)) %>% left_join(spx %>% mutate(spx=`Adj Close`) %>% select(Date,spx)) %>% mutate(day=gsub("-","",Date)) %>% mutate(year=substr(day,1,4)) %>% mutate(month=substr(day,1,6)) %>% left_join(rfr,by="day") %>% filter(day>=strt_dd,day<=end_dd) %>% select(year,month,day,bac,cme,ibm,mcd,spx,tbill,Date) # using monthly return monthly_raw <- tibble() monthly_raw <- raw_data %>% group_by(year,month) %>% arrange(day %>% desc()) %>% slice(1) %>% pivot_longer(.,c("bac","cme","ibm","mcd","spx"), names_to = "name",values_to = "price") %>% ungroup() %>% arrange(name,year,month) %>% mutate(return=price/lag(price)-1) %>% # monthly return mutate(ex_return=return-tbill/100/12) %>% # excess return filter(as.integer(month)>=201901) ``` ### (a) **각 회귀분석의 결과로 산출된 절편(알파, 연환산)과 계수(베타)는 아래와 같습니다.** |구 분|alpla|beta| |:--:|----:|---:| |Band of America|-0.0571|1.42| |CME|0.0054|0.45| |IBM|0.0439|0.76| |McDonald's|0.0352|0.71| ```{r} regression <- tibble() regression <- monthly_raw %>% select(day,name,ex_return) %>% pivot_wider(.,values_from = "ex_return", names_from = "name") reg_bac <- lm(regression$bac~regression$spx) %>% coef() %>% as_tibble() reg_cme <- lm(regression$cme~regression$spx) %>% coef() %>% as_tibble() reg_ibm <- lm(regression$ibm~regression$spx) %>% coef() %>% as_tibble() reg_mcd <- lm(regression$mcd~regression$spx) %>% coef() %>% as_tibble() result_reg <- tibble(name=c("bac","cme","ibm","mcd"), alpha=c(reg_bac$value[1],reg_cme$value[1],reg_ibm$value[1],reg_mcd$value[1])*12, beta=c(reg_bac$value[2],reg_cme$value[2],reg_ibm$value[2],reg_mcd$value[2])) result_reg ``` ### (b) **베타가 가장 높은 주식은 뱅크오브아메리카이며, 가장 낮은 주식은 시카고거래소그룹**입니다. 일반적으로 금융업종이 기간산업인 거래소업종에 비해 변동성이 큰 편이며, 실제로 기간 중 BOA의 연환산 변동성이 33.4%로 CME의 20.9%를 크게 상회하였습니다. 물론 기간 중 변동성이 가장 낮았던 주식은 맥도날드(19.1%)로, 베타는 0.71이 산출되어 변동성과 베타의 관계가 선형적이지는 않지만, 거래소산업의 베타가 일반적으로 0.5수준임을 고려할 때 **합리적인 결과가 산출**된 것으로 보입니다. ```{r} monthly_raw %>% group_by(name) %>% summarise(vol=sd(ex_return)*sqrt(12)) ``` ### (c)~(d) 먼저, (a), (c), (d)의 결과를 각각 정리하면 아래와 같습니다. ***연환산 Alpha 및 Beta*** |구 분|(a)$\alpha$|(c)$\alpha$|(d)$\alpha$|(a)$\beta$|(c)$\beta$|(d)$\beta$| |:--:|----:|---:|---:|----:|---:|---:| |Band of America|-0.0571|-0.0814|-0.0699|1.42|1.61|1.23| |CME|0.0054|-0.0314|0.0196|0.45|0.56|0.35| |IBM|0.0439|-0.0804|0.0940|0.76|1.14|0.41| |McDonald's|0.0352|-0.0241|0.0895|0.71|0.73|0.71| **알파의 경우 앞 30개월의 결과가 낮은 경향이 있고, 베타의 경우 앞 30개월의 결과가 높은 경향**이 있습니다. 앞 30개월의 기간은 2019.1~2021.6이며, 뒤 30개월은 2021.7~2023.12입니다. 기간의 특징적인 부분은 **앞 30개월의 기간에 코로나19 팬데믹으로 인해 주가변동성이 굉장히 높고, 주식이 급락했던 시기를 포함**하고 있다는 것 입니다. 따라서 앞 30개월 기간동안은 평균적으로 주식의 변동성이 높았고, 주가급락으로 인해 평균적인 수익률은 낮아 시장지수인 S&P500을 Underperform하였을 가능성이 높습니다. 이로 인해 **시장지수 대비 초과수익률인 알파는 (-)를 기록**하였으며, **베타는 전체기간(a)과 후반기간(d) 대비 높은 수치가 산출**된 것으로 해석할 수 있습니다. 오히려, 최근 30개월간의 주식시장을 평균적인 흐름이라고 본다면, 뒤 30개월의 결과값이 해당 주식들의 일반적인 알파 및 베타라고 볼 수 있습니다. 따라서, **코로나19 기간동안의 왜곡된 값으로 인해 전체 기간의 알파가 과소평가, 베타가 과대평가되었을 가능성**이 있습니다. ```{r} regression2 <- tibble() regression2 <- monthly_raw %>% select(day,name,ex_return) %>% pivot_wider(.,values_from = "ex_return", names_from = "name") %>% slice(1:30) reg_bac2 <- lm(regression2$bac~regression2$spx) %>% coef() %>% as_tibble() reg_cme2 <- lm(regression2$cme~regression2$spx) %>% coef() %>% as_tibble() reg_ibm2 <- lm(regression2$ibm~regression2$spx) %>% coef() %>% as_tibble() reg_mcd2 <- lm(regression2$mcd~regression2$spx) %>% coef() %>% as_tibble() result_reg2 <- tibble(name=c("bac","cme","ibm","mcd"), alpha2=c(reg_bac2$value[1],reg_cme2$value[1], reg_ibm2$value[1],reg_mcd2$value[1])*12, beta2=c(reg_bac2$value[2],reg_cme2$value[2], reg_ibm2$value[2],reg_mcd2$value[2])) regression3 <- tibble() regression3 <- monthly_raw %>% select(day,name,ex_return) %>% pivot_wider(.,values_from = "ex_return", names_from = "name") %>% slice(31:60) reg_bac3 <- lm(regression3$bac~regression3$spx) %>% coef() %>% as_tibble() reg_cme3 <- lm(regression3$cme~regression3$spx) %>% coef() %>% as_tibble() reg_ibm3 <- lm(regression3$ibm~regression3$spx) %>% coef() %>% as_tibble() reg_mcd3 <- lm(regression3$mcd~regression3$spx) %>% coef() %>% as_tibble() result_reg3 <- tibble(name=c("bac","cme","ibm","mcd"), alpha3=c(reg_bac3$value[1],reg_cme3$value[1], reg_ibm3$value[1],reg_mcd3$value[1])*12, beta3=c(reg_bac3$value[2],reg_cme3$value[2], reg_ibm3$value[2],reg_mcd3$value[2])) result <- tibble() result <- result_reg %>% left_join(result_reg2,by="name") %>% left_join(result_reg3,by="name") result ```