Stephen Rush

UConn Commute

I started taking some photos along my commute from Providence to the main UConn campus in Storrs, CT. The fog effects on the last few photos are completely natural. I just added a frame to some and tweaked the color on a few.

Gonna Cut You Down

Most good songs might get one significant cover version. The truly great songs are remade over and over as artists find new interpretations of something that is universal. Here is the 1999 cover by Moby and the 2003 cover by Johnny Cash. Although the original song predates both by decades, each cover has something new to offer. If these versions peak your interest, check out the covers by Tom Jones, Marilyn Manson, and Elvis.

Ho Chi Minh City

IMG_20150604_193212 I traveled to Vietnam to present a paper at the 2nd Vietnam International Conference in Finance hosted at the University of Economics and Law in Ho Chi Minh City. The conference dinner was a marvelous buffet of Vietnamese seafood served in a pavilion along the river. IMG_20150605_224031

I also managed to visit Independence Palace, Bitexco Financial Tower, the post office, opera house, and City Hall.

Independence Palace

Nighthawks

I was attending the R/Finance conference in Chicago when I had the opportunity to visit the Art Institute of Chicago. Aside from recreating every scene in Ferris Bueller’s Day Off, I really liked seeing Edward Hopper’s Nighthawks in person.

Pastor and Stambaugh Liquidity

While there a many measures of liquidity, the metric in Pastor and Stambaugh (2003) is the standard used in empirical asset pricing. The first thing we need is daily data. Academics can use the CRSP data set but the idea can be replicated with free data. You will need Date, unique stock identifier (I use CUSIP but ticker works just as well), Volume, Return (make sure to include dividends), and the market return (Here I use the CRSP value weighted index but any broad value weighted index should work). Assuming you’ve already formatted the date properly and the data so everything is numeric and not factor or string, we start with a panel I’ve named ‘crsp’. I also recommend setting up a parallel backend to make use of all your CPU cores. Here I use the parallel package but if you have multiple machines be sure to check out doRedis for a relatively simple way to make the poor man’s computer cluster.

require(data.table); require(doParallel);
require(parallel); require(foreach);
registerDoParallel(detectCores()-1) # Keep 1 CPU core for the OS
crsp=fread(paste(path,'Data_Sets/WRDS_CRSP_LiquidityCalc_1993-2013.csv',sep=''),header=TRUE)
setnames(crsp,c('PERMNO','Date','Ticker','CUSIP','Price','Volume','Return','Shares','MktReturn'))
crsp=crsp[Price>5]; crsp[,MktCap:=1000*Shares*Price];
crsp=crsp[MktCap>1000000]
crsp[,Date:=as.Date(as.character(Date),format='%Y%m%d')];
crsp[,Month:=as.yearmon(Date)]; crsp[,Return:=as.numeric(Return)]
#### Create Metrics ##############################################
crsp[,Ret.e:=Return-MktReturn];
crsp[,SVolume:=Volume*(Ret.e/abs(Ret.e))]
crsp[,L.Ret.e:=c(NA,head(Ret.e,-1)),by='CUSIP']
crsp[,L.SVolume:=c(NA,head(SVolume,-1)),by='CUSIP']
crsp=crsp[,list(Date,Month,CUSIP,Ret.e,L.Ret.e,L.SVolume)]
#### Eliminate months with insufficient observations #############
crsp[,Obs:=length(na.omit(L.Ret.e)),by=c("CUSIP","Month")]
crsp=crsp[Obs>10]
#### Run model on daily data for each month ######################
model=Ret.e~L.Ret.e+L.SVolume
gammas=foreach(cusip=unique(crsp[,CUSIP]),.combine='rbind') %dopar% {
 stock=crsp[CUSIP==cusip];
 stock[,Gamma:=coef(lm(model,data=.SD))[3],by=c('Month')]
 stock=unique(stock[,list(Month,CUSIP,Gamma)]); return(stock)
}
save(gammas,file=paste(path,'Panels/Liquidity_Panel.RData',sep=''))

Ocean Gravity

Drift diving is where the diver finds an area with a strong current and plans to ride the current over the course of the dive rather than fight against it. This video uses a combination of drift diving and free diving to give some perspective on the size and power of the ocean relative to a human.

The Econometric Cycle

Use OLS
Quick results but assumptions are probably violated

Use Nonparametric Techniques
Results are not general enough because my sample isn’t big enough or lacks certain characteristics

Everything is a State Space Model!
Too many parameters → I don’t own a quantum computer.

Use GMM
Think more; use less parameters → Thinking is hard.

Use OLS
OLS is pretty robust

VPIN in R

This code is an implementation of Volume Synchronized Probability of Informed Trading by Easley, Lopez de Prado, and O’Hara (2012) published in the Review of Financial Studies. Easley et al. argue that the CDF of VPIN indicates order flow toxicity. This is their explanation of the flash crash in 2010. You can see slides to my R/Finance 2014 presentation on the topic.

This version of the code is not particularly fast and they are plenty of opportunities for a better programmer than me to tune it up for speed.

#### VPIN calculation #########################################################
#install.packages('fasttime',repos='http://www.rforge.net/')
require(data.table); require(fasttime); require(plyr)
# Assuming TAQ data is arranged in 1 year stock csv files
stock=fread('/TAQ_data.csv'); stock=stock[,1:3,with=FALSE]
setnames(stock,colnames(stock),c('DateTime','Price','Volume'));
stock[,DateTime:=paste(paste(substr(DateTime,1,4),substr(DateTime,5,6),
    substr(DateTime,7,8),sep='-'),substr(DateTime,10,17))]
setkey(stock,DateTime);
stock[,DateTime:=fastPOSIXct(DateTime,tz='GMT')]
stock=as.xts(stock)
# Now we have an xts data frame called 'stock' with a DateTime index and... 
# two columns: Price and Volume
# Vbucket=Number of volume buckets in an average volume day (Vbucket=50)
VPIN=function(stock,Vbucket) {
  stock$dP1=diff(stock[,'Price'],lag=1,diff=1,na.pad=TRUE)
  ends=endpoints(stock,'minutes')
  timeDF=period.apply(stock[,'dP1'],INDEX=ends,FUN=sum)
  timeDF$Volume=period.apply(stock[,'Volume'],INDEX=ends,FUN=sum)
  Vbar=mean(period.apply(timeDF[,'Volume'],INDEX=endpoints(timeDF,'days'),
    FUN=sum))/Vbucket
  timeDF$Vfrac=timeDF[,'Volume']/Vbar
  timeDF$CumVfrac=cumsum(timeDF[,'Vfrac'])
  timeDF$Next=(timeDF[,'CumVfrac']-floor(timeDF[,'CumVfrac']))/timeDF[,'Vfrac']
  timeDF[timeDF[,'Next']<1,'Next']=0
  timeDF$Previous=lag(timeDF[,'dP1'])*lag(timeDF[,'Next'])
  timeDF$dP2=(1-timeDF[,'Next'])*timeDF[,'dP1'] + timeDF[,'Previous']
  timeDF$Vtick=floor(timeDF[,'CumVfrac'])
  timeDF[,'Vtick']=timeDF[,'Vtick']-diff(timeDF[,'Vtick']); timeDF[1,'Vtick']=0
  timeDF=as.data.frame(timeDF); timeDF[,'DateTime']=row.names(timeDF)
  timeDF=ddply(as.data.frame(timeDF),.(Vtick),last)
  timeDF=as.xts(timeDF[,c('Volume','dP2','Vtick')],
    order.by=fastPOSIXct(timeDF$DateTime,tz='GMT'))
  timeDF[1,'dP2']=0
  timeDF$sigma=rollapply(timeDF[,'dP2'],Vbucket,sd,fill=NA)
  timeDF$sigma=na.fill(timeDF$sigma,"extend")
  timeDF$Vbuy=Vbar*pnorm(timeDF[,'dP2']/timeDF[,'sigma'])
  timeDF$Vsell=Vbar-timeDF[,'Vbuy']
  timeDF$OI=abs(timeDF[,'Vsell']-timeDF[,'Vbuy'])
  timeDF$VPIN=rollapply(timeDF[,'OI'],Vbucket,sum)/(Vbar*Vbucket)
  timeDF=timeDF[,c('VPIN')]; return(timeDF)
}
out=VPIN(stock,50)
###############################################################################

Here is what the original file looks like:

1993-01-04 09:35:25,10.375,5300,40,0,,N
1993-01-04 09:36:49,10.375,25000,40,0,,N
1993-01-04 09:53:06,10.375,100,40,0,,N
1993-01-04 10:04:13,10.375,200,40,0,,N
1993-01-04 10:04:20,10.375,100,40,0,,N
1993-01-04 10:24:42,10.375,1000,40,0,,N
1993-01-04 10:25:19,10.375,600,40,0,,N
1993-01-04 11:31:04,10.5,10000,40,0,,N
1993-01-04 12:13:09,10.5,200,0,0,,M
1993-01-04 12:13:38,10.5,200,0,0,,M

Stacey Kent

It baffles me why we don’t have more of this style of jazz which is simplistic enough to be musically accessible to everyone but substantial in chord structure and style. This is not the same as smooth jazz or elevator music. Raconte Moi is beautiful but Ces Petits Riens is my favorite.

Emacs and ESS on OS X for Beginners

When I first started learning R, I used R Commander as my GUI since it worked with all my operating systems and was relatively easy to setup and use. After a few months, I came to appreciate the simplicity of the RGui that is standard on R installation. Now, I am ready for something more advanced and have decided to try ESS (Emacs Speaks Statistics). Since I do not have any prior experience with Emacs and have not invested any time creating configuration settings and learning keyboard shortcuts, my main goal was to get a very Mac friendly Emacs version that just worked. OS X does come with Emacs installed but I didn’t want to go through the hassle of setting up ESS with the existing EMacs. I decided to use Aquamacs because it looks and behaves the most like a modern OS X application and has ESS enabled when installed. Emacs is available through the terminal but setting up and using ESS without menus does not provide an easier working environment than the RGUI or R Commander.

Emacs uses what are called meta commands since it originated before toolbars. To execute a meta command, type <Ctrl> u, <alt> x and the actual command. On Aquamacs, the right option key is set up to be the meta key so you can simply press <option> x before the actual command. You can also use the <esc> key as the meta key. To start R, press <option> x R <return> or <esc> x R <return>. You can also open an existing R script and click the R icon to start R. To customize ESS, select Customize Aquamacs -> Customize Group… from the Options menu and type ess at the prompt. You can also click the settings icon only after you have opened an R script. I like to have ESS start from the the same directory every time so I add my directory to the “ESS Directory” option and turn off the “ESS Ask for ESS Directory” option. Make sure to “Save for future sessions” before closing the buffer. I had already deleted R32 so I know that Emacs will default to R64 but I’m not sure how to work with both versions.

Update: I have now switched to RStudio since it is free, cross platform compatible, and simple to setup. I highly recommend it to any R user.