Tuesday, October 14, 2008

Simple analysis of stocks in R


First things first: getting data and getting it into R.

I like Yahoo! finance: http://finance.yahoo.com/
Choose a company you might be interested in, e.g., Questar gas (STR).
We'll want a lot of data so choose historical prices from the Quotes: summary options on the left.

Now, scroll to the bottom and choose Download to Spreadsheet. There you have your .csv file.

*See previous posts about getting .csv data into R*

Here is some sample R code to run a simple analysis and plot the data:

data<-read .csv=".csv" file.choose="file.choose" span="span">
data
summary(data)
attach(data)
Date_3=1:length(Close)
data=cbind(data,Date_3)

lin=lm(Close~Date_3)


pdf(file='moola.pdf')
par(bg="snow", family="serif", ps=10)
plot(Close~Date_3, data=data, type="l", xlab="Date", sub="Questar",axes=F, ylab="Closing $")
abline(h=mean(Close), col="blue", lty=2)
abline(h=max(Close), col="lightblue", lty=5)
abline(h=min(Close), col="lightblue", lty=5)
abline(lin, col="green", lty=2)
axis(1,at=c(1,1000,2000,3000,4000, 5232),
labels=c("12/30/87","12/11/91", "11/24/95", "11/10/99","11/12/03","Oct 6, 2008"))
axis(2, label=T)
axis(2, at=mean(Close), labels="[Mean]")
dev.off()

Saturday, October 4, 2008

Importing data into R

We'll cover two scenarios: 1) you have your data in a .csv (comma delimited) file, 2) you have your data in a .xls (Excel) file.

In either case, your data should be formated to have the first row as the variable name and the next rows as the data:

age weight other
12 25 63
13 67 67
... ... ...

-.csv files-

For .csv files, R has a built-in function that you can use to sort of "upload" you data:
read.csv(file.choose())

This will open a little window where you can browse and select your .csv file.
Or, if you know the exact location of your file, you can use (I made a folder on my desktop and a file named data1.csv):

read.csv("C:\\Users\\Andrew\\Desktop\\Desktop Folder\\data1.csv")
(For some reason you have to use double back-slashes. I don't know why.)

-Excel files-

The package 'xlsReadWrite' has a function read.xls to read .xls files.
Funny thing is, this package doesn't come installed in R. No worries. Use the code:
install.packages('xlsReadWrite')
and follow the onscreen instructions.

Then, "load" this package with:
library(xlsReadWrite)
And you're ready to go.

The logic is the same as with the 'read.csv' command- just replace all '.csv' to '.xls':
read.xls(file.choose())
read.xls("C:\\Users\\Andrew\\Desktop\\Desktop Folder\\data1.xls")


-All codes used-

read.csv(file.choose())
read.csv("C:\\Users\\Andrew\\Desktop\\Desktop Folder\\data1.csv")


install.packages('xlsReadWrite')
library(xlsReadWrite)


read.xls(file.choose())
read.xls("C:\\Users\\Andrew\\Desktop\\Desktop Folder\\data1.xls")

Thursday, October 2, 2008

Creating and exporting boxplots using R

For now, let's assume you want to enter your data directly (later I'll post on importing).

Let's say we just want to compare two variables, x & y. We can use the following code [I'll use blue font for R code]:

x=c(2, 3, 5, 6, 7, 4, 5, 7, 8)
y=c(4, 5, 11, 13, 16, 16, 17, 18, 24, 19)

You could also use the following to create random normal variables:

x=rnorm(10, mean=4, sd=2)
y=rnorm(10, mean=12, sd=2)

From here you can make a very simple boxplot using:

boxplot(x,y)

Let's add a few options: color (col="lightblue"), width of boxes (boxwex=.4), and cleaning up line type (lty=1). Putting this together:

boxplot(x,y, col="lightblue", boxwex=.4, lty=1)

This is looking a little better.

Let's add some x & y labels and main title:
boxplot(x,y, col="lightblue", boxwex=.4, lty=1, ylab="Whatever y is", xlab="Whatever x is", main="Boxplot sample for stats class")

Now something to label the x values and also a line at some y value of interest - say you know that y values over 10 are unhealthy:

boxplot(x,y, col="lightblue", boxwex=.4, lty=1, ylab="Whatever y is", xlab="Whatever x is", main="Boxplot sample for stats class") axis(1,at=c(1, 2),labels=c("x", "y")) abline(h=10, lty=2, lwd=2, col="blue")

Finally, let's export this thing as a pdf (you can also export as jpeg, etc, but pdf looks best).

Final code looks like:
pdf("whatev.pdf") par(family="serif", bg="beige") boxplot(x,y, col="lightblue", boxwex=.4, lty=1, ylab="Whatever y is", xlab="Whatever x is", main="Boxplot sample for stats class") axis(1,at=c(1, 2),labels=c("x", "y")) abline(h=10, lty=2, lwd=2, col="blue") dev.off()

Final plot looks like (click on image for better view):
Note: I actually had to use Jpeg because of blogger constraints.




Play around with the options and have fun!