Weather surface plots with R

8 Dec


This code will help you make surface temperature (or dew point) plots. It is designed for the state of Florida, but with some modifications, you can use a different state.

I know, I know… Comic Sans is a bit bubbly, and maybe a tad unprofessional. But, this plot is cute, and it was meant for social media and auto-tweeting. You can, of course, use your favorite fontūüôā

Side note. To use the font of your choice, you will need the extrafont package.

Step 1: We gotta download the necessary data.

  • The first is GIS data. Go to this website and download “SE 1 SHP Maps s_28au12.7z” – you will need to uncompress the files. I know this can be done in R, but I was struggling with that part. Lemme know if there is a simple way.
  • Then download the .csv files ‘merged.csv’ and ‘fl.merged.csv’





Step 2: Get to coding.

###Lisa Vaughn

###I honestly do not know if I need ALL of these packages. 
#But I am scared to delete one of them. 
#You know how it is... Sorry.
#The first time you install the extrafont package, you will 
#need to type font_import() into the command line.
#You don't have to do this - you can use the default font.
#See this link for help on fonts.
#library(twitteR) If you want to tweet out your plot!

#Get the date. We'll use this later for the plot. 
date         <- format(Sys.time(), "%b %d, %Y %I%p %Z")

#fl.merged is a subset of merged. I handpicked the cities I wanted to use.
#If you want to do Texas, for example, you will simply need to create a subset of merged.csv
#containing the Texas cities you want to use. 
#There *will*, however, be other modifications to make if you choose a different state.
#You will see in the comments below.
fl.merged          <- read.csv("C:/Users/Lisa/Desktop/currents/fl.merged.csv", 
                               stringsAsFactors = FALSE)

fl.merged$dewpoint <- rep(NA, nrow(fl.merged)) #create a column for dewpoint
fl.merged$temp     <- rep(NA, nrow(fl.merged)) #create a column for temp
total              <- nrow(fl.merged)

#Set up a progress bar for our 'for' loop.
pb                 <- txtProgressBar(min = 0, max = total, style = 3)

#This 'for' loop will collect the temp and dew point for each of the cities in our
#fl.merged .csv file.
for(i in 1:total){
  link        <- paste("", paste(fl.merged[i,1], ".html", sep = ""), sep ="")
  doc         <- html(link)
  tables      <- doc %>% html() %>% html_nodes(xpath = "/html/body/table") %>% html_table(fill = TRUE)
  data        <- data.frame(tables[[4]], stringsAsFactors = FALSE)
  #I realize this next step is not entirely necessary, but it's good to give the columns names
  #in case you want to use other variables in the future.
  names(data)          <- c("date", "time", "wind", "vis", "weather", "sky",
                            "temp", "dp", "6hrmax", "6hrmin", "rh", "windchill", 
                            "heatindex", "altimeter", "sealevel_mb", "precip1", "precip2",
  num.rows              <- nrow(data)
  data                  <- data[-c(1,2,(num.rows-2):num.rows), ]  #Get rid of first two rows.
  new.num.rows          <- nrow(data)
  data                  <- data[(new.num.rows:1),] #re-order data.  Earliest obs first.
  fl.merged$temp[i]     <- data$temp[7] #Add temp to fl.merged
  fl.merged$dewpoint[i] <- data$dp[8]   #Add dew point to fl.merged
  setTxtProgressBar(pb, i) #This is for the progress bar.
close(pb) #closing the progress bar.    <- map('state', 'florida', fill = FALSE, plot = FALSE) #Get the lon/lat for outline of florida.

#Read in the GIS data. This takes a little bit of time.
state     <- readOGR("C:/Users/Lisa/Documents/Weather/Weather/SE 1 SHP Maps s_28au12/s_28au12",
                     "s_28au12", verbose = FALSE)

fl        <- state[state$STATE == "FL", ] #pick out just florida data

xyz       <- data.frame(-fl.merged$long, fl.merged$lat, fl.merged$dewpoint)
#xyz is a data frame with 3 columns: x axis, y axis, and the variable we are trying to plot (dew point or temperature)
#So, to plot temperature, we would have to change the fl.merged$dewpoint to fl.merged$temperature

mba.bbox  <- c(-89, -79, 25, 32) #These are the axes limits for the plot (in this case, lon & lat)

#surf stands for 'surface' (comes from MBA package)
surf.w                  <-, 150, 150, extend = TRUE,
                                    sp = TRUE, = mba.bbox)$xyz.est  #The 150,150 is the resolution. Make smaller to speed up processing time.

surf.w@data             <- surf.w@data * (!, fl)))
surf.w$z[surf.w$z == 0] <- NA
surf.w$z                <- surf.w$z * .010 #2.23693629

png("C:/Users/Lisa/Desktop/currents/current.temps.png") #where we want to save the image.
par(mar=rep(1,4), bg = "#43a2ca")
image(as.image.SpatialGridDataFrame(surf.w), asp = 1.25,
      col = rev(heat_hcl(10)),
      axes = FALSE, #set axes=TRUE if you want to see the lat/lon
      xlim = c(-87, -80))  #From the 'fields' package.

#Can use image.plot() instead of just image() to get a legend on the side.
plot(fl, add = TRUE)

text(-fl.merged$long + .25, fl.merged$lat + 0.25, label = fl.merged$STATION,
     cex = .75, family = "Comic Sans MS", font = 2)

text(-fl.merged$long + 0.25, fl.merged$lat, label = fl.merged$dewpoint,
     cex = 1.25, family = "Comic Sans MS", font = 2)

text(-83, 31.55, "Current Dew point", family = "Comic Sans MS", font = 2,
     cex = 2.25)

text(-83, 31.1, date, family = "Comic Sans MS")

I know there is always a more elegant and succinct way of writing code. Feel free to share with me how you made it betterūüôā

Happy coding.

Texas bound

30 Nov

Today was my last day at WPTV and WFLX in West Palm Beach. I am returning to my home state of Texas where I will be close to my family.

My future career path is a little unknown right now. I am exploring my options in the data science field.

I have lots of blessings in the coming months – I get to spend the holidays with my family, and I am getting married in January 2016.

In my spare time, I can assure you, I will be writing a lot of R code. Maybe even some packages.



last day

How to program a Twitter bot

26 Nov

I have made many “Tweet bots” over the last year, but I’m just now getting around to sharing.

One of the simplest ones I have created is the “current conditions” auto-tweet. It ¬†simply scrapes the weather observation data from the NWS website and then tweets it out.

I have the R script set up to run in batch mode a couple of times a day (times specified by me in my Task Scheduler).



You must first create an app in Twitter via the Twitter Developer Website.

You must then obtain a consumer key and secret. Here is a good how-to.

Play around with the R Twitter package (twitteR) and familiarize yourself with its functions.

I use the httr package for Twitter Authentication.

###This script will scrape the latest conditions from the NWS for KPBI and
###tweet out the current conditions including an image of sky cover.

###Have to first go through all the twitter stuff:

#detach("package:Rfacebook", unload=TRUE)

# returns string w/o leading or trailing whitespace
trim <- function (x) gsub("^\\s+|\\s+$", "", x)

setwd("C:/Users/Lisa/Desktop/facebook") #setting this folder as directory bc that is whwere the facebook/twitter api stuff is.

api_key             <- "ABC123"
api_secret          <- "YOURSECRET123"
access_token        <- "ZYZ123"
access_token_secret <- "BLAHBLAHBLAH123"

options(httr_oauth_cache = TRUE)


# 1. Find OAuth settings for twitter:

# 2. Register an application at
#    Insert your values below - if secret is omitted, it will look it up in
#    the TWITTER_CONSUMER_SECRET environmental variable.
#    Make sure to set callback url to ""
myapp <- oauth_app("twitter", key = api_key, secret=api_secret)

# 3. Get OAuth credentials
twitter_token <- oauth1.0_token(oauth_endpoints("twitter"), myapp)

# 4. Use API
req <- GET("",
           config(token = twitter_token))

# 5. Get authorization
setup_twitter_oauth(api_key, api_secret, access_token)

############## Now scrape the NWS data
#This function takes in a vector and returns the very last 
get.first.last <- function(x){
  length <- length(x)

#library(gdata) #for dropping levels.
library(XML) #for html scraping.

theurl   <- ""
#tables  <- readHTMLTable(theurl)
doc      <- html(theurl)
tables   <- theurl %>% html() %>% html_nodes(xpath = "/html/body/table") %>% html_table(fill = TRUE)

data <- data.frame(tables[[4]], stringsAsFactors = FALSE)
names(data) <- c("date", "time", "wind", "vis", "weather", "sky",
                 "temp", "dp", "6hrmax", "6hrmin", "rh", "windchill", 
                 "heatindex", "altimeter", "sealevel_mb", "precip1", "precip2",

num.rows <- nrow(data)

data         <- data[-c(1,2,(num.rows-2):num.rows), ]  #Get rid of first two rows.
new.num.rows <- nrow(data)
data         <- data[(new.num.rows:1),] #re-order data.  Earliest obs first.  

format(Sys.time(), "%a %b %d %X %Y")
month <- format(Sys.time(), "%m")
year  <- format(Sys.time(), "%Y")

date.time <- paste(paste(year, paste(month, paste(data$date), sep = "-"), sep = "-"), 
#date.time <- as.POSIXlt(date.time)
#date.time <- strftime(date.time, format = "%m-%d-%Y %H:%M")

data$date       <- paste(data$date)
data$temp       <- as.numeric(paste(data$temp))
data$heatindex  <- as.numeric(paste(data$heatindex)) <- paste(data[nrow(data), ]$weather)
current.temp    <- data[nrow(data), ]$temp
current.dp      <- paste(data[nrow(data), ]$dp)
current.hi      <- data$heatindex[nrow(data)]

###Get image
url <- ""
url <- html(url)
image_url <- url %>%  
  html_nodes("#current-conditions-body img") %>% 

link = paste("", image_url, sep = '')
download.file(link, destfile = "C:/Users/Lisa/Desktop/currents/currents.png", 
              mode = "wb")

#Set some conditions for the text.
if(current.temp >= 90 & current.dp > 69){a <- c("Hot & humid!")}
if(current.temp < 90){a <- " "}

x <- c("Right now:", "Currently:", "Current Conditions:", "Right now weather:")
y <- sample(x, 1) #Choose a phrase at random so it doesn't look like a tweet bot

tweet.text <- paste("Right now:", a,
                    current.temp, "degrees", 
                    "in West Palm.",

tweet.text <- trim(tweet.text)

random <- runif(1, 1, 5*60) #wait between 1 second and 5 minutes, just to randomize a little
#            mediaPath = "C:/Users/Lisa/Desktop/currents/currents.png")
#updateStatus(paste(tweet.text, ""))
updateStatus(tweet.text, mediaPath = "C:/Users/Lisa/Desktop/currents/currents.png")

Once you’ve got this done, you’ll need to set up a batch file. I do this by creating a new R script (within Rstudio) and typing the following:

"C:\Program Files\R\R-3.1.2\bin\x64\R.exe" CMD BATCH "C:\Users\Lisa\Desktop\currents\currents.r"

Then, save this file as a ‘.bat’ file instead of a ‘.r’ file – this is very simple to do in Rstudio. (I don’t know about the R console…)

It is this ‘.bat’ file that you will want to trigger using your computer’s task scheduler.

The likelihood of Fred getting “close”

30 Aug

Tropical Storm Fred has formed off the coast of Africa as of 5AM 8/30/2015.

fred satellite image

Storms that form THIS close to Africa rarely make it to the U.S. You may hear them referred to as “fish storms” that “curve out to sea”. ¬†That’s because the only thing these storms usually interact with are fish.

I found that only 10% of storms that form as far away as Fred make it close to the U.S.*

A couple of these storms are quite infamous – The Okeechobee Hurricane of 1928 and the Great New England Hurricane of 1938.

Able in 1952 was the only hurricane that season to make landfall in the U.S. Dora in 1964 was the first time (in our short record book) Jacksonville got hurricane-force winds.


long track 2

*10% of storms that formed east of 20W longitude made it west of 75W longitude. Fred formed at 18.9W longitude.  There were 42 total storms I found that formed east of 20W longitude.

Interactive weather observation graphs

10 Aug

I recently took the tutorial on shiny by Garret Grolemund (we were classmates at Rice!), and found it to be a very thorough tutorial.

If you aren’t familiar with Shiny – it is a way to create web applications using just R (without HTML knowledge). ¬†R sort of “translates” to HTML and creates a web page. You still have to be pretty proficient at the R language though.

Email me if you are interested in the R code, or just enjoy the interactive app!

Check it out!

app-3 pic

Exploratory analysis of hurricane data

16 Jul

I have been doing a lot of exploration with ggplot2 and my hurricane dataset.

This map was made using some code I posted previously. I just added some additional stuff to make this specific plot.  You will have to go through the entire data collection process from the previous post in order to append this code below.

data <- read.csv("C:/Documents and Settings/165180/Desktop/code/hurricane.csv") #you will have to get this through the previous post!
data$name.year <- paste(data$year, data$name)


map <- get_map(location = c(lon = -80, lat = 30), zoom = 4)
p <- ggmap(map)

july <- data[data$month == 7, ]

july$type <- rep(0, nrow(july))

july$type[which(july$STAT == "HURRICANE-1" | july$STAT == "HURRICANE-2" | july$STAT == "HURRICANE-3" | july$STAT == "HURRICANE-4")] <- "HURRICANE"

ddply(july, .(name.year), function(x){any(x == "HURRICANE")})

#I want the stat to be either hurricane or x. so I can plot.
july$STAT <- revalue(july$STAT, c("DEPRESSION (ESTIMATE"="x", "DISSIPATED"="x",
"LOW" = "x",
"TROPICAL WAVE" = "x", "XXX" = "x"))

cbPalette <- c("#d4b9da", "#91003f", "#df65b0", "#e7298a", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
p + geom_path(data = july, aes(LON, LAT, group = name.year), size = 1, col="grey") +
geom_point(data = july, aes(LON, LAT, col = STAT), pch=18, size = 4) +
scale_colour_manual(values=cbPalette) +
ggtitle("July storms since 1851") +
theme(plot.title = element_text(size = rel(2)))

july storms

Facebook data PCA

8 Jul


I wanted to do this exploration because I have noticed some posts get a huge reach with very few shares/likes/comments.  I have also noticed some posts have a large number of shares/likes/comments but reach a small number of people.

I did principal component analysis on Facebook data from our last 2,000 posts (4/16/2015 Р7/7/2015).  Principal components can reveal structure to data which you cannot see otherwise.  PCA refresher here:

There are four variables – number of likes, number of comments, number of shares, and reach. Reach is the number of unique people who saw the post.

I broke down the data by type – link, photo, status update, or video.

Below are 3 combinations of principal components. The most striking observation (to me) is that video posts (purple) can get a huge reach with very few shares/likes/comments.  However, just being a video post does not guarantee a large reach.

I know this is pretty well-known about Facebook videos.  It is neat, however, to see it in the data structure.

pc12 pc14 pc34


Code for the ggplot2 biplot:




Get every new post delivered to your Inbox.