scrappy: A Simple Web Scraper

The goal of scrappy is to provide simple functions to scrape data from different websites for academic purposes.

Installation

You can install the released version of scrappy from CRAN with:

install.packages("scrappy")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("villegar/scrappy")

Example

NEWA @ Cornell University

The Network for Environment and Weather Applications at Cornell University. Website: http://newa.cornell.edu

# Create RSelenium session
rD <- RSelenium::rsDriver(browser = "firefox", port = 4548L, verbose = FALSE)

# Call scrappy
out <- scrappy::newa_nrcc(client = rD$client, 
                          year = 2020, 
                          month = 12, # December
                          station = "gbe", # Geneve (Bejo) station
                          save_file = FALSE) # Don't save output to a CSV file
# Stop server
rD$server$stop()
#> [1] TRUE

Partial output from the previous example:

Date/Time	Air Temp (℉)	RH (%)	Wind Spd (mph)	Wind Dir (degrees)	Solar Rad (langleys)	Dewpoint (℉)	Station
12/31/2020 23:00 EST	33.1	82	2.8	264	0	28	gbe
12/31/2020 22:00 EST	33.0	80	3.3	250	0	28	gbe
12/31/2020 21:00 EST	32.8	81	2.6	261	0	28	gbe
12/31/2020 20:00 EST	32.5	84	1.7	277	0	28	gbe
12/31/2020 19:00 EST	32.9	81	2.1	279	0	28	gbe
12/31/2020 18:00 EST	33.3	79	3.0	272	0	28	gbe
12/31/2020 17:00 EST	33.5	78	3.9	274	1	27	gbe
12/31/2020 16:00 EST	34.1	74	4.9	272	7	27	gbe
12/31/2020 15:00 EST	33.8	72	7.1	277	8	26	gbe
12/31/2020 14:00 EST	34.4	70	7.9	276	13	26	gbe