Skip to content

Latest commit

 

History

History
40 lines (30 loc) · 1.13 KB

README.md

File metadata and controls

40 lines (30 loc) · 1.13 KB

sparkedatools

The missing SparklyR EDA toolkit (for use in R). Quick, efficient, and easy to use.

Wrap and graph the outputs from the SparkEDA package for Spark.

You will need a few things to get started.

  1. Grab the SparkEDA Jar from my website

  2. Download this repository to your R home or other location Note: You will need devtools installed in R (as this package is not yet up on CRAN

install.packages("devtools")
library("devtools")

Then

devtools::install_github('GabeChurch/sparkedatools')
  1. Edit your SparklyR Configuration (in R)

You need to add the SparkEDA jar for the package to work in R.

conf = spark_config()

This is the important line

#This is the configuration option
conf$'sparklyr.jars.default'= "/system/path/to/sparkeda_2.11-2.07.jar"
sc = spark_connect(master = "yarn-client", config = conf, version = '2.3.2')

The ORDER IS IMPORTANT. Must be BEFORE you have connected and AFTER you have instantiated your spark_config in R.

  1. Enjoy being able to visualize and understand your giant data-sets like never before!