Installing Spark 1.6.0 on Mac OS X (10.11) El Capitan

spark-logo-trademark

With the release of Spark 1.6.0 on 4th January 2016 I wanted to write a quick guide to getting up and running on your local laptop. In my case that’s a Macbook with OS X 10.11.3, El Capitan.

Prerequisites

First you need to make sure you have version 7 or 8 of the Java SE Development Kit installed correctly. You can download the installer for Java 7 here and Java 8 here.

Screen Shot 2016-02-02 at 12.24.47 PM

To install simply download the installer and run the package.

Running java -version  from the command line will confirm that everything is installed correctly and you’re ready to go.

Screen Shot 2016-02-02 at 12.26.06 PM

Why Java 8?

In a single word? Lamda. Java 8 now supports Lamda expressions much like Python so you can save yourself time writing those anonymous functions in Java 7 like this:

Instead word count becomes something like this:

Downloading Spark

Head on over to spark.apache.org and look for the download link in the menu bar and on the right hand side.

Screen Shot 2016-02-02 at 12.29.14 PM

Pre-Built or DIY?

If you’re just starting out I’d recommend getting one of the pre-built Spark distributions. If you’re after something more specific and want to build it yourself you’ll need to make sure you have both Maven and Scala installed on you mac before running the build.

On the download page you need to first select the Spark release and the package type. I’d advise getting the latest Spark build, 1.6.0 in this case and the pre-built for Hadoop 2.6 package.

Screen Shot 2016-02-02 at 12.29.49 PM

Finally change the download type to direct download and then click on the download link. The file is 289MB.

Making sure it all works

Move the file to your home directory or where ever you want to install Spark and unzip it by double clicking. For the pre-built packages that’s it. Spark is installed!

Open a command prompt and navigate to the directory you just unzipped and run one of the Spark examples to make sure everything it working.

You’ll see some output like the following and towards the bottom you should see Pi is roughly 3.14…

Screen Shot 2016-02-02 at 12.36.21 PM

And that’s it, Spark is installed and ready to run. Very simple.

About This Author

Big Data and Python geek. Writes an eclectic mix of news from the world of Big Data and Telecommunications interspersed with articles on Python, Hadoop, E-Commerce and my continual attempts to learn Russian!

Post A Reply