SparkMonitor

Final Report | Installation | How it Works | Use Cases | Code | License

Installation

Prerequisites

Quick Install

pip install sparkmonitor
jupyter nbextension install sparkmonitor --py --user --symlink 
jupyter nbextension enable sparkmonitor --py --user            
jupyter serverextension enable --py --user sparkmonitor
ipython profile create && echo "c.InteractiveShellApp.extensions.append('sparkmonitor.kernelextension')" >>  $(ipython profile locate default)/ipython_kernel_config.py

Detailed Instructions

  1. Install the python package in the latest tagged github release. The python package contains the JavaScript resources and the listener jar file.
pip install sparkmonitor
  1. The frontend extension is symlinked (--symlink) into the jupyter configuration directory by jupyter nbextension command. The second line configures the frontend extension to load on notebook startup.
jupyter nbextension install --py sparkmonitor --user --symlink
jupyter nbextension enable sparkmonitor --user --py
  1. Configure the server extension to load when the notebook server starts
 jupyter serverextension enable --py --user sparkmonitor
  1. Create the default profile configuration files (Skip if config file already exists)
    ipython profile create
    
  2. Configure the kernel to load the extension on startup. This is added to the configuration files in users home directory
    echo "c.InteractiveShellApp.extensions.append('sparkmonitor.kernelextension')" >>  $(ipython profile locate default)/ipython_kernel_config.py
    

Configuration

By default the Spark Web UI runs on localhost:4040. If this is not the case, setting the environment variable SPARKMONITOR_UI_HOST and SPARKMONITOR_UI_PORT overrides the default Spark UI hostname localhost and port 4040 used by the Spark UI proxy.

Build from Source

Building the extension involves three parts:

  1. Bundle and minify the JavaScript
  2. Compile the Scala listener into a JAR file.
  3. Package and install the python package.
git clone https://github.com/krishnan-r/sparkmonitor
cd sparkmonitor/extension
#Build Javascript
yarn install
yarn run webpack
#Build SparkListener Scala jar
cd scalalistener/
sbt package
#Install the python package (in editable format -e for development)
cd sparkmonitor/extension/
pip install -e .
# The sparkmonitor python package is now installed. Configure with jupyter as above.