Final Report |
Installation |
How it Works |
Use Cases |
Code |
License
Example Use Case - Spark Tutorial Notebooks
Introduction
This use case runs a few notebooks used at CERN for training in Apache Spark.
They test a wide range of Spark APIs including reading data from files.
Notebooks
Environment
- These notebook were run with a local Apache Spark installation, using 1 executor and 4 cores, running inside a Docker container based on Scientific Linux CERN 6.
Monitoring the Notebook
- The extension shows all the jobs that have been run from a cell
- The stages for each job are shown in an expanded view which can be individually collapsed.
data:image/s3,"s3://crabby-images/84c56/84c5640a0b83527a08587c0be9996505d7d7c173" alt="6"
- An aggregated view of resource usage is provided through a graph between number of active tasks and available executor cores. This gives insight into whether the job is blocking on some I/O or waiting for other results. This view gives a picture of the level of parallelization of the tasks between cores across a cluster.
data:image/s3,"s3://crabby-images/e50e4/e50e45bb2a32c5f3e71b3f884011b878e462fcfc" alt="3"
data:image/s3,"s3://crabby-images/c386a/c386a162c4fee5f1f81f9e32f864777d4a73f2bb" alt="7"
- An event timeline shows the overall picture of what is happening in the cluster, split into jobs stages and tasks.
data:image/s3,"s3://crabby-images/b7ec0/b7ec08d4c8741fa8eb833f18f139fb33d9eef76b" alt="2"
data:image/s3,"s3://crabby-images/50c72/50c72006c906cfeb73ca0f7519ce52a3e222df13" alt="8"
- The timeline shows various tasks running on each executor as a group
- It shows the time spent by the task in various phases. An overall view of this gives insight into the nature of the workload - I/O bound or CPU bound. This feature can be toggled using a checkbox.
- On clicking on an item on the timeline, the corresponding details of the item are shown as a pop-up. For jobs and stages, this shows the Spark Web UI page. For tasks a custom pop-up is shown with various details.
data:image/s3,"s3://crabby-images/e62a2/e62a2692cff5644dd33d5796703fb30a715c4522" alt="5"
- For more advanced details, the extension provides access to the Spark Web UI through a server proxy. This can used by advanced users for an in-depth analysis.
data:image/s3,"s3://crabby-images/6baa1/6baa18e2ff9e0a39e853e754ff7bedf37adfe0b7" alt="1"