Enable Logging for Completed Applications

For Legacy Support Purposes Only

 

This article explains how to view historical logs of completed applications. The Instaclustr Spark Console provides the Spark Application UI for running applications, however, once the application is completed, the logs are deleted and only Spark worker logs are accessible.

In order to retain the Application UI for completed applications, you will need to configure the Spark History Server on the machine being used to submit Spark jobs to the cluster.

Prerequisites

  • An Instaclustr Spark cluster. See step 1 here for a guide on how to create one.
  • A Spark Client machine used to submit Spark jobs to the cluster. See steps 2, 3, and 4 here for a guide on how to create and use one.

Spark History Server Configuration

On the Spark Client machine, copy the default configuration template spark-defaults.conf.template to a new file called spark-defaults.conf. This file will be automatically loaded by Spark applications when they are next started.

Next, add the following lines to the spark-defaults.conf file:

See here more information on Spark History Server configuration.

Now create the event log folder:

Start the History Server

Now that configuration is complete, start the Spark History Server from within your Spark installation folderlike so:

When the History Server starts, it will print the location of the log file it will write to in case you want to tail or review it, for example:

Starting org.apache.spark.deploy.history.HistoryServer, logging to /home/ubuntu/spark-2.1.1-bin-hadoop2.6/logs/spark-admin-org.apache.spark.deploy.history.HistoryServer-1-ip-172.19.34.19.out

View the History Server

With the History Server running, browse to https://<Spark client IP>:18080/ and you should see the Spark History Server page displayed:

To see Running (incomplete) applications, click on Show incomplete applications.

Note: The Download button does not work for applications run in client mode. To download logs for an application run in client mode, remove the attempt ID from the URL, for example: https://<host>:18080/api/v1/applications/app-20180709045720-0050/1/logs becomes https://<host>:18080/api/v1/applications/app-20180709045720-0050/logs.

By Instaclustr Support
Need Support?
Experiencing difficulties on the website or console?
Already have an account?
Need help with your cluster?
Contact Support
Why sign up?
To experience the ease of creating and managing clusters via the Instaclustr Console
Spin up a cluster in minutes