Streamline Databricks Workflows with Azure DevOps Release Pipelines

The process of developing and deploying applications is complex, time-consuming, and often error-prone. The use of release pipelines helps to streamline this process and automate the deployment of code and data. Databricks is a popular cloud-based platform used for data engineering, data science, and machine learning tasks. Azure DevOps is a powerful tool for managing the entire software development lifecycle, including build and release management. In the blog “Streamline Databricks Workflows with Azure DevOps Release Pipelines“, we will explore how to build release pipelines for Databricks using Azure DevOps. We will look at the steps required to set up a pipeline for Databricks. By the end of this post, you will have a good understanding of how to build efficient and reliable release pipelines for Databricks using Azure DevOps.

Table of Contents

Introduction

In my last blog, we discussed the build pipeline and in this blog, we will discuss the release pipeline. Here I will be providing a step-by-step guide to developing and deploying Azure DevOps Release Pipelines for Databricks.

DevOps Release Pipeline Steps

STEP 1: Define the Release Pipeline

First, we will have to set up the release pipeline. Select the pipeline and click Releases. It will take you to the new release pipeline.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (1)

STEP 2: Select Empty Job Template

Now you need to create a project type for the release pipeline. Use an empty Job type so we can customize it fully.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (2)

STEP 3: Add the Artifacts

The release pipeline has two separate steps. Artifacts and Stages. In the Artifacts you need to select the files dropped from build pipelines and specify the location of the Artifacts. This is exactly depicted in the below diagram.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (3)

STEP 4: Enable Continuous Deployment Trigger

We need to enable the continuous deployment trigger so as soon as the build pipeline triggers the build release pipeline can be triggered.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (4)

STEP 5: Define environment variables for the release pipeline

Azure DevOps will invoke the Databricks CLI command to connect to the remote Databricks Cluster to deploy the code. So we need to create variables in Azure DevOps to store the Databricks instance-specific information where the code will be released to. Here are the details of these variables and their significance.

Parameter Name used by Databricks Connect	What does it means?
DATABRICKS_HOST	It is this part in the databricks workspace URL https://adb-XXXXX.azuredatabricks.net
DATABRICKS_TOKEN	API token generated from Databricks. It is used for REST API calls.
DATABRICKS_CLUSTER_ID,	Code can be deployed to any cluster so this is a unique cluster id in the cluster URL: https://adb-xxx.azuredatabricks.net/?o=xxx#setting/clusters/XXX/configuration

This is how it will look inside Azure DevOps.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (5)

Here are the step-by-step instructions for creating the Token.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (6)

STEP 6: Configure Release Agent

We will Deploy an Azure virtual machine for the release pipeline. The virtual machine image should match the one on the Azure Databricks cluster as closely as possible. For example, Databricks Runtime 10.4 LTS runs Ubuntu 20.04.4 LTS, which maps to the Ubuntu 20.04 virtual machine image in the Azure Pipeline agent pool. Here is the latest link to Databricks releases which exactly shows the OS. For example, runtime 13.0 works on Ubuntu 22.04.2 LTS as per the release 13 release notes. So always make sure you use the OS version supported by the release.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (7)

STEP 7: Set the Python Version for the release agent

Now we have the VM we need to deploy the required version of Python and build tools on the virtual machine for testing and packaging the Python code. Make sure that the correct version of Python matches the version installed on your remote Azure Databricks cluster.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (8)

STEP 8: unpackaged the build artifact from the build pipeline

In my previous blog when we created the Build Pipeline we zipped the build artifacts and drop them in the artifact location. Now in this step, we need to extract the zipped build artifact so we can use it for deployment. We are extracting these artifacts in this step.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (9)

STEP 9: Install the Databrics CLI and Unit Test XML Reporting

We will be using Databricks CLI and unittest-xml-reporting Python packages because these libraries will be used in the upcoming steps.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (10)

STEP 10: Deploy the notebook to the workspace

Now we have to import the Python notebook from the artifact directory into Databricks Workspace. This is exactly what this CLI does.

databricks workspace import --language=PYTHON --format=SOURCE --overwrite $(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/notebooks/dbxdemo-notebook.py /Shared/dbxdemo-notebook.py

Let’s understand this command so you can tweak it based on your own need.

databricks workspace import: This is the command to import a file or notebook to the Databricks workspace.
--language=PYTHON: This option specifies the language of the file or notebook being imported. In this case, it’s a Python notebook.
--format=SOURCE: This option specifies the format of the file or notebook being imported. In this case, it’s a source file.
--overwrite: This option specifies that if a notebook with the same name already exists in the Databricks workspace, it should be overwritten.
$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/notebooks/dbxdemo-notebook.py: This is the path to the notebook file in the artifacts directory.
/Shared/dbxdemo-notebook.py: This is the path where the notebook will be imported into the Databricks workspace.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (11)

STEP 11: Copy Python Wheel to the workspace

Now we will copy a Python library wheel file from the artifacts directory to the Databricks File System (DBFS) in the Databricks workspace. Let me explain how this command works.

databricks fs cp  --overwrite $(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/libraries/python/libs/dbxdemo-0.1.0-py3-none-any.whl dbfs:/libraries/python/libs/dbxdemo-0.1.0-py3-none-any.whl

databricks fs cp: This is the command to copy files between the local file system and DBFS.
--overwrite: This option specifies that if a file with the same name already exists in the destination, it should be overwritten.
$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/libraries/python/libs/dbxdemo-0.1.0-py3-none-any.whl: This is the path to the Python library wheel file in the artifacts directory.
dbfs:/libraries/python/libs/dbxdemo-0.1.0-py3-none-any.whl: This is the path where the file will be copied into the DBFS.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (12)

STEP 12: Install the Python Wheel library to cluster

This command is running a Python script called “installWhlLibrary.py” which installs a Python library wheel file in a Databricks cluster. Let me explain the command:

$(Release.PrimaryArtifactSourceAlias)/Databricks/cicd-scripts/installWhlLibrary.py --shard=$(DATABRICKS_HOST) --token=$(DATABRICKS_TOKEN) --clusterid=$(DATABRICKS_CLUSTER_ID) --libs=$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/libraries/python/libs/ --dbfspath=/libraries/python/libs

$(Release.PrimaryArtifactSourceAlias)/Databricks/cicd-scripts/installWhlLibrary.py: This is the path to the Python script in the artifacts directory that installs the library.
--shard=$(DATABRICKS_HOST): This option specifies the hostname of the Databricks workspace to connect to.
--token=$(DATABRICKS_TOKEN): This option specifies the Databricks access token to use for authentication.
--clusterid=$(DATABRICKS_CLUSTER_ID): This option specifies the ID of the Databricks cluster where the library will be installed.
--libs=$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/libraries/python/libs/: This option specifies the path to the directory containing the Python library wheel file in the artifacts directory.
--dbfspath=/libraries/python/libs: This option specifies the path where the library will be installed in the Databricks cluster.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (13)

This Python script installs Python .whl libraries on a Databricks cluster. It uses the Databricks REST API to interact with the cluster and performs the following steps:

Parse command line arguments using getopt.
Walk the local file path specified in libspath to generate a list of .whl files to evaluate.
For each library in the list, evaluate whether it needs to be installed, uninstalled and reinstalled, or left as is.
If the library is not found on the cluster, install it using the installLib function.
If the library is found on the cluster, uninstall it using the uninstallLib function, restart the cluster using restartCluster, and then install it using the installLib function.

The getLibStatus function is used to determine whether a library is already installed on the cluster and what its current status is. The main function is the entry point of the script and calls the other functions to perform the library installation.

# installWhlLibrary.py
#!/usr/bin/python3
import json
import requests
import sys
import getopt
import time
import os

def main():
  shard = ''
  token = ''
  clusterid = ''
  libspath = ''
  dbfspath = ''

  try:
    opts, args = getopt.getopt(sys.argv[1:], 'hstcld',
      ['shard=', 'token=', 'clusterid=', 'libs=', 'dbfspath='])
  except getopt.GetoptError:
    print(
      'installWhlLibrary.py -s <shard> -t <token> -c <clusterid> -l <libs> -d <dbfspath>')
    sys.exit(2)

  for opt, arg in opts:
    if opt == '-h':
      print(
        'installWhlLibrary.py -s <shard> -t <token> -c <clusterid> -l <libs> -d <dbfspath>')
      sys.exit()
    elif opt in ('-s', '--shard'):
      shard = arg
    elif opt in ('-t', '--token'):
      token = arg
    elif opt in ('-c', '--clusterid'):
      clusterid = arg
    elif opt in ('-l', '--libs'):
      libspath=arg
    elif opt in ('-d', '--dbfspath'):
      dbfspath=arg

  print('-s is ' + shard)
  print('-t is ' + token)
  print('-c is ' + clusterid)
  print('-l is ' + libspath)
  print('-d is ' + dbfspath)

  # Generate the list of files from walking the local path.
  libslist = []
  for path, subdirs, files in os.walk(libspath):
    for name in files:

      name, file_extension = os.path.splitext(name)
      if file_extension.lower() in ['.whl']:
        print('Adding ' + name + file_extension.lower() + ' to the list of .whl files to evaluate.')
        libslist.append(name + file_extension.lower())

  for lib in libslist:
    dbfslib = 'dbfs:' + dbfspath + '/' + lib
    print('Evaluating whether ' + dbfslib + ' must be installed, or uninstalled and reinstalled.')

    if (getLibStatus(shard, token, clusterid, dbfslib)) is not None:
      print(dbfslib + ' status: ' + getLibStatus(shard, token, clusterid, dbfslib))
      if (getLibStatus(shard, token, clusterid, dbfslib)) == "not found":
        print(dbfslib + ' not found. Installing.')
        installLib(shard, token, clusterid, dbfslib)
      else:
        print(dbfslib + ' found. Uninstalling.')
        uninstallLib(shard, token, clusterid, dbfslib)
        print("Restarting cluster: " + clusterid)
        restartCluster(shard, token, clusterid)
        print('Installing ' + dbfslib + '.')
        installLib(shard, token, clusterid, dbfslib)

def uninstallLib(shard, token, clusterid, dbfslib):
  values = {'cluster_id': clusterid, 'libraries': [{'whl': dbfslib}]}
  requests.post(shard + '/api/2.0/libraries/uninstall', data=json.dumps(values), auth=("token", token))

def restartCluster(shard, token, clusterid):
  values = {'cluster_id': clusterid}
  requests.post(shard + '/api/2.0/clusters/restart', data=json.dumps(values), auth=("token", token))

  waiting = True
  p = 0
  while waiting:
    time.sleep(30)
    clusterresp = requests.get(shard + '/api/2.0/clusters/get?cluster_id=' + clusterid,
      auth=("token", token))
    clusterjson = clusterresp.text
    jsonout = json.loads(clusterjson)
    current_state = jsonout['state']
    print(clusterid + " state: " + current_state)
    if current_state in ['TERMINATED', 'RUNNING','INTERNAL_ERROR', 'SKIPPED'] or p >= 10:
      break
      p = p + 1

def installLib(shard, token, clusterid, dbfslib):
  values = {'cluster_id': clusterid, 'libraries': [{'whl': dbfslib}]}
  requests.post(shard + '/api/2.0/libraries/install', data=json.dumps(values), auth=("token", token))

def getLibStatus(shard, token, clusterid, dbfslib):

  resp = requests.get(shard + '/api/2.0/libraries/cluster-status?cluster_id='+ clusterid, auth=("token", token))
  libjson = resp.text
  d = json.loads(libjson)
  if (d.get('library_statuses')):
    statuses = d['library_statuses']

    for status in statuses:
      if (status['library'].get('whl')):
        if (status['library']['whl'] == dbfslib):
          return status['status']
  else:
    # No libraries found.
    return "not found"

if __name__ == '__main__':
  main()

STEP 13: Create Integration Test Directories

Here we are going to create Integration Test Directories. Let’s understand these commands:

These are two separate commands. The first command creates two directories in the artifacts directory, while the second command installs two Python packages.

mkdir -p $(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/logs/json
- This command creates a directory named “json” inside another directory named “logs”, which in turn is inside a directory named “$(Release.PrimaryArtifactSourceAlias)/Databricks” in the artifacts directory. The “-p” option ensures that the parent directories are created if they don’t already exist.
- The command also creates another directory named “xml” in the same location.
pip install pytest requests
- This command uses pip to install two Python packages: “pytest” and “requests”.
- “pytest” is a testing framework for Python, while “requests” is a package for making HTTP requests in Python.

mkdir -p $(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/logs/json
mkdir -p $(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/logs/xml
pip install pytest requests

Streamline Databricks Workflows with Azure DevOps Release Pipelines (14)

STEP 14: Run Notebooks & understanding Executenotebook.py Python script

This step is going to run a Python script called “executenotebook.py” which executes a Databricks notebook and saves the results in a JSON log file. Let’s understand the command:

$(Release.PrimaryArtifactSourceAlias)/Databricks/cicd-scripts/executenotebook.py: This is the path to the Python script in the artifacts directory that executes the notebook.
--shard=$(DATABRICKS_HOST): This option specifies the hostname of the Databricks workspace to connect to.
--token=$(DATABRICKS_TOKEN): This option specifies the Databricks access token to use for authentication.
--clusterid=$(DATABRICKS_CLUSTER_ID): This option specifies the ID of the Databricks cluster where the notebook will be executed.
--localpath=$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/notebooks: This option specifies the path to the directory containing the Databricks notebook in the artifacts directory.
--workspacepath=/Shared: This option specifies the path where the notebook will be uploaded in the Databricks workspace.
--outfilepath=$(System.ArtifactsDirectory)/$(Release.PrimaryArtifactSourceAlias)/Databricks/logs/json: This option specifies the path where the JSON log file containing the results of the notebook execution will be saved.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (15)

The Python script “executenotebook.py” is designed to execute a set of Jupyter notebooks on a Databricks cluster. The script takes command-line arguments to configure the execution, including the Databricks cluster and authentication information, local and workspace paths to the notebooks, and an output file path.

The script first parses the command-line arguments using the getopt module and prints them to the console. Then, it walks the local path to discover the notebooks and generates a list of notebooks to execute.

The script then executes each notebook in the list by submitting a job request to Databricks. The job request specifies the notebook to execute, the cluster to execute on, and a timeout value. The script then waits for the job to complete by polling the Databricks REST API until the job’s state indicates that it has terminated, an internal error occurred, or it was skipped.

If an output file path is specified, the script writes the JSON response from the job request to a file with the run ID in the specified output directory.

Finally, the script defines a main() function that calls all of the above functions in order. The if __name__ == '__main__': block at the end of the script ensures that the main() function is only called if the script is run directly, not if it is imported as a module.

# executenotebook.py
#!/usr/bin/python3
import json
import requests
import os
import sys
import getopt
import time

def main():
  shard = ''
  token = ''
  clusterid = ''
  localpath = ''
  workspacepath = ''
  outfilepath = ''

  try:
    opts, args = getopt.getopt(sys.argv[1:], 'hs:t:c:lwo',
      ['shard=', 'token=', 'clusterid=', 'localpath=', 'workspacepath=', 'outfilepath='])
  except getopt.GetoptError:
    print(
      'executenotebook.py -s <shard> -t <token>  -c <clusterid> -l <localpath> -w <workspacepath> -o <outfilepath>)')
    sys.exit(2)

  for opt, arg in opts:
    if opt == '-h':
      print(
        'executenotebook.py -s <shard> -t <token> -c <clusterid> -l <localpath> -w <workspacepath> -o <outfilepath>')
      sys.exit()
    elif opt in ('-s', '--shard'):
        shard = arg
    elif opt in ('-t', '--token'):
        token = arg
    elif opt in ('-c', '--clusterid'):
        clusterid = arg
    elif opt in ('-l', '--localpath'):
        localpath = arg
    elif opt in ('-w', '--workspacepath'):
        workspacepath = arg
    elif opt in ('-o', '--outfilepath'):
        outfilepath = arg

  print('-s is ' + shard)
  print('-t is ' + token)
  print('-c is ' + clusterid)
  print('-l is ' + localpath)
  print('-w is ' + workspacepath)
  print('-o is ' + outfilepath)

  # Generate the list of notebooks from walking the local path.
  notebooks = []
  for path, subdirs, files in os.walk(localpath):
    for name in files:
      fullpath = path + '/' + name
      # Remove the localpath to the repo but keep the workspace path.
      fullworkspacepath = workspacepath + path.replace(localpath, '')

      name, file_extension = os.path.splitext(fullpath)
      if file_extension.lower() in ['.scala', '.sql', '.r', '.py']:
        row = [fullpath, fullworkspacepath, 1]
        notebooks.append(row)

  # Run each notebook in the list.
  for notebook in notebooks:
    nameonly = os.path.basename(notebook[0])
    workspacepath = notebook[1]

    name, file_extension = os.path.splitext(nameonly)

    # workspacepath removes the extension, so now add it back.
    fullworkspacepath = workspacepath + '/' + name + file_extension

    print('Running job for: ' + fullworkspacepath)
    values = {'run_name': name, 'existing_cluster_id': clusterid, 'timeout_seconds': 3600, 'notebook_task': {'notebook_path': fullworkspacepath}}

    resp = requests.post(shard + '/api/2.0/jobs/runs/submit',
      data=json.dumps(values), auth=("token", token))
    runjson = resp.text
    print("runjson: " + runjson)
    d = json.loads(runjson)
    runid = d['run_id']

    i=0
    waiting = True
    while waiting:
      time.sleep(10)
      jobresp = requests.get(shard + '/api/2.0/jobs/runs/get?run_id='+str(runid),
        data=json.dumps(values), auth=("token", token))
      jobjson = jobresp.text
      print("jobjson: " + jobjson)
      j = json.loads(jobjson)
      current_state = j['state']['life_cycle_state']
      runid = j['run_id']
      if current_state in ['TERMINATED', 'INTERNAL_ERROR', 'SKIPPED'] or i >= 12:
        break
      i=i+1

    if outfilepath != '':
      file = open(outfilepath + '/' +  str(runid) + '.json', 'w')
      file.write(json.dumps(j))
      file.close()

if __name__ == '__main__':
  main()

STEP 15: Create and Evaluate Notebook Test Results & Understanding evaluatenotebookruns.py Python file

This Python script evaluates the notebook runs. Let’s understand this Python script:

The Python script “evaluatenotebookruns.py” contains a unit test class named “TestJobOutput” with two test methods: “test_performance” and “test_job_run”.

The purpose of this script is to evaluate the output of Databricks notebook runs by processing JSON log files located in the specified path (self.test_output_path) and asserting that they meet certain criteria.

In “test_performance” method, each JSON log file is loaded and the execution duration of the notebook is checked. If the duration is greater than 100000 (presumably in milliseconds), the test will fail; otherwise, it will succeed.

In “test_job_run” method, each JSON log file is loaded and the job status is checked. If any job fails, the test will fail; otherwise, it will succeed.

After running all tests, the results are output to an XML file named “TEST-report.xml” using the xmlrunner module. The output is transformed into an XML format that can be read by continuous integration tools such as Azure DevOps or Jenkins.

This script is intended to be run in a continuous integration (CI) environment to ensure that all Databricks notebook runs meet certain criteria and pass specific tests.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (16)

# evaluatenotebookruns.py
#!/usr/bin/python3
import io
import xmlrunner
from xmlrunner.extra.xunit_plugin import transform
import unittest
import json
import glob
import os

class TestJobOutput(unittest.TestCase):

  test_output_path = '<path-to-json-logs-on-release-agent>'

  def test_performance(self):
    path = self.test_output_path
    statuses = []

    for filename in glob.glob(os.path.join(path, '*.json')):
      print('Evaluating: ' + filename)
      data = json.load(open(filename))
      duration = data['execution_duration']
      if duration > 100000:
        status = 'FAILED'
      else:
        status = 'SUCCESS'

      statuses.append(status)

    self.assertFalse('FAILED' in statuses)

  def test_job_run(self):
    path = self.test_output_path
    statuses = []

    for filename in glob.glob(os.path.join(path, '*.json')):
      print('Evaluating: ' + filename)
      data = json.load(open(filename))
      status = data['state']['result_state']
      statuses.append(status)

    self.assertFalse('FAILED' in statuses)

if __name__ == '__main__':
  out = io.BytesIO()

  unittest.main(testRunner=xmlrunner.XMLTestRunner(output=out),
    failfast=False, buffer=False, catchbreak=False, exit=False)

  with open('TEST-report.xml', 'wb') as report:
    report.write(transform(out.getvalue()))

STEP 16: Publish Test Results

This step will publish Python Unit Test results in JUNIT format to a Test result File path so results can be viewed.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (17)

STEP 17: Test the end-to-end Execution

When you run the end-to-end pipeline. This is how it will look in the release pipeline.

Streamline Databricks Workflows with Azure DevOps Release Pipelines (18)

Conclusion

In this blog, Streamline Databricks Workflows with Azure DevOps Release Pipelines, we learned that the process of developing a release pipeline involves multiple steps, including selecting the right job template, adding artifacts, defining environment variables, setting up the release agent, and configuring continuous deployment triggers. It also involves deploying the notebook to the workspace, installing the necessary dependencies, running tests, and publishing test results. While the process may seem complex, it can help automate the deployment process, reduce errors, and ensure that the workspace is always up-to-date with the latest changes. By following the steps outlined in this blog post, you can create a reliable and robust release pipeline that meets your organization’s specific needs and requirements.

Beyond the Horizon…

Streamline Databricks Workflows with Azure DevOps Release Pipelines

Introduction

DevOps Release Pipeline Steps

STEP 1: Define the Release Pipeline

STEP 2: Select Empty Job Template

STEP 3: Add the Artifacts

STEP 4: Enable Continuous Deployment Trigger

STEP 5: Define environment variables for the release pipeline

STEP 6: Configure Release Agent

STEP 7: Set the Python Version for the release agent

STEP 8: unpackaged the build artifact from the build pipeline

STEP 9: Install the Databrics CLI and Unit Test XML Reporting

STEP 10: Deploy the notebook to the workspace

STEP 11: Copy Python Wheel to the workspace

STEP 12: Install the Python Wheel library to cluster

STEP 13: Create Integration Test Directories

STEP 14: Run Notebooks & understanding Executenotebook.py Python script

STEP 15: Create and Evaluate Notebook Test Results & Understanding evaluatenotebookruns.py Python file

STEP 16: Publish Test Results

STEP 17: Test the end-to-end Execution

Conclusion

Like this:

Related

+ There are no comments

Leave a ReplyCancel reply

Introduction

DevOps Release Pipeline Steps

STEP 1: Define the Release Pipeline

STEP 2: Select Empty Job Template

STEP 3: Add the Artifacts

STEP 4: Enable Continuous Deployment Trigger

STEP 5: Define environment variables for the release pipeline

STEP 6: Configure Release Agent

STEP 7: Set the Python Version for the release agent

STEP 8: unpackaged the build artifact from the build pipeline

STEP 9: Install the Databrics CLI and Unit Test XML Reporting

STEP 10: Deploy the notebook to the workspace

STEP 11: Copy Python Wheel to the workspace

STEP 12: Install the Python Wheel library to cluster

STEP 13: Create Integration Test Directories

STEP 14: Run Notebooks & understanding Executenotebook.py Python script

STEP 15: Create and Evaluate Notebook Test Results & Understanding evaluatenotebookruns.py Python file

STEP 16: Publish Test Results

STEP 17: Test the end-to-end Execution

Conclusion

Share this:

Like this:

Related

+ There are no comments

Leave a ReplyCancel reply

Discover more from Beyond the Horizon...