About App Workbench 4.0

The BlueData App Workbench is a tool that allows you to create, assemble, and share Dockerized applications that can run on the BlueData EPIC platform. App Workbench version 4.0 (also called bdwb) is a Python-based CLI framework that provides a rich set of APIs, macros, and a shell to:

  • Create Docker images from Dockerfiles based on BlueData images.
  • Orchestrate the run time environment for single and multi-node deployments.
  • Package and load images into the BlueData EPIC App Store (i.e. new Catalog entries).

This functionality allows EPIC Platform Administrators to manage, create, and update the preferred applications and versions of applications available to their end users in the BlueData App Store.

The App Workbench focuses mainly on the following three use cases:

  1. Modify or upgrade an existing Hadoop or Spark distribution in the App Store. For example:
    • Modify Cloudera CDH version 6.1 and add a security patch to the base image.
    • Create a new CDH version 6.1 image starting from existing CDH 5.12.1 image.
  2. Add a new application as an edge node to a Hadoop or Spark cluster with auto-provisioning. For example:
    • Spark with Jupyter Notebook could be one team's preferred tool for operational analytics; those end users may want “Jupyter with Spark on Hadoop” as an edge node for new Hadoop deployments.
    • Users with a cluster dedicated for ETL may want NiFi/HDF as an edge node, pre-wired for immediate use.
  3. Create new images for Big Data applications and frameworks. For example:
    • Data science teams often need tools beyond Hadoop and Spark – they'll want to add Kafka, Cassandra, and other applications like H2O for their testing, development, prototyping, and experimentation.
    • A user running Spark on YARN in a Hadoop cluster may be interested in trying Spark 2.3.1 or Spark 2.4.0 standalone, and they may want to add tools such as Zeppelin or Jupyter notebooks and Spark Job Server to their Spark clusters.

The BlueData EPIC platform and App Workbench provide seamless support for all of the scenarios described above. Organizations can easily maintain and run multiple applications and tools in parallel, to support a wide variety of Big Data use cases.

How the App Workbench Works

Note: You must have App Workbench installed as described in Installing App Workbench before running the bdwb tool.

As shown in the command line example below, bdwb supports the following:

  • Interactive shell
  • Inline help for commands and subcommands
  • Autocomplete and contextual help
  • A rich command set that includes commands to build images, configure Docker instances, and define new catalog entries and options.
                        [root@yav-157 ~]#
                        [root@yav-157 ~]# bdwb  This command launches the interactive App Workbench shell.
                        BlueData workbench version 4.0.
                        Executing in interactive mode.
                        bdwb> help
                        
                        AWBed commands (type help <topic>):
                        ----------------------------------------
                        EOF        baseimg  catalog        define  image  role     sources
                        appconfig  builder  clusterconfig  exit    logo   service  workbench

                        attach     service  document
                        
                        Undocumented commands:
                        ----------------------
                        help
                        
                        bdwb> help image  Inline help for commands.
                        usage: image [-h] {load,pull,package,list,download,build,push,registry} ...
                        
                        Container image management for the catalog entry.
                        
                        optional arguments:
                          -h, --help        show this help message and exit
                        
                        Subcommands:
                          {load,pull,package,list,download,build,push,registry}
                            load                Load an image
                            list                List the configured container image.
                            pull                Pull an image from the registry.
                            push                Push a docker image to a registry and refer to it in
                                                the catalog entry metadata.
                            build               Build a catalog image from a Dockerfile. No additional
                                                action is taken.
                            package             Package an image in the local registry into catalog
                                                bundle as a file.
                            download            Download the image file from a HTTP url and add it to
                                                the catalog entry.
                            registry            Registry information for the
                        bdwb> help image list  Extended help for sub-commands.
                        usage image list [-h]
                        
                        optional arguments
                          -h, --help  show this help message and exit
                        bdwb> .

Below is a sample file. Note the command #!/usr/bin/env bdwb in the first line. To add a new application or modify an existing one, users can either run these commands interactively or create a .wb file containing a series of commands. Much like a Python script, running the .wb runs all of the commands in that file.

                        #! /usr/bin/env bdwb
                        ##################################################################
                        #                                                               #
                        #  Sample workbench instructions for building Apache Kafka 1.0  #
                        #                                                               #
                        #################################################################
                        
                        # YOUR_ORGANIZATION_NAME must be replaced with a valid organization name. Please
                        # refer to 'help builder organization' for details.
                        #
                        builder organization --name YOUR_ORGANIZATION NAME Organization name required for tracking.
                        
                        ## Begin a new catalog entry
                        catalog new --distroid kafka --name "Apache Kafka 1.0"
                                    --desc "Apache Kafka is an open-source stream platform             \
                                            developed by the Apache Software Foundation written        \
                                            in Scala. The project aims to provide a unified,           \
                                            high-throughput, low-latency platform for handling         \
                                            real-time data feeds.                                      \
                                    --categories Kafka --version 1.0
                        
                        ## Define all node roles for the virtual cluster.
                        role add worker 1 Sample role definition in a multi-node/multi-role cluster.
                        
                        ## Define all services that are available to the virtual cluster.
                        service add --srvcid kafka_broker --name "Kafka service" --port 9092 Services that are started automatically during deployment.
                        
                        ## Define run time placement of the services.
                        clusterconfig new --configid default
                        
                        # Instructions for autogenerating a simple appconfig bundle.
                        appconfig autogen --pkgfile server.properties --dest /usr/lib/kafka/*/config/server.properties
                        
                        # The order of services defined here is the order in which they are brought up.
                        # The services defined here are only registered and started if the specific
                        #node is expected to run the service.
                        appconfig autogen --srvcid kafka_broker --sysv kafka_broker
                        
                        appconfig package
                        
                        ##logo
                        logo file --filepath appconfig/Logo_Kafka.png A picture file that can be included in the Catalog.

The following screenshot shows the App Store in the BlueData EPIC web interface, followed by the corresponding binaries from a specific BlueData installation:



                        [root@yav-028 install]# ls -lt
                        total 0
                        drwxr-sr-x. 2 root root 208 Oct 24 21:00 bdcatalog-centos7-bluedata-helloworld-3.0
                        drwxtwsr-x. 2 root root 202 Oct 24 19:19 bdcatalog-centos-bluedata-splunk63-1.1
                        drwxr-xr-x. 2 root root 202 Oct 24 19:13 bdcatalog-centos6-bluedata-cdh5101-4.0
                        drwxr-xr-x. 2 root root 180 Oct 24 19:07 bdcatalog-centos-bluedata-mesos-1.2
                        drwxr-xr-x. 2 root root 192 Oct 24 19:03 bdcatalog-ubuntu-bluedata-ubuntu16--1.4
                        drwxr-xr-x. 2 root root 197 Oct 24 19:00 bdcatalog-centos-bluedata-cassandra-2.7
                        drwxr-xr-x. 2 root root 172 Oct 24 18:59 bdcatalog-rhel-bluedata-rhel6-2.6
                        drwxr-xr-x. 2 root root 174 Oct 24 18:59 bdcatalog-rhel-bluedata-rhel7-2.0
                        drwxr-xr-x. 2 root root 195 Oct 24 18:53 bdcatalog-centos-bluedata-cdh591-1.0
                        drwxr-xr-x. 2 root root 202 Oct 24 18:44 bdcatalog-centos-bluedata-spark211n-1.0
                        drwxr-xr-x. 2 root root 197 Oct 24 18:39 bdcatalog-centos-bluedata-spark201-1.2
                        drwxr-xr-x. 2 root root 197 Oct 24 18:30 bdcatalog-centos-bluedata-mapr510-2.3
                        drwxr-xr-x. 2 root root 191 Oct 24 18:25 bdcatalog-centos-bluedata-cdh57-2.1
                        drwxr-xr-x. 2 root root 186 Oct 24 18:19 bdcatalog-centos-bluedata-centos6-2.6
                        drwxr-xr-x. 2 root root 222 Oct 24 18:07 bdcatalog-centos-bluedata-hdp26-ambari-3.0
                        drwxr-xr-x. 2 root root 188 Oct 24 17:56 bdcatalog-centos6-bluedata-appwb-3.1.3
                        drwxr-xr-x. 2 root root 207 Oct 24 17:55 bdcatalog-centos-bluedata-cass_2_1_10-2.5
                        drwxr-xr-x. 2 root root 220 Oct 24 17:48 bdcatalog-centos-bluedata-hdp24-ambari-1.7
                        drwxr-xr-x. 2 root root 217 Oct 24 17:43 bdcatalog-centos-bluedata-rstudio136sp210-3.0
                        drwxr-xr-x. 2 root root 188 Oct 24 17:40 bdcatalog-centos-bluedata-centos7-2.0
                        drwxr-xr-x. 2 root root 222 Oct 24 17:30 bdcatalog-centos-bluedata-hdp25-ambari-3.0
                        drwxr-xr-x. 2 root root 185 Oct 24 17:19 bdcatalog-centos-bluedata-kafka-1.4
                        drwxr-xr-x. 2 root root 192 Oct 24 17:19 bdcatalog-centos-bluedata-epictest-1.0
                        drwxr-xr-x. 2 root root 211 Oct 24 17:14 bdcatalog-centos-bluedata-spark201-edge-2.3
                        drwxr-xr-x. 2 root root 193 Oct 24 17:11 bdcatalog-centos-bluedata-spark16-1.8
                        drwxr-xr-x. 2 root root 208 Oct 24 17:08 bdcatalog-centos-bluedata-spark16-edge-2.1
                        drwxr-xr-x. 2 root root 224 Oct 24 17:00 bdcatalog-centos-bluedata-hdp24-edge-1.0
                        drwxr-xr-x. 2 root root 217 Oct 24 16:55 bdcatalog-centos-bluedata-cdh591-edge-1.1
                        drwxr-xr-x. 2 root root 212 Oct 24 16:50 bdcatalog-centos-bluedata-cdh57-edge-1.4
                        drwxr-xr-x. 2 root root 226 Oct 24 16:33 bdcatalog-centos-bluedata-hdp26-edge-3.0
                        drwxr-xr-x. 2 root root 226 Oct 24 16:23 bdcatalog-centos-bluedata-bluedata-hdp-edge25-3.0
                        drwxr-xr-x. 2 root root 205 Oct 24 16:16 bdcatalog-centos6-bluedata-bluedata-cdh5101-edge-2.6
                        [root@yav-028 install]# .