diff --git a/docs/index.rst b/docs/index.rst index d6a3ff2ae..4220a45f6 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -30,6 +30,19 @@ Welcome to DeepForge's documentation! deployment/overview.rst deployment/native.rst +.. toctree:: + :maxdepth: 1 + :caption: Step-by-Step Guides + + walkthrough/introduction.rst + walkthrough/creating-pipelines.rst + walkthrough/creating-operations.rst + walkthrough/creating-neural-networks.rst + walkthrough/executing-pipelines.rst + walkthrough/viewing-executions.rst + walkthrough/CIFAR-10-classifier.rst + walkthrough/redshift-estimator.rst + .. toctree:: :maxdepth: 1 :caption: Tutorials and Examples diff --git a/docs/walkthrough/CIFAR-10-classifier.rst b/docs/walkthrough/CIFAR-10-classifier.rst new file mode 100644 index 000000000..4f8581007 --- /dev/null +++ b/docs/walkthrough/CIFAR-10-classifier.rst @@ -0,0 +1,454 @@ +CIFAR-10 Classifier +------------------- +This guide provides step-by-step instructions on how to create full pipeline for training and evaluating a simple image classification neural network. This example uses the `CIFAR-10 dataset `_. This guide assumes that the reader has a basic understanding of the DeepForge interface. New users are recommended to review the `step-by-step guides `_ before attempting the process described in this guide. + +Pipeline Overview +================= +.. figure:: images/cifar-pipeline-blank.png + :align: center + :scale: 50 % + +This guild will give a step-by-step process beginning with a new, blank pipeline (shown above) and ending with the pipeline shown below that will create, train, and evaluate a CIFAR-10 classifier. + +.. figure:: images/cifar-pipeline-final.png + :align: center + :scale: 50 % + +GetCifarData Operation +====================== +To create your first operation, click on the floating red button in the bottom right of the pipeline editor workspace, and click on the *New Operation* option that appears. + +This operation provides the pipeline with the training and testing data that will be used by later operations. In many cases, this will be accomplished with *Input* operations, but it may be preferable in some cases to retrieve the data programmatically. + +The first step in any operation should be giving it a name, defining its attributes, and defining its inputs and outputs. These steps are best performed in the right-side panel in the operation editor. + +Our GetCifarData operation will produce four outputs, representing the images and labels from the training and testing sets. This operation does not require any inputs or attributes. + +.. figure:: images/get-data-io.png + :align: center + :scale: 50 % + +The next step in creating any operation is defining its implementation. This is performed in the left panel using the python programming language. Every operation is defined as a python class that must include an *execute* function. Any arbitrary code can be included and run in association with this operation, but an *execute* function must be included and external code will only run if called from within the *execute* function. + +CIFAR-10 is a very common benchmarking dataset. As such, the common keras neural network python library provides a simple method for directly downloading and using the data. The code for doing this is relatively straightforward and is shown below. + +.. code-block:: python + + from keras.datasets import cifar10 + + class GetCifarData(): + + def execute(self): + print('Retrieving CIFAR-10 train_imgs') + + # Retrieve CIFAR-10 data. load_data() returns a 2-tuple of 2-tuples. The + # left hand side decomposes these tuples into four separate variables. + (train_imgs,train_labels),(test_imgs,test_labels) = cifar10.load_data() + + print('CIFAR-10 train_imgs successfully retrieved') + print('Training set shape: {shape}'.format(shape=train_imgs.shape)) + print('Testing set shape: {shape}'.format(shape=test_imgs.shape)) + + return train_labels, test_imgs, test_labels, train_imgs + +When finished, return to the pipeline and use the add operation button again to add the new operation to the pipeline. At this point, you should have a single operation with four outputs, as shown below: + +.. figure:: images/get-cifar-pipeline.png + :align: center + :scale: 50% + +TrainCifar Operation +==================== +The next operation will create and train the neural network classifier. + +Once again, our first step after naming is to define the inputs and outputs of the operation. Unlike the previous operation, two attributes should be added: *batch_size* and *epochs*. Batch size is the number of training samples that the model will be trained on at a time and epochs is the number of times that each training sample will be given to the model. Both are important hyperparameters for a neural network. For this guide, the attributes are defined as shown below, but the exact number used for default values can be changed as desired by the reader. + +.. figure:: images/train-cifar-attr.png + :align: center + :scale: 50% + +This operation will require two inputs (images and labels) and a neural network architecture. Finally, the operation produces one output, which is the trained classifier model. After all inputs, outputs, and attributes have been added, the structure of the operation should appear similar to the following: + +.. figure:: images/train-cifar-io.png + :align: center + :scale: 50% + +The code for this operation follows the standard procedure for creating and training a Keras network. The code for this process is shown below. Note that the attributes must be assigned as class variables in the *__init__* function in order to be used in the *execute* function. Also note that we do not need to import the keras library explicitly here. This is because the architecture object already comes with all the currently needed keras functions attached. + +.. code-block:: python + + class TrainCifar(): + + # Runs when preparing the operation for execution + def __init__(self, architecture, batch_size=32, epochs=20): + print("Initializing Trainer") + + # Saves attributes as class variables for later use + self.arch = architecture + self.epochs = epochs + self.batch_size = batch_size + return + + + # Runs when the operation is actually executed + def execute(self, images, labels): + print("Initializing Model") + + # Creates an instance of the neural network architecure. Other + # losses and optimizers can be used as desired + self.arch.compile(loss='sparse_categorical_crossentropy', + optimizer='adam', + metrics=['sparse_categorical_accuracy']) + print("Model Initialized Successfully") + + print("Beginning Training") + print("Training images shape:", images.shape) + print("Training labels shape:", labels.shape) + + # Train the model on the given inputs (images) and outputs (labels) + # using the specified training options. + self.arch.fit(images, + labels, + batch_size=self.batch_size, + epochs=self.epochs, + verbose=2) + + print("Training Complete") + + # Saves the model in a new variable. This is necessary so that the + # output of the operation is named 'model' + model = self.arch + + return model + +After the operation is fully defined, it needs to be added to the workspace and connected to the **GetCifarData** operation as shown below. Specifically, the *train_images* and *train_labels* outputs from **GetCifarData** should be connected to the *images* and *labels* inputs to **TrainCifar** respectively. Hovering over the circles representing each input or output will display the full name of that element. This should help to ensure that the correct inputs and outputs are matched together. + +Note that the architecture selected from within the pipeline editor until after the `Neural Network Architecture`_ section of this guide is completed. + +.. figure:: images/cifar-gt.png + :align: center + :scale: 50 % + +Neural Network Architecture +=========================== + +This section will describe how to create a simple, but effective, Convolutional Neural Network for classifying CIFAR-10 images. In particular, this section gives instructions on creating a slightly simplified `VGG network `_. The basic structure of this network is a series of four feature detection blocks, followed by a densely connected classifier block. + +For specifics on how to create a new network how to use the neural network editor interface, consult the `Creating Neural Networks `_ walkthrough. + +Beginning from a blank network, the first step when building a network is to create an Input layer by clicking anywhere on the workspace. + +For reference during design, the full architecture can be found `here `_. + +.. figure:: images/vgg-blank.png + :align: center + :scale: 25% + +This Input layer requires that either the *shape* or *batch_shape* attributes be defined. Because our data is composed of 32*32 pixel RGB images, the *shape* of our input should be (32,32,3). + +.. figure:: images/vgg-input.png + :align: center + :scale: 25% + +The four feature detector blocks are each composed of two **Conv2D** layers followed by a **MaxPooling2D** layer. The settings for the first **Conv2D** and **MaxPooling2D** layers are shown below. + +Every **Conv2D** layer requires that the *filters* and *kernel_size* attributes be defined. Each **Conv2D** layer in this network will use a *kernel_size* (window size) of (3,3), a stride of (1,1), and will use ReLU as the activation function. They should all also use *same* as the padding so that the size of the input does not change during convolution. For the first pair of **Conv2D** layers, the number of filters will be 32. + +.. figure:: images/vgg-block-conv.png + :align: center + :scale: 50% + +Every **MaxPooling2D** layer requires that the *pool_size* (window size) attribute be defined. In this network, all **MaxPooling2D** layers will use a pool_size of (2,2), a stride of (2,2), and padding set to *valid*. These settings will result in the size of the image being cut in half at every pooling. + +.. figure:: images/vgg-block-pool.png + :align: center + :scale: 50% + +A total of four of these convolutional blocks should be created in sequence. The only difference between each block is that the number of filters used in the **Conv2D** layers in each block should double after each pooling. In other words, the value of *filters* should be 32 for the first **Conv2D** layer, 64 for the third **Conv2D** layer, 128 for the fifth, and so on. + +After the last convolutional block comes the classifier block. The first layer in this block is a **Flatten** layer, which converts the convolved image into a 1D vector that can be fed into the following **Dense** layers. The **Flatten** layer has no attributes to change. + +There are a total of three **Dense** layers in this classifier, with the first two using the same attribute values. Every **Dense** layer requires that the *units* (output length) attribute be defined. + +For the first two **Dense** layers, the number of units used will be 2048, and the activation function used will be ReLU, as shown below. + +.. figure:: images/vgg-class-block-dense.png + :align: center + :scale: 50% + +The final **Dense** layer will actually provide the output probability density function for the model. As such, the number of units should be the number of categories in the data (in this case 10). This last layer also uses the *softmax* activation function, which ensures that the output is a vector whose sum is 1. + +.. figure:: images/vgg-class-block-out.png + :align: center + :scale: 50% + +Optionally, an **Output** layer may be added after the final **Dense** layer. This layer explicitly marks the output of a model, but may be excluded when there is only one output. When there is only one output, such as in this network, the lowest layer in the model will be assumed to be the output layer. + +PredictCifar Operation +====================== + +This operation uses the model created by **TrainCifar** to predict the class of a set on input images. This operation has no attributes, takes a model and images as input and produces a set of predicted labels (named *pred_labels*), resulting in the following structure: + +.. figure:: images/predict-cifar-io.png + :align: center + :scale: 50% + +The code for this operation is short and straightforward with only one peculiarity. The *predict* function does not provide a prediction directly, instead providing a `probability density function (pdf) `_ over the available classes. For example, a CIFAR-10 classifier's output for a single input may be [0, 0.03, 0.9, 0.02, 0, 0, 0.05, 0, 0, 0], which indicates that the model is predicting that the likelihood that the image falls into each category is 0% for category 1, 3% for category 2, 90% for category 3, and so on. This requires taking the argmax of every output of the model to determine which class has been ruled the most likely. + +.. code-block:: python + + import numpy as np + + class PredictCifar(): + + def execute(self, images, model): + print('Predicting Image Categories') + + # Predicts the PDF for the input images + pred_labels = model.predict(images) + + # Converts PDFs into scalar predictions + pred_labels = np.argmax(pred_labels, axis=1) + + print('Predictions Generated') + + return pred_labels + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the *test_images* outputs from **GetCifarData** and the *model* output from **TrainCifar** should be connected to the *images* and *model* inputs to **PredictCifar** respectively. + +.. figure:: images/cifar-gtp.png + :align: center + :scale: 50% + +EvalCifar Operation +=================== + +This operation evaluates the outputs from the classifier and produces a confusion matrix that could be helpful for determining where the shortcomings of the model lie. + +.. figure:: images/cifar-eval-output.png + :align: center + :scale: 50% + +This operation requires no attributes and produces no output variables. It requires two inputs in the form of *true_labels* and *pred_labels*. The structure of this operation is shown below: + +.. figure:: images/eval-cifar-io.png + :align: center + :scale: 50% + +With this operation, the code becomes a bit more complex as we build the visualization with the tools provided by the `matplotlib.pyplot library `_. The code below is annotated with comments describing the purpose of all graphing commands. Also of note is that the expected input *true_labels* is a 2-dimensional array, where the second dimension is of length 1. This is because of a quirk of keras that requires this structure for training and automatic evaluation. To ease calculations, the first step taken is to flatten this array to one dimension. + +.. code-block:: python + + import matplotlib.pyplot as plt + import numpy as np + + class EvalCifar(): + + def execute(self, pred_labels, true_labels): + + # Reduces the dimensionality of true_labels by 1 + # ex. [[1],[4],[5],[2]] becomes [1, 4, 5, 2] + true_labels = true_labels[:,0] + + # Builds a confusion matrix from the lists of labels + cm = self.buildConfustionMatrix(pred_labels, true_labels) + + #normalize values to range [0,1] + cm = cm / cm.sum(axis=1) + + # Calculates the overall accuracy of the model + # acc = (# correct) / (# samples) + acc = np.trace(cm) / np.sum(cm) + + # Display the confusion matrix as a grayscale image, mapping the + # intensities to a green colorscale rather than the default gray + plt.imshow(cm, cmap=plt.get_cmap('Greens')) + + # Adds a title to the image. Also reports accuracy below the title + plt.title('CIFAR-10 Confusion Matrix\naccuracy={:0.3f}'.format(acc)) + + # Labels the ticks on the two axes (placed at positions [0,1,2,...,9]) with + # the category names + bins = np.arange(10) + catName = ['plane','car','bird', + 'cat','deer','dog','frog', + 'horse','ship','truck'] + plt.xticks(bins, catName, rotation=45) + plt.yticks(bins, catName) + + # Determines value at the center of the color scale + mid = (cm.max() + cm.min()) / 2 + + for i in range(10): + for j in range(10): + # Prints the value of each cell to three decimal places. + # Colors text so that white text is printed on dark cells + # and black text on light cells + plt.text(j, i, '{:0.3f}'.format(cm[i, j]), + ha='center', va='center', + color='white' if cm[i, j] > mid else 'black') + + # Labels the two axes + plt.ylabel('True label') + plt.xlabel('Predicted label') + + plt.tight_layout() + + # Displays the plot + plt.show() + + def buildConfustionMatrix(self, pred_labels, true_labels): + # Creates an empty matrix of size 10 x 10 + mat = np.zeros((10,10)) + + # Computes count of times that image with true label t is + # assigned predicted label p + for p, t in zip(pred_labels, true_labels): + mat[t][p] += 1 + + return mat + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the *test_labels* outputs from **GetCifarData** and the *pred_labels* output from **PredictCifar** should be connected to the *true_labels* and *pred_labels* inputs to **EvalCifar** respectively. + +.. figure:: images/cifar-gtpe.png + :align: center + :scale: 50% + +ViewCifar Operation +=================== + +This operation displays a random subset of images, along with the predicted and actual categories in which those images belong. Such a visualization might be helpful for seeing what kind of images are being misclassified and for what reason. + +.. figure:: images/cifar-view-output.png + :align: center + :scale: 50% + +This operation includes an attribute *num_images* for specifying the number of images that should be drawn from the testing set and displayed. As with the attributes in TrainCifar, this attribute should be given a type of integer and will be given the default value of 16. + +.. figure:: images/view-cifar-attr.png + :align: center + :scale: 50% + +This operation produces no outputs and requires three inputs: the images, the associated true labels, and the associated predicted labels. The overall structure is shown. + +.. figure:: images/view-cifar-io.png + :align: center + :scale: 50% + +As with the previous operation, the code for this operation gets slightly complicated and has been annotated with comments describing each command. + +.. code-block:: python + + from matplotlib import pyplot as plt + import numpy as np + import math + + class ViewCifar(): + def __init__(self, num_images=16): + self.num_images = num_images + + return + + def execute(self, pred_labels, true_labels, images): + # Reduces the dimensionality of true_labels by 1 + # ex. [[1],[4],[5],[2]] becomes [1, 4, 5, 2] + true_labels = true_labels[:,0] + + # Chooses a random selection of indices representing the chosen images + orig_indices = np.arange(len(images)) + indices = np.random.choice(orig_indices, self.num_images, replace=False) + + # Extracts the images and labels represented by the chosen indices + images = np.take(images, indices, axis=0) + pred_labels = np.take(pred_labels, indices, axis=0) + true_labels = np.take(true_labels, indices, axis=0) + + # Calculates the number of rows and columns needed to arrange the images in + # as square of a shape as possible + num_cols = math.ceil(math.sqrt(self.num_images)) + num_rows = math.ceil(self.num_images / num_cols) + + # Creates a collection of subplots, with one cell per image + fig, splts = plt.subplots(num_rows, num_cols, sharex=True, sharey=True) + + catName = ['plane','car','bird', + 'cat','deer','dog','frog', + 'horse','ship','truck'] + + for i in range(self.num_images): + + # Determines the current row and column location + col = i % num_cols + row = i // num_cols + + # Displays the current image + splts[row,col].imshow(images[i]) + splts[row,col].axis('off') + + # Retrieves the text label equivalent of the numerical labels + p_cat = catName[pred_labels[i]] + t_cat = catName[true_labels[i]] + + # Displays the category labels, with the true label colored green and in + # the top-left corner and the predicted label colored red and in the + # top-right corner + splts[row,col].text(8,0,t_cat,ha='center',va='bottom',color='green') + splts[row,col].text(24,0,p_cat,ha='center',va='bottom',color='red') + + # Displays the figure + plt.show() + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the *test_labels* outputs from **GetCifarData**, the *test_images* from **GetCifarData**, and the *pred_labels* output from **PredictCifar** should be connected to the *true_labels*, *images*, and *pred_labels* inputs to **ViewCifar** respectively. + +With this, we have a full pipeline ready for execution. + +.. figure:: images/cifar-pipeline-final.png + :align: center + :scale: 50% + +Execution and Results +===================== + +With the pipeline fully prepared, it is time to execute the pipeline. To do this, go to the pipeline editor workspace, hover over the red *Add Operation* button and click the floating blue *Execute Pipeline* button + +.. figure:: images/cifar-execute-button.png + :align: center + :scale: 50% + +A dialog box will open where the settings for the current execution must be defined. All inputs are required are detailed below. + +.. figure:: images/cifar-execute-dialog.png + :align: center + :scale: 50% + +The *Basic Options* section includes two settings. The first is the name to be used for identifying the execution. An execution's name must be unique within the project and if a name is given here that has already been used for an execution in the same project, a number will be appended to the given name automatically. The debug option allows for individual operations to be edited and rerun after execution. This is useful during pipeline development and allows for easier debugging or tuning. + +.. figure:: images/cifar-execute-basic.png + :align: center + :scale: 50% + +The *Compute Options* section allows configuration of the compute backend to be used for execution. The specific inputs required here will vary with the selected compute backend. For instance, the `SciServer Compute `_ backend requires login credentials and the selection of a compute domain. + +.. figure:: images/cifar-execute-compute.png + :align: center + :scale: 50% + +The *Storage Options* section allows configuration of the storage backend to be used during execution. This backend will be where all files used during execution and created as output from the pipeline will be stored. The specific inputs required here will vary with the selected compute backend. For instance, the **SciServer Files Service** backend requires login credentials, the selection of a storage volume, and the type of the volume. + +.. figure:: images/cifar-execute-storage.png + :align: center + :scale: 50% + +When all settings have been specified, click **Run** to begin execution. For information on how to check execution status, consult the `Viewing Executions `_ walkthrough. + +To view the output of the execution, go to the *Executions* tab and check the box next to the desired execution. + +.. figure:: images/cifar-select-execution.png + :align: center + :scale: 50% + +For a more detailed and larger view of individual figures, click on the name of the execution to view its status page and open the console output for the desired operation. In the bottom left is a set of buttons for switching between console output and graph output for that operation. + +.. figure:: images/cifar-execution-eval.png + :align: center + :scale: 50% diff --git a/docs/walkthrough/creating-neural-networks.rst b/docs/walkthrough/creating-neural-networks.rst new file mode 100644 index 000000000..b7fbdc81d --- /dev/null +++ b/docs/walkthrough/creating-neural-networks.rst @@ -0,0 +1,116 @@ +Creating Neural Networks +------------------------ + +This page will guide you through the steps needed to create a neural network architecture for use in pipelines. + +Importing Resource Libraries +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Neural networks and other models can be created from the *Resources* tab on the sidebar. Before any models can be created in this tab, you must import the associated library into the project using the red floating button in the bottom right of the workspace. + +.. figure:: images/resources-blank.png + :align: center + :scale: 50% + +In the box that appears, you will see a list of libraries that are available for import. + +.. figure:: images/resources-import-keras.png + :align: center + :scale: 50% + +Clicking the download icon will install that library and allow creation of associated models. The keras library, for instance, allows for the creation of neural network models. + +.. figure:: images/resources-import-keras-after.png + :align: center + :scale: 50% + +Creating a New Architecture +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +After any library has been imported, new models can be created by hovering over the import library button and clicking the floating blue button that appears. This will generate a blank model and automatically open up that model's workspace. + +.. figure:: images/resources-new.png + :align: center + :scale: 50% + +Clicking anywhere in the workspace will add the first layer of the architecture, which will always be an input layer. Just as with pipelines, these architectures are represented by a flowchart, with each node representing a single layer in the neural network. + +.. figure:: images/vgg-blank.png + :align: center + :scale: 50% + +Editing Network Layers +~~~~~~~~~~~~~~~~~~~~~~ +Clicking on a layer allows for changing the parameters of that layer. Many of these parameters can be left undefined, but many layers require that some specific parameters be given. If a layer has not been supplied with the necessary parameters, or if there is some other error encountered at that layer when building the network, the layer will be highlighted with a red border. Hovering the mouse over the layer will reveal the error. Hovering over a newly created **Input** layer, for example, shows us that the layer requires that either the shape or batch_shape parameter must be defined. + +.. figure:: images/vgg-input-error.png + :align: center + :scale: 50% + +Some parameters may not be immediately clear on their effects from the name alone. For unfamiliar parameters, hovering over the name of the parameter will reveal a short description. + +.. figure:: images/vgg-input-hover.png + :align: center + :scale: 50% + +In addition, clicking on the **?** icon in the top right of the expanded layer will display documentation on the layer as a whole, including descriptions of all available parameters. + +.. figure:: images/vgg-input-doc.png + :align: center + :scale: 50% + +Adding Additional Layers +~~~~~~~~~~~~~~~~~~~~~~~~ +To add additional layers, you can click on the arrow icons on the top or bottom of any layer. The icon should become a + icon and clicking again will open a menu from which the desired layer type can be chosen. + +.. figure:: images/vgg-add-layer.png + :align: center + :scale: 50% + +.. figure:: images/network-new-layer.png + :align: center + :scale: 50% + +Layers can also be removed from the network by expanding the layer and clicking the red X icon in the top left. Two layers that already exist in the network can also be linked by clicking on the output icon on one layer and the input icon on another. A given layer can have any number of other layers as inputs or outputs. Some layers, such as the **Dense** layer, however, only expect one input and will give an error when multiple inputs are detected. + +.. figure:: images/network-multi-io.png + :align: center + :scale: 50% + +It is optional, though recommended, that the network be concluded with an **Output** layer. A network may include multiple outputs, in which case all outputs must be given an **Output** layer. If no **Output** layer is included, the last layer in the network will be treated as the sole output. + +.. figure:: images/network-multi-out.png + :align: center + :scale: 50% + +Connections Between Layers +~~~~~~~~~~~~~~~~~~~~~~~~~~ +When two layers are connected, they will be joined by a black arrow that indicates the flow of data through the network. Hovering over these arrows will reveal the shape of the data, which can help with analyzing the network to ensure that the data is being transformed as desired. + +.. figure:: images/network-connect-hover.png + :align: center + :scale: 50% + +Connections can also be removed and layers separated by clicking on the unwanted arrow and then clicking on the red X icon that appears. + +.. figure:: images/network-connect-delete.png + :align: center + :scale: 50% + +Exporting Architectures +~~~~~~~~~~~~~~~~~~~~~~~ +With keras models, another feature exists to export the model as python code. Clicking the red arrow button in the bottom right of the workspace will display a window generating the code. After making any optional changes to the configuration, clicking run will generate the code. + +.. figure:: images/vgg-generate-keras.png + :align: center + :scale: 50% + +After successful generation, hovering over the red arrow button and clicking on the floating gray list button will provide a list of all exported architectures. + +.. figure:: images/vgg-gen-keras-view-res.png + :align: center + :scale: 50% + +Clicking on *Details* will provide some metadata about the export, as well as a link to download the generated file. This file can then be incorporated into a python project. + +.. figure:: images/vgg-gen-keras-view-res-details.png + :align: center + :scale: 50% diff --git a/docs/walkthrough/creating-operations.rst b/docs/walkthrough/creating-operations.rst new file mode 100644 index 000000000..40f0899fb --- /dev/null +++ b/docs/walkthrough/creating-operations.rst @@ -0,0 +1,65 @@ +Creating Operations +------------------- + +When adding an operation to a pipeline, new operations can be created by clicking the *New Operation* option. This will open the operation editor for the new operation. + +.. figure:: images/cifar-pipeline-blank.png + :align: center + :scale: 50% + +This editor can also be reached for existing operations by clicking the **** icon when editing an operation's attributes. + +.. figure:: images/cifar-operation-io.png + :align: center + :scale: 50% + +This editor has two primary views for editing the operation. The left view allows editing the underlying code of the operation directly. The right view provides a graphical means of adding inputs, outputs, and attributes. + +.. figure:: images/new-operation.png + :align: center + :scale: 50% + +Editing the Operation Interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. figure:: images/new-operation-orig.png + :align: center + :scale: 50% + +Clicking on the operation in the right view will allow editing the operation interface. The operation interface consists of design time parameters (attributes and references) as well as inputs and outputs generated at runtime. + +Attributes can be added by clicking the *New Attribute* label, which will open a dialog box where you can define the name, type, default value, and other metadata about that attribute. This dialog box can be viewed again to edit the attribute by clicking on the name of the attribute in the right-side view. + +.. figure:: images/train-cifar-epochs.png + :align: center + :scale: 50% + +Inputs and outputs can be added using the blue arrow icons. Any number of inputs and outputs can be added to an operation, but each should be given a unique name. + +.. figure:: images/new-operation-io.png + :align: center + :scale: 50% + +Using the plus icon, referencess to resources can be added to the operation. These resources will usually be some form of neural network. As with inputs and outputs, any number of resources can be added to an operation. + +.. figure:: images/train-cifar-io.png + :align: center + :scale: 50% + +The paint brush icon allows editing the color of the operation, but this is purely aesthetic and does not affect the operation's underlying logic. + +Implementing the Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. figure:: images/operation-code-editor.png + :align: center + :scale: 50% + +In the left-side view, the underlying logic of the operation is implemented using Python. The code here can be edited freely. All operations are defined by a class with the same name as the operation. This class has two primary functions associated with it. The first is the *__init__* function, which will appear automatically when creating the operation's first attribute. This function will run when the operation is initialized and is primarily used for the creation of class variables and the processing of attributes. Note that operation attributes will not be accessible from other functions and must be assigned to a class variable in this function to be utilized elsewhere. The second primary function is the *execute* function. This is the function that is executed when the operation is running. Any number of other classes and functions can be created in the code editor, but they will not be executed if they are not called within the execute function. The outputs of the execute function will also be the outputs of the operation. + +Importing Libraries +~~~~~~~~~~~~~~~~~~~ +.. figure:: images/operation-depen.png + :align: center + :scale: 50% + +Python libraries can be used within an operation by importing them, which is usually done above the operation class. Any library that is installed on the compute backend's python environment can be imported as normal, but more niche libraries that are available through pip or anaconda need to be specified as dependencies for the operation by clicking the *Environment* tab on the right side. The dependencies described here should be defined using the same syntax as in a `conda environment file `_. + diff --git a/docs/walkthrough/creating-pipelines.rst b/docs/walkthrough/creating-pipelines.rst new file mode 100644 index 000000000..b4947f0f7 --- /dev/null +++ b/docs/walkthrough/creating-pipelines.rst @@ -0,0 +1,27 @@ +Creating Pipelines +------------------ +From the home view in the *Pipelines* tab, you are presented with a list of the pipelines that have already been created. Clicking on any of these pipelines will allow editing of that pipeline. To create a new, empty pipeline, click on the red button in the bottom right corner of the workspace. + +.. figure:: images/pipelines-view.png + :align: center + :scale: 50% + +The basic unit of work in a pipeline is the operation. Operations can be added using the red button in the bottom right corner of the workspace. + +.. figure:: images/cifar-pipeline-blank.png + :align: center + :scale: 50% + +After an operation has been added, the attributes of that operation can be changed by clicking on the operation and then clicking on the current value of that attribute. + +.. figure:: images/cifar-gt.png + :align: center + :scale: 50% + +Operation inputs and outputs are represented by blue circles that are visible after clicking on the operation. Blue circles on the top on the operation represent inputs, while circles on the bottom represent outputs. The red X circle can be clicked to remove an operation from the pipeline. This does not remove it from the set of available operations. The **** icon will open the operation editor view. Holding alt while clicking this icon will instead create a copy of the operation and open the new copy's editor. + +.. figure:: images/cifar-operation-io.png + :align: center + :scale: 50% + +Operations can be connected by clicking on an input or output of an operation before clicking on an output or input respectively of another operation. All input and output connections are optional, though missing outputs may give errors depending upon the operation's internal logic. diff --git a/docs/walkthrough/executing-pipelines.rst b/docs/walkthrough/executing-pipelines.rst new file mode 100644 index 000000000..a502f512d --- /dev/null +++ b/docs/walkthrough/executing-pipelines.rst @@ -0,0 +1,60 @@ +Executing Pipelines +------------------- + +This page will guide you through the steps needed to execute a finished pipeline. + +Executing within DeepForge +~~~~~~~~~~~~~~~~~~~~~~~~~~ +Finished pipelines can be conveniently executed from within DeepForge. To do so, navigate to the desired pipeline's workspace, hover over the red + button in the bottom right, and click on the blue arrow button. This will open a dialog box for defining how to execute the pipeline. The configuration options are split into several sections. Once all information has been provided, clicking the blue *Run* button will begin execution. The information provided can also be saved for future executions by checking the box in the bottom left. + +.. figure:: images/cifar-execute-dialog.png + :align: center + :scale: 50% + +Basic Options +^^^^^^^^^^^^^ +Here you will define the name of the execution. Execution names are unique identifiers and cannot be repeated. In the case that a name is given that has already been used for that project, an index will be added to the pipeline name automatically (i.e. *test* becomes *test_2*). Upon starting execution, the execution name will also be added to the project version history as a tag. + +The pipeline can also be chosen to run in debug mode here. This will allow editing the operations and re-running the pipelines with the edited operations after creation. Alternatively, the execution will only use the version of each operation that existed when the pipeline was first executed. This can be helpful when creating and testing pipelines before deployment. + +.. figure:: images/cifar-execute-basic.png + :align: center + :scale: 50% + +Credentials for Pipeline Inputs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +This section requires inputting the credentials for accessing all artifacts used in the pipeline. Because each artifact can be located in different storage backends or different accounts within that backend, each artifact must be provided with its own credentials. If no input artifacts are used in the pipeline, this section will not be present. + +.. figure:: images/redshift-execute-creds.png + :align: center + :scale: 50% + +Compute Options +^^^^^^^^^^^^^^^ +In this section, you will select from the available compute backends. In the examples shown, for instance, the SciServer Compute service will be used. Each compute backend may require additional information, such as login credentials or computation resources that should be used. + +.. figure:: images/cifar-execute-compute.png + :align: center + :scale: 50% + +Storage Options +^^^^^^^^^^^^^^^ +Here, the storage backend must be chosen from the available options. As with the compute options, SciServer's Files Service is used here as an example. Each backend may require additional input, such as login credentials and the desired storage location. This storage backend and location will be where all files created during execution will be stored. This will include both files used during execution, such as data passed between operations, as well as artifacts created using Output operations. + +.. figure:: images/cifar-execute-storage.png + :align: center + :scale: 50% + +Manual Execution +~~~~~~~~~~~~~~~~ +If desired, pipelines can be executed manually by advanced users. Hovering over the red + icon in the pipeline's workspace and clicking the yellow export button that appears will open a dialog box for exporting the pipeline. + +.. figure:: images/export-pipeline.png + :align: center + :scale: 50% + +Any artifacts used in Input operations will require the login credentials for the backend and account where the artifact is stored. Clicking the blue *Run* button in the bottom right will generate the execution files for the pipeline and automatically download them in a zip file. In this zip folder are all the files normally generated for execution. The simplest way to execute this pipeline is to run the top-level *main.py* file. + +.. figure:: images/export-pipeline-dialog.png + :align: center + :scale: 50% diff --git a/docs/walkthrough/images/artifact-blank.png b/docs/walkthrough/images/artifact-blank.png new file mode 100644 index 000000000..a36a19017 Binary files /dev/null and b/docs/walkthrough/images/artifact-blank.png differ diff --git a/docs/walkthrough/images/artifact-import-upload.png b/docs/walkthrough/images/artifact-import-upload.png new file mode 100644 index 000000000..6f7e826d9 Binary files /dev/null and b/docs/walkthrough/images/artifact-import-upload.png differ diff --git a/docs/walkthrough/images/cifar-eval-output.png b/docs/walkthrough/images/cifar-eval-output.png new file mode 100644 index 000000000..05a510017 Binary files /dev/null and b/docs/walkthrough/images/cifar-eval-output.png differ diff --git a/docs/walkthrough/images/cifar-execute-basic.png b/docs/walkthrough/images/cifar-execute-basic.png new file mode 100644 index 000000000..ae76966c0 Binary files /dev/null and b/docs/walkthrough/images/cifar-execute-basic.png differ diff --git a/docs/walkthrough/images/cifar-execute-button.png b/docs/walkthrough/images/cifar-execute-button.png new file mode 100644 index 000000000..5ccfb3954 Binary files /dev/null and b/docs/walkthrough/images/cifar-execute-button.png differ diff --git a/docs/walkthrough/images/cifar-execute-compute.png b/docs/walkthrough/images/cifar-execute-compute.png new file mode 100644 index 000000000..bc2198eff Binary files /dev/null and b/docs/walkthrough/images/cifar-execute-compute.png differ diff --git a/docs/walkthrough/images/cifar-execute-dialog.png b/docs/walkthrough/images/cifar-execute-dialog.png new file mode 100644 index 000000000..f797ef838 Binary files /dev/null and b/docs/walkthrough/images/cifar-execute-dialog.png differ diff --git a/docs/walkthrough/images/cifar-execute-storage.png b/docs/walkthrough/images/cifar-execute-storage.png new file mode 100644 index 000000000..c6225aedb Binary files /dev/null and b/docs/walkthrough/images/cifar-execute-storage.png differ diff --git a/docs/walkthrough/images/cifar-execution-eval.png b/docs/walkthrough/images/cifar-execution-eval.png new file mode 100644 index 000000000..03f499b33 Binary files /dev/null and b/docs/walkthrough/images/cifar-execution-eval.png differ diff --git a/docs/walkthrough/images/cifar-execution-in-progress.png b/docs/walkthrough/images/cifar-execution-in-progress.png new file mode 100644 index 000000000..dbb27a740 Binary files /dev/null and b/docs/walkthrough/images/cifar-execution-in-progress.png differ diff --git a/docs/walkthrough/images/cifar-gt.png b/docs/walkthrough/images/cifar-gt.png new file mode 100644 index 000000000..920ca9ee2 Binary files /dev/null and b/docs/walkthrough/images/cifar-gt.png differ diff --git a/docs/walkthrough/images/cifar-gtp.png b/docs/walkthrough/images/cifar-gtp.png new file mode 100644 index 000000000..0947e47d7 Binary files /dev/null and b/docs/walkthrough/images/cifar-gtp.png differ diff --git a/docs/walkthrough/images/cifar-gtpe.png b/docs/walkthrough/images/cifar-gtpe.png new file mode 100644 index 000000000..b60f860d9 Binary files /dev/null and b/docs/walkthrough/images/cifar-gtpe.png differ diff --git a/docs/walkthrough/images/cifar-operation-io.png b/docs/walkthrough/images/cifar-operation-io.png new file mode 100644 index 000000000..996a7e9b4 Binary files /dev/null and b/docs/walkthrough/images/cifar-operation-io.png differ diff --git a/docs/walkthrough/images/cifar-pipeline-blank.png b/docs/walkthrough/images/cifar-pipeline-blank.png new file mode 100644 index 000000000..c1e4e6be1 Binary files /dev/null and b/docs/walkthrough/images/cifar-pipeline-blank.png differ diff --git a/docs/walkthrough/images/cifar-pipeline-final.png b/docs/walkthrough/images/cifar-pipeline-final.png new file mode 100644 index 000000000..a504b1269 Binary files /dev/null and b/docs/walkthrough/images/cifar-pipeline-final.png differ diff --git a/docs/walkthrough/images/cifar-select-execution.png b/docs/walkthrough/images/cifar-select-execution.png new file mode 100644 index 000000000..4ea58640a Binary files /dev/null and b/docs/walkthrough/images/cifar-select-execution.png differ diff --git a/docs/walkthrough/images/cifar-view-output.png b/docs/walkthrough/images/cifar-view-output.png new file mode 100644 index 000000000..2033744d0 Binary files /dev/null and b/docs/walkthrough/images/cifar-view-output.png differ diff --git a/docs/walkthrough/images/eval-cifar-io.png b/docs/walkthrough/images/eval-cifar-io.png new file mode 100644 index 000000000..f05e811e3 Binary files /dev/null and b/docs/walkthrough/images/eval-cifar-io.png differ diff --git a/docs/walkthrough/images/export-pipeline-dialog.png b/docs/walkthrough/images/export-pipeline-dialog.png new file mode 100644 index 000000000..ae0867ae6 Binary files /dev/null and b/docs/walkthrough/images/export-pipeline-dialog.png differ diff --git a/docs/walkthrough/images/export-pipeline.png b/docs/walkthrough/images/export-pipeline.png new file mode 100644 index 000000000..f825823c9 Binary files /dev/null and b/docs/walkthrough/images/export-pipeline.png differ diff --git a/docs/walkthrough/images/get-cifar-pipeline.png b/docs/walkthrough/images/get-cifar-pipeline.png new file mode 100644 index 000000000..5ec4e9e91 Binary files /dev/null and b/docs/walkthrough/images/get-cifar-pipeline.png differ diff --git a/docs/walkthrough/images/get-data-io.png b/docs/walkthrough/images/get-data-io.png new file mode 100644 index 000000000..33b51a0a9 Binary files /dev/null and b/docs/walkthrough/images/get-data-io.png differ diff --git a/docs/walkthrough/images/incep-full.png b/docs/walkthrough/images/incep-full.png new file mode 100644 index 000000000..8af26adad Binary files /dev/null and b/docs/walkthrough/images/incep-full.png differ diff --git a/docs/walkthrough/images/incep-incep-block-1.png b/docs/walkthrough/images/incep-incep-block-1.png new file mode 100644 index 000000000..2f777e0e2 Binary files /dev/null and b/docs/walkthrough/images/incep-incep-block-1.png differ diff --git a/docs/walkthrough/images/incep-incep-block-2.png b/docs/walkthrough/images/incep-incep-block-2.png new file mode 100644 index 000000000..c4e61a552 Binary files /dev/null and b/docs/walkthrough/images/incep-incep-block-2.png differ diff --git a/docs/walkthrough/images/incep-incep-block-3.png b/docs/walkthrough/images/incep-incep-block-3.png new file mode 100644 index 000000000..6f02c584f Binary files /dev/null and b/docs/walkthrough/images/incep-incep-block-3.png differ diff --git a/docs/walkthrough/images/incep-input-block.png b/docs/walkthrough/images/incep-input-block.png new file mode 100644 index 000000000..a0c58a7ea Binary files /dev/null and b/docs/walkthrough/images/incep-input-block.png differ diff --git a/docs/walkthrough/images/incep-output.png b/docs/walkthrough/images/incep-output.png new file mode 100644 index 000000000..883619fa7 Binary files /dev/null and b/docs/walkthrough/images/incep-output.png differ diff --git a/docs/walkthrough/images/network-connect-delete.png b/docs/walkthrough/images/network-connect-delete.png new file mode 100644 index 000000000..4b128ddd4 Binary files /dev/null and b/docs/walkthrough/images/network-connect-delete.png differ diff --git a/docs/walkthrough/images/network-connect-hover.png b/docs/walkthrough/images/network-connect-hover.png new file mode 100644 index 000000000..1743bd04e Binary files /dev/null and b/docs/walkthrough/images/network-connect-hover.png differ diff --git a/docs/walkthrough/images/network-multi-io.png b/docs/walkthrough/images/network-multi-io.png new file mode 100644 index 000000000..dd0ff5dc8 Binary files /dev/null and b/docs/walkthrough/images/network-multi-io.png differ diff --git a/docs/walkthrough/images/network-multi-out.png b/docs/walkthrough/images/network-multi-out.png new file mode 100644 index 000000000..3b9deb98e Binary files /dev/null and b/docs/walkthrough/images/network-multi-out.png differ diff --git a/docs/walkthrough/images/network-new-layer.png b/docs/walkthrough/images/network-new-layer.png new file mode 100644 index 000000000..40c7ad3a5 Binary files /dev/null and b/docs/walkthrough/images/network-new-layer.png differ diff --git a/docs/walkthrough/images/new-operation-io.png b/docs/walkthrough/images/new-operation-io.png new file mode 100644 index 000000000..56e0ca260 Binary files /dev/null and b/docs/walkthrough/images/new-operation-io.png differ diff --git a/docs/walkthrough/images/new-operation-orig.png b/docs/walkthrough/images/new-operation-orig.png new file mode 100644 index 000000000..d5771a969 Binary files /dev/null and b/docs/walkthrough/images/new-operation-orig.png differ diff --git a/docs/walkthrough/images/new-operation.png b/docs/walkthrough/images/new-operation.png new file mode 100644 index 000000000..54f160599 Binary files /dev/null and b/docs/walkthrough/images/new-operation.png differ diff --git a/docs/walkthrough/images/operation-code-editor.png b/docs/walkthrough/images/operation-code-editor.png new file mode 100644 index 000000000..43fcf62a7 Binary files /dev/null and b/docs/walkthrough/images/operation-code-editor.png differ diff --git a/docs/walkthrough/images/operation-depen.png b/docs/walkthrough/images/operation-depen.png new file mode 100644 index 000000000..4d988976b Binary files /dev/null and b/docs/walkthrough/images/operation-depen.png differ diff --git a/docs/walkthrough/images/output-artifacts.png b/docs/walkthrough/images/output-artifacts.png new file mode 100644 index 000000000..64b25559d Binary files /dev/null and b/docs/walkthrough/images/output-artifacts.png differ diff --git a/docs/walkthrough/images/pipeline-view-exec.png b/docs/walkthrough/images/pipeline-view-exec.png new file mode 100644 index 000000000..269f8f2ff Binary files /dev/null and b/docs/walkthrough/images/pipeline-view-exec.png differ diff --git a/docs/walkthrough/images/pipelines-view.png b/docs/walkthrough/images/pipelines-view.png new file mode 100644 index 000000000..3a7733d8b Binary files /dev/null and b/docs/walkthrough/images/pipelines-view.png differ diff --git a/docs/walkthrough/images/predict-cifar-io.png b/docs/walkthrough/images/predict-cifar-io.png new file mode 100644 index 000000000..c9dfa2029 Binary files /dev/null and b/docs/walkthrough/images/predict-cifar-io.png differ diff --git a/docs/walkthrough/images/redshift-eval-depen.png b/docs/walkthrough/images/redshift-eval-depen.png new file mode 100644 index 000000000..7826fcc8e Binary files /dev/null and b/docs/walkthrough/images/redshift-eval-depen.png differ diff --git a/docs/walkthrough/images/redshift-eval-io.png b/docs/walkthrough/images/redshift-eval-io.png new file mode 100644 index 000000000..428aaebb8 Binary files /dev/null and b/docs/walkthrough/images/redshift-eval-io.png differ diff --git a/docs/walkthrough/images/redshift-eval-res.png b/docs/walkthrough/images/redshift-eval-res.png new file mode 100644 index 000000000..4ced4ed30 Binary files /dev/null and b/docs/walkthrough/images/redshift-eval-res.png differ diff --git a/docs/walkthrough/images/redshift-execute-creds.png b/docs/walkthrough/images/redshift-execute-creds.png new file mode 100644 index 000000000..ecc404e68 Binary files /dev/null and b/docs/walkthrough/images/redshift-execute-creds.png differ diff --git a/docs/walkthrough/images/redshift-final.png b/docs/walkthrough/images/redshift-final.png new file mode 100644 index 000000000..768161940 Binary files /dev/null and b/docs/walkthrough/images/redshift-final.png differ diff --git a/docs/walkthrough/images/redshift-inputs.png b/docs/walkthrough/images/redshift-inputs.png new file mode 100644 index 000000000..dc596f00c Binary files /dev/null and b/docs/walkthrough/images/redshift-inputs.png differ diff --git a/docs/walkthrough/images/redshift-pdfvis-io.png b/docs/walkthrough/images/redshift-pdfvis-io.png new file mode 100644 index 000000000..2146061a4 Binary files /dev/null and b/docs/walkthrough/images/redshift-pdfvis-io.png differ diff --git a/docs/walkthrough/images/redshift-pdfvis-res.png b/docs/walkthrough/images/redshift-pdfvis-res.png new file mode 100644 index 000000000..fc04bb4ee Binary files /dev/null and b/docs/walkthrough/images/redshift-pdfvis-res.png differ diff --git a/docs/walkthrough/images/redshift-predict-io.png b/docs/walkthrough/images/redshift-predict-io.png new file mode 100644 index 000000000..0547acf8e Binary files /dev/null and b/docs/walkthrough/images/redshift-predict-io.png differ diff --git a/docs/walkthrough/images/redshift-t.png b/docs/walkthrough/images/redshift-t.png new file mode 100644 index 000000000..8b00de932 Binary files /dev/null and b/docs/walkthrough/images/redshift-t.png differ diff --git a/docs/walkthrough/images/redshift-tp.png b/docs/walkthrough/images/redshift-tp.png new file mode 100644 index 000000000..8e9ca3c3d Binary files /dev/null and b/docs/walkthrough/images/redshift-tp.png differ diff --git a/docs/walkthrough/images/redshift-tpe.png b/docs/walkthrough/images/redshift-tpe.png new file mode 100644 index 000000000..7c659a609 Binary files /dev/null and b/docs/walkthrough/images/redshift-tpe.png differ diff --git a/docs/walkthrough/images/redshift-tpep.png b/docs/walkthrough/images/redshift-tpep.png new file mode 100644 index 000000000..2e0cd0d1e Binary files /dev/null and b/docs/walkthrough/images/redshift-tpep.png differ diff --git a/docs/walkthrough/images/redshift-train-io.png b/docs/walkthrough/images/redshift-train-io.png new file mode 100644 index 000000000..9eb11448c Binary files /dev/null and b/docs/walkthrough/images/redshift-train-io.png differ diff --git a/docs/walkthrough/images/resources-blank.png b/docs/walkthrough/images/resources-blank.png new file mode 100644 index 000000000..90802c0c1 Binary files /dev/null and b/docs/walkthrough/images/resources-blank.png differ diff --git a/docs/walkthrough/images/resources-import-keras-after.png b/docs/walkthrough/images/resources-import-keras-after.png new file mode 100644 index 000000000..16b017218 Binary files /dev/null and b/docs/walkthrough/images/resources-import-keras-after.png differ diff --git a/docs/walkthrough/images/resources-import-keras.png b/docs/walkthrough/images/resources-import-keras.png new file mode 100644 index 000000000..cb38bfbcb Binary files /dev/null and b/docs/walkthrough/images/resources-import-keras.png differ diff --git a/docs/walkthrough/images/resources-new.png b/docs/walkthrough/images/resources-new.png new file mode 100644 index 000000000..a52865654 Binary files /dev/null and b/docs/walkthrough/images/resources-new.png differ diff --git a/docs/walkthrough/images/status-tracker-selected.png b/docs/walkthrough/images/status-tracker-selected.png new file mode 100644 index 000000000..ff7727cfb Binary files /dev/null and b/docs/walkthrough/images/status-tracker-selected.png differ diff --git a/docs/walkthrough/images/train-cifar-attr.png b/docs/walkthrough/images/train-cifar-attr.png new file mode 100644 index 000000000..2fb291138 Binary files /dev/null and b/docs/walkthrough/images/train-cifar-attr.png differ diff --git a/docs/walkthrough/images/train-cifar-epochs.png b/docs/walkthrough/images/train-cifar-epochs.png new file mode 100644 index 000000000..75c5c1cef Binary files /dev/null and b/docs/walkthrough/images/train-cifar-epochs.png differ diff --git a/docs/walkthrough/images/train-cifar-io.png b/docs/walkthrough/images/train-cifar-io.png new file mode 100644 index 000000000..bcadb3260 Binary files /dev/null and b/docs/walkthrough/images/train-cifar-io.png differ diff --git a/docs/walkthrough/images/vgg-add-layer.png b/docs/walkthrough/images/vgg-add-layer.png new file mode 100644 index 000000000..94fc6b4e2 Binary files /dev/null and b/docs/walkthrough/images/vgg-add-layer.png differ diff --git a/docs/walkthrough/images/vgg-blank.png b/docs/walkthrough/images/vgg-blank.png new file mode 100644 index 000000000..c3dc5ef20 Binary files /dev/null and b/docs/walkthrough/images/vgg-blank.png differ diff --git a/docs/walkthrough/images/vgg-block-conv.png b/docs/walkthrough/images/vgg-block-conv.png new file mode 100644 index 000000000..512e18132 Binary files /dev/null and b/docs/walkthrough/images/vgg-block-conv.png differ diff --git a/docs/walkthrough/images/vgg-block-pool.png b/docs/walkthrough/images/vgg-block-pool.png new file mode 100644 index 000000000..dd09bafe0 Binary files /dev/null and b/docs/walkthrough/images/vgg-block-pool.png differ diff --git a/docs/walkthrough/images/vgg-class-block-dense.png b/docs/walkthrough/images/vgg-class-block-dense.png new file mode 100644 index 000000000..1842af5a6 Binary files /dev/null and b/docs/walkthrough/images/vgg-class-block-dense.png differ diff --git a/docs/walkthrough/images/vgg-class-block-out.png b/docs/walkthrough/images/vgg-class-block-out.png new file mode 100644 index 000000000..39271108e Binary files /dev/null and b/docs/walkthrough/images/vgg-class-block-out.png differ diff --git a/docs/walkthrough/images/vgg-full.png b/docs/walkthrough/images/vgg-full.png new file mode 100644 index 000000000..67a3547d2 Binary files /dev/null and b/docs/walkthrough/images/vgg-full.png differ diff --git a/docs/walkthrough/images/vgg-gen-keras-view-res-details.png b/docs/walkthrough/images/vgg-gen-keras-view-res-details.png new file mode 100644 index 000000000..2657855fe Binary files /dev/null and b/docs/walkthrough/images/vgg-gen-keras-view-res-details.png differ diff --git a/docs/walkthrough/images/vgg-gen-keras-view-res.png b/docs/walkthrough/images/vgg-gen-keras-view-res.png new file mode 100644 index 000000000..6b6bede79 Binary files /dev/null and b/docs/walkthrough/images/vgg-gen-keras-view-res.png differ diff --git a/docs/walkthrough/images/vgg-generate-keras.png b/docs/walkthrough/images/vgg-generate-keras.png new file mode 100644 index 000000000..51eaea2f2 Binary files /dev/null and b/docs/walkthrough/images/vgg-generate-keras.png differ diff --git a/docs/walkthrough/images/vgg-input-doc.png b/docs/walkthrough/images/vgg-input-doc.png new file mode 100644 index 000000000..1d1004e26 Binary files /dev/null and b/docs/walkthrough/images/vgg-input-doc.png differ diff --git a/docs/walkthrough/images/vgg-input-error.png b/docs/walkthrough/images/vgg-input-error.png new file mode 100644 index 000000000..432363dc2 Binary files /dev/null and b/docs/walkthrough/images/vgg-input-error.png differ diff --git a/docs/walkthrough/images/vgg-input-hover.png b/docs/walkthrough/images/vgg-input-hover.png new file mode 100644 index 000000000..696fce166 Binary files /dev/null and b/docs/walkthrough/images/vgg-input-hover.png differ diff --git a/docs/walkthrough/images/vgg-input.png b/docs/walkthrough/images/vgg-input.png new file mode 100644 index 000000000..c9f36b6d7 Binary files /dev/null and b/docs/walkthrough/images/vgg-input.png differ diff --git a/docs/walkthrough/images/view-cifar-attr.png b/docs/walkthrough/images/view-cifar-attr.png new file mode 100644 index 000000000..a198c2126 Binary files /dev/null and b/docs/walkthrough/images/view-cifar-attr.png differ diff --git a/docs/walkthrough/images/view-cifar-io.png b/docs/walkthrough/images/view-cifar-io.png new file mode 100644 index 000000000..9ee76e7f6 Binary files /dev/null and b/docs/walkthrough/images/view-cifar-io.png differ diff --git a/docs/walkthrough/images/view-compute-window.png b/docs/walkthrough/images/view-compute-window.png new file mode 100644 index 000000000..0ba4cf8d3 Binary files /dev/null and b/docs/walkthrough/images/view-compute-window.png differ diff --git a/docs/walkthrough/images/view-compute.png b/docs/walkthrough/images/view-compute.png new file mode 100644 index 000000000..f1432b98d Binary files /dev/null and b/docs/walkthrough/images/view-compute.png differ diff --git a/docs/walkthrough/images/view-graphical-output.png b/docs/walkthrough/images/view-graphical-output.png new file mode 100644 index 000000000..0bc1d3341 Binary files /dev/null and b/docs/walkthrough/images/view-graphical-output.png differ diff --git a/docs/walkthrough/introduction.rst b/docs/walkthrough/introduction.rst new file mode 100644 index 000000000..553d54841 --- /dev/null +++ b/docs/walkthrough/introduction.rst @@ -0,0 +1,8 @@ +Introduction +============ +This tutorial provides detailed instructions for creating a complete DeepForge project from scratch. The motivating examples for this walkthrough will be a simple image classification task using `CIFAR-10 `_ as our dataset and a more complex astronomical redshift estimation task using `Sloan Digital Sky Survey `_ as our dataset. + +The overall process of creating projects is centered around the creation of data processing **pipelines** that will be executed to generate the data, visualizations, models, etc. that we need. This guide begins with a detailed walkthrough on how to create pipelines and all their constituent parts. After this introductory walkthrough will be detailed walkthroughs on how to create a pair of useful pipelines using the motivating examples. + +.. figure:: images/pipelines-view.png + :align: center \ No newline at end of file diff --git a/docs/walkthrough/redshift-estimator.rst b/docs/walkthrough/redshift-estimator.rst new file mode 100644 index 000000000..17fa7dfd5 --- /dev/null +++ b/docs/walkthrough/redshift-estimator.rst @@ -0,0 +1,445 @@ +Redshift Estimator +------------------ + +This guide provides instructions on how to create a full pipeline for training and evaluating a convolutional neural network on the task of predicting astronomical redshift values given images of galaxies. It provides an approach that is simplified from work by `Pasquet et. al. `_ The data referenced and used in this guide was obtained from the `Sloan Digital Sky Survey Data Release 3 `_, obtained via `SciServer's CasJobs Service `_, and processed using `Astromatic's SWarp tool `_. +This guide assumes that the reader has a basic understanding of the DeepForge interface and how to create basic pipelines. New users are recommended to review the `step-by-step guides `_ before attempting the process described in this guide. + +Pipeline Overview +================= +This guild will give instruction on creating a pipeline that will create, train, and evaluate a model that estimates photometric redshift. Each of the sections below details how to create a piece of the final pipeline and how each piece is connected. + +.. figure:: images/redshift-final.png + :align: center + :scale: 50% + +Input Operations +================ + +While it is possible to retrieve the data needed for model creation programmatically in many cases, this guide makes use of the **Input** operations to load preprocessed data. This is in the interest of both simplicity and generalizability to datasets and data sources different than those used in this tutorial. + +This pipeline uses four **Input** operations. These operations provide the training input images, training output values, testing input images, and testing output values. For the purposes of this tutorial, the structure of the input images is a 4D numpy array of shape (N, 64, 64, 5), where N is the number of images. The outputs are 1D numpy arrays of length N. The process described in this tutorial will work for images that are not 64*64 pixels in size and that use any number of color channels, requiring only a slight change in the neural network. + +.. figure:: images/redshift-inputs.png + :align: center + :scale: 50% + +Each **Input** operation requires an artifact to have been added to the project. To do this, go to the artifacts view and click either of the two floating buttons in the bottom right of the workspace (one button only appears on hover). + +.. figure:: images/artifact-blank.png + :align: center + :scale: 50% + +With these buttons, you can either upload a local file to one of the storage backends or import a file that already exists within a storage backend. + +.. figure:: images/artifact-import-upload.png + :align: center + :scale: 50% + +By default, all artifacts are treated as python `pickle objects `_. Using other forms of serialized data, such as `FITS `_ or `npy `_ files, requires defining a custom serializer in the *Custom Serialization* view, which is not covered in this tutorial. + +TrainRedshift Operation +======================= + +The first custom operation will create and train the neural network classifier. + +Two attributes should be added: *batch_size* and *epochs*. Batch size is the number of training samples that the model will be trained on at a time and epochs is the number of times that each training sample will be given to the model. Both are important hyperparameters for training a neural network. For this guide, the attributes are defined as shown below, but the exact number used for default values can be changed as desired by the reader. + +This operation will require two inputs (images and labels) and a neural network architecture. Finally, the operation produces one output, which is the trained classifier model. After all inputs, outputs, and attributes have been added, the structure of the operation should appear similar to the following: + +.. figure:: images/redshift-train-io.png + :align: center + :scale: 50% + +The code for this operation follows the standard procedure for creating and training a Keras network with one minor caveat. The method used by Pasquet et al. on which this pipeline is based formulates redshift prediction as a classification problem. Because the labels used in this tutorial are floating point values, they must be converted into a categorical format. This is the purpose of the *to_categorical* function. The code for this operation is shown below. + +.. code-block:: python + + import numpy as np + + class TrainRedshift(): + def __init__(self, architecture, + epochs=20, + batch_size=32): + self.arch = architecture + self.epochs = epochs + self.batch_size = batch_size + + # Maximum expected redshift value and number of bins to be used in classification + # step. The max_val will need to change to be reasonably close to the maximum + # redshift of your dataset. The number of bins must match the output shape of the + # architecture but may be tuned as a hyperparameter. Both can optionally be made + # attributes of the operation. + self.max_val = 0.4 + self.num_bins = 180 + return + + + def execute(self, images, labels): + print(type(labels)) + print("Initializing Model") + + # Initialize the model + self.arch.compile(loss='sparse_categorical_crossentropy', + optimizer='adam', + metrics=['sparse_categorical_accuracy']) + print("Model Initialized Successfully") + + print("Beginning Training") + print("Training images shape:", images.shape) + print("Training labels shape:", labels.shape) + + # Train the model on the images and the labels. Labels are converted to categorical + # data because the architecture expects an index to an output vector of length 180 + self.arch.fit(images, + self.to_categorical(labels), + epochs=self.epochs, + verbose=2) + + print("Training Complete") + + # Saves the model in a new variable. This is necessary so that the + # ouput of the operation is named 'model' + model = self.arch + return model + + # Converts floating point labels to categorical vectors. The result for a given input + # label is a 1D vector of length 1 whose value is the index representing the range in + # which the label falls. For example, if the max_val is 0.4 and the num_bins is 4, the + # possible indices are 0-3, representing the ranges [0,0.1), [0.1,0.2), [0.2,0.3), and + # [0.3,0.4] respectively. So, a label of 0.12 results in an output of [1] + def to_categorical(self, labels): + return np.array(labels) // (self.max_val / self.num_bins) + +After the operation is fully defined, it needs to be added to the workspace and connected to the **Input** operations as shown below. Specifically, the training images and training outputs should be connected to the *images* and *labels* inputs of **TrainRedshift** respectively. + +Note that the architecture selected from within the pipeline editor until after the `Neural Network Architecture`_ section of this guide is completed. + +.. figure:: images/redshift-t.png + :align: center + :scale: 50% + +Neural Network Architecture +=========================== +This section will describe how to create a convolutional neural network for estimating redshift from images. In particular, this section gives instructions on creating an `Inception-v1 network `_. The basic structure of this network is an input block, a series of five inception blocks, followed by a densely connected classifier block. These blocks are each described in order below. + +For reference during design, the full architecture can be found `here `_. + +Input Block +^^^^^^^^^^^ +The input block begins, as with all network architectures, with an **Input** layer. The shape of this layer should be the shape of the input images (64\*64\*3 in this case). This input feeds into a 5\*5 **Conv2D** layer with 64 filters and linear activation. The activation here is linear because the layer is to be activated by the **PReLU** layer that follows. The Input block is finished with an **AveragePooling2D** layer with a window size and stride of 2. Note that all layers use *same* padding to prevent changes in data shape due to the window size. + +.. figure:: images/incep-input-block.png + :align: center + :scale: 50% + +Inception Blocks +^^^^^^^^^^^^^^^^ +The five inception blocks fall into one of three designs. Blocks 1 and 3 share the same design, as do blocks 2 and 4. Each of the three designs are described more detail below. Take note throughout these subsections that every **Conv2D** layer is followed by a **PReLU** layer using the default attribute values. In addition, all **AveragePooling2D** layers will use have the attribute values of (2,2) for both *pool_size* and *strides* and *same* for *padding*. In the interest of brevity, this will not be pointed out in each subsection. + +Inception Blocks 1 and 3 +~~~~~~~~~~~~~~~~~~~~~~~~ +Blocks 1 and 3 each begins with an **AveragePooling2D** layer. This is the same layer pictured at the bottom of the input block and blocks 2 and 4. The output of this layer is fed into 4 separate **Conv2D** layers that all have a *kernel_size* of 1\*1. Two of these new layers feed into another **Conv2D** layer, one with *kernel_size* 3\*3 and another with *kernel_size* 5\*5. Another of the original **Conv2D** layers feeds into an **AveragePooling2D** layer. Finally, the remaining original **Conv2D** layer, along with the **AveragePooling2D** layer and the two new **Conv2D** layers all feed into a **Concatenate** layer. For reference, the expected structure is shown below. + +.. figure:: images/incep-incep-block-1.png + :align: center + :scale: 50% + +Inception Blocks 2 and 4 +~~~~~~~~~~~~~~~~~~~~~~~~ +Blocks 2 and 4 are laid out mostly identically to blocks 1 and 3, with the exception of the first and last layers. The first layer in these blocks is the **Concatenate** layer from the end of the previous block. In addition, another **AveragePooling2D** layer is added after the **Concatenate** layer at the end of the block. For reference, the expected structure is shown below. + +.. figure:: images/incep-incep-block-2.png + :align: center + :scale: 50% + +Inception Block 5 +~~~~~~~~~~~~~~~~~ +Block 5 is laid out mostly identically to blocks 1 and 3. The only difference is that one of the two branches with two **Conv2D** layers is omitted. Specifically, the branch in which the second layer has a *kernel_size* of 5\*5 is left out. For reference, the expected structure is shown below. + +.. figure:: images/incep-incep-block-3.png + :align: center + :scale: 50% + +Conv2D Attributes +~~~~~~~~~~~~~~~~~ +All **Conv2D** layers in the architecture use a stride of 1, use *same* padding, and use a *linear* activation function. The only attributes that vary between the various layers are the number of *filters* and the *kernel_size*. Notice in the diagrams above that every **Conv2D** layer is marked with an identifying letter. The table below gives the correct values for *filters* and *kernel_size* for every layer in each inception block. + ++-----------+---------------+---------------+---------------+---------------+---------------+ +| | Block 1 | Block 2 | Block 3 | Block 4 | Block 5 | ++-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+ +|Con2D layer|filters|kernel |filters|kernel |filters|kernel |filters|kernel |filters|kernel | ++-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| a | 48 | (1,1) | 64 | (1,1) | 92 | (1,1) | 92 | (1,1) | 92 | (1,1) | ++-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| b | | (1,1) | | (1,1) | | (1,1) | | (1,1) | | (1,1) | ++-----------+ +-------+ +-------+ +-------+ +-------+ 128 +-------+ +| c | 64 | (3,3) | 92 | (3,3) | 128 | (3,3) | 128 | (3,3) | | (3,3) | ++-----------+ +-------+ +-------+ +-------+ +-------+-------+-------+ +| d | | (5,5) | | (5,5) | | (5,5) | | (5,5) | | ++-----------+-------+-------+-------+-------+-------+-------+-------+-------+---------------+ + +Classifier Block +^^^^^^^^^^^^^^^^ + +The classifier block begins with a **Flatten** layer to reshape the data into a 1D vector. This feeds into a **Dense** layer with 1096 units and ReLU activation. The next layer is a **Dropout** layer intended to help prevent overfitting. The dropout rate used here is 0.3, but this may require tuning to fit the dataset most appropriately. . Finally, a **Dense** layer using softmax activation produces the final output. This final layer must use the value for *units* as the *num_bins* variable used in various operations. An optional **Output** layer may also be included but is unnecessary as long as the **Dense** layer is the lowest layer in the architecture. + +.. figure:: images/incep-output.png + :align: center + :scale: 50% + + + +PredictRedshift Operation +========================= +This operation uses the model created by **TrainRedshift** to predict the values of a set on input images. This operation has no attributes, takes a model and a set of images as input and produces a set of predicted values (named *labels*) and the associates probability density functions that resulted in those values (named *pdfs*). The structure of the operation is as shown below: + +.. figure:: images/redshift-predict-io.png + :align: center + :scale: 50% + +The *model.predict* function results in a probability density function (PDF) over all redshift values in the allowed range [0,0.4]. In order to get scalar values for predictions, a weighted average is taken for each PDF where the value being averaged is the redshift value represented by that bin and the weight is the PDF value at that bin (i.e. how likely it is that the value represented by that bin is the actual redshift value). + +.. code-block:: python + + import numpy as np + + class PredictRedshift(): + + def execute(self, images, model): + # See first comment in PredictRedshift() + max_val = 0.4 + num_bins = 180 + step = max_val / num_bins + + # Generates PDF for the redshift of each image + pdfs = model.predict(images) + bin_starts = np.arange(0, max_val, step) + + # Regresses prediction to a scalar value. Essentially a weighted average + # where the weights are the pdf values for each bin and the values are + # the beginning of the range represented by each bin. + labels = np.sum((bin_starts + (step / 2)) * pdfs, axis=1) + + return pdfs, labels + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the *test images* **Input** operation and the *model* output from **TrainRedshift** should be connected to the *images* and *model* inputs to **PredictRedshift** respectively. + +.. figure:: images/redshift-tp.png + :align: center + :scale: 50% + +EvalRedshift Operation +====================== +This operation creates a figure for evaluating the accuracy of the redshift model. The resulting figure (shown on the right in the image below) plots the true redshift value against the predicted value. The further a point falls away from the diagonal dotted line, the more incorrect that prediction. + +.. figure:: images/redshift-eval-res.png + :align: center + :scale: 50% + +This operation has no attributes and produces no output. It requires two inputs in the form of a list of predicted redshift values (*pt*) and a list of actual redshift values (*gt*). The structure of the operation is as shown below: + +.. figure:: images/redshift-eval-io.png + :align: center + :scale: 50% + +The code for this operation is below and is heavily annotated to explain the various graphing functions. + +.. code-block:: python + + import numpy as np + from properscoring import crps_gaussian + import matplotlib.pyplot as plt + + class EvalRedshift(): + + def execute(self, gt, pt): + print('Evaluating model') + + # Calculates various metrics for later display. For more info, see section 4.1 of + # of Pasquet et. al. + residuals = (pt - gt) / (gt + 1) + pred_bias = np.average(residuals) + dev_MAD = np.median(np.abs(residuals - np.median(residuals))) * 1.4826 + frac_outliers = np.count_nonzero(np.abs(residuals) > (dev_MAD * 5)) / len(residuals) + crps = np.average(crps_gaussian(pt, np.mean(pt), np.std(pt))) + + # Creates the figure and gives it a title + plt.figure() + plt.title('Redshift Confusion Scatterplot') + + # Plots all galaxies where the x-value is the true redshift of a galaxy and the + # y-value is the predicted redshift value of a galaxy + plt.scatter(gt, pt) + + # Creates a dashed black line representing the line on which a perfect prediction + # would lie. This line has a slope of 1 and goes from the origin to the maximum + # redshift (predicted or actual) + maxRS = max(max(gt), max(pt)) + endpoints = [0, maxRS] + plt.plot(endpoints, endpoints, '--k') + + # Creates a formatted string with one metric per line. Prints metrics to three + # decimal places + metricStr = 'pred_bias: {pb:.03f}\n' + \ + 'MAD Deviation: {dm:.03f}\n' + \ + 'Fraction of Outliers: {fo:.03f}\n' + \ + 'Avg. CRPS: {ac:.03f}' + formattedMetrics = metricStr.format(pb=pred_bias, + dm=dev_MAD, + fo=frac_outliers, + ac=crps) + + # Prints the metrics string at the top left of the figure + plt.text(0, maxRS, formattedMetrics, va='top') + + # Labels axes and displays figure + plt.ylabel('Predicted Redshift') + plt.xlabel('True Redshift') + plt.show() + + return + +Notice in the above code that there is a new library used to calculate one of the metrics. This library is not standard and is not included in many default environments. Because of this, the library needs to be added to the environment at runtime by going to the *Environment* tab in the operation editor and defining the operation dependencies as shown below. Operation dependencies are defined in the style of a `conda environment file `_. + +.. figure:: images/redshift-eval-depen.png + :align: center + :scale: 50% + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the test values **Input** operation and the *labels* output from **PredictRedshift** should be connected to the *gt* and *pt* inputs to **EvalRedshift** respectively. + +.. figure:: images/redshift-tpe.png + :align: center + :scale: 50% + +PdfVisRedshift Operation +======================== +This operation creates another figure for evaluating the accuracy of the redshift model as shown below. Compared to the output of the **EvalRedshift** operation, this figure provides a more zoomed in picture of individual predictions. Each of the subplots is a plotting of the probability density function for a randomly chosen input image. The red and green lines indicate the predicted and actual value of the image's redshift value respectively. + +.. figure:: images/redshift-pdfvis-res.png + :align: center + :scale: 50% + +This operation has one attribute, *num_images* and produces no output. It requires three inputs in the form of a list of predicted redshift values (*pt*), a list of actual redshift values (*gt*), and a list of probability density functions (*pdfs*). The structure of the operation is as shown below: + +.. figure:: images/redshift-pdfvis-io.png + :align: center + :scale: 50% + +The code for this operation is below and is heavily annotated to explain the various graphing functions. + +.. code-block:: python + + import numpy as np + import matplotlib.pyplot as plt + import math + + class PdfVisRedshift(): + def __init__(self, num_images=9): + + # Calculates the number of rows and columns needed to arrange the images in + # as square of a shape as possible + self.num_images = num_images + self.num_cols = math.ceil(math.sqrt(num_images)) + self.num_rows = math.ceil(num_images / self.num_cols) + + self.max_val = 0.4 + return + + + def execute(self, gt, pt, pdfs): + + # Creates a collection of subfigures. Because each prediciton uses the same bins, + # x-axes are shared. + fig, splts = plt.subplots(self.num_rows, + self.num_cols, + sharex=True, + sharey=False) + + # Chooses a random selection of indices representing the chosen images + random_indices = np.random.choice(np.arange(len(pt)), + self.num_images, + replace=False) + + # Extracts the pdfs and redshifts represented by the chosen indices + s_pdfs = np.take(pdfs, random_indices, axis=0) + s_pt = np.take(pt, random_indices, axis=0) + s_gt = np.take(gt, random_indices, axis=0) + + # Creates a list of the lower end of the ranges represented by each bin + x_range = np.arange(0, self.max_val, self.max_val / pdfs.shape[1]) + + for i in range(self.num_images): + col = i % self.num_cols + row = i // self.num_cols + + # Creates a line graph from the current image's pdf + splts[row,col].plot(x_range, s_pdfs[i],'-') + + # Creates two vertical lines to represent the predicted value (red) and the + # actual value (green) + splts[row,col].axvline(s_pt[i], color='red') + splts[row,col].axvline(s_gt[i], color='green') + + # Creates a formatted string with one metric per line. Prints metrics to three + # decimal places. d (delta) is how far off the prediction was from the actual value + metricString = 'gt={gt:.03f}\npt={pt:.03f}\n \u0394={d:.03f}' + metricString = metricString.format(gt = s_gt[i], + pt = s_pt[i], + d = abs(s_gt[i]-s_pt[i])) + + # Determines whether the metrics should be printed on the left or right of the + # figure. If prediction is on the left end, the right side should be more clear + # and should be the chosen side. + alignRight = s_pt[i] <= self.max_val / 2 + + # Adds the metric string to the figure at the top of the subfigure (which is the + # max value of that pdf) + splts[row,col].text(self.max_val if alignRight else 0, + np.max(s_pdfs[i]), + metricString, + va='top', + ha='right' if alignRight else 'left') + + # Automatically tweaks margins and positioning of the graph + plt.tight_layout() + plt.show() + +After the operation is fully defined, it needs to be added to the workspace and connected to the previous operations as shown below. Specifically, the *labels* and *pdfs* output from **PredictRedshift** and the test values **Input** operation should be connected to the *pt*, *pdfs* and *pt* inputs to **PdfVisRedshift** respectively. + +.. figure:: images/redshift-tpep.png + :align: center + :scale: 50% + +Output Operations +================= +**Output** operations are special operations that allow saving python objects generated during execution. For instance, in this tutorial, it might be useful to save the trained model and the generated predictions for later use or analysis. Shown below is the result of adding two **Output** operations to the pipeline to save these two objects. + +.. figure:: images/redshift-final.png + :align: center + :scale: 50% + +Objects created in this way will be saved in the execution working directory (defined in *Execution Options* when executing a pipeline) under the name given to the operation's *saveName* attribute. Objects saved in this manner will also be automatically added to the list of available artifacts for use in other pipelines. + +.. figure:: images/output-artifacts.png + :align: center + :scale: 50% + +Execution and Results +===================== +As with all pipelines, this pipeline can be executed using the red floating button in the bottom right of the pipeline editor view. In addition to the normal settings that are always included, this pipeline (as with any pipeline using **Input** operations) required additional credentials for each artifact being used. + +.. figure:: images/redshift-execute-creds.png + :align: center + :scale: 50% + +To view the output of the execution, go to the *Executions* tab and check the box next to the desired execution. + +.. figure:: images/redshift-eval-res.png + :align: center + :scale: 50% + +For a more detailed and larger view of individual figures, click on the name of the execution to view its status page and open the console output for the desired operation. In the bottom left is a set of buttons for switching between console output and graph output for that operation. + +.. figure:: images/redshift-pdfvis-res.png + :align: center + :scale: 50% diff --git a/docs/walkthrough/viewing-executions.rst b/docs/walkthrough/viewing-executions.rst new file mode 100644 index 000000000..4b2b1ff24 --- /dev/null +++ b/docs/walkthrough/viewing-executions.rst @@ -0,0 +1,70 @@ +Viewing Executions +------------------ + +This page will guide you through monitoring the execution of pipelines and viewing the output of finished executions. + +Monitoring Executions +~~~~~~~~~~~~~~~~~~~~~ +After execution has been started through DeepForge, the status of an execution can be checked using numerous methods. + +Viewing Execution Status +^^^^^^^^^^^^^^^^^^^^^^^^ + +While in the workspace for a pipeline, the bottom left corner shows a list of all executions associated with the pipeline. Clicking on the name of an execution will open the status tracker for that execution. + +.. figure:: images/pipeline-view-exec.png + :align: center + :scale: 50% + +An alternative method of getting to this screen is to go to the *Executions* tab for a list of all executions for the current project. In this view, clicking the name of the desired execution will also open the status tracker. + +.. figure:: images/cifar-select-execution.png + :align: center + :scale: 50% + +In the status tracker view, the current status for the execution is displayed on an operation level. Each operation is colored based upon its status, with the following possible states: + +* Gray - Awaiting Execution +* Yellow - Currently Executing +* Green - Execution Finished Successfully +* Orange - Execution Cancelled +* Red - Error Encountered During Execution + +.. figure:: images/cifar-execution-in-progress.png + :align: center + :scale: 50% + +Also in this view, clicking on an operation will reveal the attribute values used for this execution. Clicking the blue monitor icon in the top right of a selected operation will open the console output for that operation. + +.. figure:: images/status-tracker-selected.png + :align: center + :scale: 50% + + +Viewing the Compute Dashboard +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +In the top left of the webpage, clicking the DeepForge logo will open the DeepForge main dropdown menu. In this menu is an option named *View Compute*. + +.. figure:: images/view-compute.png + :align: center + :scale: 50% + +This option opens a dialog box that displays the current execution status of the connected compute backends. Each backend will have its own tab with information specific to that backend. The SciServer Compute tab includes a direct link to SciServer's Compute Dashboard, where the status and output of current and past executions can be viewed. + +.. figure:: images/view-compute-window.png + :align: center + :scale: 50% + +Viewing Execution Output +~~~~~~~~~~~~~~~~~~~~~~~~ +Execution output can be viewed in one of two major ways. Textual output that is printed to the console can be viewed by going to the `execution status tracker `_, selecting the operation that produces the desired output, and clicking on the blue monitor icon in the top right of the operation. For operations generating matplotlib figures, a set of buttons in the bottom left will allow swapping between console and matplotlib figures. + +.. figure:: images/view-graphical-output.png + :align: center + :scale: 50% + +Graphical output, which will generally by generated using a graphical library like `Matplotlib `_, can be viewed from the *Executions* tab on the sidebar. Beside each execution is a checkbox. Activating a checkbox will display the graphical output generated during that execution. Selecting multiple boxes will display the output from all selected executions together. + +.. figure:: images/cifar-select-execution.png + :align: center + :scale: 50%