Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/examples/application-pipelines.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/basic-io.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/display-cifar.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/display-rand-img.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/download.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
66 changes: 66 additions & 0 deletions docs/examples/redshift.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Redshift Example Project
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated
========================

This branch provides a small collection of generalized pipelines for the training and utilization of red-shift estimation models. This branch is designed to allow simple use by only requiring that the configuration parameters of individual nodes be defined where necessary. The most involved alterations that should be necessary for most users is the definition of additional architectures in the **Resources** tab. It should be noted that any newly defined architecture should have an output length and input shape that match the *num_bins* and *input_shape* configuration parameters being used in the various pipelines.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

Pipeline list
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated
-------------

* `Train Test Single`_
* `Train Test Compare`_
* `Download Train Evaluate`_
* `Train Predict`_
* `Predict Pretrained`_
* `Test Pretrained`_
* `Download SDSS`_
* `Download Train Predict`_

.. * `Visualize Predictions`_
.. * `Train Visualize`_

.. figure:: application-pipelines.png
:align: center
:width: 75%

Pipelines
---------

Train Test Single
~~~~~~~~~~~~~~~~~
Trains and evaluates a single CNN model. Uses predefined artifacts that contain the training and testing data. For this and all training pipelines, the artifacts should each contain a single numpy array. Input arrays should be a 4D array of shape **(n, y, x, c)** where n=number of images, y=image height,x=image width, and c=number of color channels. Output (label) arrays should be of shape **(n,)** .
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

.. Visualize Predictions
.. ~~~~~~~~~~~~~~~~~~~~~


Train Test Compare
~~~~~~~~~~~~~~~~~~
Trains and evaluates two CNN models and compares effectiveness of the models.

Download Train Evaluate
~~~~~~~~~~~~~~~~~~~~~~~
Downloads SDSS images, trains a model on the images, and evaluates the model on a separate set of downloaded images. Care should be taken when defining your own CasJobs query to ensure that all queried galaxies for training have a redshift value below the **Train** node’s *max_val* configuration parameter’s value.

Train Predict
~~~~~~~~~~~~~
Trains a single CNN model and uses the newly trained model to predict the redshift value of another set of galaxies.

Predict Pretrained
~~~~~~~~~~~~~~~~~~
Predicts the redshift value of a set of galaxies using a pre-existing model that is saved as an artifact.

Test Pretrained
~~~~~~~~~~~~~~~
Evaluates the performance of a pre-existing model that is saved as an artifact.

.. Train Visualize
.. ~~~~~~~~~~~~~~~


Download SDSS
~~~~~~~~~~~~~
Download SDSS images and save them as artifacts. Can be used in conjunction with the other pipelines that rely on artifacts rather than images retrieved at execution time.

Download Train Predict
~~~~~~~~~~~~~~~~~~~~~~
Download SDSS images and use some images to train a model before using the model to predict the redshift value of the remaining galaxies.
163 changes: 163 additions & 0 deletions docs/examples/rs-tutorial.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
Tutorial Project - Redshift
===========================

Pipeline list
-------------
1. `Basic Input/Output`_
2. `Display Random Image`_
3. `Display Random CIFAR-10`_
4. `Train CIFAR-10`_
5. `Train-Test`_
6. `Train-Test-Compare`_
7. `Download-Train-Evaluate`_

.. 6. `Visualize Predictions`_

Comment thread
brollb marked this conversation as resolved.
Pipelines
---------


Basic Input/Output
~~~~~~~~~~~~~~~~~~
This pipeline provides one of the simplest examples of a pipeline possible in DeepForge. Its sole purpose is to create an array of numbers, pass the array from the first node to the second node, and print the array to the output console.
Comment thread
brollb marked this conversation as resolved.

.. figure:: basic-io.png
:align: center


.. code-block:: python

import numpy

class GenArray():
def __init__(self, length=10):
self.length = length
return

def execute(self):
arr = list(numpy.random.rand(self.length))
return arr


Display Random Image
~~~~~~~~~~~~~~~~~~~~
.. figure:: display-rand-img.png
:align: center

This pipeline’s primary purpose is to show how graphics can be output and viewed. A random noise image is generated and displayed using matplotlib’s pyplot library. Any graphic displayed using the **plt.show()** function can be viewed in the executions tab.

.. code-block:: python

from matplotlib import pyplot as plt
from random import randint

class DisplayImage():
def execute(self, image):
if len(image.shape) == 4:
image = image[randint(0, image.shape[0] - 1)]
plt.imshow(image)
plt.show()

Display Random CIFAR-10
~~~~~~~~~~~~~~~~~~~~~~~
.. figure:: display-cifar.png
:align: center

As with the previous pipeline, this pipeline simply displays a single image. The image from this pipeline, however, is more meaningful, as it is drawn from the commonly used CIFAR-10 dataset. This pipeline seeks to provide an example of the input being used in the next pipeline while providing an example of how the data can be obtained. This is important for users who seek to develop their own pipelines, as CIFAR-10 data generally serves as an effective baseline for testing and development of new CNN architectures or training processes.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

Also note, as shown in the figure above, that it is not necessary to utilize all of the outputs of a given node. Unless specifically handled, however, it is generally inappropriate for an input to be left undefined.

.. code-block:: python

from keras.datasets import cifar10

class GetDataCifar():
def execute(self):
((train_imgs, train_labels),
(test_imgs, test_labels)) = cifar10.load_data()
return train_imgs, train_labels, test_imgs, test_labels

Train CIFAR-10
~~~~~~~~~~~~~~
.. figure:: train-basic.png
:align: center

This pipeline gives a very basic example of how to create, train, and evaluate a simple CNN. The primary takeaway from this pipeline should be the overall structure of a training pipeline, which should follow the following steps in most cases:

1. Load data
2. Define the loss, optimizer, and other metrics
3. Compile model, with loss, metrics, and optimizer, using the **compile()** method
4. Train model using the **fit()** method, which requires the training inputs and outputs
5. Output the trained model for serialization and/or utilization in subsequent nodes

.. code-block:: python

import numpy as np
import keras

class TrainBasic():
def __init__(self, model, epochs=20, batch_size=32, shuffle=True):
self.model = model
self.epochs = epochs
self.batch_size = batch_size
self.shuffle = shuffle
return

def execute(self, train_imgs, train_labels):
opt = keras.optimizers.rmsprop(lr=0.001)
self.model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['sparse_categorical_accuracy'])
self.model.fit(train_imgs,
train_labels,
batch_size=self.batch_size,
epochs=self.epochs,
shuffle=self.shuffle,
verbose=2)
model = self.model
return model

.. code-block:: python

class EvalBasic():
def __init__(self):
return

def execute(self, model, test_imgs, test_labels):
results = model.evaluate(test_imgs, test_labels, verbose=0)
for i, metric in enumerate(model.metrics_names):
print(metric,'-',results[i])
return results

Train-Test
~~~~~~~~~~
.. figure:: train-basic.png
:align: center

This pipeline provides an example of how one might train and evaluate a redshift estimation model.For the training process, there are two primary additions that should be noted.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

First, the **Train** class has been given a function named **to_categorical**. Because we are using categorization models for redshift estimation in this tutorial, the keras model expects the output labels to be either one-hot vectors or a single integer where the position/value indicates the range in which the true redshift value falls. This function converts the continuous redshift values into the necessary discrete, categorical format.
Comment thread
brollb marked this conversation as resolved.
Outdated

Second, a class has been provided to give examples of how researchers may dene their own Sequence for training. Sequences are helpful in that they allow alterations to be made to the data during3 training. In the example given here, the **SdssSequence** class provides the ability to rotate or flip images before every epoch, which will hopefully improve the robustness of the final model.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

The evaluation node has also been updated to provide metrics more in line with redshift estimation. Specifically, it calculates the fraction of outlier predictions, the model’s prediction bias, thedeviation in the MAD scores of the model output, and the average Continuous Ranked Probability Score (CRPS) of the output.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated


.. Visualize Predictions
.. ~~~~~~~~~~~~~~~~~~~~~


Train-Test-Compare
~~~~~~~~~~~~~~~~~~
.. figure:: train-compare.png
:align: center

This pipeline gives a more complicated example of how to create visualizations that may be helpful for understanding the effectiveness of a model. The **EvalCompare** node provides a simple comparison visualization of two models.


Download-Train-Evaluate
~~~~~~~~~~~~~~~~~~~~~~~
.. figure:: download.png
:align: center

This pipeline provides an example of how data can be retrieved and utilized in the same pipeline. The previous pipelines use manually uploaded artifacts. In many real cases, users may desire to retrieve novel data or more specific data using SciServer’s CasJobs API. In such cases, the **DownloadSDSS** node here makes downloading data relatively simple for users. It should be noted that the data downloaded is not in a form easily usable by our models and first requires moderate preprocessing, which is performed in the **Preprocessing** node. This general structure of download-process-train is a common pattern, as data is rarely supplied in a clean, immediately usable format.
Binary file added docs/examples/train-basic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/train-compare.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/examples/train-single.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/artifacts_tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/custom_serializer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/custom_utils.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/execute_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/execution_finished.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/executions_tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/import_artifact.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
105 changes: 105 additions & 0 deletions docs/fundamentals/interface.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
Deepforge Interface
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated
===================
The Deepforge editor interface is separated into six views for defining all of the necessary features of your desired project. The details of each interface tab are detailed below. You can switch to any of the views at any time by clicking the appropriate icon on the left side of the screen. In order, the tabs are:
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

+---------------+--------------------------+
| |tabs| | - Pipelines_ |
| | - Executions_ |
| | - Resources_ |
| | - Artifacts_ |
| | - `Custom Utils`_ |
| | - `Custom Serialization`_|
+---------------+--------------------------+

.. |tabs| image:: interface_tabs.png

Pipelines
---------
.. figure:: pipelines_tab.png
:align: center
:width: 75%

In the initial view, all pipelines that currently exist in the project are displayed. New pipelines can be created using the red plus symbol in the bottom right. From this screen, existing pipelines can also be opened for editing, deleted, or renamed. Pipelines in this list are arranged automatically by the system and cannot be manually reordered in the current implementation.
Comment thread
brollb marked this conversation as resolved.
Outdated

Pipeline editing
~~~~~~~~~~~~~~~~
.. figure:: pipeline_example.png
:align: center
:width: 50%

Pipelines are composed of a directed graph of nodes, where each node is an isolated python module. Nodes are added to a pipeline using the red plus button in the bottom right of the workspace. Any nodes that have previously been defined in the project can be added to the pipeline, or new operations can be created when needed. Arrows in the workspace indicate the passing of data between nodes. These arrows can be created by clicking on the desired output (bottom circles) of the first node before clicking on the desired input (top circles) of the second node. Clicking on a node also gives the options to delete (red X), edit (blue </>), or change attributes. Information on the editing of nodes can be found in `Custom Operations <custom_operations.rst>`_
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

Pipelines are executed by clicking the yellow play button in the bottom right of the workspace. In the window that appears, you can name the execution, select a computation platform, and select a storage platform. The computation platform can either be SciServer's Compute service or a WebGME platform. The available storage platforms are SciServer's Files service and Amazon's S3 service. The provided storage option will be used for storing both the output objects defined in the pipeline, as well as all files used in execution of the pipeline. Login credentials will be required for SciServer computation service, either storage service, and each individual input node in the pipeline.
Comment thread
brollb marked this conversation as resolved.
Outdated

.. figure:: execute_pipeline.png
:align: center
:width: 75%

Executions
----------
.. figure:: executions_tab.png
:align: center
:width: 75%

This view allows the review of previous pipeline executions. Clicking on any execution will display any plotted data generated by the pipeline, and selecting multiple executions will display all of the selected plots together. Clicking the provided links will open either the associated pipeline or a trace of the execution (shown below). The blue icon in the top right of every node allows viewing the text output of that node. The execution trace can be viewed during execution to check the status of a running job. During execution, the color of a node indicates its current status. The possible statuses are:

- **Dark gray**: Awaiting initialization
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated
- **Light gray**: Awaiting execution
- **Yellow**: Currently executing
- **Green**: Successfully finished execution
- **Red**: Execution failed

.. figure:: execution_finished.png
:align: center
:width: 50%

Resources
---------
.. figure:: resources_tab.png
:align: center
:width: 75%

This view shows the available neural network resources available to your pipelines. From this view, resources can be created, deleted, and renamed. Resources are arranged by the deepforge system and cannot by manually reordered.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

.. figure:: neural_network.png
:align: center
:width: 50%

As with pipelines, the neural networks are depicted as directed graphs. Each node in the graph corresponds to a single layer or operation in the network (information on operations can be found on the `keras website <https://keras.io/api/>`_). Clicking on a node provides the ability to change the attributes of that layer, delete the layer, or add new layers before or after the current node. Many operations require that certain attributes be defined before use. The Conv2D node pictured above, for example, requires that the *filters* and *kernel_size* attributes be defined. If these are left as *<none>*, a visual indicator will show that there is an error to help prevent mistakes. In order to ease analysis and development, hovering over any connecting line will display the shape of the data as it moves between the given layers.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

Artifacts
---------
.. figure:: artifacts_tab.png
:align: center
:width: 75%

In this view, you can see all artifacts that are available to your pipelines. These artifacts can be used in any pipeline through the inclusion of the built in **Input** node. Artifacts are, by default, only supported in the form of either keras models (such as those created using the `keras.model.save_model <https://keras.io/api/models/model_saving_apis/#save_model-function>`_ function) or python pickle objects. Other artifact types can also be used, but require the definition of a `custom serialization <Custom Serialization_>`_. A new artifact can be created in one of three ways. First, artifacts are automatically during the execution of any pipeline that includes the built-in **Output** node. Second, artifacts can be directly uploaded in this view using the red upload button in the bottom right of the workspace. Using this option will also upload the artifact to the storage platform specified in the popup window. Finally, artifacts that already exist in one of the storage platforms can be imported using the blue import button in the bottom right of the workspace.
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

|import| |upload|

.. |import| image:: import_artifact.png
:width: 45%
.. |upload| image:: upload_artifact.png
:width: 45%


Custom Utils
------------
.. figure:: custom_utils.png
:align: center
:width: 75%

This view allows the creation and editing of custom utility modules. Utilities created here can be imported into any pipeline node. For example, the *swarp_config_string* shown above can be printed out in a node using the following code:

.. code-block:: python

import utils.swarp_string as ss
print(ss.swarp_config_string)

Custom Serialization
--------------------
.. figure:: custom_serializer.png
:align: center
:width: 75%

In this view, you can create custom serialization protocols for the creation and use of artifacts that are neither python pickle objects nor keras models. To create a serialization, you will need to define two functions, one for serialization and one for deserialization. These functions must then be passed as arguments to the *deepforge.serialization.register* function as shown in the commented code above. The serializer and deserializer should have the same signatures as the dump and load functions respectively from python's `pickle module <https://docs.python.org/3/library/pickle.html>`_.
Binary file added docs/fundamentals/interface_tabs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/neural_network.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/pipeline_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/pipelines_tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/resources_tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/upload_artifact.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Welcome to DeepForge's documentation!
:maxdepth: 1
:caption: Fundamentals

fundamentals/interface.rst
fundamentals/custom_operations.rst
fundamentals/integration.rst

Expand All @@ -29,6 +30,13 @@ Welcome to DeepForge's documentation!
deployment/overview.rst
deployment/native.rst

.. toctree::
:maxdepth: 1
:caption: Example Projects
Comment thread
KyleAMoore marked this conversation as resolved.
Outdated

examples/rs-tutorial.rst
examples/redshift.rst

.. toctree::
:maxdepth: 1
:caption: Reference
Expand Down