Example

The DTaaS API follows the OpenAPI spec, and documentation is self-hosted under /v1/doc. In this example, we will be using the Python SDK that implements an interface to this API, but the API can be used independently of the Python SDK if it’s desireable to interface with DTaaS from other languages.

SDK Authentication

The SDK is capable of functioning from within DTaaS Servables automatically, or externally provided you can authenticate to DTaaS. From within the context of a running Servable, the SDK is automatically authenticated because an API key for that Servable is created by DTaaS and mounted in the Servable as a secret. From outside of that context, the sdk login command can be used to obtain authentication tokens.

The SDK also has the concept of Profiles. Low level Python profile management documentation can be found here. The CLI can interact with Profiles through the following commands:

sdk login
sdk list-profiles
sdk use-profile
sdk remove-profile

All other CLI commands accept --profile as an argument, i.e. sdk --profile profilename list apps in order to execute that command as the profile “profilename”. When using sdk login, --profile sets which profile will be set up with the given credentials, either username/password or API key. sdk list-profiles lists all profiles, and indicates which profile is currently active. sdk use-profile updates which profile is the active profile, while sdk remove-profile removes the currently active profile.

Design Guidelines

A Data Trust Application should be designed so that individual Data Assets are Servables, and the user-facing Application coordinates the lifecycle of and sharing of those Servables among users. The stock DTaaS Admin Dashboard can provide that functionality on a generic level, but building a customized Application allows for a better user experience.

The DTaaS platform makes building certain kinds of systems easier by providing tools to solve common problems:

  • Not all Users have access to the Application running on a single Server.

  • Data lives in multiple locations and must stay in place.

  • No pre-existing centralized Identity Management or Access Control

Example Project

This page gives an introduction to DTaaS modules, and a minimal example project. This example assumes you have the sdk installed and have access to a running DTaaS instance. A basic understanding of Docker and Python3 will also be required to follow along.

Overall Project Flow

The general steps to develop a DTaaS project are as follows:

Step 1: Project Requirements and Architecture describing how each prototype will interact.
Step 3: Prototype Development with the fewest number of dependencies.

It is important to know how many prototypes the project requires, and what kind of communication they will require. This will help distinguish which of the prototype types are best fit for the project.

Step 1: Project Requirements and Architecture

This project will demonstrate how to use a Pipeline, App, and Daemon, and show how they interact with each other using DTaaS. It will be a minimal example that displays possible interactions.

The purpose of a Daemon is to provide a service to be used by other Servables. Communication with a Daemon happens directly between Pods and Services in Kubernetes.

Our example Daemon will be a simple HTTP server.

An App is meant to be a user-facing web application. Our example App will be a very simple application that collects data from the Daemon, and displays the data to the user.

Pipelines are intended to complete some kind of task. Our example Pipeline will coordinate the other two prototypes. First it will create and run the Daemon, followed by creating and running the App while configuring the App to interface with the Daemon.

Step 2: Create a DTaaS project

A DTaaS project consists of a Manifest file that declares Prototypes. We will use the following Manifest:

MANIFEST
 1{
 2    "hello-daemon": {
 3        "name": "hello-daemon",
 4        "tag": "1",
 5        "type": "daemon",
 6        "dependencies": {},
 7        "arguments": {
 8            "volume_location": "/data",
 9            "volume_size": 1
10        },
11        "capabilities": {},
12        "config-capabilities": {}
13    },
14    "hello-app": {
15        "name": "hello-app",
16        "tag": "1",
17        "type": "app",
18        "dependencies": {
19            "hello-daemon": "hello-daemon"
20        },
21        "arguments": {},
22        "capabilities": {},
23        "config-capabilities": {
24            "daemon-id": ["READ"]
25        }
26    },
27    "hello-pipeline": {
28        "name": "hello-pipeline",
29        "tag": "1",
30        "type": "pipeline",
31        "dependencies": {
32            "hello-app": "hello-app",
33            "hello-daemon": "hello-daemon"
34        },
35        "arguments": {},
36        "capabilities": {
37            "hello-app": [
38                "CREATE",
39                "READ"
40            ],
41            "hello-daemon": [
42                "CREATE",
43                "READ"
44            ]
45        },
46        "config-capabilities": {}
47    }
48}

In this example, we will be building the docker images ourselves. The name and tag specified will be how we tag the images we build so that the SDK can find and upload them.

Specifying the dependencies allows DTaaS to ensure that a Prototype cannot be removed from the system until all other Prototypes that depend on it are also removed.

Because hello-pipeline is responsible for deploying hello-app and hello-daemon it requires Create Capability on them. Because the hello-app is going to be interfacing with the running Daemon, it wants Read Capability on it to be able to check that the Daemon exists and is running, and to find its address to make requests to it. Because we don’t know the id of the Daemon until runtime, this Capability is requested with config-capabilities, and says that we will need Read Capability on the value of the daemon-id key in our config. The Pipeline will have to provide that config when it creates the App.

Step 3: Prototype Development

Daemon

Dockerfile
1FROM python:3.7-slim
2
3RUN pip install flask
4
5COPY ./hello-daemon.py /hello-daemon.py
6
7ENTRYPOINT ["python3", "/hello-daemon.py"]
hello-daemon.py
 1from flask import Flask
 2app = Flask(__name__)
 3
 4
 5@app.route('/hello')
 6def hello():
 7    return 'Hello, DTaaS!'
 8
 9
10if __name__ == '__main__':
11    app.run(host='0.0.0.0', port=8080)

This sets up a Flask server to listen on address 0.0.0.0 and port 8080 A HTTP GET request against /hello should return a HTTP 200 OK response, with a body of “Hello, DTaaS!”. Because Daemon objects are not exposed through the DTaaS HTTP proxy, any port or protocol could be used here. It is up to the caller of the Daemon to understand the interfaces that the Daemon exposes.

App

Dockerfile
1FROM python:3.7-slim
2
3RUN pip install flask
4RUN python3 -m pip install sightline-0.15.3-py3-none-any.whl
5
6COPY ./hello-app.py /hello-app.py
7
8ENTRYPOINT ["python3", "/hello-app.py"]
hello-app.py
 1import requests
 2
 3from flask import Flask
 4
 5from sightline.simon.instance import Instance
 6from sightline.simon.utils import current_config, access_token
 7
 8
 9app = Flask(__name__)
10config = current_config()
11
12daemon = Instance.from_id(config["daemon-id"])
13daemon_url = f"http://{daemon.ipv4_address}:8080/hello"
14
15
16@app.route('/')
17def serve_web_frontend():
18    response = requests.get(daemon_url)
19    return f"The Daemon says, {response.text}"
20
21
22if __name__ == "__main__":
23    app.run(host="0.0.0.0", port="8080")

Because Apps are meant to be served to end users in a browser, the requirements for an App are more constrained. HTTP is mandatory, as is running the server on port 8080 so that the DTaaS HTTP proxy can expose it.

Web applications also have concerns about authentication and authorization. DTaaS has two mechanisms to deal with this, an automatic authenticating proxy and an OpenID Connect compliant Identity Provider that Apps can use to authenticate DTaaS users. The DTaaS HTTP proxy can enforce access control for Apps, but that enforcement only works at a very coarse level. DTaaS does not understand the HTTP API exposed by Apps, and only provides the ability to allow or deny all HTTP requests. If granular access control is desired, then the automatic access control can be disabled on App creation and the App can use OpenID Connect with the DTaaS Identity Provider to authenticate users.

In this example, because our API is simple we rely on automatic access control and do not write any additional code to work with granular access control.

This App also includes the Sightline SDK. The SDK interfaces with the DTaaS API to discover the Daemon address, allowing the App to talk to the Daemon. The config object from the SDK is the configuration passed into the App.create call in the Pipeline to create the App, which is where we are provided with the daemon-id.

Pipeline

Dockerfile
1FROM python:3.7-slim
2
3RUN python3 -m pip install sightline-0.15.3-py3-none-any.whl
4
5COPY ./hello-pipeline.py /hello-pipeline.py
6
7ENTRYPOINT ["python3", "/hello-pipeline.py"]
hello-pipeline.py
 1from sightline.simon.app import App
 2from sightline.simon.daemon import Daemon
 3from sightline.simon.instance import Resources
 4from sightline.simon.capability import resolve_delegation_request
 5
 6
 7resources = Resources(cpu_cores=1, ram=1024, disk=1024, gpu_count=0, shm=0)
 8
 9daemon = Daemon.create(
10    "hello-daemon:1",
11    name="hello-daemon",
12    config={},
13    resources=resources)
14daemon_instance = daemon.run()
15
16
17app = App.create(
18    "hello-app:1",
19    name="hello-app",
20    config={
21        "daemon-id": daemon_instance.id
22    },
23    resources=resources)
24app_instance = app.run()
25
26# Running the App will request Capabilities on the daemon we pass in to
27# the config. This grants that request.
28resolve_delegation_request(app_instance.pending_capabilities)

The Pipeline runs a deployment of the Daemon and App, demonstrating DTaaS object creation and Capability delegation. This isn’t strictly necessary to perform as a Pipeline. It is possible and reasonable for this use case to write a stand-alone deployment script that is identical to the pipeline. The only difference is that when running a stand-alone script outside of DTaaS, the SDK has to authenticate first so that it can obtain the necessary access token to make API calls.

Step 4: Uploading and Running the Project

In order to upload the Prototypes, first use docker to build the images and tag them the same way that the Manifest refers to them.

$ docker build -t hello-app:1 ./hello-app
$ docker build -t hello-daemon:1 ./hello-daemon
$ docker build -t hello-pipeline:1 ./hello-pipeline

To upload the Prototypes, you first need an authenticated context to be able to use the SDK to interact with DTaaS. The canonical way to do this is to authenticate, and then export the authentication context into your environment. From there it will be used by future calls to the SDK, or by child processes that inherit your environment variables (for example, calling into the API from a Python script).

$ sdk login

Note

If DTaaS is deployed without TLS, add the --insecure flag.

This may require the use of a web browser to complete, as DTaaS can be configured to require authentication with a 3rd-party Identity Provider.

$ eval $(sdk export)

Because we want to upload Prototypes, we also have to configure our local docker client to be authenticated with the DTaaS registry. This can be done with

$ sdk docker-configure [host] [port]

Where [host] and [port] are substituted by the deployment location of DTaaS and the port used to expose the registry (by default, usually 5000).

$ sdk upload

Once your project is uploaded, we can run it. We’ll be launching the Pipeline from a stand-alone Python script.

run-pipeline.py
 1from sightline.simon.capability import resolve_delegation_request
 2from sightline.simon.pipeline import Pipeline
 3from sightline.simon.instance import Resources
 4from sightline.simon.app import App
 5
 6resources = Resources(cpu_cores=1, ram=1024, disk=1024, gpu_count=0, shm=0)
 7pipeline = Pipeline.create(
 8    "hello-pipeline:1",
 9    name="hello-pipeline",
10    resources=resources,
11    config={})
12instance = pipeline.run()
13resolve_delegation_requestinstance.pending_capabilities)

Save this as run-pipeline.py and then run

$ python3 ./run-pipeline.py

After the Pipeline has completed, the App can be found by either by finding the App in the outputs field of the Pipeline, or by selecting the correct App from App.list(). The App’s URL can be found from app.url, and visiting that URL in a browser should display the App that you defined earlier.