DataKitchen DataOps Documention

Orders

The processing of recipe variations in kitchens.

Orders

The processing of a recipe variation is known as running an Order. Orders are specific to a kitchen-recipe-variation combination.

Order ID
Each order is assigned an Order ID, which is unique across all kitchens.

    **Example Order ID:** 887d01ba-365e-11e9-8394-0242ac110002

UI Orders Summary

A list of past and present orders is available via the UI on the Orders page. Users can filter the order list based on recipe, variation, ID, status, and time.

CLI Orders Summary

A list of past and present orders is available via the CLI by way of the order-list command.

~ $ dk order-list -k eric_dev
Current context is: default
YYYY-MM-DD HH:MM:SS - Get Order information for Kitchen eric_dev

ORDER SUMMARY (order ID: af325758-7bdf-11e9-adc1-0242ac110002)
Kitchen:        eric_dev
Recipe:         test_sources_sinks
Variation:      on_demand
Schedule:       now
Status:         COMPLETED_ORDER

  1.  ORDER RUN (OrderRun ID: 958dcdb4-7bdf-11e9-b517-0ad19868a29a)
        OrderRun Status OrderRun Completed
        Start time:     YYYY-MM-DD HH:MM:SS EDT
        End time:       YYYY-MM-DD HH:MM:SS EDT
        Duration:       0:00:03 (H:M:S)

ORDER SUMMARY (order ID: 1ae54d8a-4c05-11e9-a18b-0242ac110002)
Kitchen:        eric_dev
Recipe:         new_recipe
Variation:      Variation1
Schedule:       now
Status:         COMPLETED_ORDER

  1.  ORDER RUN (OrderRun ID: 19672852-4c05-11e9-8b62-0ad19868a29a)
        OrderRun Status OrderRun Completed
        Start time:     YYYY-MM-DD HH:MM:SS EDT
        End time:       YYYY-MM-DD HH:MM:SS EDT
        Duration:       0:00:03 (H:M:S)

ORDER SUMMARY (order ID: cf2612c2-4c03-11e9-a18b-0242ac110002)
Kitchen:        eric_dev
Recipe:         new_recipe
Variation:      Variation1
Schedule:       now
Status:         COMPLETED_ORDER

  1.  ORDER RUN (OrderRun ID: cbc23fca-4c03-11e9-9f13-0ad19868a29a)
        OrderRun Status OrderRun Completed
        Start time:     YYYY-MM-DD HH:MM:SS EDT
        End time:       YYYY-MM-DD HH:MM:SS EDT
        Duration:       0:00:03 (H:M:S)

ORDER SUMMARY (order ID: 0f55c348-4b4f-11e9-8ef7-0242ac110002)
Kitchen:        eric_dev
Recipe:         test_sources_sinks
Variation:      on_demand
Schedule:       now
Status:         COMPLETED_ORDER

  1.  ORDER RUN (OrderRun ID: 17b0b1ec-4b4f-11e9-b31c-0ad19868a29a)
        OrderRun Status OrderRun Completed
        Start time:     YYYY-MM-DD HH:MM:SS EDT
        End time:       YYYY-MM-DD HH:MM:SS EDT
        Duration:       0:00:06 (H:M:S)

ORDER SUMMARY (order ID: 3d5b927c-4a73-11e9-9320-0242ac110002)
Kitchen:        eric_dev
Recipe:         prospect_demo
Variation:      on_demand_demo
Schedule:       now
Status:         COMPLETED_ORDER

  1.  ORDER RUN (OrderRun ID: 3c84fa1e-4a73-11e9-ae2f-0ad19868a29a)
        OrderRun Status OrderRun Completed
        Start time:     YYYY-MM-DD HH:MM:SS EDT
        End time:       YYYY-MM-DD HH:MM:SS EDT
        Duration:       0:01:15 (H:M:S)
~ $ dk order-list --help



Current context is: default
Usage: dk order-list [OPTIONS]

  List Orders in a Kitchen.

  Examples:

  1) Basic usage with no paging, 5 orders, 3 order runs per order.

  dk order-list

  2) Get first, second and third page, ten orders per page, two order runs
  per order.

  dk order-list --start 0  --order_count 10 --order_run_count 2

  dk order-list --start 10 --order_count 10 --order_run_count 2

  dk order-list --start 20 --order_count 10 --order_run_count 2

  3) Get first five orders per page, two order runs per order, for recipe
  recipe_name

  dk order-list --recipe recipe_name --order_count 5 --order_run_count 2

Options:
  -k, --kitchen TEXT              Filter results for kitchen only
  -s, --start INTEGER             Start offset for displaying orders
  -oc, --order_count INTEGER      Number of orders to display
  -orc, --order_run_count INTEGER
                                  Number of order runs to display, for each
                                  order
  -r, --recipe TEXT               Filter results for this recipe only
  --help                          Show this message and exit.

Order Runs

Each order contains at least one Order Run. Order runs represent the distinct, one-time processing of a recipe variation within a specific kitchen. An order may be associated with multiple order runs if the order was configured with some degree of repetition. For example, an order may be submitted to run daily, generating a distinct order run record each day.

The platform displays order run data and results on the Order Run Details page, and the same data may be accessed via command line. See Order Run Details for more information.

Order Run ID
Each order run is given a unique ID called an Order Run ID.

    **Example order run ID:** 802213bc-3662-11e9-9f7b-0ad19868a29a

Each Order Run Has Its Own Deep-Linked URL

Schedule

A recipe must contain at least one configured Schedule (also known as mesos-setting-list in JSON configuration), which indicates when an order should be processed, its repetition properties, and it's retry interval. Recipe variations may share schedules, and schedules may be configured but remain unused.

Schedules may be accessed via deep links in the following format.
https://cloud.datakitchen.io/dk/index.html#/recipeConfig/customercode/kitchenname/recipename

Configure schedules in the web app

  1. On the Recipes page, select your recipe.
  2. Click the Schedules tab.
  3. For new schedules, click the Add Schedule button.
  4. For existing schedules, click the Actions menu and select Edit.

Configure schedules in the CLI

Using the command line, access a recipe's variations.json file to configure schedules, specifically within the mesos-setting-list.

Mesos Settings

A recipe may contain many configured mesos settings, with each specifying its own schedule and resources allocation.

{
    "variation-list": {
        "daily_6_am_edt_multi_node_graph": {
          	"graph-setting": "graph",  
          	"mesos-setting": "daily_6_am_edt" 
        }
, 
        "on_demand_multi_node_graph": {
            "graph-setting": "graph"
            
        }
, 
        "on_demand_single_node_graph": {
            "graph-setting": "graph_single_node" 
        }
, 
        "on_demand_single_node_graph_different_dataset": {
          	"graph-setting": "graph_single_node",   
          	"mesos-setting": "on_demand", 
            "override-setting": [
                "different_dataset"            ]
        }

    }
, 
    "graph-setting-list": {
        "graph": [
          [
                "node1", 
                "node2"            ]        ],
        "single_node_graph": [
          [
        				"node1", 
                "node2"            ]        ]  
    },
  	"mesos-setting-list": {
        "daily_6_am_edt": {
            "schedule": "0 6 * * *", 
            "scheduleTimeZone": "America/New_York", 
            "epsilon": 1800, 
            "max-ram": 512, 
            "max-disk": 10240
        }
    }
, 
    "override-setting-list": {
        "different_dataset": {
            "source_data": "dataset2"
        }

    } 

}

Syntax

Order schedules consumed by Kubernetes-based DataKitchen agents follow cron syntax.

Note that the order scheduler supports standard cron statements, including the following expressions.

  • Both 0s and 7s can be used for Sunday. For example, orders scheduled for 0 20 * * 7 or 0 20 * * 0 both run at 20:00 on Sunday.

Schedules Are Most Easily Configured via the UI

Web app forms provide scheduling configuration using simple dropdown forms. Alternatively, DateTime schedules may be converted to cron syntax using open source tools.

On Demand

Recipe variations that lack an applied scheduled run on-demand by default. The web app also provides an option to run a scheduled recipe variation once, on-demand. This option is presented in the order confirmation modal.

Recurring

Schedules may be configured to recur at some interval, for example, every weekday at 5 AM IST.

"mesos-setting-list": {
        "daily_5_am_ist_schedule": {
            "schedule": "0 5 * * 1-5",
            "scheduleTimeZone": "Asia/Calcutta",
            "epsilon": 1800
        }
    },

Timezone Support

Order scheduling provides timezone support with a convenient search feature. A list of supported timezone values may be found here.

If no timezone is applied, Coordinated Universal Time [UTC] is used by default.

Retry Interval

Recipe schedules also include a retry interval configuration known as an Epsilon value. This entry specifies the period of time during which the system can retry an order run execution in the event it is delayed from its start time. Retry periods are recorded in seconds, with no default.

Editing Existing Order Schedules

Unlike the entirety of the remaining recipe and kitchen configuration, changes to a recipe variation's schedule (mesos-settings-list) will not be reflected in presently running orders.

Editing Schedules for Active Orders

Edits to timing and runtime settings have no impact on existing, active orders. Thus, when editing the timing and runtime settings for an active order, the existing order must be stopped and a new order submitted after the settings changes have been applied.

Runtime Resources

Recipes also contain at least one configured resource allocation (also known as Runtime Setting), which allocates memory and disk space on the Agent Machine for each order run that is processed. These upper bounds are configured to prevent runaway resource consumption due to user error.

Runtime settings are configured in the same location as recipe Schedules, both from the web app and command line.

Configure resources in the web app

  1. On the Recipes page, select your recipe.
  2. Click the Schedules tab.
  3. Open a Schedule dialog.
    • For new schedules, click the Add Schedule button.
    • For existing schedules, click the Actions menu and select Edit.
  4. Find the RAM and Disk Space fields at the bottom.

Resource allocation is configured in variations.json, specifically within the mesos-setting-list.

"mesos-setting-list": {
        "daily_9_pm_edt_schedule": {
            "schedule": "0 21 * * 1-5",
            "scheduleTimeZone": "Eastern (EDT)",
            "epsilon": 1800,
            "max-ram": 1024,
            "max-disk": 2048
        }
    },

Memory

The default memory runtime setting allocates 1024 MB of memory for each order run. This is sufficient memory for most order runs. This setting is configured in variations.json via max-ram.

The memory required for an order run = RAM used for processing the order + active memory caches + some Docker overhead.

Disk Space

The default disk space runtime setting allocates 2048 MB of disk space for each order run. This is sufficient disk space for most order runs. This setting is configured in variations.json via max-disk.

The disk space required for an order run = space needed for any write operations + space for the recipe file sizes. The disk space usage reported in logs does not include any resource requirements of shared infrastructure, such as container images and shared volumes used for file exchange within a node.

Best Practices

The vast majority of orders are well-handled by default resource allocation settings and thus warrant no resource consideration from the user. In those cases where resources are constrained and a new allocation need be considered, users can follow a straightforward workflow to land on the appropriate allocation:

  • Review order run logs to find clear error message regarding resource constraints
  • Note the cumulative memory and disk space consumed at the point of order run failure
  • Check the order run graph failure point for components with memory leaks and/or excessive abnormal volume of data collection
  • Increase the resource allocation for the recipe variation's mesos-setting
  • Cancel any scheduled orders for the recipe variation
  • Run a test order. If passing reschedule the main order. If failing, repeat the steps above.

Use in Container Nodes
Be sure to examine the memory consumption on your local machine of any scripts you wish to insert into Container Nodes within a recipe. Memory leaks impacting these standalone scripts will also impact any order runs with these erroneous scripts included. See also GPC Resource Allocation for more information.

Use in Ingredient Nodes
The use of ingredients requires special resource allocation consideration. When an order run encounters an Ingredient Node, a temporary kitchen is generated and the Ingredient referenced by the ingredient node runs there, with its status, runtime variables, and logs being communicated back to its parent order run upon completion. As a result, use of ingredients requires allocating sufficient resources to cover the distinct order runs associated with ingredient nodes.

Editing Existing Order Resource Allocations

Editing Resource Allocations for Recipes with Existing Orders

Edits to resource allocations have no impact on existing, active orders. Thus, when editing the resource allocations for a recipe with active order(s), the existing order(s) must be stopped and new order(s) submitted after the configuration changes have been applied.

Running Orders

An order is always associated with a kitchen-recipe-variation combination.

Running Orders via the Web App

Users can run orders for existing recipe variations with saved graphs from several locations in the web app but only from within the associated kitchen.

Run an order from the Recipes page.

Run an order from the Recipes page.

Run an order from the variation graph view.

Run an order from the variation graph view.

Confirmation dialog that appears after choosing to run an order.

Confirmation dialog that appears after choosing to run an order.

Running CLI Orders

Run orders via the command line using the order-run command.

~ $ dk order-run --kitchen eric_dev --recipe test_sources_sinks on_demand --yes
Current context is: default
YYYY-MM-DD HH:MM:SS - Create an Order:
        Kitchen: eric_dev
        Recipe: test_sources_sinks
        Variation: on_demand

Order ID is: 9cf3efec-9684-11e9-8f1d-0242ac110002
~ $ dk order-run --help



Current context is: default
Usage: dk order-run [OPTIONS] VARIATION

  Run an Order: cook a Recipe Variation.  Run an Order for a single, specific 
  Recipe graph node using the --node option.

Options:
  -k, --kitchen TEXT  Kitchen name
  -r, --recipe TEXT   Recipe name
  -n, --node TEXT     Name of the node to run
  -p, --params TEXT   Overrides passed as parameters
  -y, --yes           Force yes
  --help              Show this message and exit.

Run Variation Once

Users may run an entire recipe variation's graph in a kitchen even if the recipe variation has an existing, scheduled, recurring order in place.

Run a variation one time using the run arrow next to the variation name or Run Variation buttons on the Recipes and Variations pages.

Information about the order run from running a variation once appears on the Order Run Details page. See Order Run Details for more information.

Order-Time Overrides

When submitting an order, users have the option of passing overrides at runtime. Order-Time Overrides take precedence over all other parameters feeding into the recipe.

Order-Time Overrides in the Web App
When users submit orders from the web app, the Run Variation dialog opens and displays the compiled values of all recipe variation input parameters. Users may proceed with these values or edit them selectively.

Order-Time CLI Overrides
Use the order-run command with the --params option to pass order-time overrides via the CLI.

~ $ dk order-run --kitchen eric_dev --recipe test_sources_sinks on_demand --params '{"email":"[email protected]"}' --yes
Current context is: default
YYYY-MM-DD HH:MM:SS - Create an Order:
        Kitchen: eric_dev
        Recipe: test_sources_sinks
        Variation: on_demand

Order ID is: 55b49efa-9680-11e9-9d6a-0242ac110002
The following variables will be overridden:
email: eric.veleker+[email protected]
~ $ dk order-run --help



Current context is: default
Usage: dk order-run [OPTIONS] VARIATION

  Run an Order: cook a Recipe Variation.  Run an Order for a single, specific 
  Recipe graph node using the --node option.

Options:
  -k, --kitchen TEXT  Kitchen name
  -r, --recipe TEXT   Recipe name
  -n, --node TEXT     Name of the node to run
  -p, --params TEXT   Overrides passed as parameters
  -y, --yes           Force yes
  --help              Show this message and exit.

Run One Node

It is often most efficient when building graph nodes to run the node under development on its own rather than the full graph to which it is a component. Rather than saving a new variation with a new graph, users may take advantage of the run-one-node feature.

Run-One-Node Considerations

  • The entire variation will get compiled and checked for errors, even though only one node will execute.
  • Any upstream dependencies, such as changes to a database's state or setting variables during execution, will be missing. This may cause errors or test failures during the single node Order.
  • The single node order will count as part of the variation's metrics and test history, and this may skew those statistics.

Run One Node via UI

You can run a single node in a graph, to make it easy to iteratively test while building a recipe. From a recipe variation graph, open the desired node in the Node Editor and click the Run Node button.

Run a single node from the Node Editor

Run a single node from the Node Editor

The run of a single node is also listed on the Orders page and has its own Order Run Details page. See Order Run Details for more information.

Run One Node via CLI

Run one node via the command line using the order-run command with the --node option.

~ $ dk order-run --kitchen eric_dev --recipe test_sources_sinks on_demand --node action_node_s3_source --yes


Current context is: default
YYYY-MM-DD HH:MM:SS - Create an Order:
	Kitchen: eric_dev
	Recipe: test_sources_sinks 
	Variation: on_demand
  Node: action_node_s3_source
  
Order ID is: c9dc1bd2-4ef6-11e8-a766-080027aceabc
~ $ dk order-run --help



Current context is: default
Usage: dk order-run [OPTIONS] VARIATION

  Run an Order: cook a Recipe Variation.  Run an Order for a single, specific 
  Recipe graph node using the --node option.

Options:
  -k, --kitchen TEXT  Kitchen name
  -r, --recipe TEXT   Recipe name
  -n, --node TEXT     Name of the node to run
  -p, --params TEXT   Overrides passed as parameters
  -y, --yes           Force yes
  --help              Show this message and exit.

Reproducibility

Each order run record includes the following:

  • An identical copy of the recipe code and configuration that was processed
  • All parameters and runtime variables with their compiled values per the override hierarchy
  • A full copy of order run logs
  • Identification of data inputs and outputs
  • Identification of leveraged infrastructure instances
  • Test result metadata
  • Process timing metadata

Order run records thus assist back-testing use cases and are often more than adequate to meet governance and regulatory requirements.

Order Run Archive

An optional, automated daily backup of a kitchen's order run records to S3 is available on the Configure Kitchen page.

Backup Timing

Order run backups to a configured archive location occur each day at 6 PM EST.

"backup":{
      "last_backup":null,
      "backup_enabled":false,
      "export_test_data":false,
      "s3_bucket":null,
      "s3_secret_key":null,
      "s3_access_key":null,
      "export_node_timing":false,
      "target_folder":null,
      "export_key_timing":false
   },

Updated 4 days ago

Orders


The processing of recipe variations in kitchens.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.