DataKitchen DataOps Documention

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


action (test)

A test action defines what is to be done when a test fails. Test actions can be set to log, warning, or stop-on-error.

action node

A node type containing a data source but no data sink. Performs some action. Contains a directory named /actions that contains the files directing the data work to be performed.


Adds a given amount of days (positive or negative) to a date time object.


Adds a given amount of months (positive or negative) to a date time object.


Adds a given amount of weeks (positive or negative) to a date time object.


Adds a given amount of years (positive or negative) to a date time object.


The Mesos slave that runs DataKitchen Recipes. The UI displays the status across all available Agents if more than one is configured. As long as one Agent is available, the status will show as green. Additionally, the Agent status in the UI shows the total available memory and disk space for all Kitchens. The Agent status is refreshed every 30 seconds. An Agent should be provided with sufficient disk space to run the upper limit of expected simultaneous OrderRuns. Note that the Agent will compute the disk space only for the partition where the working directory is set.

analytic container



Denotes the Keys to which a test will be applied. The test will loop these keys, assigning the value to the test variable. This is an older syntax usage that is best used for tests containing historical calculations.



Parses a string representation of a boolean value.


The DataKitchen UI supports Google Chrome, Mozilla Firefox, and Microsoft Edge.

built-in functions

Give support to miscellaneous operations with variables.

built-in variables

Read-only variables available for each order run that provide details regarding the run.


clean previous run

An option for DKCC's kitchen-merge-preview command that allows the user to wipe any existing local files that remain after a previously aborted kitchen-merge-preview.


The default directory name where compiled versions of Recipes are written to local. Results from the use of the recipe-compile command.


container id

The Docker Container ID associated with the containers used for container nodes is recorded as part of each OrderRun's MongoDB record. This value can be accessed via the UI on the OrderRun details page, specifically under the OrderRun details header (container node -> progress.json -> json -> notebook_progress -> DKContainer -> container-progress -> container-id)

container node

A node type that is encapsulated by a container. Provides flexibility to build nodes using GUI tool or tools not currently supported by default node types.


A local configuration for DKCC associated with a specific customer account. A default context configuration is required. Additional contexts may be configured if desired.


A DKCC command used to delete a local context configuration.


A DKCC command that lists all local context configurations.


A DKCC command used to create a new context, or switch to another existing context.


See "clean previous run"


A system/built-in variable set equal to the name of the current Kitchen.


A system/built-in variable set equal to the ID of the current Order.


A system/built-in variable set equal to the ID of the current OrderRun.


A system/built-in variable set equal to the name of the current Variation for which the Recipe is being compiled.


datamapper node

A node type that maps data between a data source and a data sink.


View and sign The DataOps Manifesto.


DataKitchen's default i/o connector to put data from Recipe Nodes.


DataKitchen's default i/o connector to get data to Recipe Nodes.


Returns a string representation of a date time object according to a given format.


Returns a date time object according to a given string representation and a format.


Used for the decryption of file-based Data Sources. Its value should point to a Vault Secret that is stored as text, not binary.


Used for the decryption of file-based Data Sources if and only if the key has a passphrase set. Its value should point to a Vault Secret that is stored as text, not binary.

description.json (Node)

A file contained by each Recipe node containing "type" and "name" fields.

description.json (Recipe)

One of the 4 core Recipe configuration files. Contains "recipe-name", "nodes-to-use", "edges-to-use", and "recipe-emails" fields.

directed-acyclic-graph (DAG)

A finite directed graph with no directed cycles. DAGs consists of Nodes and Edges.

disk space

Disk space resources allocated for a given OrderRun.


The acronym for DKCloudCommand, DataKitchen's command line tool.


Denotes a Node type Action Node.


Denotes a Node type Container Node.


Denotes a Node type DataMapper Node.


Denotes a Node type Ingredient Node.


Denotes a Node type Synchronize Node.



The connection between two or more nodes in a Recipe-Variation graph.


Used for the encryption of S3 and SFTP datasinks. Its value should point to a Vault Secret that is stored as text, not binary.


If the scheduler misses the scheduled run time for any reason, it will still run the job if the delay time is within this interval. This is configured as part of the timing and runtime settings within the variations.json configuration file. Follows ISO 8601 syntax.



A DKCC command that compiles a file with the variable and override values associated with a provided Recipe Variation name.


A DKCC command used to delete one or more Recipe files. Delete all files within a directory to delete the directory itself.


Leverages a locally-configured tool to display a two-pane window that compares a local version of a file against its remote counterpart.


A DKCC command that leverages a locally-configured tool to display a three-pane window that compares a file (remote-copies) across two Kitchens. Source Kitchen, Target Kitchen, and base version of the file are displayed.


The base filepath to be referenced inside a container node.


A DKCC command that marks a conflicted file within a Recipe as resolved, so that a merge can be completed.


A DKCC command that updates the server copy of one or more existing Recipe files based on local changes, or adds one or more new Recipe files. Requires inclusion of a change message.


Parses a string representation of a floating point number.



A container that is used to get data. Inside its /docker-share directory, only the config.json file can parse jinja templates.


A DKCC command that is used to set up a GIT repository for a customer account. Authentication remains via a centralized GIT user, but commits are tagged with the appropriate DK username and email address.


A customer-level Vault whose Secrets are accessible by all Kitchens. All users may edit its connection settings. Disable or use in conjunction with Kitchen-Vaults. OrderRuns will look to the Global-Vaults for Secrets Vaults only if the value cannot first be found within a connected Kitchen-level Vault.




A field in a Container Node's notebook.json configuration file that specifies the tag of the Docker image to be pulled when building the container. This is an optional field. When populated, the default value is "latest."


A Recipe and its outputs, which can be reused by another Recipe without requiring reprocessing of the reusable code. As of v1.0.62, for Ingredients to function properly, they need to exist in the Kitchen for which they will be used, or in the case of creating a Child Kitchen via the Wizard, the Ingredients must exist in the parent Kitchen so that they may be inherited by the child.

ingredient node

A node type that calls an Ingredient.


The name of the Recipe that has been declared as an Ingredient. This field is configured in the notebook.json file of Ingredient Nodes.


A dictionary of configuration found within an Ingredient Node's notebook.json file. Specifies the metadata to be passed from the Ingredient OrderRun to its Parent OrderRun, including the polling interval.


Parses a string representation of an integer number.



DataKitchen configuration file format.



A configuration flag used when defining tests. When set to true, keep-history retains the value of a variable across runs to provide for historical comparisons.


A substep of work within a Recipe node. Multiple keys may exist within a node and are executed in the order they are presented within the node. Tests are applied to the output of keys. A row count as the final portion of a SQL query is an example. All keys within a node are processed before the processing of tests.


Kitchens are virtual workspaces, tied to a release environment, where people build, manage, and run data pipelines. Think of them much like factories containing assembly lines.


A DKCC command that deletes a Kitchen. Deleting a Kitchen will not delete any child Kitchens, but will instead create orphan kitchens.


A history of all changes that have occurred to either the definition of the kitchen environment or the Recipe code and configuration. The history of changes is filterable. Each change provides a detailed diff view file changes, automated changes messages, and optional user messages.

kitchen-level overrides

These overrides supersede variables.json baseline values as well as overrides defined in variations.json. Unlike the values they override, kitchen-level overrides sit outside of the Recipe content in version control and thus are best used for defining infrastructure. For example, the schema name compiled in development versus production Kitchens. Because they are defined at the Kitchen-level, kitchen-level overrides are applied to all Recipes within a Kitchen. These overrides may be defined via Kitchen details in the UI or via the kitchen-config command when using DKCloudCommand.


A file stored in MongoDB that contains the configuration for the Kitchen Wizard. Note that additional Wizard settings, specifically required variables that appear as text fields in the Wizard, are configured not via the kitchen-settings.json file but the variations.json file that is part of the Ingredient Recipe.

kitchen staff

Defines the set of users who have access to a Kitchen. A user will see all existing Kitchens, but those for which they do not have Kitchen staff rights, access will be blocked. A user need not need access to the master Kitchen.

kitchen status

Appears in the UI as part of the Kitchen list. Indicates whether there was an error with a wizard step during Kitchen creation. If an error exists additional details will appear.


An optionally-inheritable custom Vault connection type where access to Secrets is limited to connected Kitchens. Management of Kitchen-Vault connection settings is limited by Kitchen Staff. Each Kitchen may only be connected to a single Kitchen-Vault at a time. Secrets in these Vaults will override Secret values in the Global-Vault if the Secret paths are identical

kitchen wizard

A UI feature that guides "clickers" through the process of creating, deleting, and merging Kitchens. Wizards can also be configured to perform other tasks like adding Ingredients to Kitchens and creating schemas and clusters.



A file location in {USER_HOME}/.dk that denotes the currently installed version of DKCC. This is used to prompt the user to upgrade DKCC when applicable.


Loads a .csv file and returns its contents as a list of tuples, which can be iterated with jinja loop expressions. The file reading is performed in compile time so it must reference an existing resource file in the Recipe. The path is absolute, the built-in variable WorkDir can be used as a helper to locate the file in the Recipe.

local tool

A local tool configured with DKCC to provide two-pane file diffs and three-pane file merges.


OrderRun logs integrate logging data for all tools in your toolchain that are orchestrated as part of any given Recipe Variation.



A field within a DataMapper's notebook.json file. Defines the names and keys for sources and sinks.


The default Kitchen which is the parent to all subsequent Kitchen lineages and also a parent unto itself. The master Kitchen cannot be deleted.


The setting, in MB, for the upper limit of disk space made available to an Agent for a Recipe to run. This value is configured via a Variation's Mesos Settings configuration. The default setting is 2048 MB. This is the minimum disk space allowed. Additional disk space may be required depending on the volume of data being handled by the Recipe. Ingredient nodes are treated as wholly separate Recipes and thus require their own dedicated disk space.


The setting for the upper limit of RAM space for a container containing a Recipe; configured via Variation Mesos configuration. The default setting is 1024 MB.


Designates a DataKitchen Agent constraint. When an Agent constraint is applied to a Kitchen via a mesos-group, orders from that kitchen will only be picked up and cooked by Agents with a matching mesos-group tag. This can be used to segment orders across releases environments and across cloud providers/on-prem.


A hash format that may be generated when a file is loaded by a data source.


Also known as Timing & Runtime Settings. Contains the configuration for a specific combination of order scheduling and resource allocation. Found within each recipe's variations.json within the mesos-setting-list.


A list of configured mesos-settings located in each recipe's variations.json file.



Encapsulates a unit of work in a data analytics workflow, which can be code or a GUI tool with configuration. Nodes can be thought of as steps in a data workflow. Nodes contains one or more keys and should contain one or more tests. A node first processes preconditions, retrieves data, performs tests, and finally, processes postconditions.


Used to override the nodes used for the graph for a recipe variation. Paired with edges-to-use.


A configuration file used for datamapper and container nodes.


An object representing current date time.



The submission of a specific recipe variation for execution. Orders may be run on demand or scheduled to commence at some point in the future. Orders possess a unique order ID and may contain one or more unique order runs.

order id

The unique ID assigned to every order. Available during runtime via CurrentOrderID.


A specific instance of the running of a recipe variation. Multiple order runs will exist for a given order if that order is configured to repeat. Each order run has a unique order run ID and a run record.


In a recipe that runs an Ingredient, set this value to true to allow the logs for tests of log type, for the order run in the generated child kitchen, to be read into the parent kitchen.


In a recipe that runs an Ingredient, set this value to true to allow the logs for tests of warning type, for the order run in the generated child kitchen, to be read into the parent kitchen.


For a recipe that runs an ingredient, this value is used to set the specific time/cadence that the recipe checks the status of the auto-generated child kitchen where the order run takes place for the Ingredient itself. An example value, found in a notebook.json file, is [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]


The name of the kitchen marked as history kitchen, to be used in metrics.

orderrun id

The unique ID assigned to every order run. Available during runtime via CurrentOrderRunID.

orphan kitchen

Deleting a kitchen does not delete any of its child kitchens but rather orphans them. Orphan kitchens can subsequently be deleted or merged "diagnonally" into other kitchens via the "Advanced Merge" feature.


The name of the file returned by a container node. Defined per key.


See variable overrides and recipe overrides


parent kitchen

The kitchen from which a child kitchen is directly created, and where it inherits all selected recipes. Inheritance also applies to kitchen staff and recipe overrides. When merging between a parent and child, the merge should also first be processed from parent down to child (parent=source).


A type of connection configured in a FTP data source or sink.


See synchronize node


The ID of the previous OrderRun.


An array containing the previous order run IDs for this recipe variation.


A file, specific to each order run, that contains details about the ongoing run.

python scripts

Python scripts that are used as part of recipes are placed inside containers (container nodes or data sources). Other python scripts may be used strictly for development work, though they never run against production servers. These scripts are likely included as /resources files in GHE.


quality assurance

Do DataOps and implement automated tests across your recipes to catch errors in place and resolve them before they are ever released to Production.



A combination of pipeline assets and DataKitchen configuration files that define a Graph of executable steps. Each Recipe contains four core configuration files, a /resources sub-directory, and additional sub-directories specific to each contained Node. A given Recipe may consist of multiple Variations, each with its own variable override set saved in variations.json.

recipe overrides

See kitchen-level overrides.


The name of the Recipe for the current OrderRun.


The level of detail included in each OrderRun record makes its processing fully reproducible, thus satisfying regulatory requirements.


A standard recipe directory that contains files leveraged by the recipe. Node-specific files can either be stored here or within their respective nodes.


Order runs that have failed or have been manually stopped may be resumed, either from the UI or via DKCC's orderrun-resume command. Resumed order runs ignore nodes that were previously completed successfully and will rerun the entirety of failed nodes, even if a failed node did have some keys that completed successfully. Note that a new container is used for a resumed order run.


A built-in feature of data sources and sinks whereby the row counts for files is set to a variable that can be consumed by a downstream test. Row counts are not available for binary files.

run record

Each order run generates a distinct run record that contains a full copy of the recipe code that was run, a record of the specific infrastructure on which the run occurred, compiled values for all relevant overrides, timing information, and test data. With a run record, an order run becomes fully reproducible.



Recipes may be scheduled to run on a recurring basis. If changes are applied to a Recipe between scheduled order runs the recipe need not be rescheduled. See here for syntax details.


The time an order run was scheduled to execute per its order's schedule. Used with actual runtime to calculate the delay for a run, which will always be less than the configured Epsilon interval. Users can use this variable in any JSON, SQL, or text file. Python files (.py) and shell scripts (.sh) cannot use this variable directly. If using in a non-analytic container, this variable can be passed as a command line: /bin/bash -c "echo ScheduledOrderRunTime > somefile"

schedule delay

The time elapsed between when an order run was scheduled to kick off and its actual start time. This delay time is always less than the configured Epsilon interval, otherwise, the scheduled order run is skipped.


A sensitive value stored securely via encryption in the vault. A filter is applied to the system that prevent Secrets from being displayed in order run logs. Sometimes this filter overwrites non-Secrets in logs (but not in compiled files).

serving states

The states provided by the dk orderrun-info --runstatus command response: PLANNED_SERVING, ACTIVE_SERVING, COMPLETED_SERVING, STOPPED_SERVING, SERVING_ERROR, SERVING_RERAN, UNKNOWN


A hash format that may be generated when a file is loaded by a data source.




The designated "from" Kitchen for kitchen merge previews and kitchen merges.


The specific key from a data source to be mapped to a specific key in a data sink as part of an explicit mapping in a datamapper node's notebook.json file.


The specific data source to be mapped to a specific data sink as part of an explicit mapping in a datamapper node's notebook.json file.


Indicates the state of an order or order run. Recall that orders may contain multiple order runs if they are schedued or if they have been stopped and resumed. Possible order status values include: Active, Complete, Stopped, Error. Possible order run values include: "", Actve, Completed, Error in.


A test action that stops an order run when a test fails.


Returns a string representation of any object.

synchronize node

A node type that does no data work but serves as a placeholder or convergence point. The default recipe template contains two of these nodes. Its "type" is denoted as "DKNode_NoOp."



The designated "to" kitchen for kitchen merge previews and kitchen merges.


Defines a standard recipe structure that can be leveraged when creating a recipe via DKCC. Templates current include those that match Quickstart1 (default), Quickstart2, and Quickstart3.


Defined within a recipe node, a test is applied to a key within said node. Tests are configured with the following fields: "test-variable", "type", "applies-to-keys", "action", "keep-history", "test-logic", "test-compare", and "test-metric." A single key can be used by multiple tests by first creating a test, populating a variable as part of said test, then using that variable in other tests. The test suite holds onto variable values within a given node.


Legacy syntax, though still supported. Sub-field to test-logic. Declares how to compare test-variable to test-metric.


Declares the type of the variable being tested as datetime.


Declares the type of the variable being tested as float.


Declares the type of the variable being tested as integer.


Declares the type of the variable being tested as string.


Contains a logic statement evaluating the test-variable.


Legacy syntax, though still supported. Sub-field to test-logic. Parent field to optional historic-calculation and historic-metric fields when performing a historic comparison test. Declares the value the test-variable will be compared against. Can be a literal (100, "100"), a date expression, or a variable name (runtime or key-associated).




API session tokens are valid for 4 hours after which long-running connection sessions must be renewed.


The variable that holds the value being tested.







type (test)

The type of test performed. This is used to evaluate the value of test-variable as a specific datatype. If the value cannot be cast to the specified type, an error will be thrown. If omitted, the user is responsible for ensuring the correctness of datatypes used in test-logic: [test-contents-as-date, test-contents-as-float, test-contents-as-integer, test-contents-as-string].


user home

The dafult location where DKCC installs its hidden /.dk folder. This contains a DKCC config file and latest version file. Defined in documentation by {USER_HOME}.



The process of confirming that all recipe files can be properly compiled during an Order-Run. Validation is processed at the variation level for both DKCC and the UI at the time of any version control changes. With DKCC users may also validate a recipe variation aside from any version control changes.


Values set in variables.json file for runtime flexibility. Can be overriden.


One of the 4 core recipe configuration files. Stores values for Jinja templates, which allows for runtime flexibility. Values are stored as plain text or point to a path within the Vault.

variable overrides

Values set in variations.json file for runtime flexibility. Overrides values set in variables.json . Can be overridden.


One of the 4 core recipe configuration files. Contains a list of defined recipe variations as well as setting for the Environment, Mesos, and Overrides. Also defines the "active-variation."


An encrypted store of secrets in DynamoDB. See global-vault and kitchen-vault.



Warns the user when a test fails but does not stop an order run.


Leveraged when getting or putting files with templated naming formats, specifically to cycle through large numbers of files. For example, wildcards may be used to pull files based on the date in their name while ignoring the time portion of the name. Wildcard errors only halt an order run if they reference a non-existent filepath; absence of files matching wildcards will not stop an order run.


A field in kitchen.json that summarizes the order runs processed by the cooking of ingredients via the kitchen wizard.


The current working directory, also known as the recipe root directory.





Updated 6 days ago


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.