DataOps users who prefer to operate at the command line rather than the DataKitchen web app GUI can do so via DKCloudCommand, the command line tool for interacting with the DataKitchen API.
PyPI Package Index
DataKitchen recommends that you first review the DataOpps Web App topics as companions to this demonstration of DKCloudCommand. The GUI resources introduce concepts like kitchens, recipes, variations, orders, and ingredients. In these command line topics, we'll cover those concepts in less detail and focus on building a recipe from scratch.
DKCloudCommand requires Python version 3.6 or newer. DataKitchen recommends Python 3.8. This version of Python may be easily installed within a virtual environment, as described in the next section.
DKCloudCommand is best installed via pip3. Pip3 is already installed if you are using Python 3 (version >=3.6).
We recommend that all users install DKCloudCommand within a virtual environment. Two great options, conda and virtualenv, are described in detail below. You may choose to use either.
Next, execute the series of commands below to setup a conda virtual environment named DataKitchen where we will later install DKCloudCommand.
# Create a conda virtual environment named DataKitchen that uses Python 3.7 ~ $ conda create -n DataKitchen python=3.7 # Activate the newly-created conda virtual environment (Windows) ~ $ activate DataKitchen (DataKitchen) ~ $ # Activate the newly-created conda virtual environment (macOS) ~ $ source activate DataKitchen (DataKitchen) ~ $ # When finished using the conda virtual environment, deactivate it (DataKitchen) ~ $ source deactivate ~ $
Execute the following series of commands to install virtualenv and create a virtual environment named DataKitchen where we will later install DKCloudCommand.
# Install virtualenv ~ $ pip install virtualenv # Create and enter a directory to house local virtual environments ~ $ mkdir virtualenvs ~ $ cd virtualenvs # Create a virtual environment named DataKitchen that uses Python 3.7 ~ $ virtualenv DataKitchen --python=python3.7 # Activate the newly-created DataKitchen virtual environment ~ $ source DataKitchen/bin/activate (DataKitchen) ~ $ # When finished using the virtual environment, deactivate it (DataKitchen) ~ $ deactivate ~ $
DKCloudCommand can also be run inside a custom container.
Python package requirements will be installed or updated when installing DKCloudCommand.
A number of high-quality open-source tools are available to assist with the building and editing data analytic pipelines. DataKitchen's flexibility as a platform allows it to play nice with any tool sin your existing toolchain. In this guide we'll make use of the open-source tools highlighted below:
Pycharm Community Edition (Multi-Platform)
DBeaver is open source, can be downloaded without network-admin privileges, and automatically downloads and manages required database drivers.
SQL Workbench (Multi-Platform)
SQL Workbench is open source.
Configuring Visual File Diff & Merge Tools with DKCloudCommand
Configuration for file diff and merge tools is documented in the next section.
Updated about a month ago
|Install & Configure|