DataKitchen DataOps Documention

DataMapper Nodes

Nodes that map data from Data Sources to Data Sinks.

This node type moves data from a data source to a data sink destination by "mapping" the data between the two locations. As the data is in transit, these nodes have the ability to perform some operation or transformation of the data itself.

Description.json

{
   "type" : "DKNode_DataMapper",
   "description": "[YOUR DESCRIPTION HERE]"
}

Mapping Arbitrary Lists of Files Using Wildcards

Every DataMapper node contains a list of mappings between Data Sources and Data Sinks configured for the Node. These mappings reference the Source and Sinks names and the specific Keys to be mapped between them, as each may contain multiple Keys.

Automatically all keys in data sources will create associations on the data sink.
For more details check wildcard configuration in data sources/data sinks.
The field mappings can also contain other well-defined mappings, so both cases will apply.

{
  "mappings": {
    "orders": {
      "source-name": "orders_source",
      "source-key": "daily",
      "sink-name": "orders_target",
      "sink-key": "daily"
    },
    "invoices": {
      "source-name": "invoices_source",
      "source-key": "daily",
      "sink-name": "invoices_target",
      "sink-key": "daily"
    }
  }
}

The following configuration convention is used when mapping an undefined amount of files:

{
  "name" : "move-converted-orders",
  "wildcard-will-automatically-create-mappings":  [
    {
      "data-source": "converted-orders",
      "data-sink": "store-sftp"
    }
  ],  
  "mappings": {}   
}

Properties

Field
Description
Required

name

source-name

source-key

sink-name

sink-key

data-source

data-sink

unzip-file

Tells the mapper to unzip the file picked from data source

optional, default false

gunzip-file

Tells the mapper to use gzip decompression on the file being picked from data source

optional, default false

bunzip2-file

Tells the mapper to use bzip2 decompression on the file being picked from data source

optional, default false

gzip-file

Tells the mapper to use gzip compression on the files being sent to data sink

optional, default false

do-sed-command

Uses the sed unix command to process the file while passed from the data source to the data sink, this field requires as input a valid sed expression

optional

Examples

Example1

{
    "type": "DKNode_DataMapper",
    "description": ""
}
{
    "metadata": {
        "name": "mapper"
    },
    "wildcard-will-automatically-create-mappings": [
        {
            "data-source": "sftp_source",
            "data-sink": "s3_sink"
        }
    ],
    "mappings": {}
}
{
    "type": "DKDataSource_SFTP",
    "name": "sftp_datasource",
    "username": "exos",
    "hostname": "#{vault://dk-sftp-03/host}",
    "port": 22,
    "wildcard": "i\\S*txt",
    "wildcard-key-prefix": "sftp/in/",
    "pem_file": "#{vault://dk-sftp-03/exos.pem}",
    "keys": {}
}
{
    "type": "DKDataSink_S3",
    "name": "s3_datasink",
    "public-bucket": false,
    "s3-secret-key": "#{vault://s3_schema/s3-secret-key}",
    "s3-access-key": "#{vault://s3_schema/s3-access-key}",
    "bucket": "#{vault://s3_schema/bucket}",
    "wildcard": "wc-*.txt",
    "wildcard-key-prefix": "wildcard-test",
    "f": "x"
}
{}

(source)

Example2

{
    "type": "DKNode_DataMapper",
    "description":  ""
}
{
    "mappings": {
        "map1": {
            "source-name": "source",
            "source-key": "key1",
            "sink-name": "sink",
            "sink-key": "key1"
        }
    }
}
{
	"name" : "s3_datasource",
	"type" : "DKDataSource_S3",
	"config" : {{s3config}},
	"keys" : {
		"key1" : {
			"file-key" : "dk-test/bucketgroup1/bucketa/file.csv",
            "use-only-file-key" : true,
            "set-runtime-vars": {
                "md5" : "md5"
            }
		}
	}
}
{
    "name" : "sftp_datasink",
    "type" : "DKDataSink_SFTP",
    "config" : {{sftpconfig}},
    "keys" : {
        "key1" : {
            "file-key" : "sftp-test/file-{{today}}-{{md5}}.csv",
            "use-only-file-key" : true
        }
    } 
}

Updated 13 days ago


Next Up:

Action Nodes

DataMapper Nodes


Nodes that map data from Data Sources to Data Sinks.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.