DataKitchen DataOps Documention

File Settings for SQL Connectors

When a recipe node is configured with a SQL database source or sink to write to or read from a file, the system needs to know the file format and needs instruction on how to map and format the data.

The Node Editor offers settings to define these mappings for container nodes and datamapper nodes.

  • For SQL data sources, find the Output File Settings section when you add a mapping on the Inputs tab of a container node or the Mappings tab of a datamapper node.
  • For SQL data sinks, find the Input File Settings section when you add a mapping on the Outputs tab of a container node or the Mappings tab of a datamapper node.

SQL Data Source to Delimited File Output

For example, in a container node configured with a PostgreSQL database as the data source, a mapping added to the Inputs tab in the Node Editor includes fields to define settings for the source's output file. In the enlarged image below, the output file is a comma-delimited text file (CSV).

Delimited Text File Settings

For a SQL data source, the platform retrieves data from the database and stores it in Python objects. When the platform writes to a delimited file, it has to translate the format as per these settings.

Similarly, when the platform retrieves data from a delimited file to write to a SQL data sink, it must translate the format back to store in Python objects.

Field
Available for
Description

Insert column names as first row (insert-column-names)

source output

If checked, this setting instructs the system to create a header row using database column names when writing data to the delimited file.


Skip first row as column names (first-row-column-names)

sink input

If checked, this setting instructs the system to ignore the first line of the delimited file when writing the contents to the SQL database.


File format (format)

both

The file type that the system reads from or writes to.

  • Data sources support binary, delimited text, or JSON output files.

  • Data sinks support binary or delimited text input files.

Note: BigQuery data sinks support delimited text and JSON.


Column Delimiter (col-delimiter)

both

The character used in the delimited file to separate each data field at each database table column. Default value , separates fields with commas.


Row Delimiter (row-delimiter)

source output

The character used in the delimited file to separate each line at each database table row. Default value \r\n separates rows with carriage returns or line feeds.


Date Format (date-format)

both

Python datetime format for dates in the delimited file. Default value %Y-%m-%d might be written to a file as 2020-08-27. See https://strftime.org/ for syntax.


Datetime Format (datetime-format)

both

Python datetime format for dates in the delimited file. Default value %Y-%m-%d %H:%M:%S might be written to a file as 2020-08-27 14:30:22. See https://strftime.org/ for syntax.


Time Format (time-format)

both

Python datetime format for time in the delimited file. Default value %H:%M:%S might be written to a file as 14:30:22. See https://strftime.org/ for syntax.


Integer Format (int-format)

both

Python string format for integers in the delimited file. Default value %d for integers. See https://pyformat.info/ for the old style syntax.


Decimal Format (decimal-format)

both

Python string format for decimal numbers in the delimited file. Default value %f for floating point numbers. See https://pyformat.info/ for the old style syntax.

Null Value (null-value)

both

How null values from a SQL source should be written to a CSV file, and which strings from a CSV should be created as null in a SQL sink. Default value is an empty string, but users could specify NULL or None instead.


True Value (bool-format)

both

How a value stored as True in a SQL source should be written to a CSV file, and which strings from a CSV should be created as True in a SQL sink. Default value is True, but users could specify Yes or 1 instead.


False Value (bool-format)

both

How a value stored as False in a SQL source should be written to a CSV file, and which strings from a CSV should be created as False in a SQL sink. Default value is False, but users could specify No or 0 instead.

SQL Connectors that Support Advanced I/O File Settings

The settings above are available for sources and sinks using DB2, Microsoft SQL Server, MySQL, PostgreSQL, Snowflake, Teradata, and Vertica. Note that BigQuery only supports the Column Delimiter setting.

Binary File Input to SQL Data Sink

While there are no data mapping settings for binary files used as output from SQL data sources and input to SQL data sinks, users can select Binary as the file format. Binary files are non-text, typically machine-readable data files like executables, configuration files, drivers, and libraries. Application data may also be stored in binary files.

Likewise, there are no settings for JSON output files used with SQL data sources.

For example, in a datamapper node configured with a Microsoft SQL database as the data sink, a mapping in the Node Editor includes a section to define the sink's input file.

Updated about a month ago

File Settings for SQL Connectors


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.