Skip to main content
Skip to main content
Edit this page

Handling other formats

Earlier examples of loading JSON data assume the use of JSONEachRow (ndjson). We provide examples of loading JSON in other common formats below.

Array of JSON objects

One of the most popular forms of JSON data is having a list of JSON objects in a JSON array, like in this example:

Let’s create a table for this kind of data:

To import a list of JSON objects, we can use a JSONEachRow format (inserting data from list.json file):

We have used a FROM INFILE clause to load data from the local file, and we can see the import was successful:

Handling NDJSON (line delimited JSON)

Many apps can log data in JSON format so that each log line is an individual JSON object, like in this file:

The same JSONEachRow format is capable of working with such files:

JSON object keys

In some cases, the list of JSON objects can be encoded as object properties instead of array elements (see objects.json for example):

ClickHouse can load data from this kind of data using the JSONObjectEachRow format:

Specifying parent object key values

Let’s say we also want to save values in parent object keys to the table. In this case, we can use the following option to define the name of the column we want key values to be saved to:

Now, we can check which data is going to be loaded from the original JSON file using file() function:

Note how the id column has been populated by key values correctly.

JSON Arrays

Sometimes, for the sake of saving space, JSON files are encoded in arrays instead of objects. In this case, we deal with a list of JSON arrays:

In this case, ClickHouse will load this data and attribute each value to the corresponding column based on its order in the array. We use JSONCompactEachRow format for this:

Importing individual columns from JSON arrays

In some cases, data can be encoded column-wise instead of row-wise. In this case, a parent JSON object contains columns with values. Take a look at the following file:

ClickHouse uses the JSONColumns format to parse data formatted like that:

A more compact format is also supported when dealing with an array of columns instead of an object using JSONCompactColumns format:

Saving JSON objects instead of parsing

There are cases you might want to save JSON objects to a single String (or JSON) column instead of parsing it. This can be useful when dealing with a list of JSON objects of different structures. Let's take this file where we have multiple different JSON objects inside a parent list:

We want to save original JSON objects into the following table:

Now we can load data from the file into this table using JSONAsString format to keep JSON objects instead of parsing them:

And we can use JSON functions to query saved objects:

Note that JSONAsString works perfectly fine in cases we have JSON object-per-line formatted files (usually used with JSONEachRow format).

Schema for nested objects

In cases when we're dealing with nested JSON objects, we can additionally define schema and use complex types (Array, Object Data Type or Tuple) to load data:

Accessing nested JSON objects

We can refer to nested JSON keys by enabling the following settings option:

This allows us to refer to nested JSON object keys using dot notation (remember to wrap those with backtick symbols to work):

This way we can flatten nested JSON objects or use some nested values to save them as separate columns.

Skipping unknown columns

By default, ClickHouse will ignore unknown columns when importing JSON data. Let’s try to import the original file into the table without the month column:

We can still insert the original JSON data with 3 columns into this table:

ClickHouse will ignore unknown columns while importing. This can be disabled with the input_format_skip_unknown_fields settings option:

ClickHouse will throw exceptions in cases of inconsistent JSON and table columns structure.

BSON

ClickHouse allows exporting to and importing data from BSON encoded files. This format is used by some DBMSs, e.g. MongoDB database.

To import BSON data, we use the BSONEachRow format. Let’s import data from this BSON file:

We can also export to BSON files using the same format:

After that, we’ll have our data exported to the out.bson file.