Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for json_table #23761

Merged
merged 1 commit into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
234 changes: 234 additions & 0 deletions docs/src/main/sphinx/functions/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -1138,6 +1138,240 @@ FROM customers
| 102 | 'missing' |
| 103 | 'missing' |


(json-table)=
## json_table

The `json_table` clause extracts a table from a JSON value. Use this clause to
transform JSON data into a relational format, making it easier to query and
analyze. Use `json_table` in the `FROM` clause of a
[`SELECT`](select-json-table) statement to create a table from JSON data.

```text
JSON_TABLE(
json_input,
json_path [ AS path_name ]
[ PASSING value AS parameter_name [, ...] ]
COLUMNS (
column_definition [, ...] )
[ PLAN ( json_table_specific_plan )
| PLAN DEFAULT ( json_table_default_plan ) ]
[ { ERROR | EMPTY } ON ERROR ]
)
```

The `COLUMNS` clause supports the following `column_definition` arguments:

```text
column_name FOR ORDINALITY
| column_name type
[ FORMAT JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ] ]
[ PATH json_path ]
[ { WITHOUT | WITH { CONDITIONAL | UNCONDITIONAL } } [ ARRAY ] WRAPPER ]
[ { KEEP | OMIT } QUOTES [ ON SCALAR STRING ] ]
[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT expression } ON EMPTY ]
michaeleby1 marked this conversation as resolved.
Show resolved Hide resolved
[ { ERROR | NULL | DEFAULT expression } ON ERROR ]
| NESTED [ PATH ] json_path [ AS path_name ] COLUMNS ( column_definition [, ...] )
```

`json_input` is a character string or a binary string. It must contain a single
JSON item.

`json_path` is a string literal containing the path mode specification and the
path expression. It follows the syntax rules described in
[](json-path-syntax-and-semantics).

```text
'strict ($.price + $.tax)?(@ > 99.9)'
'lax $[0 to 1].floor()?(@ > 10)'
```

In the `PASSING` clause, pass values as named parameters that the `json_path`
expression can reference.

```text
PASSING orders.totalprice AS o_price,
orders.tax % 10 AS o_tax
```

Use named parameters to reference the values in the path expression. Prefix
named parameters with `$`.

```text
'lax $?(@.price > $o_price || @.tax > $o_tax)'
```

You can also pass JSON values in the `PASSING` clause. Use `FORMAT JSON` to
specify the format and `ENCODING` to specify the encoding:

```text
PASSING orders.json_desc FORMAT JSON AS o_desc,
michaeleby1 marked this conversation as resolved.
Show resolved Hide resolved
orders.binary_record FORMAT JSON ENCODING UTF16 AS o_rec
```

The `json_path` value is case-sensitive. The SQL identifiers are uppercase. Use
michaeleby1 marked this conversation as resolved.
Show resolved Hide resolved
quoted identifiers in the `PASSING` clause:

```text
'lax $.$KeyName' PASSING nation.name AS KeyName --> ERROR; no passed value found
'lax $.$KeyName' PASSING nation.name AS "KeyName" --> correct
```

The `PLAN` clause specifies how to join columns from different paths. Use
`OUTER` or `INNER` to define how to join parent paths with their child paths.
Use `CROSS` or `UNION` to join siblings.

`COLUMNS` defines the schema of your table. Each `column_definition` specifies
how to extract and format your `json_input` value into a relational column.

`PLAN` is an optional clause to control how to process and join nested JSON
data.
michaeleby1 marked this conversation as resolved.
Show resolved Hide resolved

`ON ERROR` specifies how to handle processing errors. `ERROR ON ERROR` throws an
error. `EMPTY ON ERROR` returns an empty result set.

`column_name` specifies a column name.

`FOR ORDINALITY` adds a row number column to the output table, starting at `1`.
Specify the column name in the column definition:

```text
row_num FOR ORDINALITY
```

`NESTED PATH` extracts data from nested levels of a `json_input` value. Each
`NESTED PATH` clause can contain `column_definition` values.

The `json_table` function returns a result set that you can use like any other
table in your queries. You can join the result set with other tables or
combine multiple arrays from your JSON data.

You can also process nested JSON objects without parsing the data multiple
times.

Use `json_table` as a lateral join to process JSON data from another table.

### Examples

The following query uses `json_table` to extract values from a JSON array and
return them as rows in a table with three columns:

```sql
SELECT
mosabua marked this conversation as resolved.
Show resolved Hide resolved
*
FROM
json_table(
'[
{"id":1,"name":"Africa","wikiDataId":"Q15"},
{"id":2,"name":"Americas","wikiDataId":"Q828"},
{"id":3,"name":"Asia","wikiDataId":"Q48"},
{"id":4,"name":"Europe","wikiDataId":"Q51"}
]',
'strict $' COLUMNS (
NESTED PATH 'strict $[*]' COLUMNS (
id integer PATH 'strict $.id',
name varchar PATH 'strict $.name',
wiki_data_id varchar PATH 'strict $."wikiDataId"'
)
)
);
```

| id | child | wiki_data_id |
| -- | --------- | ------------- |
| 1 | Africa | Q1 |
| 2 | Americas | Q828 |
| 3 | Asia | Q48 |
| 4 | Europe | Q51 |

The following query uses `json_table` to extract values from an array of nested
JSON objects. It flattens the nested JSON data into a single table. The example
query processes an array of continent names, where each continent contains an
array of countries and their populations.

The `NESTED PATH 'lax $[*]'` clause iterates through the continent objects,
while the `NESTED PATH 'lax $.countries[*]'` iterates through each country
within each continent. This creates a flat table structure with four rows
combining each continent with each of its countries. Continent values repeat for
each of their countries.

```sql
SELECT
*
FROM
json_table(
'[
{"continent": "Asia", "countries": [
{"name": "Japan", "population": 125.7},
{"name": "Thailand", "population": 71.6}
]},
{"continent": "Europe", "countries": [
{"name": "France", "population": 67.4},
{"name": "Germany", "population": 83.2}
]}
]',
'lax $' COLUMNS (
NESTED PATH 'lax $[*]' COLUMNS (
continent varchar PATH 'lax $.continent',
NESTED PATH 'lax $.countries[*]' COLUMNS (
country varchar PATH 'lax $.name',
population double PATH 'lax $.population'
)
)
));
```

| continent | country | population |
| ---------- | --------- | ------------- |
| Asia | Japan | 125.7 |
| Asia | Thailand | 71.6 |
| Europe | France | 67.4 |
| Europe | Germany | 83.2 |

The following query uses `PLAN` to specify an `OUTER` join between a parent path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the next two examples use and empty array of input and the whole syntax is not really explained I think we either have to update these now .. or revisit in a follow up PR (I prefer that so we can ship this PR) .. but as it stands I think the two examples are basically a headscratcher for the reader and barely useful unless you know and understand already.

and a child path:

```sql
SELECT
*
FROM
JSON_TABLE(
'[]',
'lax $' AS "root_path"
COLUMNS(
a varchar(1) PATH 'lax "A"',
NESTED PATH 'lax $[*]' AS "nested_path"
COLUMNS (b varchar(1) PATH 'lax "B"'))
PLAN ("root_path" OUTER "nested_path")
);
```

| a | b |
| ---- | ---- |
| A | null |

The following query uses `PLAN` to specify an `INNER` join between a parent path
and a child path:

```sql
SELECT
*
FROM
JSON_TABLE(
'[]',
'lax $' AS "root_path"
COLUMNS(
a varchar(1) PATH 'lax "A"',
NESTED PATH 'lax $[*]' AS "nested_path"
COLUMNS (b varchar(1) PATH 'lax "B"'))
PLAN ("root_path" INNER "nested_path")
);
```

| a | b |
| ---- | ---- |
| null | null |

(json-array)=
## json_array

Expand Down
7 changes: 7 additions & 0 deletions docs/src/main/sphinx/sql/select.md
Original file line number Diff line number Diff line change
Expand Up @@ -1277,6 +1277,13 @@ LEFT JOIN UNNEST(checkpoints) AS t(checkpoint) ON TRUE;

Note that in case of using `LEFT JOIN` the only condition supported by the current implementation is `ON TRUE`.

(select-json-table)=
## JSON_TABLE

`JSON_TABLE` transforms JSON data into a relational table format. Like `UNNEST`
and `LATERAL`, use `JSON_TABLE` in the `FROM` clause of a `SELECT` statement.
For more information, see [`JSON_TABLE`](json-table).

## Joins

Joins allow you to combine data from multiple relations.
Expand Down
Loading