Miscellaneous

Storing JSON data in the Oracle database

In this posting  you learn how to:

  • Store JSON data in the Oracle Database
  • Pick a good data type for your JSON data
  • Identify rows and columns  that contain JSON data
  • Ensure that only JSON data can be inserted into a table

Find all columns that store JSON data

Prerequisites:

  • You need the Oracle Database Release 12.1.0.2 or later
  • Ensure that the COMPATIBLE init.ora parameter is set to 12.1.0.2 or later (you may see an error message otherwise)

To learn about JSON visit [http://json.org/ or http://en.wikipedia.org/wiki/JSON]

Oracle stores JSON data using existing data types. This allows you to choose the best data type for you use case and use existing clients. Also you get cross functional completeness right away (import export, partitioning, replication, etc). JSON can be stored as VARCHAR2 (up to 32767 bytes), CLOB and BLOB.
CLOB and BLOB have no length limitations.

Internally, CLOB encodes characters as UCS2 (similar to UTF16) which means every character takes up two bytes. BLOB does not perform such re-encoding but instead stores the Unicode (UTF8) bytes unmodified – thus requiring half the storage size for ASCII characters, and half the IO to load it.

For this example we use VARCHAR2 as data type.
In our example we store a list of color names and their codes.

 CREATE TABLE colorTab (
id NUMBER,
color VARCHAR2(32767)
); 

(In order to allow the 32k length for VARCHAR2 the init.ora
parameter MAX_STRING_SIZE has to be set to EXTENDED. See [https://docs.oracle.com/database/121/REFRN/refrn10321.htm#REFRN10321].
Or replace 32767 with 4000).

Let’s insert a few rows:

INSERT INTO colorTab VALUES(1, ‘
{
“color”: “black”,
“rgb”: [0,0,0],
“hex”: “#000000”
}
‘);

INSERT INTO colorTab VALUES(2, ‘
{
“color”: “orange red”,
“rgb”: [255,69,0],
“hex”: “#FF4500”
}
‘);

INSERT INTO colorTab VALUES(3, ‘
{
color: “gold”,
“rgb”: [255,215,0],
“hex”: “#FFD700 ”
}
‘);

INSERT INTO colorTab VALUES(4, ‘I am not valid JSON’);

COMMIT;

As one can see the last insert statement inserts data that is not syntactically correct JSON.

How can we identify such a rows?
And how can we prevent invalid JSON from being inserted
in the first place?

This brings us to the first new JSON operator: ‘IS JSON’. It
allows us to filter column values.

SELECT id, color FROM colorTab WHERE color IS JSON;

(Instead of ‘IS JSON’ one can use ‘IS NOT JSON’ to find all rows which are not JSON.)

If you look carefully at the third row (gold) you’ll see that the field name ‘color’ is not inside quotes. Technically, this makes the data invalid JSON because key names need to be enquoted but not everyone follows that rule. The IS JSON operator still accepts this row as JSON because our JSON parser by default runs in a tolerant mode. But not every JSON tool accepts unquoted names (try out http://jsonlint.com/).

So for JSON you’re sending to others you may want to be extra nice and make sure it complies to all rules. You can do this by running IS JSON in a strict mode:

SELECT id, color FROM colorTab WHERE color IS JSON STRICT;

The JSON specification (unfortunately) says nothing about duplicate key names within the same object. For instance this is technically valid JSON although it has two keys with the name name.

{
color“: “black”,
“rgb”: [0,0,0],
“hex”: “#000000”,
color“: “white”
}

Again, try this out in  http://jsonlint.com/,

In our experience duplicate key names occur vary rarely but if they do they are big troublemakers. So how can we avoid them? The clause ‘WITH UNIQUE KEYS’ checks that keys are not repeated within the same object level (keys with the same name can occur at multiple places in the data but not in the same object).

SELECT id, color FROM colorTab WHERE color IS JSON STRICT WITH UNIQUE KEYS;

(Checking for unique keys comes at a price. It slows down IS JSON because names have to me memorized and checked for duplicates. So use it carefully!)

So now that we understand IS JSON how can we use it to prevent non-JSON data to be inserted into our table in the first place?
The answer is a check constraint. Lets add a constraint to our table:

TRUNCATE TABLE colorTab;

ALTER TABLE colorTab ADD CONSTRAINT ensure_json CHECK (color IS JSON);

Now, when inserting any non-JSON row will be rejected with
ORA-02290: check constraint (SYS.ENSURE_JSON) violated

Also, since we have added a IS JSON check constraint the table/column will show up in a new dictionary view:

SELECT * FROM USER_JSON_COLUMNS;

TABLE_NAME COLUMN_NAME FORMAT DATA_TYPE
COLORTAB COLOR TEXT VARCHAR2

The view’s are named USER_JSON_COLUMNS, DBA_JSON_COLUMNS and ALL_JSON_COLUMNS. They rely on the presence of an IS JSON check constraint.

This is the first posting on this blog about JSON. You’ll find the JSON Developer’s Guide here: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adjsn/toc.htm

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *