Error
Error Code: 22037

PostgreSQL Error 22037: Duplicate JSON Object Keys

📦 PostgreSQL
📋

Description

Error 22037, 'non unique keys in a json object', indicates that a JSON object being processed or stored in PostgreSQL contains multiple instances of the same key. According to standard JSON specifications, keys within a JSON object must be unique. This error typically occurs during data insertion, updates, or when using JSON functions with malformed input.
💬

Error Message

non unique keys in a json object
🔍

Known Causes

3 known causes
⚠️
Malformed JSON Input
An external source or user provides a JSON string where an object explicitly defines the same key multiple times, violating JSON uniqueness rules.
⚠️
Application Logic Error
Application code constructing JSON data (e.g., using `jsonb_build_object` or similar functions) inadvertently creates objects with duplicate keys.
⚠️
Data Transformation Issue
During data import, migration, or transformation processes, source data is mapped or converted to JSON incorrectly, leading to non-unique keys.
🛠️

Solutions

3 solutions available

1. Identify and Remove Duplicate Keys During Data Ingestion medium

Modify your data loading process to detect and remove duplicate JSON keys before inserting into PostgreSQL.

1
When inserting JSON data, preprocess the JSON string to remove duplicate keys. This can be done in your application code or a staging script.
SELECT jsonb_build_object(key, value) FROM jsonb_each('{"a": 1, "b": 2, "a": 3}'::jsonb);
-- This will result in a single object with the last value for 'a'.
-- Example in Python:
-- import json
-- def remove_duplicate_keys(json_string):
--     data = json.loads(json_string)
--     unique_data = {}
--     if isinstance(data, dict):
--         for key, value in data.items():
--             unique_data[key] = value
--     return json.dumps(unique_data)
-- print(remove_duplicate_keys('{"a": 1, "b": 2, "a": 3}'))
2
Ensure your PostgreSQL table uses the `jsonb` data type for the column storing JSON data. `jsonb` handles duplicate keys by keeping the last occurrence during parsing.
ALTER TABLE your_table ALTER COLUMN your_json_column TYPE jsonb USING your_json_column::jsonb;
3
If you are inserting directly from a file, use a script that cleans the JSON before insertion.
cat input.json | python your_cleaning_script.py | psql -h your_host -U your_user -d your_db -c "INSERT INTO your_table (your_json_column) VALUES ('$(cat -)'::jsonb);"

2. Clean Existing Data with Duplicate Keys medium

Run a SQL query to identify and correct existing JSON data with non-unique keys.

1
Identify rows in your table that contain JSON objects with duplicate keys. PostgreSQL's `jsonb` type automatically handles this by keeping the last value during parsing, but it's good to be aware of the source.
SELECT id, your_json_column FROM your_table WHERE your_json_column::text ~ '.*"([^"].*)":.*\1".*'; -- This regex is a simplified check and might not catch all cases.
2
Update the affected rows to use `jsonb` and implicitly resolve duplicate keys. The `::jsonb` cast will parse the JSON and keep only the last occurrence of any duplicate key.
UPDATE your_table SET your_json_column = your_json_column::jsonb WHERE your_json_column IS NOT NULL AND jsonb_typeof(your_json_column) <> 'object'; -- This example assumes you might have non-object JSON. Adjust the WHERE clause as needed.
3
For more precise control, you can rebuild the JSON object, ensuring uniqueness. This is more complex and might require custom functions if you need specific behavior (e.g., concatenating values). For the standard behavior (keeping the last), the cast to `jsonb` is sufficient.
-- Example to demonstrate how jsonb handles it:
SELECT jsonb_object(array_agg(key), array_agg(value))
FROM (
  SELECT key, value, row_number() OVER (PARTITION BY key ORDER BY random()) as rn
  FROM jsonb_each('{"a": 1, "b": 2, "a": 3}'::jsonb)
) sub
WHERE rn = 1;

-- To apply this to your table (use with caution and backup):
UPDATE your_table
SET your_json_column = (
  SELECT jsonb_object(array_agg(key), array_agg(value))
  FROM (
    SELECT key, value, row_number() OVER (PARTITION BY key ORDER BY random()) as rn
    FROM jsonb_each(your_table.your_json_column)
  ) sub
  WHERE rn = 1
)
WHERE jsonb_typeof(your_json_column) = 'object' AND jsonb_pretty(your_json_column) <> jsonb_pretty(jsonb_object(array_agg(key), array_agg(value))
  FROM (
    SELECT key, value, row_number() OVER (PARTITION BY key ORDER BY random()) as rn
    FROM jsonb_each(your_table.your_json_column)
  ) sub
  WHERE rn = 1
);

3. Enforce JSON Schema Validation advanced

Use PostgreSQL's JSON schema validation features to prevent malformed JSON from being inserted.

1
Define a JSON schema that specifies the expected structure and data types of your JSON objects. This schema should not allow duplicate keys if that's a requirement for your application logic.
{
  "type": "object",
  "properties": {
    "name": {"type": "string"},
    "age": {"type": "integer"}
  },
  "additionalProperties": false
}
2
Create a CHECK constraint on your table that uses a JSON schema validation function (e.g., `json_schema_valid` from the `pg_jsonschema` extension, or a custom function).
-- Install the extension if you haven't already:
-- CREATE EXTENSION pg_jsonschema;

ALTER TABLE your_table
ADD CONSTRAINT enforce_json_keys_uniqueness
CHECK (json_schema_valid('your_schema_definition'::jsonb, your_json_column));

-- Note: Standard JSON schema validation doesn't explicitly forbid duplicate keys, as the JSON parser handles it. The 'additionalProperties: false' helps prevent unexpected keys, but not duplicate ones within the same key name.
3
If your JSON schema needs to enforce uniqueness of keys in a way that the standard JSON parser doesn't cover (which is rare for simple duplicate key errors), you would need a custom PL/pgSQL function to perform that specific check.
CREATE OR REPLACE FUNCTION check_unique_json_keys(json_data jsonb) RETURNS BOOLEAN AS $$
DECLARE
  key_count int;
  distinct_key_count int;
BEGIN
  IF jsonb_typeof(json_data) <> 'object' THEN
    RETURN TRUE; -- Or FALSE if only objects are allowed
  END IF;

  SELECT count(*) INTO key_count FROM jsonb_each(json_data);
  SELECT count(DISTINCT key) INTO distinct_key_count FROM jsonb_each(json_data);

  RETURN key_count = distinct_key_count;
END;
$$ LANGUAGE plpgsql;

ALTER TABLE your_table
ADD CONSTRAINT enforce_unique_json_keys
CHECK (check_unique_json_keys(your_json_column));
🔗

Related Errors

5 related errors