Unique Constraint
The UNIQUE
constraint ensures that a column in a table contains no duplicate values. Like, if you add an UNIQUE
constraint to a column, then no two rows can have the exact/same value for that column.
The documentation is more nuanced.
A
UNIQUE
constraint is similar to aPRIMARY KEY
constraint, except that a single table may have any number ofUNIQUE
constraints. For eachUNIQUE
constraint on the table, each row must contain a unique combination of values in the columns identified by theUNIQUE
constraint. For the purposes ofUNIQUE
constraints,NULL
values are considered distinct from all other values, including other NULLs. As withPRIMARY KEY
s, aUNIQUE
table-constraint clause must contain only column names — the use of expressions in an indexed-column of aUNIQUE
table-constraint is not supported.
There are three points to note, I'll simplify the points:
- If you add an
UNIQUE
constraint to a column, then no two rows can have the exact/same value for that column in that table. - The
UNIQUE
constraint considersNULL
values as distinct i.e. it cannot distinguish between two rows havingNULL
values, hence two rows will pass theUNIQUE
constraint. (Better to useUNIQUE
andNOT NULL
) - The
UNIQUE
constraint is tied to one or more columns, so you cannot add custom expressions to theUNIQUE
constraint. (Better to useUNIQUE INDEX
to create custom expression-based uniqueness or index for querying data.)
Adding a UNIQUE constraint
To add a unique constraint to a column, simply add the UNIQUE
keyword to the column definition:
Let's keep the table simple with just three fields: name
, email
, and age
.
CREATE TABLE users (
name TEXT,
email TEXT UNIQUE,
age INTEGER CHECK (age > 0)
);
Now, lets insert a few records
INSERT INTO users (name, email, age) VALUES('alice', 'alice@wonderland.com', 12);
INSERT INTO users (name, email, age) VALUES('bob', 'bob@wonderland.com', 13);
Now, lets try to insert a record with the existing email value:
INSERT INTO users (name, email, age) VALUES('charlie', 'alice@wonderland.com', 14);
This will immediately fail the UNIQUE
constraint on the email, and the record will not be inserted.
sqlite> INSERT INTO users (name, email, age) VALUES('charlie', 'alice@wonderland.com', 14);
Runtime error: UNIQUE constraint failed: users.email (19)
Notice, as the constraint is only on the email column, it's not on the name or age columns.
INSERT INTO users (name, email, age) VALUES('alice', 'charlie@wonderland.com', 12);
This will succeed, as the UNIQUE
constraint is only on the email column. Hence, we can conclude that the UNIQUE
constraint is on a column level when stated next to the column while creating the table.
SELECT * FROM users;
+-------+------------------------+-----+
| name | email | age |
+-------+------------------------+-----+
| alice | alice@wonderland.com | 12 |
| bob | bob@wonderland.com | 13 |
| alice | charlie@wonderland.com | 12 |
+-------+------------------------+-----+
sqlite>
Let's understand the problem first with NULL
values in UNIQUE constraint.
Adding a UNIQUE constraint with NULL values
What if I insert a record with a NULL
value for the email
column?
INSERT INTO users (name, email, age) VALUES('charlie', NULL, 14);
--OR
INSERT INTO users (name, age) VALUES('charlie', 14);
That went in, it created the record, which is fine, so far, as there was no previous entry with NULL
email in the table, or rather the email
column.
SELECT * FROM users;
However, what happens if we try to insert another record with a NULL
value for the email
column?
INSERT INTO users (name, email, age) VALUES('dave', NULL, 15);
-- OR
INSERT INTO users (name, age) VALUES('dave', 15);
This goes in as well, Hhh? It shouldn't have right? If alice@wonderland.com
is not allowed twice then why should NULL
be allowed?
That's what the documentation clearly stated:
For the purposes of
UNIQUE
constraints,NULL
values are considered distinct from all other values, including other NULLs.
That is clear as crystal, SQLite won't distinguish between two NULL
values in the email
column. Hence, we can't have duplicate NULL
values in the email
column.
For that we need to set email as NOT NULL
as well as UNIQUE
inorder to avoid populating NULL
values in the email
column.
Adding UNIQUE Constraint on multiple columns
If you wondered, can we add unique constraint on multiple columns? Yes, we can and there are subtle variations on how you interpret those as, multiple as in two or more columns unique independently or the combination of two or more columns being unique.
- Unique constraint independent of each other
- Combinational Unique constraint
Unique constraint independent of each other
Let's say we want to add constraint to the user table, which could have more than one column that needs to be unique, like example, the phone_number
for two users cannot be same, and so on.
DROP TABLE users;
CREATE TABLE users (
name TEXT,
email TEXT UNIQUE,
phone_number TEXT UNIQUE,
government_id TEXT UNIQUE,
age INTEGER CHECK (age > 0),
bio TEXT
);
In the above table schema, we have three unique constraints, email
, phone_number
, and government_id
. Each unique constraint is independent of each other. Let's take a look at a few cases in order to understand it better.
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('bob', 'bob@wonderland.com', '1234567891', 'DEF456', 13, 'I am bob');
SELECT * FROM users;
That is the simple case, However what would happen if one user has the same phone_number
, and government_id
but different email
?
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'new_alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
sqlite> INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'new_alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
Runtime error: UNIQUE constraint failed: users.government_id (19)
Nope, it is independent of each constraint, no two rows can have the same government_id
or phone_number
and even the email
column.
Now, what if we want to allow phone_number
, government_id
and email_id
to have a unique combination? Like the three of the columns together should be unique across the table, now with that they won't be unique across the single column, they are combined.
Combinational Unique constraint
In a combinational unique constraint, we can add two or more columns like UNIQUE(colum_1, column_2, ... column_n)
as a combinational constraint.
In this type of constraint:
- Not necessarily each column needs to be unique, the combination of the
N
columns should be unique
DROP TABLE users;
CREATE TABLE users (
name TEXT,
email TEXT NOT NULL,
phone_number TEXT NOT NULL,
government_id TEXT NOT NULL,
age INTEGER CHECK (age > 0),
bio TEXT,
UNIQUE(email, phone_number, government_id)
);
-- OR
-- CREATE TABLE users(
-- name TEXT,
-- email TEXT NOT NULL,
-- phone_number TEXT NOT NULL,
-- age INTEGER CHECK (age > 0),
-- bio TEXT,
-- UNIQUE(email, phone_number)
-- );
I am just making it up, its not secure, like combination of email
, phone_number
and government_id
is not secure, you can debate about that, but it's not a data-integrity and schema-design class, this post is about understanding the UNIQUE
column constraint with combination of two or more columns.
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('bob', 'bob@wonderland.com', '1234567891', 'DEF456', 13, 'I am bob');
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'new_alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
SELECT * FROM users;
That works! We have 1234567890
as the phone_number
and ABC123
as the government_id
which is same, but email
is different. So, the combination of email
, phone_number
and government_id
is unique.
sqlite> SELECT * FROM users;
+-------+--------------------------+--------------+---------------+-----+------------+
| name | email | phone_number | government_id | age | bio |
+-------+--------------------------+--------------+---------------+-----+------------+
| alice | alice@wonderland.com | 1234567890 | ABC123 | 12 | I am alice |
| bob | bob@wonderland.com | 1234567891 | DEF456 | 13 | I am bob |
| alice | new_alice@wonderland.com | 1234567890 | ABC123 | 12 | I am alice |
+-------+--------------------------+--------------+---------------+-----+------------+
sqlite> INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('bob', 'new_alice@wonderland.com', '0987654321', 'ABC123', 12, 'I am alice');
sqlite> SELECT * FROM users;
+-------+--------------------------+--------------+---------------+-----+------------+
| name | email | phone_number | government_id | age | bio |
+-------+--------------------------+--------------+---------------+-----+------------+
| alice | alice@wonderland.com | 1234567890 | ABC123 | 12 | I am alice |
| bob | bob@wonderland.com | 1234567891 | DEF456 | 13 | I am bob |
| alice | new_alice@wonderland.com | 1234567890 | ABC123 | 12 | I am alice |
| bob | new_alice@wonderland.com | 0987654321 | ABC123 | 12 | I am alice |
+-------+--------------------------+--------------+---------------+-----+------------+
sqlite>
INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('alice', 'new_alice@wonderland.com', '1234567890', 'ABC123', 12, 'I am alice');
sqlite> INSERT INTO users (name, email, phone_number, government_id, age, bio)
VALUES('bob', 'new_alice@wonderland.com', '0987654321', 'ABC123', 12, 'I am alice');
Runtime error: UNIQUE constraint failed: users.email, users.phone_number, users.government_id (19)
sqlite>
Now this will fail, as the email
, phone_number
, government_id
combination is already unique.
Primary key vs Unique Constraint
Both of these do the same thing, but UNIQUE
can be added to multiple columns however, the PRIMARY KEY
should only be one, but it can contain multiple columns. In other words, UNIQUE constraint is column-level, however each table can only have one PRIMARY KEY
that will be used to indentify the uniqueness among the rows.
PRIMARY KEY = UNIQUE + NOT NULL
Primary key is also a shorthand for UNIQUE
and NOT NULL
, if you add a PRIMARY KEY
to a column or combination of columns for a table, you needn't specify UNIQUE
and NOT NULL
, it already does that for you.
So, that is the general difference of UNIQUE
and PRIMARY KEY
, we'll explore the PRIMARY KEY
in later posts.