Skip to content

Commit ca6a239

Browse files
authored
Add UUIDv8, use the official format for UUIDv7 (#66)
* Add UUIDv8 with nanosecond precision * Change UUIDv7 to millisecond precision with increased entropy
1 parent a07d329 commit ca6a239

5 files changed

Lines changed: 105 additions & 54 deletions

File tree

README.md

Lines changed: 29 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ New time-based UUID formats which are suited for use as a database key.
77
[![Python versions supported](https://img.shields.io/pypi/pyversions/uuid6.svg?logo=python)](https://pypi.org/project/uuid6/)
88
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
99

10-
This module extends immutable UUID objects (the UUID class) with the functions `uuid6()` and `uuid7()` from [the IETF draft][draft repository].
10+
This module extends immutable UUID objects (the UUID class) with the functions `uuid6()`, `uuid7()`, and `uuid8()` from [the IETF draft][draft repository].
1111

1212
## Install
1313

@@ -18,7 +18,7 @@ pip install uuid6
1818
## Usage
1919

2020
```python
21-
from uuid6 import uuid6, uuid7
21+
from uuid6 import uuid6, uuid7, uuid8
2222

2323
my_uuid = uuid6()
2424
print(my_uuid)
@@ -27,8 +27,20 @@ assert my_uuid < uuid6()
2727
my_uuid = uuid7()
2828
print(my_uuid)
2929
assert my_uuid < uuid7()
30+
31+
my_uuid = uuid8()
32+
print(my_uuid)
33+
assert my_uuid < uuid8()
3034
```
3135

36+
## Which UUID version should I use?
37+
38+
> Implementations SHOULD utilize UUID version 7 over UUID version 1 and 6 if possible.
39+
40+
UUID version 7 features a time-ordered value field derived from the widely implemented and well known Unix Epoch timestamp source, the number of milliseconds seconds since midnight 1 Jan 1970 UTC, leap seconds excluded. As well as improved entropy characteristics over versions 1 or 6.
41+
42+
If your use case requires greater granularity than UUID vesion 7 can provide, you might consider UUID version 8. UUID version 8 doesn't provide as good entropy characteristics as UUID version 7, but it utilizes timestamp with nanosecond level of precision.
43+
3244
## UUIDv6 Field and Bit Layout
3345

3446
```
@@ -47,8 +59,6 @@ assert my_uuid < uuid7()
4759

4860
## UUIDv7 Field and Bit Layout
4961

50-
### [Draft 04][draft 04]
51-
5262
```
5363
0 1 2 3
5464
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -63,7 +73,7 @@ assert my_uuid < uuid7()
6373
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6474
```
6575

66-
### This implementation
76+
## UUIDv8 Field and Bit Layout
6777

6878
```
6979
0 1 2 3
@@ -80,13 +90,13 @@ assert my_uuid < uuid7()
8090
```
8191

8292
- `unix_ts_ms`: 48 bit big-endian unsigned number of Unix epoch timestamp with millisecond level of precision
83-
- `ver`: The 4 bit UUIDv7 version (0111)
93+
- `ver`: The 4 bit UUIDv8 version (1000)
8494
- `subsec_a`: 12 bits allocated to sub-second precision values
8595
- `var`: 2 bit UUID variant (10)
8696
- `subsec_b`: 8 bits allocated to sub-second precision values
8797
- `rand`: The remaining 54 bits are filled with [cryptographically strong random data][python randbits]
8898

89-
20 extra bits dedicated to sub-second precision provide nanosecond resolution. The `unix_ts` and `subsec` fields guarantee the order of UUIDs generated within the same nanosecond by monotonically incrementing the timer.
99+
20 extra bits dedicated to sub-second precision provide nanosecond resolution. The `unix_ts_ms`, `subsec_a`, and `subsec_b` fields guarantee the order of UUIDs generated within the same nanosecond by monotonically incrementing the timer.
90100

91101
## Performance
92102

@@ -96,34 +106,19 @@ Run the shell script [bench.sh][bench] to test on your own machine.
96106

97107
MacBook Air 2020
98108
```
99-
Python 3.10.2
100-
Mean +- std dev: 1.02 us +- 0.01 us
101-
Mean +- std dev: 1.13 us +- 0.02 us
102-
Mean +- std dev: 2.33 us +- 0.02 us
103-
Mean +- std dev: 1.91 us +- 0.02 us
104-
+-----------+---------+-----------------------+-----------------------+-----------------------+
105-
| Benchmark | uuid1 | uuid4 | uuid6 | uuid7 |
106-
+===========+=========+=======================+=======================+=======================+
107-
| timeit | 1.02 us | 1.13 us: 1.11x slower | 2.33 us: 2.29x slower | 1.91 us: 1.87x slower |
108-
+-----------+---------+-----------------------+-----------------------+-----------------------+
109-
```
110-
111-
Google [Cloud Shell][cloud shell] VM
112-
```
113-
Python 3.9.2
114-
Mean +- std dev: 12.6 us +- 0.5 us
115-
Mean +- std dev: 3.06 us +- 0.14 us
116-
Mean +- std dev: 6.42 us +- 0.37 us
117-
Mean +- std dev: 4.94 us +- 0.24 us
118-
+-----------+---------+-----------------------+-----------------------+-----------------------+
119-
| Benchmark | uuid1 | uuid4 | uuid6 | uuid7 |
120-
+===========+=========+=======================+=======================+=======================+
121-
| timeit | 12.6 us | 3.06 us: 4.11x faster | 6.42 us: 1.95x faster | 4.94 us: 2.54x faster |
122-
+-----------+---------+-----------------------+-----------------------+-----------------------+
109+
Python 3.10.4
110+
Mean +- std dev: 870 ns +- 11 ns
111+
Mean +- std dev: 1.17 us +- 0.01 us
112+
Mean +- std dev: 2.18 us +- 0.02 us
113+
Mean +- std dev: 1.60 us +- 0.02 us
114+
Mean +- std dev: 1.78 us +- 0.02 us
115+
+-----------+--------+-----------------------+-----------------------+-----------------------+-----------------------+
116+
| Benchmark | uuid1 | uuid4 | uuid6 | uuid7 | uuid8 |
117+
+===========+========+=======================+=======================+=======================+=======================+
118+
| timeit | 870 ns | 1.17 us: 1.35x slower | 2.18 us: 2.51x slower | 1.60 us: 1.84x slower | 1.78 us: 2.04x slower |
119+
+-----------+--------+-----------------------+-----------------------+-----------------------+-----------------------+
123120
```
124121

125-
[draft repository]: https://github.com/uuid6/uuid6-ietf-draft
126-
[draft 04]: https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04#section-5.2
127-
[cloud shell]: https://cloud.google.com/shell/docs
122+
[draft repository]: https://github.com/ietf-wg-uuidrev/rfc4122bis
128123
[python randbits]: https://docs.python.org/3/library/secrets.html#secrets.randbits
129124
[bench]: https://github.com/oittaa/uuid6-python/blob/main/bench.sh

bench.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,6 @@ python -m pyperf timeit -q -o "${TESTDIR}/uuid1.json" -s "import uuid" "uuid.uui
66
python -m pyperf timeit -q -o "${TESTDIR}/uuid4.json" -s "import uuid" "uuid.uuid4()"
77
python -m pyperf timeit -q -o "${TESTDIR}/uuid6.json" -s "import uuid6" "uuid6.uuid6()"
88
python -m pyperf timeit -q -o "${TESTDIR}/uuid7.json" -s "import uuid6" "uuid6.uuid7()"
9-
python -m pyperf compare_to --table "${TESTDIR}/uuid1.json" "${TESTDIR}/uuid4.json" "${TESTDIR}/uuid6.json" "${TESTDIR}/uuid7.json"
9+
python -m pyperf timeit -q -o "${TESTDIR}/uuid8.json" -s "import uuid6" "uuid6.uuid8()"
10+
python -m pyperf compare_to --table "${TESTDIR}/uuid1.json" "${TESTDIR}/uuid4.json" "${TESTDIR}/uuid6.json" "${TESTDIR}/uuid7.json" "${TESTDIR}/uuid8.json"
1011
rm -rf -- "${TESTDIR}"

src/uuid6/__init__.py

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ def __init__(
3939
if not 0 <= int < 1 << 128:
4040
raise ValueError("int is out of range (need a 128-bit value)")
4141
if version is not None:
42-
if not 6 <= version <= 7:
42+
if not 6 <= version <= 8:
4343
raise ValueError("illegal version number")
4444
# Set the variant to RFC 4122.
4545
int &= ~(0xC000 << 48)
@@ -62,6 +62,8 @@ def time(self) -> int:
6262
| (self.time_hi_version & 0x0FFF)
6363
)
6464
if self.version == 7:
65+
return self.int >> 80
66+
if self.version == 8:
6567
return (self.int >> 80) * 10**6 + _subsec_decode(self.subsec)
6668
return super().time
6769

@@ -76,12 +78,13 @@ def _subsec_encode(value: int) -> int:
7678

7779
_last_v6_timestamp = None
7880
_last_v7_timestamp = None
81+
_last_v8_timestamp = None
7982

8083

8184
def uuid6(clock_seq: int = None) -> UUID:
8285
r"""UUID version 6 is a field-compatible version of UUIDv1, reordered for
83-
improved DB locality. It is expected that UUIDv6 will primarily be
84-
used in contexts where there are existing v1 UUIDs. Systems that do
86+
improved DB locality. It is expected that UUIDv6 will primarily be
87+
used in contexts where there are existing v1 UUIDs. Systems that do
8588
not involve legacy UUIDv1 SHOULD consider using UUIDv7 instead.
8689
8790
If 'clock_seq' is given, it is used as the sequence number;
@@ -98,21 +101,20 @@ def uuid6(clock_seq: int = None) -> UUID:
98101
_last_v6_timestamp = timestamp
99102
if clock_seq is None:
100103
clock_seq = secrets.randbits(14) # instead of stable storage
101-
node = secrets.randbits(48)
102104
time_high_and_time_mid = (timestamp >> 12) & 0xFFFFFFFFFFFF
103105
time_low_and_version = timestamp & 0x0FFF
104106
uuid_int = time_high_and_time_mid << 80
105107
uuid_int |= time_low_and_version << 64
106108
uuid_int |= (clock_seq & 0x3FFF) << 48
107-
uuid_int |= node
109+
uuid_int |= secrets.randbits(48)
108110
return UUID(int=uuid_int, version=6)
109111

110112

111113
def uuid7() -> UUID:
112114
r"""UUID version 7 features a time-ordered value field derived from the
113115
widely implemented and well known Unix Epoch timestamp source, the
114116
number of milliseconds seconds since midnight 1 Jan 1970 UTC, leap
115-
seconds excluded. As well as improved entropy characteristics over
117+
seconds excluded. As well as improved entropy characteristics over
116118
versions 1 or 6.
117119
118120
Implementations SHOULD utilize UUID version 7 over UUID version 1 and
@@ -121,16 +123,33 @@ def uuid7() -> UUID:
121123
global _last_v7_timestamp
122124

123125
nanoseconds = time.time_ns()
124-
if _last_v7_timestamp is not None and nanoseconds <= _last_v7_timestamp:
125-
nanoseconds = _last_v7_timestamp + 1
126-
_last_v7_timestamp = nanoseconds
126+
timestamp_ms, _ = divmod(nanoseconds, 10**6)
127+
if _last_v7_timestamp is not None and timestamp_ms <= _last_v7_timestamp:
128+
timestamp_ms = _last_v7_timestamp + 1
129+
_last_v7_timestamp = timestamp_ms
130+
uuid_int = (timestamp_ms & 0xFFFFFFFFFFFF) << 80
131+
uuid_int |= secrets.randbits(76)
132+
return UUID(int=uuid_int, version=7)
133+
134+
135+
def uuid8() -> UUID:
136+
r"""UUID version 8 features a time-ordered value field derived from the
137+
widely implemented and well known Unix Epoch timestamp source, the
138+
number of nanoseconds seconds since midnight 1 Jan 1970 UTC, leap
139+
seconds excluded."""
140+
141+
global _last_v8_timestamp
142+
143+
nanoseconds = time.time_ns()
144+
if _last_v8_timestamp is not None and nanoseconds <= _last_v8_timestamp:
145+
nanoseconds = _last_v8_timestamp + 1
146+
_last_v8_timestamp = nanoseconds
127147
timestamp_ms, timestamp_ns = divmod(nanoseconds, 10**6)
128148
subsec = _subsec_encode(timestamp_ns)
129149
subsec_a = subsec >> 8
130150
subsec_b = subsec & 0xFF
131-
rand = secrets.randbits(54)
132151
uuid_int = (timestamp_ms & 0xFFFFFFFFFFFF) << 80
133152
uuid_int |= subsec_a << 64
134153
uuid_int |= subsec_b << 54
135-
uuid_int |= rand
136-
return UUID(int=uuid_int, version=7)
154+
uuid_int |= secrets.randbits(54)
155+
return UUID(int=uuid_int, version=8)

test/test_uuid6.py

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,11 @@
33
from unittest.mock import patch
44
from uuid import uuid1
55

6-
from uuid6 import UUID, uuid6, uuid7
6+
from uuid6 import UUID, uuid6, uuid7, uuid8
77

88
REGEX_UUID6 = r"^[0-9a-f]{8}-[0-9a-f]{4}-6[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$"
99
REGEX_UUID7 = r"^[0-9a-f]{8}-[0-9a-f]{4}-7[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$"
10+
REGEX_UUID8 = r"^[0-9a-f]{8}-[0-9a-f]{4}-8[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$"
1011
YEAR_IN_NS = 3600 * 24 * 36525 * 10**7
1112

1213

@@ -29,6 +30,15 @@ def test_uuid7_generation(self):
2930
self.assertLess(uuid7_1, uuid7_2)
3031
uuid7_1 = uuid7_2
3132

33+
def test_uuid8_generation(self):
34+
uuid8_1 = uuid8()
35+
self.assertEqual(uuid8_1.version, 8)
36+
for _ in range(1000):
37+
self.assertRegex(str(uuid8_1), REGEX_UUID8)
38+
uuid8_2 = uuid8()
39+
self.assertLess(uuid8_1, uuid8_2)
40+
uuid8_1 = uuid8_2
41+
3242
def test_invalid_int(self):
3343
with self.assertRaises(ValueError):
3444
_ = UUID(int=-1)
@@ -55,6 +65,15 @@ def test_uuid7_same_nanosecond(self, mocktime):
5565
self.assertLess(uuid7_1, uuid7_2)
5666
uuid7_1 = uuid7_2
5767

68+
@patch("uuid6._last_v8_timestamp", 1)
69+
@patch("time.time_ns", return_value=1234)
70+
def test_uuid8_same_nanosecond(self, mocktime):
71+
uuid8_1 = uuid8()
72+
for _ in range(1000):
73+
uuid8_2 = uuid8()
74+
self.assertLess(uuid8_1, uuid8_2)
75+
uuid8_1 = uuid8_2
76+
5877
@patch("uuid6._last_v6_timestamp", 1)
5978
@patch("secrets.randbits", return_value=678)
6079
@patch("time.time_ns", return_value=12345)
@@ -96,25 +115,41 @@ def test_uuid7_far_in_future(self):
96115
self.assertLess(uuid_prev, uuid_cur)
97116
uuid_prev = uuid_cur
98117

118+
@patch("uuid6._last_v8_timestamp", 1)
119+
def test_uuid8_far_in_future(self):
120+
with patch("time.time_ns", return_value=1):
121+
uuid_prev = uuid8()
122+
for i in range(1, 8000, 10):
123+
with patch("time.time_ns", return_value=i * YEAR_IN_NS):
124+
uuid_cur = uuid8()
125+
self.assertLess(uuid_prev, uuid_cur)
126+
uuid_prev = uuid_cur
127+
99128
def test_time(self):
100129
uuid_1 = uuid1()
101130
uuid_6 = uuid6()
102131
self.assertAlmostEqual(uuid_6.time / 10**7, uuid_1.time / 10**7, 3)
103132
cur_time = time_ns()
104133
uuid_7 = uuid7()
105-
self.assertAlmostEqual(uuid_7.time / 10**9, cur_time / 10**9, 3)
134+
self.assertAlmostEqual(uuid_7.time / 10**3, cur_time / 10**9, 2)
135+
uuid_8 = uuid8()
136+
self.assertAlmostEqual(uuid_8.time / 10**9, cur_time / 10**9, 3)
106137

107138
def test_zero_time(self):
108139
uuid_6 = UUID(hex="00000000-0000-6000-8000-000000000000")
109140
self.assertEqual(uuid_6.time, 0)
110141
uuid_7 = UUID(hex="00000000-0000-7000-8000-000000000000")
111142
self.assertEqual(uuid_7.time, 0)
143+
uuid_8 = UUID(hex="00000000-0000-8000-8000-000000000000")
144+
self.assertEqual(uuid_8.time, 0)
112145

113146
def test_max_time(self):
114147
uuid_6 = UUID(hex="ffffffff-ffff-6fff-bfff-ffffffffffff")
115148
self.assertEqual(uuid_6.time, 1152921504606846975)
116149
uuid_7 = UUID(hex="ffffffff-ffff-7fff-bfff-ffffffffffff")
117-
self.assertEqual(uuid_7.time, 281474976710656000000)
150+
self.assertEqual(uuid_7.time, 281474976710655)
151+
uuid_8 = UUID(hex="ffffffff-ffff-8fff-bfff-ffffffffffff")
152+
self.assertEqual(uuid_8.time, 281474976710656000000)
118153

119154
def test_multiple_arguments(self):
120155
with self.assertRaises(TypeError):

test/test_vectors.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
class TestVectors(unittest.TestCase):
88
"""
9-
https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04#appendix-B
9+
https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis#appendix-C
1010
"""
1111

1212
@patch("uuid6._last_v6_timestamp", 1)
@@ -17,10 +17,11 @@ def test_uuid6_hex_from_time(self, mocktime, mockrand):
1717
self.assertEqual(str(uuid_6), "1ec9414c-232a-6b00-b3c8-9e6bdeced846")
1818

1919
@patch("uuid6._last_v7_timestamp", 1)
20+
@patch("secrets.randbits", return_value=0xCC3 << 64 | 0x1 << 60 | 0x8C4DC0C0C07398F)
2021
@patch("time.time_ns", return_value=0x17F22E279B0 * 10**6)
21-
def test_uuid7_hex_from_time(self, mocktime):
22+
def test_uuid7_hex_from_time(self, mocktime, mockrand):
2223
uuid_7 = uuid7()
23-
self.assertEqual(str(uuid_7)[:15], "017f22e2-79b0-7")
24+
self.assertEqual(str(uuid_7), "017f22e2-79b0-7cc3-98c4-dc0c0c07398f")
2425

2526
def test_uuid6_time_from_hex(self):
2627
uuid_6 = UUID(hex="1EC9414C-232A-6B00-B3C8-9E6BDECED846")
@@ -30,7 +31,7 @@ def test_uuid6_time_from_hex(self):
3031

3132
def test_uuid7_time_from_hex(self):
3233
uuid_7 = UUID(hex="017F22E2-79B0-7CC3-98C4-DC0C0C07398F")
33-
self.assertEqual(uuid_7.time // 10**6, 1645557742000)
34+
self.assertEqual(uuid_7.time, 1645557742000)
3435

3536

3637
if __name__ == "__main__":

0 commit comments

Comments
 (0)