Avoid fromisoformat
when parsing UTC date times serialized by DjangoJSONEncoder
The problem
I ran into this issue today when usingfromisoformat
on a date time stored in a JSONField
. At first, I thought it was data provided by a third party, but it turned out that a python datetime
object was being saved into the JSONField
and the real culprit was DjangoJSONEncoder
.
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
>>> from datetime import datetime, timezone
>>> from django.core.serializers.json import DjangoJSONEncoder
>>> DjangoJSONEncoder().default(datetime(2022,1,1).replace(tzinfo=timezone.utc))
'2022-01-01T00:00:00Z'
>>> datetime_str = _ # _ is the value of the last expression when using the interpreter
>>> datetime.fromisoformat(datetime_str)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Invalid isoformat string: '2022-01-01T00:00:00Z'
Defined in a model such as
class MyModel(models.Model):
data = fields.JSONField(encoder=json.DjangoJSONEncoder, null=True)
The fix
To avoid this you can manually replace the 'Z' or use a library such as dateutil
>>> from dateutil.parser import parse
>>> parse('2022-01-01T00:00:00Z')
datetime.datetime(2022, 1, 1, 0, 0, tzinfo=tzutc())
The reason
But why does this happen in the first place? Well it turns out that Python's fromisoformat
currently implements a subset of ISO 8601 and doesn't parse the "Z", as detailed in Issue 35829: datetime: parse "Z" timezone suffix in fromisoformat() - Python tracker and Issue 15873: datetime: add ability to parse RFC 3339 dates and times - Python tracker specifically this comment:
> The better is the enemy of the good here. Given the history of this issue, I would rather accept a well documented restrictive parser than wait for a more general code to be written. Note that we can always relax the parsing rules in the future.
This is in fact the exact reason why I wrote the isoformat parser like I did, because ISO 8601 is actually a quite expansive standard, and this is the least controversial subset of the features. In fact, I spent quite a bit of time on adapting the general purpose ISO8601 parser I wrote for dateutil *into* one that only accepts the output of isoformat() because it places a minimum burden on ongoing support, so it's not really a matter of waiting for a more general parser to be written.
The current implementation of DjangoJSONEncoder
follows the ECMAScript Language Specification - ECMA-262 Edition 5.1. Which specifies dates in the format YYYY-MM-DDTHH:mm:ss.sssZ
class DjangoJSONEncoder(json.JSONEncoder):
def default(self, o):
# See "Date Time String Format" in the ECMA-262 specification.
if isinstance(o, datetime.datetime):
r = o.isoformat()
if o.microsecond:
r = r[:23] + r[26:]
if r.endswith("+00:00"):
r = r[:-6] + "Z"
return r
fromisoformat
implementation. Just be sure to be careful when pulling data out of a JSONField
containing serialized datetimes.
Meta
- Written - 23th February 2022
Code tested on
- python 3.8.10
- django 3.2.10