Regex matching "|" separated values for Union types

  annotations, python, python-3.x, python-typing, regex

I’m trying to match type annotations like int | str, and use regex substitution to replace them with a string Union[int, str].

Desired substitutions (before and after):

  • str|int|bool -> Union[str,int,bool]
  • Optional[int|str] -> Optional[Union[int,str]]
  • dict[str | int, list[B | C | Optional[D]]] -> dict[Union[str,int], list[Union[B,C,Optional[D]]]]

I’ve put together a Regex below on what i’ve attempted so far, but it’s not really working how I’d want it to. For ex. it seems to consume unnecessary ] at the end. I also can’t get it to separate from words outside the expression, for ex. the dict[str above seems to be included in the expression, but ideally we’d exclude it as it’s not a part of the Union expression. Also, the regex subroutines doesn’t seem to be supported by the re module in Python. Here is where I got the idea to use that from.

Regex Demo

Additional Info

This is mainly to support the PEP 604 syntax for Python 3.7+, which requires annotatations to be forward-declared (e.g. declared as strings) to be supported, as otherwise builtin types don’t support the | operator.

Here’s a sample code that I came up with:

from __future__ import annotations
from datetime import date
from decimal import Decimal
from typing import Optional

class A:
    field_1: str|int|bool
    field_2: int  |  str  |  bool
    field_3: Decimal|date|str
    field_4: str|Optional[int]
    field_5: Optional[int|str]
    field_6: dict[str | int, list[B | C | Optional[D]]]

class B: ...
class C: ...
class D: ...

For Python versions earlier than 3.10, I use a __future__ import to avoid the error below:

TypeError: unsupported operand type(s) for |: 'type' and 'type'

This essentially converts all annotations to strings, as below:

>>> A.__annotations__
{'field_1': 'str | int | bool', 'field_2': 'int | str | bool', 'field_3': 'Decimal | date | str', 'field_4': 'str | Optional[int]', 'field_5': 'Optional[int | str]', 'field_6': 'dict[str | int, list[B | C | Optional[D]]]'}

But in code (say in another module), I want to evaluate the annotations in A. This works in Python 3.10, but fails in Python 3.7+ even though the __future__ import supports forward declared annotations.

>>> from typing import get_type_hints
>>> hints = get_type_hints(A)

Traceback (most recent call last):
    eval(self.__forward_code__, globalns, localns),
  File "<string>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'type' and 'type'

It seems the best approach to make this work, is to replace all occurrences of int | str (for example) with Union[int, str], and then with typing.Union included in the additional localns used to evaluate the annotations, it should then be possible to evaluate PEP 604- style annotations for Python 3.7+.

Source: Python Questions