Skip to content

Mypy plugin features for NumPy? #181

@tyralla

Description

@tyralla

We have currently started writing a (still experimental) Mypy plugin to increase Mypy's understanding of some "dynamic" aspects of HydPy. So far, it works fine, and will be an important helper for checking HydPy's source and client code in --strict mode and typing shapes more precisely.

When searching for the remaining causes of unexpected Any occurrences, I realised they are frequently due to the use of NumPy functions. A (at least to me) really surprising Any example:

assert_type(numpy.log(1.0), Any)
assert_type(numpy.log(numpy.float64(1.0)), Any)

Or, less surprisingly, shape typing is inaccurate:

In_: TypeAlias = numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]]
Out: TypeAlias = numpy.ndarray[tuple[Any, ...], numpy.dtype[numpy.float64]]
in_: In_
assert_type(numpy.cumsum(in_), Out)

It should be possible to improve things substantially with other plugin functions. Here is a very rapidly prototyped callback function for cumsum:

def cumsum_hook(context: FunctionContext) -> Type:
    if (
        isinstance(api := context.api, TypeChecker)
        and (len(a_types := context.arg_types[0]) == 1)
        and isinstance(a_inst := get_proper_type(a_types[0]), Instance)
        and isinstance(a_type := a_inst.type, TypeInfo)
        and a_type.has_base("numpy.ndarray")
    ):
        if (
            (len(axis_types := context.arg_types[1]) == 1)
            and isinstance(axis_inst := get_proper_type(axis_types[0]), Instance)
            and not isinstance(axis_inst, NoneType)
        ):
            shape_type = a_inst.args[0]
        else:
            if isinstance(tuple_type := a_inst.args[0], TupleType):
                length: int | None = 1
                for item in tuple_type.items:
                    if (isinstance(item, LiteralType) and isinstance(item.value, int)):
                        length *= item.value
                    else:
                        length = None
                        break
            int_ = api.named_type("builtins.int")
            shape_type = TupleType(
                [int_] if length is None else [LiteralType(length, fallback=int_)],
                fallback=api.named_type("builtins.tuple"),
            )
        if (
            (len(dtype_types := context.arg_types[2]) == 1)
            and isinstance(dtype_inst := get_proper_type(dtype_types[0]), CallableType)
        ):
            dtype_type = api.named_type("numpy.dtype").copy_modified(args=[dtype_inst.ret_type])
        else:
            dtype_type = a_inst.args[1]
        return a_inst.copy_modified(args=(shape_type, dtype_type) + a_inst.args[2:])
    return context.default_return_type

Even the targeted "shape math" seems to work:

V: TypeAlias = numpy.ndarray[tuple[int],  numpy.dtype[numpy.float64]]
V2: TypeAlias = numpy.ndarray[tuple[int],  numpy.dtype[numpy.int64]]
M: TypeAlias = numpy.ndarray[tuple[int, int],  numpy.dtype[numpy.float64]]
v: V
m: M
L23: TypeAlias = numpy.ndarray[tuple[Literal[2], Literal[3]],  numpy.dtype[numpy.float64]]
l23: L23
L6: TypeAlias = numpy.ndarray[tuple[Literal[6]],  numpy.dtype[numpy.float64]]

assert_type(numpy.cumsum(v), V)
assert_type(numpy.cumsum(m), V)
assert_type(numpy.cumsum(m, axis=1), M)
assert_type(numpy.cumsum(v, dtype=numpy.int64), V2)
assert_type(numpy.cumsum(l23, axis=0), L23)
assert_type(numpy.cumsum(l23), L6)

However, it seems strange to include plugin functions that address NumPy in our Mypy plugin for HydPy.

I see the following options:

  1. Ignore the problem and accept writing many ignore statements, inaccurate shape type hints, and so on.
  2. Address the most important NumPy-related features in our HydPy-Mypy plugin.
  3. Add a slightly more complete NumPy-Mypy plugin into the HydPy package, so that users can activate/deactivate it separately.
  4. Develop an independent Mypy plugin for NumPy. (But making it reasonably versatile would require much more work than required for normal HydPy applications, of course. Did nobody start with it already?)
  5. Add such a plugin to NumPy. (Which seems unrealistic, as they have deprecated their old one.)
  6. Add such a plugin to Mypy. (Also unlikely, because Mypy has no internal plugins besides one for attrs, and this one seems to exist for historical reasons.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions