We have currently started writing a (still experimental) Mypy plugin to increase Mypy's understanding of some "dynamic" aspects of HydPy. So far, it works fine, and will be an important helper for checking HydPy's source and client code in --strict mode and typing shapes more precisely.
When searching for the remaining causes of unexpected Any occurrences, I realised they are frequently due to the use of NumPy functions. A (at least to me) really surprising Any example:
assert_type(numpy.log(1.0), Any)
assert_type(numpy.log(numpy.float64(1.0)), Any)
Or, less surprisingly, shape typing is inaccurate:
In_: TypeAlias = numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]]
Out: TypeAlias = numpy.ndarray[tuple[Any, ...], numpy.dtype[numpy.float64]]
in_: In_
assert_type(numpy.cumsum(in_), Out)
It should be possible to improve things substantially with other plugin functions. Here is a very rapidly prototyped callback function for cumsum:
def cumsum_hook(context: FunctionContext) -> Type:
if (
isinstance(api := context.api, TypeChecker)
and (len(a_types := context.arg_types[0]) == 1)
and isinstance(a_inst := get_proper_type(a_types[0]), Instance)
and isinstance(a_type := a_inst.type, TypeInfo)
and a_type.has_base("numpy.ndarray")
):
if (
(len(axis_types := context.arg_types[1]) == 1)
and isinstance(axis_inst := get_proper_type(axis_types[0]), Instance)
and not isinstance(axis_inst, NoneType)
):
shape_type = a_inst.args[0]
else:
if isinstance(tuple_type := a_inst.args[0], TupleType):
length: int | None = 1
for item in tuple_type.items:
if (isinstance(item, LiteralType) and isinstance(item.value, int)):
length *= item.value
else:
length = None
break
int_ = api.named_type("builtins.int")
shape_type = TupleType(
[int_] if length is None else [LiteralType(length, fallback=int_)],
fallback=api.named_type("builtins.tuple"),
)
if (
(len(dtype_types := context.arg_types[2]) == 1)
and isinstance(dtype_inst := get_proper_type(dtype_types[0]), CallableType)
):
dtype_type = api.named_type("numpy.dtype").copy_modified(args=[dtype_inst.ret_type])
else:
dtype_type = a_inst.args[1]
return a_inst.copy_modified(args=(shape_type, dtype_type) + a_inst.args[2:])
return context.default_return_type
Even the targeted "shape math" seems to work:
V: TypeAlias = numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]]
V2: TypeAlias = numpy.ndarray[tuple[int], numpy.dtype[numpy.int64]]
M: TypeAlias = numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]]
v: V
m: M
L23: TypeAlias = numpy.ndarray[tuple[Literal[2], Literal[3]], numpy.dtype[numpy.float64]]
l23: L23
L6: TypeAlias = numpy.ndarray[tuple[Literal[6]], numpy.dtype[numpy.float64]]
assert_type(numpy.cumsum(v), V)
assert_type(numpy.cumsum(m), V)
assert_type(numpy.cumsum(m, axis=1), M)
assert_type(numpy.cumsum(v, dtype=numpy.int64), V2)
assert_type(numpy.cumsum(l23, axis=0), L23)
assert_type(numpy.cumsum(l23), L6)
However, it seems strange to include plugin functions that address NumPy in our Mypy plugin for HydPy.
I see the following options:
- Ignore the problem and accept writing many
ignore statements, inaccurate shape type hints, and so on.
- Address the most important NumPy-related features in our HydPy-Mypy plugin.
- Add a slightly more complete NumPy-Mypy plugin into the HydPy package, so that users can activate/deactivate it separately.
- Develop an independent Mypy plugin for NumPy. (But making it reasonably versatile would require much more work than required for normal HydPy applications, of course. Did nobody start with it already?)
- Add such a plugin to NumPy. (Which seems unrealistic, as they have deprecated their old one.)
- Add such a plugin to Mypy. (Also unlikely, because Mypy has no internal plugins besides one for
attrs, and this one seems to exist for historical reasons.)
We have currently started writing a (still experimental) Mypy plugin to increase Mypy's understanding of some "dynamic" aspects of HydPy. So far, it works fine, and will be an important helper for checking HydPy's source and client code in
--strictmode and typing shapes more precisely.When searching for the remaining causes of unexpected
Anyoccurrences, I realised they are frequently due to the use of NumPy functions. A (at least to me) really surprisingAnyexample:Or, less surprisingly, shape typing is inaccurate:
It should be possible to improve things substantially with other plugin functions. Here is a very rapidly prototyped callback function for
cumsum:Even the targeted "shape math" seems to work:
However, it seems strange to include plugin functions that address NumPy in our Mypy plugin for HydPy.
I see the following options:
ignorestatements, inaccurate shape type hints, and so on.attrs, and this one seems to exist for historical reasons.)