[BUG] Large files compile very slowly in the C compiler

**Describe the bug**

It takes a very long time for the C compiler to run on large files like ExprNodes.py and Nodes.py. This is seen in the CI `-all` tests repeatedly timing out (although partly this is because we can't compile them in parallel on Python 2.7)

One possible indicator is the warning from gcc:

> `note: variable tracking size limit exceeded with ‘-fvar-tracking-assignments’, retrying without`

(although that could potentially be toggled independently or the limit increased)

Obviously it's inevitable that a big Cython file will generate a big C file and that a big C file will take a while to compile, but potentially we could do better.

There's a few gcc flags to try to profile compilation time (https://stackoverflow.com/questions/13559818/profiling-the-c-compilation-process), and they suggest that the module init function (where all the module-level user code goes) is the main culprit (unsurprisingly)

**Environment (please complete the following information):**
 - Linux, CI, most obviously in Python 2.7

**Additional context**

I tried a couple of approaches to fix the problem.

1. First I created small sub-scopes within the module init function. https://github.com/da-woods/cython/tree/morelocaltemps. This didn't achieve any speedup or get rid of the warning, but may be worth using some of the change for other reasons (https://github.com/da-woods/cython/commit/c9014165baaf56bf3af0e08971f7e3eda46b428e#commitcomment-57239451)
2. Second, I tried to split each stat at module-level into a separate function (https://github.com/cython/cython/pull/4386). This gave appreciable speed-ups for large modules. However the PR was very intended as a proof of concept with little attention to code quality....

I think a variant of the second approach is probably worthwhile. My current thought that we shouldn't do it on a "per-stat" basis but maybe give each class creation a separate function (that's easy to do for `cdef` classes, slightly harder for regular classes). That would likely give the appropriate granularity and keep things grouped in logical units.

**Improvements made to mitigate this in Cython 3.1**

More efficient string constant storage:
* https://github.com/cython/cython/commit/f39526df12fb33db6eed318e37deb8d5dfe2a3ba
* https://github.com/cython/cython/commit/368a750952f97042f3e8728f18c350f80413c84d (fixed in https://github.com/cython/cython/commit/f5f83fbf803d1b3ba88f71d12bb48306815efd3c)
* https://github.com/cython/cython/commit/0d5af7b68d062b766cb59ee1b76ad342d004f8dc

Shorter code generation:
* https://github.com/cython/cython/commit/e6c621a91265d94a6c0dad9975e85b46c493706d
* https://github.com/cython/cython/commit/6526ecf4acfa829c1ec50a5d11db4cef18386e60
* https://github.com/cython/cython/commit/904741890210d681102780dbd9f41bdf1ae561d2
* https://github.com/cython/cython/commit/12241b84055b1226223577a7f32a8a2c0bff47ee

More efficient code object creation that no longer permanently stores tuples in the global module state:
* https://github.com/cython/cython/commit/f5763fa1d69cc550c2e416a387cc84f9c8766beb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Large files compile very slowly in the C compiler #4425

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] Large files compile very slowly in the C compiler #4425

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions