Skip to content

Rustine: mimalloc C → Rust Translation#1212

Open
enum-class wants to merge 1 commit intomicrosoft:dev3from
enum-class:dev3
Open

Rustine: mimalloc C → Rust Translation#1212
enum-class wants to merge 1 commit intomicrosoft:dev3from
enum-class:dev3

Conversation

@enum-class
Copy link
Copy Markdown

This PR presents an initial version of Rust translation of Microsoft's mimalloc (dev3 branch) using Rustine, an LLM-powered C-to-Rust whole-repository translation/validation/debugging pipeline. It is worth noting that C2Rust fails to translate mimalloc because it struggles with complex atomic operations and strict thread-safety requirements. In contrast, Rustine succeeds by focusing on semantic intent rather than relying on shallow, syntax-level transpilation.

The translation of the entire repository (including application and test code) is 100% compilable. Test translation quality is manually checked, and execution of translated tests on translated code results in 100% test pass (test_api, test_api_fill, test_stress).

Please review the translations and let us know if you are interested in additional stats related to Rust code (safety features, raw pointer usage, including pointer arithmetics, clippy results, and code quality metrics). We would be happy to incorporate your feedback into the pipeline and improve the translation.

This PR presents an initial version of Rust translation
of Microsoft's mimalloc (dev3 branch) using Rustine,
an LLM-powered C-to-Rust whole-repository
translation/validation/debugging pipeline.

The translation of the entire repository
(including application and test code) is 100% compilable.
@enum-class
Copy link
Copy Markdown
Author

@enum-class please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree [company="University of Illinois Urbana-Champaign"]

@enum-class
Copy link
Copy Markdown
Author

@enum-class please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree company="University of Illinois Urbana-Champaign"

@Zoxc
Copy link
Copy Markdown

Zoxc commented Apr 17, 2026

The LLM output looks quite bad, presumedly since it tries to stay close to C without using idiomatic Rust. The use of references I've seen looks highly questionable also. Compare it to https://github.com/Zoxc/fjall for a more proper Rust port.

Comment on lines +19 to +23
pub fn mi_expand(p: Option<&mut ()>, newsize: usize) -> Option<&mut ()> {
// The C function ignores its parameters and returns NULL (0)
// In Rust, we return None to represent NULL
None
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only took me until here to find a blatantly incorrect translation

mimalloc/src/alloc.c

Lines 270 to 282 in b196396

void* mi_expand(void* p, size_t newsize) mi_attr_noexcept {
#if MI_PADDING
// we do not shrink/expand with padding enabled
MI_UNUSED(p); MI_UNUSED(newsize);
return NULL;
#else
if (p == NULL) return NULL;
const mi_page_t* const page = mi_validate_ptr_page(p,"mi_expand");
const size_t size = _mi_usable_size(p,page);
if (newsize > size) return NULL;
return p; // it fits
#endif
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcotton42 Thank you for the comment, but I believe this translation is actually correct. This version of mimalloc was intentionally translated with MI_PADDING enabled. In this mode, the original C code explicitly disables mi_expand and always returns NULL. The original C code also explicitly disables shrinking/expanding and always returns NULL The Rust translation correctly preserves this behavior (returning None). The translation is semantically correct for MI_PADDING mode. If you'd like to see the version with padding disabled (where the actual expand logic is active), or an updated translation from our improved pipeline, let me know.

@enum-class
Copy link
Copy Markdown
Author

The LLM output looks quite bad, presumedly since it tries to stay close to C without using idiomatic Rust. The use of references I've seen looks highly questionable also. Compare it to https://github.com/Zoxc/fjall for a more proper Rust port.

@Zoxc Thank you for taking the time to review the PR This translation was generated about 4 months ago using an early version of the Rustine pipeline powered by DeepSeek-V3 a relatively cheap and lightweight model at the time. The goal of this PR was never to deliver a production-ready, fully idiomatic Rust port of mimalloc, but rather to serve as a research artifact demonstrating that LLMs (even modest ones) can successfully translate a complex, highly concurrent, and performance-critical C codebase like mimalloc into 100% compilable Rust that passes the key tests. In many places the generated code does manage to produce reasonably idiomatic Rust patterns, which was encouraging. Since then, we have made significant improvements to the Rustine pipeline and many of such issues are resolved. Practitioners can use such compilable translations (and likely preserving functionality due to similarity to C code) by LLMs as the first version, and improve the idiomaticity. Note that these translations can be ready very fast, in the order of minutes/few hours. Even if significant manual fixing is still required afterward, the overall process is dramatically faster than starting from scratch.
If you're interested, I'd be happy to share some updated translations from the improved pipeline so you can see the progress we've made. We're also very open to specific feedback on problematic patterns you noticed. It would help us further refine the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants