Skip to content

Update Umans AI Coding Plan + add Umans AI (pay-per-token) provider#2265

Merged
rekram1-node merged 3 commits into
anomalyco:devfrom
jcraftsman:update-umans-ai-provider
Jun 12, 2026
Merged

Update Umans AI Coding Plan + add Umans AI (pay-per-token) provider#2265
rekram1-node merged 3 commits into
anomalyco:devfrom
jcraftsman:update-umans-ai-provider

Conversation

@jcraftsman

Copy link
Copy Markdown
Contributor

Summary

Two changes to the Umans provider listings:

1. Update Umans AI Coding Plan (subscription)

  • Add umans-kimi-k2.7 — Kimi K2.7 Code, Moonshot's newest coding model
  • Add umans-flash-beta — deprecated alias for umans-flash (sunset 2026-06-07)
  • Fix umans-glm-5.1 — add image to input modalities (vision via handoff) and [interleaved] reasoning
  • Fix umans-flash — add [interleaved] reasoning field (Qwen3.6 supports reasoning_content)
  • Fix umans-qwen3.6-35b-a3b — add [interleaved] reasoning field
  • Add reasoning_options and explicit name to all models using base_model

2. Add Umans AI provider (pay-per-token)

New provider for organization service-account usage with per-token pricing:

Model Input / 1M Output / 1M Cache Read / 1M
Kimi K2.6 $0.95 $4.00 $0.20
Kimi K2.7 Code $0.95 $4.00 $0.19
GLM 5.1 $1.40 $4.40 $0.29
Umans Flash $0.15 $1.00 $0.05
Umans Coder $0.95 $4.00 $0.20

Same endpoint (api.code.umans.ai), different billing model. Subscription plans are flat-rate (hence $0 costs on the Coding Plan); token billing is available to orgs via service-account keys. Pricing source: https://app.umans.ai/offers/code/docs/orgs

Notes

  • The base_model fields on all provider models link to the canonical model entries (e.g. moonshotai/kimi-k2.6, zhipuai/glm-5.1, alibaba/qwen3.6-35b-a3b, moonshotai/kimi-k2.7-code), so Umans will now appear as a provider option on those model pages
  • GLM 5.1 includes image in input modalities because Umans serves it with a vision handoff (text generation via GLM 5.1, image preprocessing via Kimi), matching the same pattern as Kimi For Coding's listing
  • Recommended max output for subscription plans is 32,768 tokens (matches recommended_max_tokens from our /v1/models/info endpoint), but the Coding Plan does not enforce a hard output limit so we omit the output limit override there; the pay-per-token provider sets output = 32_768 as a practical default

Umans AI Coding Plan (subscription):
- Add umans-kimi-k2.7 (Kimi K2.7 Code) model
- Add umans-flash-beta (deprecated alias for umans-flash)
- Fix umans-glm-5.1: add vision modality (via handoff) and interleaved reasoning
- Fix umans-flash: add interleaved reasoning field
- Fix umans-qwen3.6-35b-a3b: add interleaved reasoning field
- Add reasoning_options to all models
- Add explicit name field to models using base_model

Umans AI (new provider - pay-per-token for orgs):
- New provider for organization service-account usage
- Per-token pricing from the org billing page:
  - Kimi K2.6: /bin/bash.95/.00 (input/output), /bin/bash.20 cache read
  - Kimi K2.7 Code: /bin/bash.95/.00, /bin/bash.19 cache read
  - GLM 5.1: .40/.40, /bin/bash.29 cache read
  - Umans Flash (Qwen3.6-35B-A3B): /bin/bash.15/.00, /bin/bash.05 cache read
  - Umans Coder: routes to Kimi K2.6 rates
- Same endpoint (api.code.umans.ai), different billing model
@jcraftsman

Copy link
Copy Markdown
Contributor Author

@aredridel just submitted my first PR here :)

@rekram1-node rekram1-node merged commit 48d1bc2 into anomalyco:dev Jun 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants