Skip to content

Add OAuth2 login to clickhouse-client (--login / --login=device)#1606

Open
BorisTyshkevich wants to merge 15 commits intoantalya-26.2from
feature/client-IdP
Open

Add OAuth2 login to clickhouse-client (--login / --login=device)#1606
BorisTyshkevich wants to merge 15 commits intoantalya-26.2from
feature/client-IdP

Conversation

@BorisTyshkevich
Copy link
Copy Markdown
Collaborator

Summary

Adds --login and --login=device flags to clickhouse-client so users can authenticate via an OpenID Connect provider without manually obtaining a JWT token.

  • --login (bare) — ClickHouse Cloud auto-login path; provider is inferred from the server (existing cloud behaviour, no credentials file needed)
  • --login=browser — Authorization Code + PKCE flow (RFC 7636); opens a browser, captures the callback on a one-shot localhost server, exchanges the code for an ID token
  • --login=device — Device Authorization flow (RFC 8628); prints a URL + short code, polls the token endpoint; works in headless/SSH environments
  • --oauth-credentials <path> — path to a Google-Cloud-Console-format JSON credentials file (default ~/.clickhouse-client/oauth_client.json)

Refresh tokens are cached in ~/.clickhouse-client/oauth_cache.json (mode 0600). Subsequent runs reuse the cached token silently.

Both flows produce an ID token that is passed to the existing --jwt path; no changes to the server or connection layer.

New files

File Purpose
src/Client/OAuthLogin.h OAuthCredentials struct, loadOAuthCredentials, obtainIDToken
src/Client/OAuthLogin.cpp Full OAuth2/PKCE/Device implementation (gated on USE_JWT_CPP && USE_SSL)
src/Client/tests/gtest_oauth_login.cpp 11 unit tests: credential loading, PKCE building blocks

Modified files

File Change
programs/client/Client.cpp --login[=<mode>], --oauth-credentials flags; option validation; cloud vs. OIDC dispatch
src/Access/TokenProcessorsOpaque.cpp Fix picojson long/double specialization gap for Google JWT exp on arm64
docs/en/interfaces/cli.md Document new flags; add OAuth credentials file format section
tests/queries/0_stateless/03749_cloud_endpoint_auth_precedence.sh Tests 9–11: missing credentials file, invalid mode, --jwt+--login conflict

Test plan

  • Unit tests (unit_tests_dbms --gtest_filter='OAuthLogin*') pass
  • 03749_cloud_endpoint_auth_precedence stateless test passes
  • --login=device --oauth-credentials ~/.clickhouse-client/oauth_client-tv.json against a real OIDC provider completes and returns an ID token
  • Second run reuses cached refresh token silently
  • --login=invalidBAD_ARGUMENTS
  • --jwt tok --loginBAD_ARGUMENTS
  • Build without SSL (USE_JWT_CPP=0) → SUPPORT_IS_DISABLED

🤖 Generated with Claude Code

bvt123 and others added 8 commits April 1, 2026 09:18
Adds two new flags to `clickhouse-client`:
- `--login`: Authorization Code + PKCE flow (RFC 7636), opens browser
- `--login-device`: Device Authorization flow (RFC 8628), prints URL+code

Both flows obtain an ID token from any OpenID Connect provider and then
proceed exactly as if `--jwt <token>` had been passed. No changes to the
connection layer.

The user places a Google-format credentials file at
`~/.clickhouse-client/oauth_client.json` (override with
`--oauth-credentials PATH`). Refresh tokens are cached in
`~/.clickhouse-client/oauth_cache.json` (mode 0600, written atomically)
so that subsequent runs proceed silently without re-authenticating.

`OAuthLogin.cpp` is moved to `src/Client/` (compiled into
`clickhouse_client`) so it is unit-testable via `unit_tests_dbms`.
Tests cover `loadOAuthCredentials` for valid/invalid JSON, missing
required fields, unknown top-level keys, and file-not-found error paths.

Backward compatibility: when `--login` is passed without a credentials
file and without `--login-device`, the existing cloud-specific `login()`
path is used as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Google ID tokens return the `exp` claim as a JSON number (floating-point
`double`), not an integer. Calling `getValueByKey<time_t>()` instantiated
`picojson::value::is<long>()` / `picojson::value::get<long>()`, which are
not provided by the `picojson` library and caused a linker exception on
arm64 (where `time_t` is `long`).

Fix: cast through `double` first.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Blocker 1 — cloud auto-login hijack:
Remove the filesystem::exists(oauth_client.json) auto-detection from the
`--login` processing path. The credentials-file flow is now only entered
when `--login-device` or `--oauth-credentials` is explicitly passed.
`--login` alone (including the auto-added case for *.clickhouse.cloud)
always falls through to the existing cloud login() path. This prevents
breaking cloud auth for users who happen to have a Google credentials
file at the default path.

Restore `--login` as `po::bool_switch()` (was changed to a presence flag)
so that existing scripts using `--login=true` continue to work.

XSS in callback HTML:
The OAuth2 auth-code callback handler reflected the error= query-string
parameter directly into the HTML response body. Add a minimal htmlEscape
helper and use it, preventing a potential XSS via a crafted redirect URI.

Remove unused SUPPORT_IS_DISABLED declaration:
That error code is used in Client.cpp (where it belongs), not in
OAuthLogin.cpp. Remove the dead declaration from the anonymous
ErrorCodes block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
postForm — RFC 6749 error response handling:
Removed the early HTTP-status check that threw before JSON parsing.
RFC 6749 §5.2 returns errors (authorization_pending, slow_down, etc.)
as HTTP 400 with a JSON body. postForm now always attempts JSON parsing
and throws only if the body is non-JSON, which fixes the device-flow
polling bug where authorization_pending caused an exception instead of
being handled by the caller's retry loop.

discoverDeviceEndpoint — sub-path issuer support:
The generic heuristic now strips only the last path segment of token_uri
instead of the entire path, preserving realm prefixes like Keycloak's
/realms/<realm>. Accepts an explicit issuer_hint parameter populated
from the new "issuer" field in the credentials JSON, bypassing the
heuristic entirely for providers that need it.

OAuthCredentials — optional issuer field:
Add std::string issuer to the struct and load it from the credentials
JSON. Providers that use non-root issuers (Keycloak, Okta custom
domains) can now specify issuer directly instead of relying on
heuristic path-stripping.

Auth code flow — CSRF state parameter (RFC 6749 §10.12):
Generate a 16-byte random hex state, include it in the authorization
URL, echo it back via the callback handler, and verify it matches
before proceeding to code exchange.

Logging for silently swallowed exceptions:
readCachedRefreshToken prints a notice when the cache file is
unparseable. tryRefreshToken now prints the rejection reason (expired,
revoked, or network error) before falling through to interactive flow.

Tests: add issuer field, PKCE building-block, base64url property tests.
Docs: fix --login prefix, add --login-device and --oauth-credentials
entries with credentials file format and cache location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the separate --login-device flag with a mode parameter on --login:

  --login            auth-code + PKCE flow (browser, default)
  --login=device     Device Authorization flow (prints URL + code)

Uses po::value<std::string>()->implicit_value("browser") so that bare
--login keeps its existing meaning. An invalid mode value throws
BAD_ARGUMENTS immediately. The --login=... prefix is now recognised in
readArguments() as an auth credential indicator.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs from testing:

1. --login device crash (Cannot convert to boolean: device):
   argsToConfig() runs after processOptions() and overwrites config["login"]
   with the raw string value from the command line. The subsequent
   config().getBool("login", false) in main() then threw because "device"
   is not a valid boolean.
   Fix: the cloud-login signal is now stored under the separate key
   "cloud_oauth_pending" which argsToConfig never touches. All three
   references to config["login"] (set-true, set-false, get) are updated.

2. --login=browser (or --login browser) fell through to the cloud path:
   Any explicitly specified mode should use the credentials-file OIDC flow,
   since the user is clearly opting into it. Only bare --login (implicit
   empty value) falls back to the ClickHouse Cloud auto-login path.
   Fix: implicit_value changed from "browser" to ""; use_credentials_file
   now triggers on !login_mode.empty() instead of mode == "device" only.

Routing table after this change:
  --login                → ClickHouse Cloud path (backward compat)
  --login=browser        → OIDC auth-code + PKCE (credentials file)
  --login=device         → OIDC device flow (credentials file)
  --login=<other>        → BAD_ARGUMENTS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Google returns verification_url (legacy) instead of verification_uri (RFC 8628).
getValue on a missing key threw 'Can not convert empty value'.

Fallback order: verification_uri_complete -> verification_uri -> verification_url.
Also guard device_code/user_code with explicit has() checks before access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tests

- Fix `--login[=<mode>]` doc entry: bare `--login` = Cloud auto-login (no
  default mode implied); `--login=browser` / `--login=device` trigger the
  OIDC credentials-file flow
- Update `--oauth-credentials` description and link to new section
- Add `### OAuth credentials file` section: JSON format example, required
  and optional fields table, default path, cache location
- Add Tests 9–11 to `03749_cloud_endpoint_auth_precedence.sh`:
  - `--login=device --oauth-credentials /nonexistent` → file-not-found error
  - `--login=invalid` → `BAD_ARGUMENTS` ("must be 'browser' or 'device'")
  - `--jwt tok --login=browser` → `BAD_ARGUMENTS` ("cannot both be specified")
- Update `.reference` with expected output for new tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Workflow [PR], commit [c78caad]

bvt123 and others added 3 commits April 1, 2026 10:45
I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: bd0cfb8

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: f8f11a2

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 2c7c933

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 3d03550

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 2584ed9

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 1ce5b15

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 537268d

I, Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 145f276

Signed-off-by: Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>
Must-fix:
- Session token refresh: add OAuthJWTProvider extending JWTProvider so
  Connection::sendQuery can call getJWT() transparently; eliminates the
  1-hour session limit when id_token is obtained only once at startup
- Bind callback server to 127.0.0.1 explicitly (not 0.0.0.0) to prevent
  network-adjacent attackers from racing to deliver a forged callback
- Add offline_access scope to both auth-code and device flows so that
  standard OIDC providers (non-Google) issue refresh tokens

Should-fix:
- Guard postForm against non-object JSON: separate JSON parse error from
  Poco extract<Object::Ptr> BadCastException with a clearer message
- Warn on plain http:// token/auth endpoints in loadOAuthCredentials

Also fix include order: OAuthLogin.h now includes <config.h> first so
the USE_JWT_CPP && USE_SSL guard evaluates correctly regardless of the
order headers are included by callers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cess

Google rejects offline_access as an invalid scope (Error 400: invalid_scope).
It uses the access_type=offline query parameter in the authorization URL
instead. Standard OIDC providers still require offline_access in the scope.

Add isGoogleProvider helper (detected via token_uri host) and apply the
correct mechanism for each: Google gets access_type=offline appended to
the authorization URL with openid email profile scope; all other providers
get offline_access in the scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
auto token_info = getObjectFromURI(Poco::URI("https://www.googleapis.com/oauth2/v3/tokeninfo"), token);
if (token_info.contains("exp"))
credentials.setExpiresAt(std::chrono::system_clock::from_time_t((getValueByKey<time_t>(token_info, "exp").value())));
credentials.setExpiresAt(std::chrono::system_clock::from_time_t(static_cast<time_t>(getValueByKey<double>(token_info, "exp").value())));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exp means the number of seconds since Unix epoch. Usually it is integer, and it is expected to be integer, isn\t it?
At least, Azure and Keycloak stick to integer values.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// picojson only has one numeric type: double
typedef double number_type;

So value.is() returns false, value.is<time_t>() returns false, but value.is() returns true — for any JSON number, including 1719500000.

The original code getValueByKey<time_t>(token_info, "exp") would always fail with "Value for key 'exp' has incorrect type" because picojson doesn't have time_t as a native type. The double fix is actually the
correct fix for picojson's type system.

So I retract my earlier comment. Using double is not "accidentally working" — it's the only way to extract a number from picojson. The static_cast<time_t> then truncates to integer, which is fine since the
value was integer to begin with (JSON 1719500000 → picojson double(1719500000.0) → time_t(1719500000)).

The commit message and the fix are correct. If anything, a brief comment explaining picojson's type model would help future readers:

// picojson stores all JSON numbers as double; cast to time_t for epoch seconds.

But that's a style nit, not a bug.

bvt123 and others added 4 commits April 1, 2026 11:15
- Move OAuthJWTProvider class and createOAuthJWTProvider factory into
  OAuthJWTProvider.cpp. The class delegates entirely to obtainIDToken
  (the public API) so it has no anonymous-namespace dependencies and
  can live independently.  OAuthLogin.cpp loses ~60 lines and its
  JWTProvider.h include.
- Fix silent catch(...) in writeCachedRefreshToken: log a warning when
  the existing cache file cannot be parsed (mirrors the read-side
  warning in readCachedRefreshToken).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
**Layer 2 — Keycloak integration tests**

New test suite `tests/integration/test_keycloak_auth/` exercises
Keycloak 26.0 as a real OIDC provider:

- `test_jwt_dynamic_jwks` — token validated via explicit JWKS URI
- `test_openid_discovery` — token validated via OIDC discovery document
- `test_username_claim` — `preferred_username` claim maps to ClickHouse user
- `test_token_refresh` — `refresh_token` grant produces a valid `id_token`
- `test_wrong_issuer_rejected` — tampered `iss` claim is rejected
- `test_expired_token_rejected` — expired `exp` claim is rejected
- `test_device_flow_initiation` — device auth endpoint returns correct fields
- `test_device_flow_round_trip` — full RFC 8628 round-trip: initiate →
  simulate browser approval via HTML form sequence → poll token endpoint →
  authenticate ClickHouse `SELECT 1` with the resulting `id_token`

Infrastructure additions:
- `tests/integration/compose/docker_compose_keycloak.yml` — Keycloak 26.0
  service with `--import-realm` and a health-check on the OIDC discovery URL
- `tests/integration/helpers/cluster.py` — `with_keycloak` flag following
  the existing `with_ldap` pattern: `setup_keycloak_cmd`,
  `wait_keycloak_to_start`, `get_keycloak_url`, docker-compose wiring, and
  `depends_on` entry in instance compose generation
- `tests/integration/test_keycloak_auth/keycloak/realm-export.json` —
  pre-configured realm with client `clickhouse` (direct grants + device
  auth enabled), user `alice`, group `analysts`, and a `groups` claim mapper
- `tests/integration/test_keycloak_auth/configs/validators.xml` —
  `jwt_dynamic_jwks` and `openid` token processors pointing at Keycloak
- `tests/integration/test_keycloak_auth/configs/users.xml` — default user
  with `access_management` enabled

**Fix: `--oauth-credentials` without `--login` silently ignored**

Add an explicit guard before the `if (options.count("login"))` block in
`programs/client/Client.cpp` that throws `BAD_ARGUMENTS` when
`--oauth-credentials` is supplied without `--login=browser` or
`--login=device`.

**Fix: dead CMake and duplicate jwt-cpp linkage in `src/CMakeLists.txt`**

Remove the dead `add_object_library(clickhouse_client_jwt Client/jwt)` block
(the `src/Client/jwt/` directory does not exist) and the duplicate
`target_link_libraries(clickhouse_common_io PUBLIC ch_contrib::jwt-cpp)` at
the Redis block — the identical linkage is already present earlier in the file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- `query_with_token`: replace `node.query(..., headers=...)` (unsupported
  parameter) with `node.http_request("", method="POST", data=query,
  headers=...)`, which uses the HTTP interface and accepts custom headers.

- `get_keycloak_url`: return `http://localhost:{port}` instead of
  `http://keycloak:{port}`.  The `keycloak` hostname is only resolvable
  inside the Docker network; pytest runs on the host and connects via the
  mapped port.  `wait_keycloak_to_start` already used `localhost` correctly.

- `configs/users.xml`: add user `alice` with `<jwt>` authentication so
  that `test_username_claim` (and any test that maps `preferred_username`
  to a ClickHouse session user) can find the user in the user store.

- Remove unused `import copy` from `test_wrong_issuer_rejected` and
  `test_expired_token_rejected`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
postForm and discoverDeviceEndpoint in OAuthLogin.cpp constructed
Poco::Net::HTTPSClientSession with the bare two-argument constructor,
which picks up Poco's built-in default SSL context (system CA bundle,
full verification) and ignores any openSSL.client.* configuration the
user has set, including custom CA certificates (caConfig) and
verificationMode.

In a corporate environment with a TLS inspection proxy, the proxy CA
is trusted only through the client configured CA bundle. With the old
code, every OAuth HTTPS request — OIDC discovery, token endpoint,
postForm — would fail with a certificate verification error, while the
upstream Cloud login path (which uses SSLManager::instance().defaultClientContext())
worked fine.

Fix: pass Poco::Net::SSLManager::instance().defaultClientContext() to
both HTTPSClientSession constructors, matching the pattern already used
in JWTProvider::createHTTPSession. This makes OAuth traffic respect
openSSL.client.caConfig, verificationMode, and every other SSL setting
the user has configured.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@BorisTyshkevich
Copy link
Copy Markdown
Collaborator Author

App config for github.demo.altinity.cloud (not a secret!)

$ cat ~/.clickhouse-client/oauth_client-tv.json
{"installed":{"client_id":"925653064731-d1onu9jr6rdbpmeh41ast4t48cppbvvs.apps.googleusercontent.com","project_id":"altinity-mcp-oauth-test","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_secret":"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"}}

 $ cat ~/.clickhouse-client/oauth_client.json
{"installed":{"client_id":"925653064731-fnr56jga567r94470drdnjvvf6lbkf0r.apps.googleusercontent.com","project_id":"altinity-mcp-oauth-test","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_secret":"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","redirect_uris":["http://localhost"]}}

@svb-alt svb-alt added the antalya label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants