Two correctness wins: Redis now mandatory on enterprise, and Microsoft Graph gets full OpenTelemetry
Two back-to-back tags — beta.27 and beta.28 — close two different gaps in NOBA's Microsoft Graph story. Both came out of a dependency refresh that turned into a full audit of how the enterprise tier talks to Azure AD / Entra ID.
Part 1 — Redis is now required on the enterprise tier (beta.27)
NOBA's Microsoft Graph throttle layer — the part that keeps AD migration jobs from getting rate-limited to a halt by Azure's published quotas — is implemented in graph_throttle.py. It tracks three things per tenant: which requests have been made in the current rolling window, which tenants are currently in a server-imposed pause after a Retry-After header, and how close we are to the pre-429 self-pace signal Microsoft sends in the x-ms-throttle-limit-percentage response header. All three are shared state, keyed by tenant and app.
In a single-worker deployment, that shared state lives happily in memory. In a multi-worker deployment — which is what any customer running NOBA against a real AD tenant will have in production — that state needs to be visible to every worker. Otherwise each worker has its own token bucket, each one thinks it hasn't used any of the published quota yet, and they all race independently into the 429 wall. The effective 429 rate scales linearly with worker count. The symptom is invisible under unit tests. It only shows up under real migration load.
Before beta.27, the cache backend was opportunistic: if NOBA_REDIS_URL was set and a Redis endpoint was reachable, it was used. If not, NOBA silently fell back to an in-memory cache and logged a warning. The warning was easy to miss. Multi-worker enterprise deployments could run for weeks before anyone noticed the throttle layer was pretending to coordinate state.
beta.27 makes that a hard failure. At startup, NOBA checks the license state:
- plan=enterprise, state=licensed or expired — a paid-or-perpetual production install. NOBA refuses to start if
NOBA_REDIS_URLis unset or unreachable. The error message tells operators exactly what to do. - plan=enterprise, state=trial or grace — evaluation. NOBA logs a warning but starts, because first-boot onboarding and in-trial evaluation need to work without Redis pre-configured.
- plan=free or state=unlicensed — community tier. Nothing changes. The in-memory fallback is fine for single-worker installs.
The Docker Compose stacks ship a redis:8-alpine service alongside NOBA now, with AOF persistence, a healthcheck, and a named volume; NOBA_REDIS_URL is pre-wired. Zero operator action for the default containerized path. Bare-metal and managed-Redis paths are documented in the configuration page.
One thing worth calling out: on Fedora 43 and RHEL 10 onward, sudo dnf install redis does not install upstream Redis. It installs Valkey — the Linux Foundation fork maintained by AWS, Google, Oracle, Ericsson, and Snap after Redis Inc.'s 2024 license change. Valkey is protocol-compatible with NOBA's redis-py client (same RESP, same commands, same port), so it works transparently. But it has its own CVE series and its own maintainers. If you're running your own cache, identify what's actually installed with redis-cli INFO server | grep server_name and track the right CVE feed. The minimum-safe-version table in the docs now lists both.
The redis>=7.4 Python client floor was chosen after an OSV audit against every historical redis-py CVE — all of them were in the 4.x branch, fixed by 4.5.4, so anything on 7.x is clean. The floor is declared across all six NOBA pip install surfaces (source dist, Docker image, installer script, and three CI workflow venv setups) so a fresh install on any of those paths cannot silently land on a vulnerable transitive.
Part 2 — OpenTelemetry for the hand-rolled Microsoft Graph client (beta.28)
NOBA's Microsoft Graph integration is hand-rolled on httpx. It does not use msal, azure-identity, or msgraph-sdk. That choice is documented as ADR-008 and backed by an evidence pack that cites Microsoft Learn, ENISA NIS2 guidance, OWASP ASVS 5.0, OWASP A06:2021, NIST SP 800-204B, and the relevant CVEs — the short version is that no regulatory framework mandates a specific SDK, Microsoft's own support boundary is the HTTP request (not the SDK wrapper), and NOBA's graph_throttle layer is strictly more capable than what msgraph-sdk's Kiota middleware provides for our workload (pre-429 self-pace on the throttle-limit-percentage header, token-bucket accounting against three published Graph limits, cross-worker coordination).
The cost of that choice is observability. An msgraph-sdk adoption would have come with Kiota's built-in OpenTelemetry auto-instrumentation: spans around every request, metrics on throttle behavior, traces that line up cleanly in a Grafana or Jaeger dashboard. Hand-rolled means we own the observability too.
beta.28 closes that gap. The new server/otel_graph.py module defines a stable contract for Graph telemetry:
- Tracer
noba.graphemits three span kinds:graph.oauth.tokenaround OAuth2 client-credentials token acquisition,graph.requestaround every read/write attempt (one span per attempt, with retry count as an attribute), andgraph.au.replica_retryaround the Administrative Unit 404-retry loop that absorbs Azure's post-create read/write replica race. - Histogram
noba.graph.request.durationrecords per-request latency in milliseconds, attributed by HTTP method, response status code, parsed Graph endpoint (users,groups,directory,$batch, ...), parsed operation (get,create,update,delete,delta, ...), attempt number, and a hashed tenant identifier. - Counters cover the operational axes a Graph dashboard actually wants:
noba.graph.throttledfor 429s received,noba.graph.tenant_pauseforRetry-Aftersleeps honored,noba.graph.token_cachewith aresult=hit|missattribute for the OAuth2 token cache, andnoba.graph.au_replica_racewith anoutcome=resolved|gave_upattribute for the AU 404-retry loop.
Tenant identifiers are a common OpenTelemetry pitfall: spans and metrics get shipped to observability backends, shared with auditors, sometimes copied into support tickets. Raw tenant GUIDs in span attributes are a data-sharing hazard. The graph.tenant.hash attribute is always sha256(tenant_id)[:8] — stable across runs so per-tenant correlation still works, but never the raw GUID. Auditors can trust the traces; ticket-attachment hygiene gets easier.
The module exports a no-op fallback when no OpenTelemetry provider is installed, so community deployments pay zero cost and enterprise deployments running NOBA's existing opentelemetry-exporter-otlp pipeline light up the new instruments immediately.
The other thing beta.28 quietly fixes
The release.yml GitHub Actions workflow builds tarball, RPM, DEB, and Arch packages and publishes them to GitHub Releases. It was also running on NOBA's private Gitea mirror — where the packaging jobs always failed, because they assume GitHub Release API semantics that Gitea doesn't implement identically. Every tag push burned CPU on the Gitea self-hosted runner before failing.
beta.28 gates every job in that workflow with if: github.server_url == 'https://github.com'. On GitHub Actions the workflow runs normally. On Gitea, all seven jobs skip cleanly. The release-gitea.yml workflow handles the Gitea-side release path separately.
Why this matters for a NIS2 story
ADR-008's "keep hand-rolled" decision is defensible against the letter of NIS2 only if NOBA can demonstrate that the hand-rolled stack has the same operational properties a vendor SDK would have. Two of those properties were missing before these tags:
- Shared throttle state — without it, the "respects Microsoft Graph rate limits" claim was false under any multi-worker deployment. beta.27 closes that.
- Auditable telemetry — without it, the "we can prove our integration with your identity provider behaves correctly" claim had no evidence artifact. beta.28 closes that.
The next couple of tags continue the NIS2 hardening pass on adjacent surfaces. The Entra social-login flow in auth_social.py is in the queue.
Upgrading
The Docker Compose path is zero-action: docker compose pull && docker compose up -d picks up the new image and the Redis sibling service. Bare-metal and package upgrades need one line of configuration added — point NOBA_REDIS_URL at a reachable Redis endpoint (or install one locally). The configuration docs have the commands for Fedora, Debian/Ubuntu, and managed-Redis paths.
If you're on the trial or grace period, nothing breaks — NOBA starts, logs a warning, and you have time to decide where to point the cache. If you're on a paid license and you restart without configuring the cache, startup will refuse with an actionable error message. That's by design.
Changelog entries
The full entries are on the changelog page. The short version: beta.26 closed four CVEs at declared dependency floors (python-multipart and PyMySQL). beta.27 made Redis mandatory on the enterprise tier, shipped the Docker Compose Redis sibling, and declared the redis>=7.4 client floor across all six install surfaces. beta.28 added OpenTelemetry instrumentation for the hand-rolled Microsoft Graph client and gated the GitHub release workflow to GitHub-only so Gitea Act-runner stops burning cycles on jobs it was always going to fail.
Comments
No comments yet. Be the first.