RBAC and access control (Business+)
Role, policy, and token model to limit who can submit, read, and operate the cluster.
In the Community plan, anyone with the admin token does everything. That's acceptable for small teams, but scales poorly. Starting with the Business plan, the cluster gains a role model with fine granularity.
This document explains the model, shows how to assign roles, and closes with practices that work in production.
The model in four concepts
| Concept | What it is |
|---|---|
| User | Human or service identity |
| Token | Bearer credential associated with a user |
| Role | Name aggregating one or more policies |
| Policy | Concrete rules: what is allowed on which resource |
A user has one or more tokens. Each token carries one or more roles. Each role is a set of policies. Each policy authorizes capabilities on resources.
When a call reaches the cluster, the token resolves to the effective set of capabilities. If the requested operation is in the set, it proceeds. Otherwise it gets 403 Forbidden.
Built-in roles
For most teams, the four ready-made roles cover what's needed:
| Role | What it does |
|---|---|
admin | Everything. Includes creating users, changing policies, rekey, snapshot. |
operator | Submits and manages jobs in any namespace. Does not create users. |
deployer | Submits jobs in specific namespaces. Does not inspect secrets. |
viewer | Read-only. Sees jobs, allocations, metrics. Does not see secrets. |
List what exists:
heroctl acl role list
heroctl acl role describe operator
Custom policies
When the built-in roles aren't enough, write a policy in YAML:
# arquivo: deployer-prod.yaml
name: deployer-prod
description: "Submete jobs no namespace prod, sem ler segredos"
rules:
- resource: job
namespace: "prod"
capabilities: ["read", "list", "submit", "stop"]
- resource: namespace
name: "prod"
capabilities: ["read"]
- resource: alloc
namespace: "prod"
capabilities: ["read", "logs"]
- resource: secret
capabilities: [] # negado explicitamente
Apply:
heroctl acl policy create -f deployer-prod.yaml
# anexa a política a um papel novo
heroctl acl role create --name deploy-prod --policies deployer-prod
Common capabilities:
read/list— readingsubmit/update— creation and modificationstop/delete— terminationlogs— access to allocation stdout/stderrexec— open a shell in a running allocation
Warning:
execin production is one of the most dangerous capabilities. Treat it as root access in the container. Restrict to at most two or three operators.
Creating tokens
A token is what goes in the HEROCTL_TOKEN variable or the X-Heroctl-Token header:
# token de longa duração para operador humano
heroctl acl token create \
--name "joao-deploy-prod" \
--policies deploy-prod \
--ttl 90d
# token curto para CI/CD
heroctl acl token create \
--name "ci-pipeline" \
--policies deployer \
--ttl 1h \
--bound-cidr 10.20.0.0/16
The output includes the token secret only once. If you lose it, generate another — there is no recovery.
Long-lived vs short-lived
| Use case | Recommended TTL |
|---|---|
| Human operator | 30 to 90 days |
| CI/CD pipeline | 1 to 24 hours (renewed each run) |
| Monitoring integration | 1 year, with viewer scope only |
| Break-glass emergency token | No expiration, kept in physical safe |
Short tokens with automatic renewal are preferable. The operational cost has dropped significantly since CI/CDs gained native OIDC.
Revocation
At any time:
heroctl acl token list
heroctl acl token revoke <id-do-token>
Revocation is immediate and propagates to all nodes within seconds. Use it when someone leaves the team, when you suspect compromise, or when rotating.
To revoke all tokens of a user at once:
heroctl acl token revoke --user joao --all
Audit log
Every authenticated call is recorded. This includes calls that failed for permission reasons.
# últimos 7 dias para um usuário específico
heroctl audit log --user joao --since 7d
# todas as falhas de autorização
heroctl audit log --result deny --since 24h
# tudo o que tocou um job específico
heroctl audit log --resource job --name api-pagamentos --since 30d
Each record has: timestamp, identity, source IP, target resource, operation, result. Export for external analysis as JSON:
heroctl audit log --since 30d --format json
SSO integration (Business)
For larger teams, managing tokens by hand doesn't scale. SSO via SAML or OIDC integrates with your existing identity provider:
sso:
type: oidc
issuer: https://idp.empresa.com.br
client_id: heroctl-prod
client_secret: ${secret.oidc_client_secret}
group_to_role:
"engineering-prod": operator
"engineering-dev": deployer
"sre": admin
"*": viewer
Users authenticate through the web panel and automatically receive the set of roles corresponding to their IdP groups. When they leave the company, they are deprovisioned in the IdP, and access to the cluster ends with it.
For CLI, OIDC device flow:
heroctl auth login
# abre navegador, completa autenticação no IdP, volta para o terminal
Best practices
- Least privilege. Start by denying everything, open exactly what's needed. It's easier to loosen later than to tighten.
- No shared tokens. Each person, each service, each pipeline has their own. Audit only works if identity is unique.
- Quarterly rotation. Long-lived tokens cycle on a predictable schedule. Mark it on the calendar.
- Monthly audit review. Thirty minutes a month looking at the log. Strange patterns show up if you look.
- Documented break-glass. Have an emergency admin token, with long expiration, kept out of band (physical safe, password manager with dedicated 2FA). Use it once a year, at most.
- Onboarding and offboarding by checklist. When someone joins: create tokens, assign roles. When they leave: revoke. Don't trust memory.
- CI tokens with no external credentials. Use the CI provider's OIDC directly against the cluster, with no static secret. Reduces leak surface.
Next steps
- Review the secrets configuration, which has its own permission model.
- See metrics and logs to correlate audit events with cluster behavior.