feat(share-ai): cap per-request output tokens and fail closed on Redis loss
Harden the anonymous public-share AI assistant against token-cost abuse before exposing it to the internet: - Add an env-tunable per-request output ceiling (maxOutputTokens) to the public-share streamText call so one anonymous request cannot run up the provider bill even if the per-IP throttle is evaded. New resolveShareAiMaxOutputTokens() / SHARE_AI_MAX_OUTPUT_TOKENS_DEFAULT (env SHARE_AI_MAX_OUTPUT_TOKENS, default 512), mirroring resolveShareAiWorkspaceMax(). - Flip the per-workspace cost limiter to FAIL CLOSED on Redis failure (was fail-open): if Redis is unavailable we cannot prove the workspace is under its cap, so deny rather than admit an unmetered, billable call. - Update the limiter spec (fail-open -> fail-closed) and add resolver tests; document both knobs in .env.example. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -112,7 +112,12 @@ MCP_DOCMOST_PASSWORD=
|
||||
#
|
||||
# Backstop: a cluster-wide, sliding-window cap per workspace (IP-independent,
|
||||
# keyed by the server-resolved workspace id) bounds the owner's bill even if the
|
||||
# per-IP limit is fully evaded. It is a COST backstop, not an access control,
|
||||
# and FAILS OPEN if Redis is unavailable. Override the hourly cap below
|
||||
# per-IP limit is fully evaded. It is a COST backstop, not an access control, and
|
||||
# FAILS CLOSED if Redis is unavailable (an optional assistant briefly going
|
||||
# offline is safer than an unbounded bill). Override the hourly cap below
|
||||
# (default: 300 calls per workspace per rolling hour).
|
||||
# SHARE_AI_WORKSPACE_MAX_PER_HOUR=300
|
||||
#
|
||||
# Per-request output-token ceiling for the anonymous assistant (default: 512).
|
||||
# Worst-case output per accepted call = agent steps (5) × this value.
|
||||
# SHARE_AI_MAX_OUTPUT_TOKENS=512
|
||||
|
||||
Reference in New Issue
Block a user