When agents move money, the threat surface widens. An LLM that can authorize a charge can also be coerced (via prompt injection, jailbreak, or compromised upstream model) into authorizing a fraudulent one. The toolkit's job is to make those attacks mechanically harder to execute, not just hope the model never gets confused.
This page enumerates every threat we have explicitly thought about, with the specific mitigation in code. Inspired by STRIDE + the OWASP LLM Top 10. The four operator rows below (T15-T18) apply to a sociedad-IA, the umbrella nickname for the Sociedad Automatizada (art. 14 of the Anteproyecto de Ley General de Sociedades, not yet law). Three statuses:
Mitigated by toolkit, code in @ar-agents/* eliminates or substantially raises the bar for the attack. Host is responsible, the toolkit gives you the primitives but you have to wire them correctly (e.g., using HSM/KMS for cert storage). Out of scope: the attack lives outside the boundary the toolkit can reasonably defend.
T1Mitigated by toolkit
LLM agent retries a tool call after a network blip, double-charges the customer.
MITIGATIONDeterministic SHA-256 idempotency keys derived from input parameters in 4 mutating tools (create_payment, create_subscription, create_payment_preference, refund_payment). Same inputs → same key → MP server-side dedupes.
T2Mitigated by toolkit
Compromised LLM (jailbreak / prompt injection) authorizes a refund, cancellation, or card deletion the user didn't consent to.
MITIGATION8 irreversible tools (refund_payment, cancel_subscription, cancel_payment_preference, pause_subscription, delete_customer_card, cancel_qr_dynamic, delete_pos, revoke_marketplace_token) require a `requireConfirmation` callback. Tool execution blocks until the host confirms via UI / Slack / email. Programmatic gate, not LLM instruction.
T3Mitigated by toolkit
Webhook spoofing, attacker crafts fake MP webhooks to mark fake payments as completed.
MITIGATIONverifyWebhookSignature() does HMAC-SHA256 over (id, request-id, ts) with the shared secret. Constant-time comparison defangs timing attacks. 5-minute replay-tolerance window rejects old signed payloads.
T4Mitigated by toolkit
Webhook replay, attacker re-plays a legitimately-signed webhook to trigger duplicate downstream actions.
MITIGATIONWebhookDedup helper short-circuits duplicate webhook IDs server-side. Configurable TTL window (default 24h). Persisted via the same KV adapter the rest of the toolkit uses.
T5Mitigated by toolkit
Access token leak, MP/AFIP/Meta credentials end up in client-side JS bundles.
MITIGATIONMercadoPagoClient and WsfeClient throw at construction time when instantiated in a browser context (typeof window !== 'undefined' check). README warns 'use Server Components / Route Handlers / Server Actions only'. server-only side enforced; the agent loop runs on Edge or Node.
T6Host is responsible
AFIP cert exfiltration, private key in env vars ends up in logs / source maps / serverless cold-start traces.
MITIGATIONCert + key passed as PEM strings via env vars (Vercel secrets / AWS Secrets Manager / GCP Secret Manager). Never written to disk. The toolkit reads them once at boot, holds in memory, signs WSAA tokens with Web Crypto. RFC-001 § 3.2 mandates HSM/KMS for sociedades-IA in production.
T7Mitigated by toolkit
Supply-chain attack, malicious code injected into a published @ar-agents/* tarball.
MITIGATIONEvery published tarball ships an SLSA v1 npm provenance attestation tying it to a specific GitHub commit + GitHub Actions runner. Verifiable via `npm view <pkg> dist.attestations` against Sigstore transparency log. OpenSSF Scorecard auto-audits 18 supply-chain practices weekly.
T8Mitigated by toolkit
Dependency confusion, attacker publishes a typo-squat (`@ar-agent/mercadopago`).
MITIGATIONScoped npm org `@ar-agents` registered + locked to one publisher. Verified package metadata (homepage, repository, bugs.url) on every package. README badges + Glama listing + MCP Registry listing all cross-link to https://github.com/ar-agents/ar-agents.
T9Mitigated by toolkit
Hung agent / runaway loop, agent gets stuck retrying a failed tool call until quotas exhaust.
MITIGATIONstopWhen: stepCountIs(N) caps agent steps. CircuitBreaker on every external API client (rolling-window failure threshold). Per-request timeout via AbortSignal propagation. MaxRetries default = 1 for state mutations, 3 for read-only lookups.
T10Host is responsible
Cross-tenant data leak, multi-tenant host fetches Tenant A's MP payments and Tenant B sees them.
MITIGATIONEach MercadoPagoClient instance is bound to one accessToken. State adapters keyed on a host-supplied tenantId. The toolkit doesn't share state across instances, host wires per-tenant adapters.
T11Mitigated by toolkit
Audit log tampering, attacker who breached the host modifies past tool-call records to cover their tracks.
MITIGATIONAuditLogger wraps every tool call (input, output, duration, error) with an HMAC-signed timestamp using a separate audit secret. Append-only sink (Vercel KV, S3 with object lock, Postgres with row-level immutability). RFC-001 § 9.2 makes the log legally probative.
T12Mitigated by toolkit
OAuth token theft, marketplace seller's MP refresh-token leaked, attacker drains their account.
MITIGATIONVercelKVOAuthTokenStore (subpath `/vercel-kv`) encrypts at rest, scoped to your platform's Vercel project. Refresh tokens kept server-side. The toolkit's revoke_marketplace_token tool gated behind requireConfirmation (T2).
T13Mitigated by toolkit
Content injection in factura PDF (XSS via item description, or embedded executable).
MITIGATIONItem descriptions sanitized + length-capped before WSFE submit. AFIP's WSFE rejects malformed payloads server-side. PDF generation uses static templates with parameter binding, no user-supplied HTML/JS injection vector.
T14Out of scope
Browser-fingerprint MP fraud detection bypass, attacker scripts payment flow to look like legitimate browser traffic.
MITIGATIONOut of scope. MP's fraud team runs the detection; the toolkit's job is to surface their verdict via explainPaymentStatus(). Recipe 13 (anti-fraud middleware) layers additional pre-charge heuristics (CUIT validity, payer history, velocity, BCRA cross-check).
T15Host is responsible
Audit-signing Ed25519 private key compromise. The operator's RFC-005 signing key is exfiltrated; attacker forges historical audit entries that verify against the published public key.
MITIGATIONRFC-001 §3.2 mandates HSM/KMS custody for sociedades-IA in production (AWS KMS / GCP Cloud HSM / Azure Key Vault / on-prem HSM, never raw filesystem). RFC-005 §6 specifies rotation: emit a `signing-key-rotated` audit entry signed by the OLD key naming the NEW keyId, publish the new key in /.well-known/sociedad-ia/keys with overlapping validFrom window. The RFC-006 anchor sub-chain (see T17) makes post-rotation forgeries detectable because any forged entry breaks reconciliation with the global anchor head.
T16Host is responsible
Anchor service outage or rollback. The external timestamping target (TSA RFC 3161, opentimestamps, NIST randomness beacon, public-registry endpoint) goes offline or reorgs; attacker exploits the unanchored window to backdate or rewrite entries.
MITIGATIONRFC-006 §6 anchor sub-chain is itself HMAC-chained, so a missing anchor leaves a gap but does NOT break the inner ledger, replay-from-genesis still verifies every entry. Recommended host posture is multi-anchor fallback (concurrent posting to ≥2 independent services). `arg-verify bundle` surfaces any unanchored window explicitly in its output so a regulator sees the gap rather than inheriting a false sense of continuity.
T17Mitigated by toolkit
Insider operator mutating audit log within the unanchored TTL. The operador designado (or a compromised insider with write access) modifies a recent audit entry before the next anchor seals it, hiding fraud committed inside that window.
MITIGATIONAppend-only sink with row-level immutability (Postgres immutable rows / S3 object-lock / Vercel KV version-tag enforcement) blocks in-place mutation at the storage layer. RFC-006 prev-hash chain makes retroactive modification cascade, any altered entry forces re-chaining of every subsequent entry, visible as monotonic-counter discontinuity to the verifier. Hosts running high-value workloads SHOULD anchor on every transaction batch (not just daily) to compress the unanchored window from hours to seconds.
T18Host is responsible
X.509 / ARCA certificate rotation gap. The production WSAA cert expires without overlap; agent stops issuing factura electrónica mid-day and the audit log shows a gap that looks like compliance failure.
MITIGATIONRFC-001 §6.2 documents the renewal calendar: register the next cert ≥30 days before expiry, run a dual-cert overlap window (T-30 → T+0) where both certs are valid against ARCA, switch traffic to the new keyId, retire the old cert after 7 days of successful WSAA tokens. The rotation emits a `cert-rotated` audit entry signed by BOTH keys, anchoring the transition into the audit chain so an auditor can prove key continuity. Currently documented as procedure; automating the rotation cron is an open task (host-responsibility for v1; tracked for in-toolkit promotion).
If you find a security issue not covered above, please don't open a public GitHub issue. Email naza@naza.ar with details and proof-of-concept. We'll respond within 48 hours and disclose responsibly per SECURITY.md in the repo.
For supply-chain audit: every published package ships SLSA v1 provenance attestations. Verify with npm view @ar-agents/<name> dist.attestations and cross-check the Sigstore transparency-log entry.