API Design for Payment Platforms: What SDKs Get Wrong

I've used a lot of payment SDKs. I've also written some. The pattern I see is that the SDK reflects whoever built it more than the people using it. When the team is API-first, the SDK is a thin wrapper around HTTP calls. When the team is integration-engineers-first, the SDK is full of opinionated helpers. When the team is sales-first, the SDK has features that demo well and miss features that real merchants need on day three.

The cost of getting this wrong is hard to measure but easy to feel. A merchant integrating your platform spends a day on the happy path, a week on edge cases, and three months on the second processor. By the third processor, they've abandoned your SDK and are calling your REST API directly through their own wrapper. You've lost a customer relationship without ever knowing why.

Payment APIs are not CRUD APIs

Most REST API conventions assume stateless data. GET retrieves, PUT updates, DELETE removes. Idempotency is implicit for safe methods, explicit elsewhere. Errors split cleanly between 4xx and 5xx. Versioning happens with headers.

Payment APIs share almost none of these properties usefully.

A payment is not a resource. It's a process. The lifecycle from authorization through settlement spans days, involves multiple state transitions, and emits events asynchronously. Modeling this as a single endpoint that returns a JSON object is convenient but misleading — the object you got back at authorization time is not the same object that exists at settlement, and treating it as if it were produces bugs.

The SDKs that get this right expose the lifecycle, not just the resource. They have separate types for authorization, capture, refund, and settlement events. They don't pretend a Payment object is a single thing that mutates. They give you the events and let you derive state.

The SDKs that get it wrong return a Payment object with a status field and quietly swallow the lifecycle complexity. Then your code reads payment.status === 'succeeded' and you ship a bug because the status field changes meaning between processors and between webhook events.

Idempotency at the right layer

Every payment processor's API documentation talks about idempotency keys. Most SDKs handle this poorly.

What I want from an SDK: I generate a logical operation ID once, and the SDK handles retries, deduplication, and state recovery from there. If the network drops mid-request, the SDK retries safely. If my process crashes after sending but before recording, the SDK can resume on restart by querying the operation's state.

What I usually get: a idempotencyKey parameter that I have to generate, manage, and pass through. If I forget, I get a duplicate charge. If I generate it wrong (random per call instead of per logical operation), I get a duplicate charge. If I store it badly, I get a duplicate charge after a crash.

The idempotency burden should be on the SDK, not the developer. The developer says "charge this customer $50 for invoice #1234." The SDK figures out the keys, the retries, the recovery. Anything else makes idempotency a foot-gun rather than a guarantee.

// What I want
await sdk.charge({
  customer: 'cust_abc',
  amount: 5000,
  currency: 'USD',
  reference: { type: 'invoice', id: '1234' }  // SDK derives idempotency key
});

// What I usually get
await sdk.charge({
  customerId: 'cust_abc',
  amount: 5000,
  currency: 'USD',
  idempotencyKey: 'must-be-globally-unique-good-luck'
});

The first signature is harder for the SDK author. It requires a stable derivation from the reference, durable storage of the mapping, and recovery logic. It's also the right signature, because it makes the common case safe by default.

Error response design

Error responses in payment APIs have to communicate three things: what happened, whether to retry, and what to tell the merchant. Most APIs communicate one of the three and call it done.

The error responses I want look like this:

{
  "error": {
    "category": "card_declined",
    "decline_reason": "insufficient_funds",
    "retryable": false,
    "retry_after_seconds": null,
    "merchant_message": "Card declined for insufficient funds.",
    "developer_message": "The issuer declined the transaction with code 51 (insufficient funds). The customer should be asked to use a different payment method.",
    "processor_response_code": "51",
    "processor_raw": { "code": "51", "text": "DECLINED" },
    "request_id": "req_abc123"
  }
}

Six fields beyond the actual category, and every one of them is doing real work. retryable and retry_after_seconds drive automated retry logic. merchant_message is safe to surface to end users. developer_message helps debugging. processor_response_code and processor_raw preserve fidelity for auditing. request_id is what support tickets reference.

Most error responses I see are a string. "Card declined." That's not an error response, it's a hint that you'll need to call support to find out what went wrong.

The webhook problem the SDK should solve

Webhook handling is hard. I've written about this elsewhere. What's striking is how little SDKs help.

A good payment SDK's webhook handler should:

Verify the signature against the raw body, with the right algorithm and key.
Parse the event into a typed structure.
Provide a deduplication mechanism out of the box.
Distinguish between event types using sum types or discriminated unions, not magic strings.
Surface processing errors in a way that maps to retry policy.

What I usually get: verifyWebhook(body, signature, secret) → boolean and parseEvent(body) → any. Everything else is on me. After implementing the same defenses for the fifth time across the fifth processor's SDK, I stop using the SDK and write my own webhook layer that handles all of them uniformly.

Pagination is where bad APIs reveal themselves

Payment APIs need pagination. Merchant transaction lists, settlement records, dispute lists, reconciliation outputs — all paginated. The pagination strategy you ship is a defining decision for how usable your API is.

Offset-based pagination (?page=5&limit=100) is wrong. It breaks under inserts (rows shift, results skip or duplicate) and is slow at large offsets (the database has to scan everything before the offset). Don't ship it. Stripe got this wrong early and spent years migrating.

Cursor-based pagination is right. The cursor encodes a stable position; new rows don't disrupt it; performance is constant regardless of page depth. The cursor should be opaque to the caller — you can change the encoding without breaking clients.

The SDK should handle cursor management transparently. The developer wants to iterate transactions for a date range; they don't want to manage cursors. Give them an iterator:

for await (const tx of sdk.transactions.iterate({ since: '2025-01-01' })) {
  // The SDK handles fetching pages.
}

Not:

let cursor = null;
do {
  const page = await sdk.transactions.list({ since: '2025-01-01', cursor });
  cursor = page.nextCursor;
  // ...handle each page
} while (cursor);

The first form is what every developer wants. The second form is what most SDKs ship.

Versioning that doesn't break

API versioning in payments is a long-tail problem. Old integrations stay on old versions for years. New features ship on new versions. The same merchant might have one integration on v1 and another on v3.

The versioning strategy that works: version per request, not per client. The merchant pins a version when they sign up, and every API call defaults to that version. Specific calls can override the version explicitly. The SDK handles the pinning automatically.

Header-based versioning (Stripe-Version: 2024-01-01) works well for this. Path-based versioning (/v1/charges) is cumbersome at scale. Query-string versioning is the worst of both worlds.

Whatever you choose, do not deprecate aggressively. Payment integrations are not web frontends. Merchants update them on their own schedule, often years after the deprecation notice. A deprecated endpoint that returns 410 breaks production for merchants who ignored the email. Deprecation should mean "stop documenting this and discourage new use." Removal should be measured in years, not months.

SDKs that disagree with their REST API

Here's a subtle failure: the SDK exposes a different abstraction than the underlying REST API.

For example: REST endpoint accepts amount in cents, SDK accepts amount in dollars and converts internally. Looks helpful. Now a developer reading API docs and SDK docs sees different numbers, gets confused, files a support ticket, and you spend an hour explaining a translation that shouldn't exist.

Or: REST returns processor_response_code, SDK silently maps to error_code with different naming. Now logs from the SDK don't match logs from the REST API, debugging becomes a translation exercise, and any third-party tool that consumes both sees inconsistent data.

The SDK should be a thin, faithful representation of the REST API, with the addition of language-idiomatic ergonomics (typed responses, async iterators, error classes). It should not reinterpret, rename, or transform values. If the REST API takes cents, the SDK takes cents. If the REST API returns processor_response_code, the SDK exposes processor_response_code. The naming consistency between API and SDK matters more than language idiom purity.

Authentication that scales

Most payment SDKs handle authentication as a single API key passed at construction time. This is fine for simple integrations and broken for everything else.

What real production systems need:

Multiple keys per merchant (one for backend, one for client-side tokenization, one for restricted operations).
Easy key rotation without downtime.
Per-environment keys (test vs live) with clear separation.
Scoped keys for least-privilege access.
The ability to programmatically revoke a key.

SDKs that hardcode the "single key" assumption end up with shims like setApiKey(newKey) mid-execution, race conditions when keys rotate, and a security posture where every call uses the same key with full permissions.

Better: design the SDK to take a credential provider, not a credential. The provider is responsible for returning a valid key for each request, can implement rotation, can scope keys to operations, and can handle expiration. Most SDKs ship without this and have to bolt it on later, awkwardly.

What good SDKs make boring

The SDK I want makes the following things boring:

Sending a charge: one method call, type-safe inputs, type-safe outputs.
Handling errors: explicit error types, exhaustive matching, retry policy attached.
Receiving webhooks: framework-agnostic handler, signature verification, dedup, typed events.
Iterating large lists: async iterator, cursor handled transparently.
Idempotency: derived from operation reference, no manual key management.
Reconciliation: query for events in a time range, get a stream.

The SDK I want makes the following things easy:

Switching processors: a different adapter, same call sites.
Test mode: explicit test credentials, no risk of accidentally hitting production.
Logging: structured logs out of the box, opt-in to detail.
Observability: hooks for metrics and tracing.

Most SDKs make at least three of these things hard. The team writing them is solving the easy 80% well and leaving the operationally-critical 20% as an exercise for every integrator. After ten integrators have re-solved the same 20%, the SDK has effectively trained the entire ecosystem to work around it.

The harder truth

Payment SDKs are infrastructure for the relationship between a processor and its customers. They are not a feature checkbox. The team that builds the SDK is doing developer experience work whether they realize it or not, and the quality of that work shapes whether the processor's customers stay or migrate.

The processors that have the strongest developer relationships — Stripe, Adyen for some segments, a handful of regional players — got there by treating the SDK as a product. The ones with weaker relationships shipped SDKs as documentation artifacts and watched their customers route around them.

If you're integrating a payment processor and the SDK is fighting you, that's information. The SDK reflects how the company thinks about you. If the SDK is sloppy, the rest of the integration is going to be sloppy too, and you should price that into your decision before signing the contract.

Build the SDK you'd want to use at three in the morning during an incident. That's the standard. Anything else is technical debt with a brand on it.

This is part of a series on payment systems architecture. See also the real cost of payment integration nobody talks about and building payment webhooks that don't lose events.