ApplePay create-session API timing out

We are an Apple Pay consumer and observed elevated response times and intermittent timeouts affecting the create‑session API (apple-pay-gateway.apple.com/paymentservices/paymentSession) between approximately 8:01 PM and 8:35 PM PST today.

We are reaching out to understand whether there were any service disruptions during this timeframe, as we do not see corresponding updates on the system status pages. We would like to confirm whether this behavior was related to a broader Apple Pay issue or specific to our integration.

Answered by DTS Engineer in 893659022

Hi @leroyjv,

You wrote:

We are reaching out to understand whether there were any service disruptions during this timeframe, as we do not see corresponding updates on the system status pages. We would like to confirm whether this behavior was related to a broader Apple Pay issue or specific to our integration.

Unfortunately, I don't have enough information to begin an investigation into the reported outage. I'd need to know a lot more about your Apple Pay implementation, the regional availability of your payment integrations, and trace routes of the requests.

Next time this occurs, please capture this information and submit a report via Feedback Assistant as soon as possible.

Update:

Another developer reported a similar issue is a private code-level support case, so I'm sharing the outcome of that discussion here for broader visibility.

Your investigation is exceptionally thorough and the evidence points very precisely to a server-side partial failure in the paymentservices/startSession backend cluster, not a merchant-side configuration issue. The failure signature is likely related to a gateway backend worker stall:

The pattern — TCP connects, TLS completes, POST body accepted, then silence until your 20 s timeout with zero bytes returned — is the canonical signature of one of two things:

  1. A backend worker receiving the request but stalling before generating any response (e.g., hung database call, internal lock contention, downstream dependency timeout within Apple's infrastructure). The connection stays open because the gateway accepted it but the worker never calls write().
  2. Backend connection pool exhaustion / request queueing behind a partially degraded node in the apple-pay-gateway.apple.com fleet, where some nodes serve normally (500–600 ms) and others hang. Your observation of alternating pass/fail on the same IP within 25 s is consistent with a per-request upstream-backend affinity (some requests land on a healthy worker, some on a stalled one).

The fact that this has been ongoing for several weeks with gradual conversion degradation (not a cliff-edge outage) suggests a slow resource leak or a persistent degraded node rather than a transient event. The related case (a different merchant, same endpoint symptom, similar reproduction window) adds corroborating signal that this is infrastructure-side.

The same cert/payload works from a fresh process because the local test process creates a new TCP connection that may land on a different backend worker in the pool. Your production server's persistent HTTP keep-alive connection (or a specific connection in a connection pool) may be preferentially routed to a hung/stalled backend worker. Check whether your production server is reusing connections via keep-alive. If so:

# Immediately test: force a fresh connection per request in production
Connection: close   # or disable keep-alive in your HTTP client config

Note: This won't fix the root cause but may restore service immediately by avoiding the stalled worker.

As for a workaround in the meantime:

  1. Disable HTTP keep-alive / connection pooling on your startSession calls. Force a fresh TCP connection per merchant validation. This directly targets the likely mechanism (stale connection affinity to a degraded node).
  2. Implement aggressive retry with new connection — not just a new request on the same socket. On timeout, tear down the socket and reconnect. Given the alternating pass/fail on the same IP, a single retry on a fresh connection has a high probability of succeeding.
  3. Reduce your timeout from 20 s to ~8 s and retry once. The gateway either responds in <700 ms or never responds at all — there's no value in waiting 20s.
  4. Geo/IP diversification probe: Check whether apple-pay-gateway.apple.com resolves to multiple IPs from your location. If you can force connection to an alternate IP, do so to see whether the stall rate differs — this would confirm backend-node-level failure.

Please include any discoveries based on these workaround above in your existing bug report created in Feedback Assistant, as you can continue to add comments and updates at any time as long as the report isn't closed.

Cheers,

Paris X Pinkney |  WWDR | DTS Engineer

Hi @leroyjv,

You wrote:

We are reaching out to understand whether there were any service disruptions during this timeframe, as we do not see corresponding updates on the system status pages. We would like to confirm whether this behavior was related to a broader Apple Pay issue or specific to our integration.

Unfortunately, I don't have enough information to begin an investigation into the reported outage. I'd need to know a lot more about your Apple Pay implementation, the regional availability of your payment integrations, and trace routes of the requests.

Next time this occurs, please capture this information and submit a report via Feedback Assistant as soon as possible.

Update:

Another developer reported a similar issue is a private code-level support case, so I'm sharing the outcome of that discussion here for broader visibility.

Your investigation is exceptionally thorough and the evidence points very precisely to a server-side partial failure in the paymentservices/startSession backend cluster, not a merchant-side configuration issue. The failure signature is likely related to a gateway backend worker stall:

The pattern — TCP connects, TLS completes, POST body accepted, then silence until your 20 s timeout with zero bytes returned — is the canonical signature of one of two things:

  1. A backend worker receiving the request but stalling before generating any response (e.g., hung database call, internal lock contention, downstream dependency timeout within Apple's infrastructure). The connection stays open because the gateway accepted it but the worker never calls write().
  2. Backend connection pool exhaustion / request queueing behind a partially degraded node in the apple-pay-gateway.apple.com fleet, where some nodes serve normally (500–600 ms) and others hang. Your observation of alternating pass/fail on the same IP within 25 s is consistent with a per-request upstream-backend affinity (some requests land on a healthy worker, some on a stalled one).

The fact that this has been ongoing for several weeks with gradual conversion degradation (not a cliff-edge outage) suggests a slow resource leak or a persistent degraded node rather than a transient event. The related case (a different merchant, same endpoint symptom, similar reproduction window) adds corroborating signal that this is infrastructure-side.

The same cert/payload works from a fresh process because the local test process creates a new TCP connection that may land on a different backend worker in the pool. Your production server's persistent HTTP keep-alive connection (or a specific connection in a connection pool) may be preferentially routed to a hung/stalled backend worker. Check whether your production server is reusing connections via keep-alive. If so:

# Immediately test: force a fresh connection per request in production
Connection: close   # or disable keep-alive in your HTTP client config

Note: This won't fix the root cause but may restore service immediately by avoiding the stalled worker.

As for a workaround in the meantime:

  1. Disable HTTP keep-alive / connection pooling on your startSession calls. Force a fresh TCP connection per merchant validation. This directly targets the likely mechanism (stale connection affinity to a degraded node).
  2. Implement aggressive retry with new connection — not just a new request on the same socket. On timeout, tear down the socket and reconnect. Given the alternating pass/fail on the same IP, a single retry on a fresh connection has a high probability of succeeding.
  3. Reduce your timeout from 20 s to ~8 s and retry once. The gateway either responds in <700 ms or never responds at all — there's no value in waiting 20s.
  4. Geo/IP diversification probe: Check whether apple-pay-gateway.apple.com resolves to multiple IPs from your location. If you can force connection to an alternate IP, do so to see whether the stall rate differs — this would confirm backend-node-level failure.

Please include any discoveries based on these workaround above in your existing bug report created in Feedback Assistant, as you can continue to add comments and updates at any time as long as the report isn't closed.

Cheers,

Paris X Pinkney |  WWDR | DTS Engineer

ApplePay create-session API timing out
 
 
Q