Prompt caching drops to 0% when introducing base64 file inputs

Hi team,

we are testing prompt caching behaviour using Agents SDK and noticed an unexpected drop in cache (0%) when introducing file inputs (as base64 encoded)

---

### Setup

* Model: `gpt-5.4-mini` 
* SDK: Python (custom script)
* File input: PDF via base64 (`input_file`)
* Conversation store enabled
* Prompt cache key added 
* Response chaining enabled using `previous_response_id`
* Prompt: 
  * Using Chat Prompts; passing prompt_id as instructions
  * Large static prefix (~13k tokens)
  * Same system instructions
  * Same tool definitions
  * Only user message varies

---

### Results

```
Round 1: text_only_first     cache_hit=96.7%
Round 2: text_only_repeat    cache_hit=96.7%

Round 3: file_input_first    cache_hit=0.0%   <-- unexpected (even though considerable cache drop expected due to increased input tokens)
Round 4: text_only_new       cache_hit=96.7%

Round 5: file_input_repeat   cache_hit=90.3%
Round 6: text_only_new       cache_hit=96.7%
```


We tried this few times and above behaviour is consistent across Agents SDK and using Responses API directly

---

### Key observation

* Adding a file input causes **cache to drop to 0%**, even though:

  * Prompt prefix is unchanged
  * Tool definitions are unchanged
* Repeating the same file input restores cache (~90%)
* Tried different options, using a `prompt_cache_key`, for better caching rates but observing same issue and it has got zero impact on this
---

### Expected behaviour

* Cache % to drop due to additional tokens or input from files
* But **not drop to 0%**, since the shared prefix, instructions, tools are still same

---

### Questions

1. Any known issues with input files and cache hits, file inputs treated as part of the cacheable prefix in a way that invalidates prior cache ?
2. Does base64 encoding prevent prefix matching across requests ?
4. Would using `file_id` instead of base64 improve cache reuse?
5. Is this expected behaviour or a potential issue ?

---

### Hypothesis

It seems that introducing file inputs:

* Changes the effective prompt prefix completely
* Causes cache routing to treat the request as a new prefix

---

### Additional notes

* All prompts exceed the 1024 token threshold
* Cache works consistently for text-only flows
* Issue appears only when mixing text-only and file-input requests

---

Would appreciate any clarification or best practices for maintaining cache efficiency in mixed input scenarios.

Happy to provide a minimal repro script if needed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt caching drops to 0% when introducing base64 file inputs #2784

Setup

Results

Key observation

Expected behaviour

Questions

Hypothesis

Additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prompt caching drops to 0% when introducing base64 file inputs #2784

Description

Setup

Results

Key observation

Expected behaviour

Questions

Hypothesis

Additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions