Clarify support for manual ack/nack with batch mode + DLQ; potential bug in acknowledge(index) path

**Describe the issue**
We found potential issues when using Spring Cloud Stream Kafka Binder with **batch consumers + DLQ + manual acknowledgments**.

According to the [docs](https://docs.spring.io/spring-cloud-stream/reference/kafka/kafka-binder/consume-batches.html), in batch mode with DLQ enabled, all records from the previous poll are sent to DLQ. That matches current behavior.  
However, in manual-ack scenarios we observed:

1. A likely bug with **partial commits using `acknowledge(index)`** + DLQ.
2. A general improvement opportunity: DLQ handling appears to ignore already acknowledged/committed progress and republishes records that were already processed successfully.

Repro project (public): https://github.com/ferblaca/demoSCSKafka/tree/batch-consumer-dlq

**To Reproduce**

1. Clone and open the repro project:

```bash
git clone -b batch-consumer-dlq https://github.com/ferblaca/demoSCSKafka.git
cd demoSCSKafka
```

2. Start Kafka/Zookeeper/Kafka UI using the provided compose file:

```bash
docker compose -f src/main/resources/compose/docker-compose.yml up -d
```

3. In `src/main/resources/application.yml`, set one function definition at a time:
   - `batchConsumerAckManual` (scenario A)
   - `batchConsumerNackManual` (scenario B)

4. Run the application.

5. Produce 100 records to `input-topic-batch` (the project includes the producer binding `foo-out-0` to that topic, per config).

6. In the batch consumer logic, force an exception at index 20.

7. Inspect topic/DLQ behavior in Kafka UI:
   - `http://localhost:9090`


### Scenario A: `acknowledge(index)` (partial commit) + DLQ

Observed:
- After calling `acknowledgment.acknowledge(19)`, the internal partial state is set (`partial=19`).
- During DLQ handling for the failed batch, acknowledgments continue over the full polled batch.
- The failure starts when processing approximately record **80/100**, because the effective index reaches 100 (`80 + partial(19) + 1 = 100`), which exceeds the batch bounds.
- Then it throws:
  `IllegalArgumentException: index (100) is out of range (0-99)`
- The batch is retried multiple times.
- Final DLQ count becomes much larger than input (observed: **810 (10 retries x 81 messages) DLQ records for 100 input records**).

Consumer behavior:
- Partial ack every 20 records: `acknowledgment.acknowledge(i)`
- Forced exception at `i == 20`

Observed:
- After partial ack (`acknowledge(19)`), internal partial state is set.
- During DLQ flow for the failed batch, an index overflow occurs:
  `IllegalArgumentException: index (100) is out of range (0-99)`
- Batch is retried multiple times.
- Final DLQ count is much larger than input (observed: **810 DLQ records for 100 input records**).

This looks like a bug in the interaction between partial batch ack state and DLQ processing:
```
Caused by: java.lang.IllegalArgumentException: index (100) is out of range (0-99)
```

### Scenario B: `nack(index, Duration)` + DLQ

Same batch/DLQ/manual setup, but using:
- `acknowledgment.nack(i, Duration.ofMillis(100))` on failure at index 20

Observed:
- No index out-of-range exception in this path.
- But DLQ still receives records in repeated waves as re-seek/retry progresses.
- Final DLQ count is also amplified (observed: **280 DLQ records for 100 input records**).


**Expected behavior**
1. Clarify/document whether **manual ack (`acknowledge`/`nack`) is officially supported with DLQ in batch mode**.
2. If supported, fix the apparent bug for **`acknowledge(index)` + DLQ** (index out-of-range).
3. Improve DLQ error handling so that records already acknowledged/committed are not resent to DLQ (or provide a configurable strategy), reducing duplicate DLQ traffic and unnecessary reprocessing.

**Additional context**
- We understand Kafka processing is at-least-once and consumers must handle duplicates.
- Even so, current behavior appears to produce avoidable DLQ amplification in manual-ack batch use cases.
- We also ask for guidance on recommended patterns when using:
  - `acknowledge(index)`
  - `nack(index, Duration)`
  - `BatchListenerFailedException(index)`
  together with DLQ in batch mode.

**Version of the framework**
- Spring Cloud: 2025.1.1
- Spring Boot: 4.0.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify support for manual ack/nack with batch mode + DLQ; potential bug in acknowledge(index) path #3189

Scenario A: `acknowledge(index)` (partial commit) + DLQ

Scenario B: `nack(index, Duration)` + DLQ

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarify support for manual ack/nack with batch mode + DLQ; potential bug in acknowledge(index) path #3189

Description

Scenario A: acknowledge(index) (partial commit) + DLQ

Scenario B: nack(index, Duration) + DLQ

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Scenario A: `acknowledge(index)` (partial commit) + DLQ

Scenario B: `nack(index, Duration)` + DLQ