examples/llama: CoreML/MPS/QNN export still uses deprecated to_edge() + to_backend() split


### Problem

`_to_edge_and_lower_llama_xnnpack` uses `to_edge_transform_and_lower()`. The generic `_to_edge_and_lower_llama` (CoreML/MPS/QNN/Vulkan) uses the deprecated `export_to_edge() + to_backend()` split. `CoreMLPartitioner` emits a deprecation warning about this on every invocation.

For LFM2.5 hybrid models the split path desynchronises subgraph output-node names from the parent program's `buffers_to_mutate` map (short-conv `self.conv_state.copy_(...)` decomposes to `slice_copy + index_put`, only one of which the partitioner records as the mutation source). The verifier then raises:

```
torch._export.verifier.SpecViolationError: Mutation node aten_index_put_default_N is neither a buffer nor a user input.
```

### Reproduce

```bash
git clone https://github.com/pytorch/executorch && cd executorch
./install_executorch.sh
source .venv/bin/activate
pip install coremltools

cat > examples/models/lfm2/config/lfm2_coreml.yaml <<'EOF'
base:
  metadata: '{"get_bos_id": 1, "get_eos_ids":[7]}'
model:
  use_kv_cache: True
  enable_dynamic_shape: False
  dtype_override: fp32
backend:
  coreml:
    enabled: True
    ios: 18
    enable_state: True
    preserve_sdpa: True
    compute_units: cpu_and_ne
EOF

python -m extension.llm.export.export_llm \
  --config examples/models/lfm2/config/lfm2_coreml.yaml \
  +base.model_class=lfm2_5_1_2b \
  +base.params=examples/models/lfm2/config/lfm2_5_1_2b_config.json \
  +export.max_seq_length=2048 \
  +export.max_context_length=2048 \
  +export.output_name=lfm2_coreml.pte
```

### Suggested fix

Add a CoreML helper analogous to `_to_edge_and_lower_llama_xnnpack`, or short-circuit `_to_edge_and_lower_llama` when `coreml=True`:

```python
if coreml:
    coreml_partitioner = get_coreml_partitioner(
        coreml_ios, embedding_quantize, pt2e_quantize,
        coreml_quantize, coreml_compute_units,
    )
    builder = builder_exported.pt2e_quantize(quantizers).to_edge_transform_and_lower(
        [coreml_partitioner]
    )
    return builder.to_executorch(passes=additional_passes)
```

The same migration likely applies to MPS, QNN, and Vulkan branches; only CoreML has been exercised here.


cc @kimishpatel @YifanShenSZ @cymbalrush @metascroy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples/llama: CoreML/MPS/QNN export still uses deprecated to_edge() + to_backend() split #19634

Problem

Reproduce

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

examples/llama: CoreML/MPS/QNN export still uses deprecated to_edge() + to_backend() split #19634

Description

Problem

Reproduce

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions