Skip to content

examples/llama: CoreML/MPS/QNN export still uses deprecated to_edge() + to_backend() split #19634

@msluszniak

Description

@msluszniak

Problem

_to_edge_and_lower_llama_xnnpack uses to_edge_transform_and_lower(). The generic _to_edge_and_lower_llama (CoreML/MPS/QNN/Vulkan) uses the deprecated export_to_edge() + to_backend() split. CoreMLPartitioner emits a deprecation warning about this on every invocation.

For LFM2.5 hybrid models the split path desynchronises subgraph output-node names from the parent program's buffers_to_mutate map (short-conv self.conv_state.copy_(...) decomposes to slice_copy + index_put, only one of which the partitioner records as the mutation source). The verifier then raises:

torch._export.verifier.SpecViolationError: Mutation node aten_index_put_default_N is neither a buffer nor a user input.

Reproduce

git clone https://github.com/pytorch/executorch && cd executorch
./install_executorch.sh
source .venv/bin/activate
pip install coremltools

cat > examples/models/lfm2/config/lfm2_coreml.yaml <<'EOF'
base:
  metadata: '{"get_bos_id": 1, "get_eos_ids":[7]}'
model:
  use_kv_cache: True
  enable_dynamic_shape: False
  dtype_override: fp32
backend:
  coreml:
    enabled: True
    ios: 18
    enable_state: True
    preserve_sdpa: True
    compute_units: cpu_and_ne
EOF

python -m extension.llm.export.export_llm \
  --config examples/models/lfm2/config/lfm2_coreml.yaml \
  +base.model_class=lfm2_5_1_2b \
  +base.params=examples/models/lfm2/config/lfm2_5_1_2b_config.json \
  +export.max_seq_length=2048 \
  +export.max_context_length=2048 \
  +export.output_name=lfm2_coreml.pte

Suggested fix

Add a CoreML helper analogous to _to_edge_and_lower_llama_xnnpack, or short-circuit _to_edge_and_lower_llama when coreml=True:

if coreml:
    coreml_partitioner = get_coreml_partitioner(
        coreml_ios, embedding_quantize, pt2e_quantize,
        coreml_quantize, coreml_compute_units,
    )
    builder = builder_exported.pt2e_quantize(quantizers).to_edge_transform_and_lower(
        [coreml_partitioner]
    )
    return builder.to_executorch(passes=additional_passes)

The same migration likely applies to MPS, QNN, and Vulkan branches; only CoreML has been exercised here.

cc @kimishpatel @YifanShenSZ @cymbalrush @metascroy

Metadata

Metadata

Assignees

Labels

module: coremlIssues related to Apple's Core ML delegation and code under backends/apple/coreml/

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions