Add mlx-serve to Large Language Model#233
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds the ddalcu/mlx-serve project, a native LLM inference server for Apple Silicon, to the README.md file. There are no review comments, and I have no feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Pull request overview
Adds ddalcu/mlx-serve to the repository’s curated list under Data & Science → Large Language Model, expanding the set of Zig-based LLM tooling with an Apple Silicon-focused inference server.
Changes:
- Added a new README entry for
ddalcu/mlx-servein the Large Language Model section. - Positioned the entry in alphabetical order within that section.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds ddalcu/mlx-serve to Data & Science → Large Language Model.
A native Zig LLM inference server for Apple Silicon: runs MLX-format models and GGUF (embedded llama.cpp), exposes OpenAI- and Anthropic-compatible HTTP APIs (works with Claude Code), with speculative decoding and KV-cache quantization. Ships MLX Core, a macOS menu-bar app. MIT-licensed.
Entry placed in alphabetical order.