Skip to content

Add blog post: expose llama.cpp over Inlets Cloud#51

Open
welteki wants to merge 1 commit into
inlets:masterfrom
welteki:inlets-cloud-llama
Open

Add blog post: expose llama.cpp over Inlets Cloud#51
welteki wants to merge 1 commit into
inlets:masterfrom
welteki:inlets-cloud-llama

Conversation

@welteki

@welteki welteki commented Jun 30, 2026

Copy link
Copy Markdown
Member

Description

Add a blog post demonstrating how to expose llama.cpp over Inlets Cloud with bearer-token authentication, including setup instructions for connecting coding agents like opencode.

Motivation and context

Showcases Inlets Cloud as a practical way to securely share local model APIs

How has this been tested?

  • Ran locally with Docker Compose and verified the blog post renders correctly on the site.
  • Instructions and example commands were tested end-to-end against a live llama-server instance with an Inlets Cloud tunnel.

@welteki welteki force-pushed the inlets-cloud-llama branch from fbdec3e to 3ab389c Compare June 30, 2026 16:31
@reviewfn

This comment has been minimized.

@reviewfn

This comment has been minimized.

@welteki welteki force-pushed the inlets-cloud-llama branch from 3ab389c to 510f83b Compare June 30, 2026 16:40
@reviewfn

This comment has been minimized.

@welteki welteki force-pushed the inlets-cloud-llama branch from 510f83b to b821b04 Compare July 1, 2026 09:22
@reviewfn

This comment has been minimized.

@welteki welteki force-pushed the inlets-cloud-llama branch from b821b04 to 17a2327 Compare July 1, 2026 09:37
@reviewfn

This comment has been minimized.

Signed-off-by: Han Verstraete (OpenFaaS Ltd) <han@openfaas.com>
@welteki welteki force-pushed the inlets-cloud-llama branch from 17a2327 to 93d953c Compare July 1, 2026 09:44
@reviewfn

reviewfn Bot commented Jul 1, 2026

Copy link
Copy Markdown

AI Pull Request Overview

Disclaimer: This review was generated by automated AI and may contain errors. Do not trust its outputs without human verification.

Summary

  • Adds a tutorial for exposing a local llama.cpp server through Inlets Cloud with bearer-token authentication.
  • The Inlets Cloud tunnel flow is generally reproducible and gives readers concrete commands.
  • The OpenCode configuration is aligned with the OpenAI-compatible /v1 endpoint pattern.
  • The Claude Code section appears to direct an Anthropic client at a plain OpenAI-compatible llama-server, which is likely to fail for readers.
  • The post image front matter is commented out, so the rollup card and social metadata will not use a post-specific image.
  • The title and description match the tutorial scope, but the Claude Code claim should be corrected before publication.

Approval rating (1-10)

6/10. Useful tutorial, but the Claude Code instructions need correction because they likely do not work against llama-server directly.

Summary per file

Summary per file
File path Summary
blog/_posts/2026-07-01-expose-llama-cpp-with-inlets-cloud.md New tutorial for tunneling and authenticating a local llama.cpp endpoint.
images/2026-07-inlets-cloud-llama-cpp/create-access-token.png Screenshot showing Inlets Cloud access token creation.

Overall Assessment

The article has a clear reader goal and most of the Inlets Cloud setup is concrete enough to follow. I would not publish it as-is because the Claude Code section appears to assume Claude Code can talk directly to llama-server's OpenAI-compatible API by setting ANTHROPIC_BASE_URL; that is a reproducibility issue for a major promised outcome of the post. The missing image metadata is lower severity, but it will affect the blog listing and social preview quality for a rollup: true post.

Detailed Review

Detailed Review

Content review

Findings

Severity File Lines Issue
High blog/_posts/2026-07-01-expose-llama-cpp-with-inlets-cloud.md 217-231 The Claude Code instructions point ANTHROPIC_BASE_URL at the Inlets tunnel for llama-server, but the rest of the article sets up a plain OpenAI-compatible /v1 API. Claude Code's Anthropic environment variables are for Anthropic-compatible endpoints, not OpenAI-compatible llama-server endpoints, so this example is likely to fail when Claude Code sends Anthropic Messages API requests. Either remove this section, add the required Anthropic-compatible proxy/router layer, or change the tutorial to use a client that supports OpenAI-compatible providers directly.
Low blog/_posts/2026-07-01-expose-llama-cpp-with-inlets-cloud.md 8-11 The post is marked rollup: true, but the only post image metadata is commented out. Existing rollup cards render post.image when present, and meta.html uses page.image for Twitter/OpenGraph metadata, so this post will publish without a card thumbnail and with the generic social image. Add a suitable image: /images/2026-07-inlets-cloud-llama-cpp/... value, or remove the commented placeholder if the generic preview is intentional.

Additional observations

The title and description fit the main tutorial: the post does explain how to expose llama.cpp through Inlets Cloud with bearer-token authentication.

The opening establishes the value quickly, but the bullet "Accessing a model of your choice remotely from anywhere with unlimited tokens" overstates the outcome. The later examples still have context, output, hardware, and tunnel availability constraints; wording such as "without per-token hosted API billing" would be more precise.

The Inlets Cloud setup is structured well for reproducibility: prerequisites, token creation, tunnel creation, auth, and endpoint testing are presented in a usable order.

The image asset matches the access-token section and renders as a relevant screenshot, but it is not configured as the post's feature image.

AI agent details.

Agent processing time: 2m47.004s
Environment preparation time: 4.007s
Total time from webhook: 2m56.696s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant