OpenAI-compatible HTTP server + docker image by mudler · Pull Request #8 · mudler/parakeet.cpp

mudler · 2026-06-03T14:09:53Z

A small OpenAI-drop-in HTTP server for transcription, built on parakeet.cpp.
Point any OpenAI client's base_url at it and call POST /v1/audio/transcriptions.

Rebased onto current master.

What's here

examples/server: httplib-based server exposing POST /v1/audio/transcriptions
(json / text / verbose_json, plus timestamp_granularities[]=word) and /health.
Model resolver/fetcher: accepts a local .gguf, an http(s) URL, a <name>.gguf
in mudler/parakeet-cpp-gguf, or a friendly alias (downloaded and cached once).
Unit tests (response formatting, model resolution) plus an opt-in e2e test that
drives the real binary.
A dedicated docker image published to ghcr.io/<owner>/parakeet.cpp-server
(CPU + CUDA, multi-arch), built from the same Dockerfile as the cli image via a
second runtime target so ggml compiles once per build job. Binds 0.0.0.0,
exposes 8080, ships curl for alias fetch.

See examples/server/README.md for usage and the known simplifications
(WAV-only uploads, single segment, serialized inference).

jwinpbe · 2026-06-07T20:02:46Z

hello!

just curious if you want any help testing this, or what state it's in. i'm using some odd hardware that can only do vulkan and i'd rather be using parakeet than qwen3 ASR. let me know what you'd like tested.

thanks!

-- jwin

Squashed rebase of the worktree-openai-server branch (PR #8) onto current master. Adds examples/server (httplib-based POST /v1/audio/transcriptions), model resolver/fetcher, OpenAI response formatter, and unit + e2e tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Split the Dockerfile into a shared build stage plus two runtime targets: runtime (cli, default, unchanged) and runtime-server (entrypoint parakeet-server --host 0.0.0.0, EXPOSE 8080, curl for alias fetch). The docker workflow now builds and publishes both ghcr images, cli and server, for each (variant, arch); the server build reuses the cli build stage so ggml compiles once per job. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…for production Add an OpenAI-compatible server section to the main README (build, curl and OpenAI-client usage) and extend the Docker section to cover the new parakeet.cpp-server image alongside the cli. Note LocalAI as the production path in both the main README and examples/server/README.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Builds parakeet-server and runs tests/server_e2e.sh (PARAKEET_SERVER_E2E=1): fetches the ~125 MB tdt_ctc-110m-q4_k model by alias, starts the server, and hits POST /v1/audio/transcriptions with a real WAV in json/text/verbose_json (plus word timestamps), checking the transcription and the 400 paths. Runs on pull_request and workflow_dispatch, like closed-loop; no NeMo/Python venv. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mudler changed the title ~~Worktree OpenAI server~~ OpenAI server Jun 3, 2026

mudler changed the title ~~OpenAI server~~ OpenAI-compatible http-server Jun 3, 2026

mudler marked this pull request as ready for review June 7, 2026 21:15

iswaryaalex mentioned this pull request Jun 15, 2026

Feature request: OpenAI-compatible server example (POST /v1/audio/transcriptions) #23

Closed

mudler and others added 2 commits June 17, 2026 17:13

mudler force-pushed the worktree-openai-server branch from 23bd49c to d655ede Compare June 17, 2026 17:26

localai-bot changed the title ~~OpenAI-compatible http-server~~ OpenAI-compatible HTTP server + docker image Jun 17, 2026

mudler and others added 2 commits June 17, 2026 17:30

mudler merged commit 1055fb6 into master Jun 17, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI-compatible HTTP server + docker image#8

OpenAI-compatible HTTP server + docker image#8
mudler merged 4 commits into
masterfrom
worktree-openai-server

mudler commented Jun 3, 2026 •

edited by localai-bot

Loading

Uh oh!

jwinpbe commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mudler commented Jun 3, 2026 • edited by localai-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's here

Uh oh!

jwinpbe commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mudler commented Jun 3, 2026 •

edited by localai-bot

Loading