ext-whisper
Local speech-to-text for PHP, in-process. ext-whisper is a PHP
8.3+ extension that loads a whisper.cpp
model and transcribes audio inside the PHP process — no Python
sidecar, no remote API, no audio leaving the box.
use Displace\Whisper\Model;
$model = Model::load('models/ggml-tiny.en.bin');
$result = $model->transcribe('meeting.wav');
echo $result->text();
foreach ($result->segments() as $segment) {
printf("[%6.2fs → %6.2fs] %s\n", $segment['start'], $segment['end'], $segment['text']);
}
Written in Rust on top of ext-php-rs
and the whisper-rs bindings.
Part of a stack
ext-whisper is the ingest stage of the Displace local-first AI stack: transcribe with ext-whisper, embed with ext-infer, index and search with ext-turbovec — a complete audio-archive semantic-search pipeline with zero services.
Deliberately out of scope (v0.1)
- Audio decoding — input is 16kHz mono 16-bit PCM WAV, full stop. See Preparing audio for the one-line ffmpeg conversion. Decoding (mp3/m4a/ogg) is a candidate for v0.2.
- Streaming / realtime transcription — file in, transcription out.
- Speaker diarization, word-level timestamps, GPU-default builds — later, if the scope test passes.
- Windows — out of scope platform-wide until someone funds it.