Registry
Overview
Section titled “Overview”vinput-registry is a separate repository that hosts all downloadable resources for Vinput: local ASR models, cloud ASR provider scripts, and LLM adapter scripts. When you click Download or Install in Vinput GUI, or run vinput model add / vinput provider add / vinput adapter add, the resources are fetched from this registry.
vinput-registry/├── registry/│ ├── models.json # Local ASR model index│ ├── providers.json # Cloud ASR provider index│ └── adapters.json # LLM adapter index├── i18n/│ ├── en_US.json # English display text│ └── zh_CN.json # Chinese display text└── resources/ ├── providers/<folder>/<variant>/ │ ├── entry.py │ └── README.md └── adapters/<folder>/<name>/ ├── entry.py └── README.mdCore rules
Section titled “Core rules”- All scripts must be self-contained — only use the standard library of your chosen runtime. No third-party dependencies, no
pip install, nonpm install. This ensures scripts work on any machine out of the box. - Scripts can be written in any language. Python 3 is the recommended choice because it is available on virtually all Linux distributions. Node.js, Bash, or other runtimes are also acceptable as long as the stdlib-only rule is met.
- The
commandfield in registry JSON works like a VS Codetasks.jsonentry: it specifies the interpreter, andargsspecifies the script path. - Download URLs use array form for fallback.
- Each script resource must include an entry script and
README.md.
ID conventions
Section titled “ID conventions”| Type | Pattern | Example |
|---|---|---|
| Model | model.sherpa-onnx.<model-name> | model.sherpa-onnx.sense-voice-zh-en-ja-ko-yue-int8 |
| Provider | provider.<folder>.<variant> | provider.bailian.streaming |
| Adapter | adapter.<folder>.<name> | adapter.mtranserver.proxy |
idis a stable machine identifier — never rename once publishedshort_idis for human display in CLI/GUI/logs only- i18n display text is stored as
<id>.titleand<id>.descriptionini18n/*.json
ASR provider scripts
Section titled “ASR provider scripts”Non-streaming (batch)
Section titled “Non-streaming (batch)”A non-streaming provider receives a complete audio recording and returns the transcript.
Runtime contract:
| Item | Spec |
|---|---|
| Command | python3 |
| Input | Raw PCM S16_LE, mono, 16000 Hz via stdin (binary) |
| Output | Final transcript text via stdout |
| Errors | Human-readable messages to stderr |
| Exit codes | 0 success, 1 runtime error, 2 usage error |
| Dependencies | Standard library only (no third-party packages) |
File structure:
resources/providers/<folder>/batch/├── entry.py└── README.mdStreaming
Section titled “Streaming”A streaming provider recognizes audio in real time as it arrives.
Runtime contract:
| Item | Spec |
|---|---|
| Command | python3 |
| Input | JSONL via stdin |
| Output | JSONL via stdout |
| Errors | stderr only |
| Exit codes | 0 success, 1 runtime error, 2 usage error |
| Dependencies | Standard library only (no third-party packages) |
Input protocol (stdin):
{"type":"audio","audio_base64":"<base64 PCM S16_LE mono 16kHz>","commit":false}{"type":"audio","audio_base64":"...","commit":true}{"type":"finish"}{"type":"cancel"}Output protocol (stdout):
{"type":"session_started","session_id":"..."}{"type":"partial","text":"current partial transcript"}{"type":"final","text":"confirmed transcript","segment_final":true}{"type":"error","message":"..."}{"type":"closed"}Transcript semantics:
partial.text— full user-visible transcript at the current momentfinal.text— full confirmed transcript at the current moment- The script is responsible for accumulating and deduplicating segments
File structure:
resources/providers/<folder>/streaming/├── entry.py└── README.mdEnvironment variables
Section titled “Environment variables”Provider env names use the VINPUT_ASR_* namespace. Shared variables:
| Variable | Purpose |
|---|---|
VINPUT_ASR_API_KEY | Bearer-style API credential |
VINPUT_ASR_APP_ID | App identifier |
VINPUT_ASR_ACCESS_TOKEN | Token credential (non-API-key style) |
VINPUT_ASR_URL | Full endpoint override |
VINPUT_ASR_BASE_URL | Base endpoint (script constructs final path) |
VINPUT_ASR_MODEL | Remote model identifier |
VINPUT_ASR_LANGUAGE | Transcription language hint |
VINPUT_ASR_PROMPT | Recognition prompt or bias text |
VINPUT_ASR_TIMEOUT | Network timeout (seconds) |
VINPUT_ASR_FINISH_GRACE_SECS | Extra wait after finish before closing |
VINPUT_ASR_ENABLE_VAD | Enable server-side VAD |
VINPUT_ASR_VAD_THRESHOLD | VAD sensitivity |
VINPUT_ASR_VAD_PREFIX_PADDING_MS | Audio padding before speech |
VINPUT_ASR_VAD_SILENCE_DURATION_MS | Silence duration to close turn |
Provider-specific variables are allowed when upstream features don’t map to shared ones, but must still use the VINPUT_ASR_* prefix.
LLM adapter scripts
Section titled “LLM adapter scripts”An adapter is a local process that exposes a non-standard LLM as an OpenAI-compatible API.
Runtime contract:
| Item | Spec |
|---|---|
| Command | Interpreter (e.g. python3) |
| Provides | Local HTTP server |
| Required endpoints | GET /v1/models, POST /v1/chat/completions |
| Dependencies | Standard library only (no third-party packages) |
Response format
Section titled “Response format”For POST /v1/chat/completions, Vinput expects:
- Non-streaming only:
"stream": false - Standard OpenAI chat completion JSON envelope
choices[0].message.contentmust be a string- That string itself must be JSON in this shape:
{ "candidates": [ "candidate 1", "candidate 2" ]}The outer response is OpenAI-compatible, while the inner content string is the structured payload consumed by Vinput.
GET /v1/models only needs to return the standard OpenAI list format:
{ "object": "list", "data": [ { "id": "my-model", "object": "model", "owned_by": "my-adapter" } ]}File structure:
resources/adapters/<folder>/<name>/├── entry.py└── README.mdAdapter env names use a provider-specific prefix (e.g. MTRAN_*), not VINPUT_ASR_*.
Local ASR models
Section titled “Local ASR models”Model entries in models.json describe downloadable sherpa-onnx model archives.
Required fields:
| Field | Description |
|---|---|
id | Stable ID, e.g. model.sherpa-onnx.<name> |
short_id | Human-readable short ID |
urls | Array of download URLs (fallback order) |
sha256 | Archive checksum |
size_bytes | Archive size |
language | zh, en, multilingual, etc. |
vinput_model | Runtime metadata (see below) |
vinput_model structure:
| Field | Description |
|---|---|
backend | sherpa-offline or sherpa-streaming |
runtime | offline or online |
family | sherpa-onnx C API family: dolphin, sense_voice, paraformer, transducer, qwen3_asr |
language | Language code |
size_bytes | Model size |
supports_hotwords | Whether hotword boosting is supported |
recognizer | sherpa-onnx recognizer config (sample rate, decoding method, etc.) |
model | sherpa-onnx model config (tokens file, family-specific model paths) |
Field naming follows sherpa-onnx C API conventions.
Contributing a resource
Section titled “Contributing a resource”Adding a provider
Section titled “Adding a provider”- Create
resources/providers/<folder>/<batch|streaming>/entry.py+README.md - Add entry to
registry/providers.json - Add i18n entries to
i18n/en_US.jsonandi18n/zh_CN.json
Adding an adapter
Section titled “Adding an adapter”- Create
resources/adapters/<folder>/<name>/entry.py+README.md - Add entry to
registry/adapters.json - Add i18n entries
Adding a model
Section titled “Adding a model”- Add entry to
registry/models.jsonwith completevinput_modelmetadata - Add i18n entries
i18n entry format
Section titled “i18n entry format”{ "<resource-id>.title": "Display Name", "<resource-id>.description": "One-line description."}Both en_US.json and zh_CN.json must be updated.