Skip to content

Registry

vinput-registry is a separate repository that hosts all downloadable resources for Vinput: local ASR models, cloud ASR provider scripts, and LLM adapter scripts. When you click Download or Install in Vinput GUI, or run vinput model add / vinput provider add / vinput adapter add, the resources are fetched from this registry.

vinput-registry/
├── registry/
│ ├── models.json # Local ASR model index
│ ├── providers.json # Cloud ASR provider index
│ └── adapters.json # LLM adapter index
├── i18n/
│ ├── en_US.json # English display text
│ └── zh_CN.json # Chinese display text
└── resources/
├── providers/<folder>/<variant>/
│ ├── entry.py
│ └── README.md
└── adapters/<folder>/<name>/
├── entry.py
└── README.md
  • All scripts must be self-contained — only use the standard library of your chosen runtime. No third-party dependencies, no pip install, no npm install. This ensures scripts work on any machine out of the box.
  • Scripts can be written in any language. Python 3 is the recommended choice because it is available on virtually all Linux distributions. Node.js, Bash, or other runtimes are also acceptable as long as the stdlib-only rule is met.
  • The command field in registry JSON works like a VS Code tasks.json entry: it specifies the interpreter, and args specifies the script path.
  • Download URLs use array form for fallback.
  • Each script resource must include an entry script and README.md.
TypePatternExample
Modelmodel.sherpa-onnx.<model-name>model.sherpa-onnx.sense-voice-zh-en-ja-ko-yue-int8
Providerprovider.<folder>.<variant>provider.bailian.streaming
Adapteradapter.<folder>.<name>adapter.mtranserver.proxy
  • id is a stable machine identifier — never rename once published
  • short_id is for human display in CLI/GUI/logs only
  • i18n display text is stored as <id>.title and <id>.description in i18n/*.json

A non-streaming provider receives a complete audio recording and returns the transcript.

Runtime contract:

ItemSpec
Commandpython3
InputRaw PCM S16_LE, mono, 16000 Hz via stdin (binary)
OutputFinal transcript text via stdout
ErrorsHuman-readable messages to stderr
Exit codes0 success, 1 runtime error, 2 usage error
DependenciesStandard library only (no third-party packages)

File structure:

resources/providers/<folder>/batch/
├── entry.py
└── README.md

A streaming provider recognizes audio in real time as it arrives.

Runtime contract:

ItemSpec
Commandpython3
InputJSONL via stdin
OutputJSONL via stdout
Errorsstderr only
Exit codes0 success, 1 runtime error, 2 usage error
DependenciesStandard library only (no third-party packages)

Input protocol (stdin):

{"type":"audio","audio_base64":"<base64 PCM S16_LE mono 16kHz>","commit":false}
{"type":"audio","audio_base64":"...","commit":true}
{"type":"finish"}
{"type":"cancel"}

Output protocol (stdout):

{"type":"session_started","session_id":"..."}
{"type":"partial","text":"current partial transcript"}
{"type":"final","text":"confirmed transcript","segment_final":true}
{"type":"error","message":"..."}
{"type":"closed"}

Transcript semantics:

  • partial.text — full user-visible transcript at the current moment
  • final.text — full confirmed transcript at the current moment
  • The script is responsible for accumulating and deduplicating segments

File structure:

resources/providers/<folder>/streaming/
├── entry.py
└── README.md

Provider env names use the VINPUT_ASR_* namespace. Shared variables:

VariablePurpose
VINPUT_ASR_API_KEYBearer-style API credential
VINPUT_ASR_APP_IDApp identifier
VINPUT_ASR_ACCESS_TOKENToken credential (non-API-key style)
VINPUT_ASR_URLFull endpoint override
VINPUT_ASR_BASE_URLBase endpoint (script constructs final path)
VINPUT_ASR_MODELRemote model identifier
VINPUT_ASR_LANGUAGETranscription language hint
VINPUT_ASR_PROMPTRecognition prompt or bias text
VINPUT_ASR_TIMEOUTNetwork timeout (seconds)
VINPUT_ASR_FINISH_GRACE_SECSExtra wait after finish before closing
VINPUT_ASR_ENABLE_VADEnable server-side VAD
VINPUT_ASR_VAD_THRESHOLDVAD sensitivity
VINPUT_ASR_VAD_PREFIX_PADDING_MSAudio padding before speech
VINPUT_ASR_VAD_SILENCE_DURATION_MSSilence duration to close turn

Provider-specific variables are allowed when upstream features don’t map to shared ones, but must still use the VINPUT_ASR_* prefix.

An adapter is a local process that exposes a non-standard LLM as an OpenAI-compatible API.

Runtime contract:

ItemSpec
CommandInterpreter (e.g. python3)
ProvidesLocal HTTP server
Required endpointsGET /v1/models, POST /v1/chat/completions
DependenciesStandard library only (no third-party packages)

For POST /v1/chat/completions, Vinput expects:

  • Non-streaming only: "stream": false
  • Standard OpenAI chat completion JSON envelope
  • choices[0].message.content must be a string
  • That string itself must be JSON in this shape:
{
"candidates": [
"candidate 1",
"candidate 2"
]
}

The outer response is OpenAI-compatible, while the inner content string is the structured payload consumed by Vinput.

GET /v1/models only needs to return the standard OpenAI list format:

{
"object": "list",
"data": [
{
"id": "my-model",
"object": "model",
"owned_by": "my-adapter"
}
]
}

File structure:

resources/adapters/<folder>/<name>/
├── entry.py
└── README.md

Adapter env names use a provider-specific prefix (e.g. MTRAN_*), not VINPUT_ASR_*.

Model entries in models.json describe downloadable sherpa-onnx model archives.

Required fields:

FieldDescription
idStable ID, e.g. model.sherpa-onnx.<name>
short_idHuman-readable short ID
urlsArray of download URLs (fallback order)
sha256Archive checksum
size_bytesArchive size
languagezh, en, multilingual, etc.
vinput_modelRuntime metadata (see below)

vinput_model structure:

FieldDescription
backendsherpa-offline or sherpa-streaming
runtimeoffline or online
familysherpa-onnx C API family: dolphin, sense_voice, paraformer, transducer, qwen3_asr
languageLanguage code
size_bytesModel size
supports_hotwordsWhether hotword boosting is supported
recognizersherpa-onnx recognizer config (sample rate, decoding method, etc.)
modelsherpa-onnx model config (tokens file, family-specific model paths)

Field naming follows sherpa-onnx C API conventions.

  1. Create resources/providers/<folder>/<batch|streaming>/entry.py + README.md
  2. Add entry to registry/providers.json
  3. Add i18n entries to i18n/en_US.json and i18n/zh_CN.json
  1. Create resources/adapters/<folder>/<name>/entry.py + README.md
  2. Add entry to registry/adapters.json
  3. Add i18n entries
  1. Add entry to registry/models.json with complete vinput_model metadata
  2. Add i18n entries
{
"<resource-id>.title": "Display Name",
"<resource-id>.description": "One-line description."
}

Both en_US.json and zh_CN.json must be updated.