Skip to content

Scenes & LLM

Raw text from ASR may contain filler words, typos, or formatting issues. Scenes define a prompt that tells an LLM how to rewrite the text.

ASR raw text → [Scene prompt + LLM] → Rewritten text

Core relationships:

  • A scene is a configuration: prompt (instruction) + bound LLM provider + model
  • An LLM provider is an OpenAI-compatible API endpoint (e.g. Ollama, Groq, OpenAI)
  • An LLM adapter is an optional local bridge process that wraps a non-standard local model into an OpenAI-compatible API for a provider to point to

In short: the scene decides “how to rewrite”, the LLM provider decides “who rewrites”, the adapter solves “incompatible API”.

When LLM rewriting is not needed, use the built-in __raw__ scene — recognition results go directly to input.

Each scene contains:

FieldDescription
idUnique identifier
labelDisplay name in the menu
promptSystem prompt sent to the LLM
provider_idBound LLM provider
modelModel name to use
candidate_countNumber of candidates to return

LLM is only invoked when provider_id + model + prompt are all configured.

Built-in scenes:

  • __raw__ — Bypasses LLM, outputs raw recognition text
  • __command__ — Used by command mode (see below)

Switch scenes at runtime with Shift_R.

Corresponding config:

{
"scenes": {
"active_scene": "__raw__",
"definitions": [
{
"id": "__raw__",
"candidate_count": 0
},
{
"id": "default",
"label": "Default",
"prompt": "Correct ASR errors, keep original meaning...",
"provider_id": "groq",
"model": "openai/gpt-oss-120b",
"candidate_count": 5
}
]
}
}

In Vinput GUI, manage scenes in the LLM tab: add, edit prompt, bind provider and model.

Terminal window
vinput scene list # List all scenes
vinput scene add --id <id> # Add a scene
vinput scene edit <id> # Edit a scene
vinput scene use <id> # Switch active scene
vinput scene remove <id> # Remove a scene

When adding a scene with LLM, --provider, --model, and --prompt must all be provided.

An LLM provider is an OpenAI-compatible API endpoint. Scenes reference it via provider_id.

You can configure multiple providers — different scenes can bind to different providers.

Corresponding config:

{
"llm": {
"providers": [
{
"id": "groq",
"base_url": "https://api.groq.com/openai/v1",
"api_key": "your-api-key"
}
]
}
}

Using local Ollama:

Terminal window
vinput llm add ollama --base-url http://127.0.0.1:11434/v1
vinput scene add --id polish \
--label "Polish" \
--provider ollama \
--model qwen2.5:7b \
--prompt "Rewrite the recognized text into polished Chinese."
vinput scene use polish
Terminal window
vinput llm list # List configured providers
vinput llm add <id> --base-url <url> # Add
vinput llm edit <id> --base-url <url> # Edit
vinput llm remove <id> # Remove

If your local model does not provide an OpenAI-compatible API directly, install an LLM adapter. An adapter is a local process that wraps the model into an OpenAI-compatible API, then you point your LLM provider’s base_url to it.

Corresponding config:

{
"llm": {
"adapters": []
}
}

In Vinput GUI, go to Resources → LLM Adapters to browse and install.

Terminal window
vinput adapter list # List installed adapters
vinput adapter list -a # List available remote adapters
vinput adapter add <id> # Install
vinput adapter start <id> # Start
vinput adapter stop <id> # Stop

Command mode is a special scene usage: select existing text, then use a voice instruction to have LLM rewrite it.

Flow: select text → hold Control_R → speak your instruction → release → done.

Under the hood it uses the built-in __command__ scene, whose prompt template is concatenated with your spoken instruction at runtime.

Examples:

  • Select Chinese text → say “translate to English” → replaced with translation
  • Select code → say “add comments” → replaced with commented version

If there is no surrounding-text selection, it falls back to primary-selection clipboard content.

Command mode requires an LLM provider to be configured, with provider_id and model bound in the __command__ scene.