Interface LangModelInferConfig

Configuration parameters that control the behavior of language model inference.

Fields

document_polyfill: Configuration describing how retrieved documents are embedded into the model input. If None, it does not perform any polyfill, (ignoring documents).
think_effort: Controls the model's reasoning intensity. In local models, low, medium, high is ignored. In API models, it is up to it's API. See API parameters. Possible values: disable, enable, low, medium, high.
temperature: Sampling temperature controlling randomness of output. Lower values make output more deterministic; higher values increase diversity.
top_p: Nucleus sampling parameter (probability mass cutoff). Limits token sampling to a cumulative probability ≤ top_p.`
max_tokens: Maximum number of tokens to generate for a single inference.
grammar: Optional grammar constraint that restricts valid output forms. Supported types include: Plain (unconstrained text), JSON (ensures valid JSON output), JSONSchema { schema } (validates JSON against the given schema), Regex { regex } (constrains generation by a regular expression), CFG { cfg } (uses a context-free grammar definition).

interface LangModelInferConfig {
    documentPolyfill?: DocumentPolyfill;
    grammar?: Grammar;
    maxTokens?: number;
    temperature?: number;
    thinkEffort?: ThinkEffort;
    topP?: number;
}

Index

Properties

documentPolyfill? grammar? maxTokens? temperature? thinkEffort? topP?

Properties

`Optional`documentPolyfill

documentPolyfill?: DocumentPolyfill

`Optional`grammar

grammar?: Grammar

`Optional`maxTokens

maxTokens?: number

`Optional`temperature

temperature?: number

`Optional`thinkEffort

thinkEffort?: ThinkEffort

`Optional`topP

topP?: number