ailoy-web
    Preparing search index...

    Interface LangModelInferConfig

    Configuration parameters that control the behavior of language model inference.

    Fields

    • document_polyfill: Configuration describing how retrieved documents are embedded into the model input. If None, it does not perform any polyfill, (ignoring documents).

    • think_effort: Controls the model's reasoning intensity. In local models, low, medium, high is ignored. In API models, it is up to it's API. See API parameters. Possible values: disable, enable, low, medium, high.

    • temperature: Sampling temperature controlling randomness of output. Lower values make output more deterministic; higher values increase diversity.

    • top_p: Nucleus sampling parameter (probability mass cutoff). Limits token sampling to a cumulative probability ≤ top_p.`

    • max_tokens: Maximum number of tokens to generate for a single inference.

    • grammar: Optional grammar constraint that restricts valid output forms. Supported types include: Plain (unconstrained text), JSON (ensures valid JSON output), JSONSchema { schema } (validates JSON against the given schema), Regex { regex } (constrains generation by a regular expression), CFG { cfg } (uses a context-free grammar definition).

    interface LangModelInferConfig {
        documentPolyfill?: DocumentPolyfill;
        grammar?: Grammar;
        maxTokens?: number;
        temperature?: number;
        thinkEffort?: ThinkEffort;
        topP?: number;
    }
    Index

    Properties

    documentPolyfill?: DocumentPolyfill
    grammar?: Grammar
    maxTokens?: number
    temperature?: number
    thinkEffort?: ThinkEffort
    topP?: number