✍️ Constrained Grammars
Overview
The chat endpoint supports the grammar parameter, which allows users to specify a grammar in Backus-Naur Form (BNF). This feature enables the Large Language Model (LLM) to generate outputs adhering to a user-defined schema, such as JSON, YAML, or any other format that can be defined using BNF. For more details about BNF, see Backus-Naur Form on Wikipedia.
Note
Compatibility Notice: Grammar and structured output support is available for the following backends:
- llama.cpp — supports the
grammarparameter (GBNF syntax) andresponse_formatwithjson_schema/json_object - vLLM — supports the
grammarparameter (via xgrammar),response_formatwithjson_schema(native JSON schema enforcement), andjson_object
For a complete list of compatible models, refer to the Model Compatibility page.
Setup
To use this feature, follow the installation and setup instructions on the LocalAI Functions page. Ensure that your local setup meets all the prerequisites specified for the llama.cpp backend.
💡 Usage Example
The following example demonstrates how to use the grammar parameter to constrain the model’s output to either “yes” or “no”. This can be particularly useful in scenarios where the response format needs to be strictly controlled.
Example: Binary Response Constraint
In this example, the grammar parameter is set to a simple choice between “yes” and “no”, ensuring that the model’s response adheres strictly to one of these options regardless of the context.
Example: JSON Output Constraint
You can also use grammars to enforce JSON output format:
Example: YAML Output Constraint
Similarly, you can enforce YAML format:
Advanced Usage
For more complex grammars, you can define multi-line BNF rules. The grammar parser supports:
- Alternation (
|) - Repetition (
*,+) - Optional elements (
?) - Character classes (
[a-z]) - String literals (
"text")
vLLM Backend
The vLLM backend supports structured output via three methods:
JSON Schema (recommended)
Use the OpenAI-compatible response_format parameter with json_schema to enforce a specific JSON structure:
JSON Object
Force the model to output valid JSON (without a specific schema):
Grammar
The grammar parameter also works with vLLM via xgrammar:
Related Features
- OpenAI Functions - Function calling with structured outputs
- Text Generation - General text generation capabilities