ποΈ Make LLMs speak the language of every application. ποΈ
Made with β€π·οΈ by the team at .txt.
pip install outlines
First time here? Go to our setup guide
- π€ Multiple model integrations: OpenAI, transformers, llama.cpp, exllama2, mamba
- ποΈ Simple and powerful prompting primitives based on the Jinja templating engine
- π Multiple choices, type constraints and dynamic stopping
- β‘ Fast regex-structured generation
- π₯ Fast JSON generation following a JSON schema or a Pydantic model
- π Grammar-structured generation
- π Interleave completions with loops, conditionals, and custom Python functions
- πΎ Caching of generations
- ποΈ Batch inference
- π² Sample with the greedy, multinomial and beam search algorithms (and more to come!)
- π Serve with vLLM, with official Docker image,
outlinesdev/outlines
!
Outlines has new releases and features coming every week. Make sure to β star and π watch this repository, follow @dottxtai to stay up to date!
- It doesn't add any overhead during inference (cost-free)
- It allows Open Source models to beat closed source models (Mistral, GPT-4)
- It speeds up inference
- It improves the performance of base models (GSM8K)
- It improves the performance of finetuned models (CoNNL)
- It improves model efficiency (less examples needed)
We started a company to keep pushing the boundaries of structured generation. Learn more about .txt, and give our .json API a try if you need a hosted solution β¨
The first step towards reliability of systems that include large language models is to ensure that there is a well-defined interface between their output and user-defined code. Outlines provides ways to control the generation of language models to make their output more predictable.
The following methods of structured generation are supported:
- Multiple choices
- Type constraints
- Efficient regex-structured generation
- Efficient JSON generation following a Pydantic model
- Using context-free grammars to guide generation
- Open functions
Outlines does not manage chat templating tokens when using instruct models. You must apply the chat template tokens to the prompt yourself. Chat template tokens are not needed for base models.
Please see the documentation on chat templating for more.
You can reduce the completion to a choice between multiple possibilities:
import outlines
model_name = "HuggingFaceTB/SmolLM2-360M-Instruct"
model = outlines.models.transformers(model_name)
# You must apply the chat template tokens to the prompt!
# See below for an example.
prompt = """
<|im_start|>system
You extract information from text.
<|im_end|>
<|im_start|>user
What food does the following text describe?
Text: I really really really want pizza.
<|im_end|>
<|im_start|>assistant
"""
generator = outlines.generate.choice(model, ["Pizza", "Pasta", "Salad", "Dessert"])
answer = generator(prompt)
# Likely answer: Pizza
You can also pass in choices with an Enum
:
from enum import Enum
class Food(str, Enum):
pizza = "Pizza"
pasta = "Pasta"
salad = "Salad"
dessert = "Dessert"
generator = outlines.generate.choice(model, Food)
answer = generator(prompt)
# Likely answer: Pizza
You can instruct the model to only return integers or floats:
import outlines
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
prompt = "<s>result of 9 + 9 = 18</s><s>result of 1 + 2 = "
answer = outlines.generate.format(model, int)(prompt)
print(answer)
# 3
prompt = "sqrt(2)="
generator = outlines.generate.format(model, float)
answer = generator(prompt, max_tokens=10)
print(answer)
# 1.41421356
Outlines also comes with fast regex-structured generation. In fact, the choice
and
format
functions above all use regex-structured generation under the
hood:
import outlines
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
prompt = """
<|im_start|>system You are a helpful assistant.
<|im_end|>
<|im_start|>user
What is an IP address of the Google DNS servers?
<|im_end|>
<|im_start|>assistant
The IP address of a Google DNS server is
"""
generator = outlines.generate.text(model)
unstructured = generator(prompt, max_tokens=30)
generator = outlines.generate.regex(
model,
r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
sampler=outlines.samplers.greedy(),
)
structured = generator(prompt, max_tokens=30)
print(unstructured)
# 8.8.8.8
#
# <|im_end|>
print(structured)
# 8.8.8.8
Unlike other libraries, regex-structured generation in Outlines is almost as fast as non-structured generation.
Outlines users can guide the generation process so the output is guaranteed to follow a JSON schema or Pydantic model:
from enum import Enum
from pydantic import BaseModel, constr
import outlines
class Weapon(str, Enum):
sword = "sword"
axe = "axe"
mace = "mace"
spear = "spear"
bow = "bow"
crossbow = "crossbow"
class Armor(str, Enum):
leather = "leather"
chainmail = "chainmail"
plate = "plate"
class Character(BaseModel):
name: constr(max_length=10)
age: int
armor: Armor
weapon: Weapon
strength: int
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
# Construct structured sequence generator
generator = outlines.generate.json(model, Character)
# Draw a sample
seed = 789001
character = generator("Give me a character description", seed=seed)
print(repr(character))
# Character(name='Anderson', age=28, armor=<Armor.chainmail: 'chainmail'>, weapon=<Weapon.sword: 'sword'>, strength=8)
character = generator("Give me an interesting character description")
print(repr(character))
# Character(name='Vivian Thr', age=44, armor=<Armor.plate: 'plate'>, weapon=<Weapon.crossbow: 'crossbow'>, strength=125)
The method works with union types, optional types, arrays, nested schemas, etc. Some field constraints are not supported yet, but everything else should work.
Sometimes you just want to be able to pass a JSON Schema instead of a Pydantic model. We've got you covered:
import outlines
schema = '''{
"title": "Character",
"type": "object",
"properties": {
"name": {
"title": "Name",
"maxLength": 10,
"type": "string"
},
"age": {
"title": "Age",
"type": "integer"
},
"armor": {"$ref": "#/definitions/Armor"},
"weapon": {"$ref": "#/definitions/Weapon"},
"strength": {
"title": "Strength",
"type": "integer"
}
},
"required": ["name", "age", "armor", "weapon", "strength"],
"definitions": {
"Armor": {
"title": "Armor",
"description": "An enumeration.",
"enum": ["leather", "chainmail", "plate"],
"type": "string"
},
"Weapon": {
"title": "Weapon",
"description": "An enumeration.",
"enum": ["sword", "axe", "mace", "spear", "bow", "crossbow"],
"type": "string"
}
}
}'''
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, schema)
character = generator("Give me a character description")
Formal grammars rule the world, and Outlines makes them rule LLMs too. You can pass any context-free grammar in the EBNF format and Outlines will generate an output that is valid to this grammar:
import outlines
arithmetic_grammar = """
?start: expression
?expression: term (("+" | "-") term)*
?term: factor (("*" | "/") factor)*
?factor: NUMBER
| "-" factor
| "(" expression ")"
%import common.NUMBER
"""
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = outlines.generate.cfg(model, arithmetic_grammar)
sequence = generator("Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:")
print(sequence)
# (8-2)
This was a very simple grammar, and you can use outlines.generate.cfg
to generate syntactically valid Python, SQL, and much more than this. Any kind of structured text, really. All you have to do is search for "X EBNF grammar" on the web, and take a look at the Outlines grammars
module.
Outlines can infer the structure of the output from the signature of a function. The result is a dictionary, and can be passed directly to the function using the usual dictionary expansion syntax **
:
import outlines
def add(a: int, b: int):
return a + b
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = outlines.generate.json(model, add)
result = generator("Return json with two integers named a and b respectively. a is odd and b even.")
print(add(**result))
# 3
A great advantage of passing functions directly to specify the structure is that the structure of the LLM will change with the function's definition. No need to change the code at several places!
You can also embed various functions into an enum to generate params:
from enum import Enum
from functools import partial
import outlines
def add(a: int, b: int) -> int:
return a + b
def mul(c: float, d: float) -> float:
return c * d
class Operation(Enum):
add = partial(add)
mul = partial(mul)
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = outlines.generate.json(model, Operation)
result = generator("Return json with two float named c and d respectively. c is negative and d greater than 1.0.")
print(result)
# {'c': -3.14, 'd': 1.5}
Building prompts can get messy. Outlines makes it easier to write and manage prompts by encapsulating templates inside "template functions".
These functions make it possible to neatly separate the prompt logic from the general program logic; they can be imported from other modules and libraries.
Template functions require no superfluous abstraction, they use the Jinja2 templating engine to help build complex prompts in a concise manner:
import outlines
examples = [
("The food was disgusting", "Negative"),
("We had a fantastic night", "Positive"),
("Recommended", "Positive"),
("The waiter was rude", "Negative")
]
@outlines.prompt
def labelling(to_label, examples):
"""You are a sentiment-labelling assistant.
{% for example in examples %}
{{ example[0] }} // {{ example[1] }}
{% endfor %}
{{ to_label }} //
"""
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
prompt = labelling("Just awesome", examples)
answer = outlines.generate.text(model)(prompt, max_tokens=100)
- π‘ Have an idea? Come chat with us on Discord
- π¨ Want to contribute? Consult our contribution guide.
- π Found a bug? Open an issue
@article{willard2023efficient,
title={Efficient Guided Generation for LLMs},
author={Willard, Brandon T and Louf, R{\'e}mi},
journal={arXiv preprint arXiv:2307.09702},
year={2023}
}