Managing prompts with Dotprompt

Firebase Genkit provides the Dotprompt plugin and text format to help you write and organize your generative AI prompts.

Dotprompt is designed around the premise that prompts are code. You write and maintain your prompts in specially-formatted files called dotprompt files, track changes to them using the same version control system that you use for your code, and you deploy them along with the code that calls your generative AI models.

To use Dotprompt, first create a prompts directory in your project root and then create a .prompt file in that directory. Here's a simple example you might call greeting.prompt:

---
model: vertexai/gemini-1.5-flash
config:
  temperature: 0.9
input:
  schema:
    location: string
    style?: string
    name?: string
  default:
    location: a restaurant
---

You are the world's most welcoming AI assistant and are currently working at {{location}}.

Greet a guest{{#if name}} named {{name}}{{/if}}{{#if style}} in the style of {{style}}{{/if}}.

To use this prompt, install the dotprompt plugin:

go get github.com/firebase/genkit/go/plugins/dotprompt

Then, load the prompt using Open:

import "github.com/firebase/genkit/go/plugins/dotprompt"
dotprompt.SetDirectory("prompts")
prompt, err := dotprompt.Open("greeting")

You can call the prompt's Generate method to render the template and pass it to the model API in one step:

ctx := context.Background()

// Default to the project in GCLOUD_PROJECT and the location "us-central1".
vertexai.Init(ctx, nil)

// The .prompt file specifies vertexai/gemini-1.5-flash, which is
// automatically defined by Init(). However, if it specified a model that
// isn't automatically loaded (such as a specific version), you would need
// to define it here:
// vertexai.DefineModel("gemini-1.0-pro-002", &ai.ModelCapabilities{
// 	Multiturn:  true,
// 	Tools:      true,
// 	SystemRole: true,
// 	Media:      false,
// })

type GreetingPromptInput struct {
	Location string `json:"location"`
	Style    string `json:"style"`
	Name     string `json:"name"`
}
response, err := prompt.Generate(
	ctx,
	dotprompt.WithInput(GreetingPromptInput{
		Location: "the beach",
		Style:    "a fancy pirate",
		Name:     "Ed",
	}),
	nil,
)
if err != nil {
	return err
}

fmt.Println(response.Text())

Or just render the template to a string:

renderedPrompt, err := prompt.RenderText(map[string]any{
	"location": "a restaurant",
	"style":    "a pirate",
})

Dotprompt's syntax is based on the Handlebars templating language. You can use the if, unless, and each helpers to add conditional portions to your prompt or iterate through structured content. The file format utilizes YAML frontmatter to provide metadata for a prompt inline with the template.

Defining Input/Output Schemas with Picoschema

Dotprompt includes a compact, YAML-based schema definition format called Picoschema to make it easy to define the most important attributs of a schema for LLM usage. Here's an example of a schema for an article:

schema:
  title: string # string, number, and boolean types are defined like this
  subtitle?: string # optional fields are marked with a `?`
  draft?: boolean, true when in draft state
  status?(enum, approval status): [PENDING, APPROVED]
  date: string, the date of publication e.g. '2024-04-09' # descriptions follow a comma
  tags(array, relevant tags for article): string # arrays are denoted via parentheses
  authors(array):
    name: string
    email?: string
  metadata?(object): # objects are also denoted via parentheses
    updatedAt?: string, ISO timestamp of last update
    approvedBy?: integer, id of approver
  extra?: any, arbitrary extra data
  (*): string, wildcard field

The above schema is equivalent to the following JSON schema:

{
  "properties": {
    "metadata": {
      "properties": {
        "updatedAt": {
          "type": "string",
          "description": "ISO timestamp of last update"
        },
        "approvedBy": {
          "type": "integer",
          "description": "id of approver"
        }
      },
      "type": "object"
    },
    "title": {
      "type": "string"
    },
    "subtitle": {
      "type": "string"
    },
    "draft": {
      "type": "boolean",
      "description": "true when in draft state"
    },
    "date": {
      "type": "string",
      "description": "the date of publication e.g. '2024-04-09'"
    },
    "tags": {
      "items": {
        "type": "string"
      },
      "type": "array",
      "description": "relevant tags for article"
    },
    "authors": {
      "items": {
        "properties": {
          "name": {
            "type": "string"
          },
          "email": {
            "type": "string"
          }
        },
        "type": "object",
        "required": ["name"]
      },
      "type": "array"
    }
  },
  "type": "object",
  "required": ["title", "date", "tags", "authors"]
}

Picoschema supports scalar types string, integer, number, boolean, and any. For objects, arrays, and enums they are denoted by a parenthetical after the field name.

Objects defined by Picoschema have all properties as required unless denoted optional by ?, and do not allow additional properties. When a property is marked as optional, it is also made nullable to provide more leniency for LLMs to return null instead of omitting a field.

In an object definition, the special key (*) can be used to declare a "wildcard" field definition. This will match any additional properties not supplied by an explicit key.

Picoschema does not support many of the capabilities of full JSON schema. If you require more robust schemas, you may supply a JSON Schema instead:

output:
  schema:
    type: object
    properties:
      field1:
        type: number
        minimum: 20

Overriding Prompt Metadata

While .prompt files allow you to embed metadata such as model configuration in the file itself, you can also override these values on a per-call basis:

// Make sure you set up the model you're using.
vertexai.DefineModel("gemini-1.5-flash", nil)

response, err := prompt.Generate(
	context.Background(),
	dotprompt.WithInput(GreetingPromptInput{
		Location: "the beach",
		Style:    "a fancy pirate",
		Name:     "Ed",
	}),
	dotprompt.WithModelName("vertexai/gemini-1.5-flash"),
	dotprompt.WithConfig(&ai.GenerationCommonConfig{
		Temperature: 1.0,
	}),
	nil,
)

Multi-message prompts

By default, Dotprompt constructs a single message with a "user" role. Some prompts are best expressed as a combination of multiple messages, such as a system prompt.

The {{role}} helper provides a simple way to construct multi-message prompts:

---
model: vertexai/gemini-1.5-flash
input:
  schema:
    userQuestion: string
---

{{role "system"}}
You are a helpful AI assistant that really loves to talk about food. Try to work
food items into all of your conversations.
{{role "user"}}
{{userQuestion}}

Multi-modal prompts

For models that support multimodal input such as images alongside text, you can use the {{media}} helper:

---
model: vertexai/gemini-1.5-flash
input:
  schema:
    photoUrl: string
---

Describe this image in a detailed paragraph:

{{media url=photoUrl}}

The URL can be https:// or base64-encoded data: URIs for "inline" image usage. In code, this would be:

dotprompt.SetDirectory("prompts")
describeImagePrompt, err := dotprompt.Open("describe_image")
if err != nil {
	return err
}

imageBytes, err := os.ReadFile("img.jpg")
if err != nil {
	return err
}
encodedImage := base64.StdEncoding.EncodeToString(imageBytes)
dataURI := "data:image/jpeg;base64," + encodedImage

type DescribeImagePromptInput struct {
	PhotoUrl string `json:"photo_url"`
}
response, err := describeImagePrompt.Generate(
	context.Background(),
	dotprompt.WithInput(DescribeImagePromptInput{
		PhotoUrl: dataURI,
	}),
	nil,
)

Prompt Variants

Because prompt files are just text, you can (and should!) commit them to your version control system, allowing you to compare changes over time easily. Often times, tweaked versions of prompts can only be fully tested in a production environment side-by-side with existing versions. Dotprompt supports this through its variants feature.

To create a variant, create a [name].[variant].prompt file. For instance, if you were using Gemini 1.5 Flash in your prompt but wanted to see if Gemini 1.5 Pro would perform better, you might create two files:

  • my_prompt.prompt: the "baseline" prompt
  • my_prompt.geminipro.prompt: a variant named "geminipro"

To use a prompt variant, specify the variant when loading:

describeImagePrompt, err := dotprompt.OpenVariant("describe_image", "geminipro")

The prompt loader will attempt to load the variant of that name, and fall back to the baseline if none exists. This means you can use conditional loading based on whatever criteria makes sense for your application:

var myPrompt *dotprompt.Prompt
var err error
if isBetaTester(user) {
	myPrompt, err = dotprompt.OpenVariant("describe_image", "geminipro")
} else {
	myPrompt, err = dotprompt.Open("describe_image")
}

The name of the variant is included in the metadata of generation traces, so you can compare and contrast actual performance between variants in the Genkit trace inspector.