使用 Gemini API 生成结构化输出(例如 JSON 和枚举)

默认情况下,Gemini API 会以非结构化文本的形式返回响应。不过,某些用例需要结构化文本,例如 JSON。例如,您可能需要将响应用于需要已建立的数据架构的其他下游任务。

为确保模型生成的输出始终遵循特定架构,您可以定义响应架构,该架构类似于模型响应的蓝图。然后,您可以直接从模型的输出中提取数据,而无需进行太多的后期处理。

下面是一些示例:

  • 确保模型的响应生成有效的 JSON 并符合您提供的架构。
    例如,该模型可以为食谱生成结构化条目,其中始终包含食谱名称、成分列表和步骤。这样,您就可以更轻松地解析这些信息,并在应用界面中显示这些信息。

  • 限制模型在分类任务期间的响应方式。
    例如,您可以让模型使用一组特定的标签(例如一组特定的枚举,如 positivenegative)为文本添加注释,而不是使用模型生成的标签(这些标签可能具有一定程度的可变性,例如 goodpositivenegativebad)。

本指南介绍了如何在调用 generateContent 时提供 responseSchema 以生成 JSON 输出。它专注于纯文本输入,但 Gemini 还可以针对包含图片、视频和音频作为输入的多模态请求生成结构化回答。

本页面底部提供了更多示例,例如如何生成枚举值作为输出。如需查看有关如何生成结构化输出的更多示例,请参阅 Google Cloud 文档中的示例架构和模型响应列表。

准备工作

如果您尚未完成入门指南,请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、初始化 Vertex AI 服务以及创建 GenerativeModel 实例。

第 1 步:定义响应架构

通过定义响应架构指定模型输出的结构、字段名称以及每个字段的预期数据类型。

模型生成响应时,会使用提示中的字段名称和上下文。为确保您的 intent 清晰明了,我们建议您使用清晰的结构、明确的字段名称,甚至根据需要添加说明。

响应架构注意事项

编写回答架构时,请注意以下事项:

  • 响应架构的大小会占用输入词元限额。

  • 响应架构功能支持以下响应 MIME 类型:

    • application/json:输出响应架构中定义的 JSON(适用于结构化输出要求)

    • text/x.enum:输出回答架构中定义的枚举值(对分类任务很有用)

  • 响应架构功能支持以下架构字段:

    enum
    items
    maxItems
    nullable
    properties
    required

    如果您使用的是不受支持的字段,模型仍可以处理您的请求,但会忽略该字段。请注意,上述列表是 OpenAPI 3.0 架构对象的子集(请参阅 Vertex AI 架构参考文档)。

  • 默认情况下,对于 Vertex AI in Firebase SDK,除非您在 optionalProperties 数组中将其指定为可选,否则所有字段都被视为必填字段。对于这些可选字段,模型可以填充这些字段或跳过这些字段。

    请注意,这与 Vertex AI Gemini API 的默认行为相反。

第 2 步:发送包含响应架构的问题以生成 JSON

以下示例展示了如何生成结构化 JSON 输出。

创建 GenerativeModel 实例时,请指定适当的 responseMimeType(在此示例中为 application/json)以及您希望模型使用的 responseSchema

Swift

import FirebaseVertexAI

// Provide a JSON schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
let jsonSchema = Schema.object(
  properties: [
    "characters": Schema.array(
      items: .object(
        properties: [
          "name": .string(),
          "age": .integer(),
          "species": .string(),
          "accessory": .enumeration(values: ["hat", "belt", "shoes"]),
        ],
        optionalProperties: ["accessory"]
      )
    ),
  ]
)

// Initialize the Vertex AI service and the generative model.
let model = VertexAI.vertexAI().generativeModel(
  modelName: "gemini-2.0-flash",
  // In the generation config, set the `responseMimeType` to `application/json`
  // and pass the JSON schema object into `responseSchema`.
  generationConfig: GenerationConfig(
    responseMIMEType: "application/json",
    responseSchema: jsonSchema
  )
)

let prompt = "For use in a children's card game, generate 10 animal-based characters."

let response = try await model.generateContent(prompt)
print(response.text ?? "No text in response.")

Kotlin

对于 Kotlin,此 SDK 中的方法是挂起函数,需要从协程作用域调用。
// Provide a JSON schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
val jsonSchema = Schema.obj(
    mapOf("characters" to Schema.array(
        Schema.obj(
            mapOf(
                "name" to Schema.string(),
                "age" to Schema.integer(),
                "species" to Schema.string(),
                "accessory" to Schema.enumeration(listOf("hat", "belt", "shoes")),
            ),
            optionalProperties = listOf("accessory")
        )
    ))
)

// Initialize the Vertex AI service and the generative model.
val generativeModel = Firebase.vertexAI.generativeModel(
    modelName = "gemini-2.0-flash",
    // In the generation config, set the `responseMimeType` to `application/json`
    // and pass the JSON schema object into `responseSchema`.
    generationConfig = generationConfig {
        responseMimeType = "application/json"
        responseSchema = jsonSchema
    })

val prompt = "For use in a children's card game, generate 10 animal-based characters."
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

对于 Java,此 SDK 中的流式传输方法会从 Reactive Streams 库返回 Publisher 类型。
// Provide a JSON schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
Schema jsonSchema = Schema.obj(
        /* properties */
        Map.of(
                "characters", Schema.array(
                        /* items */ Schema.obj(
                                /* properties */
                                Map.of("name", Schema.str(),
                                        "age", Schema.numInt(),
                                        "species", Schema.str(),
                                        "accessory",
                                        Schema.enumeration(
                                                List.of("hat", "belt", "shoes")))
                        ))),
        List.of("accessory"));

// In the generation config, set the `responseMimeType` to `application/json`
// and pass the JSON schema object into `responseSchema`.
GenerationConfig.Builder configBuilder = new GenerationConfig.Builder();
configBuilder.responseMimeType = "application/json";
configBuilder.responseSchema = jsonSchema;

GenerationConfig generationConfig = configBuilder.build();

// Initialize the Vertex AI service and the generative model.
GenerativeModel gm = FirebaseVertexAI.getInstance().generativeModel(
  /* modelName */ "gemini-2.0-flash",
  /* generationConfig */ generationConfig);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Content content = new Content.Builder()
    .addText("For use in a children's card game, generate 10 animal-based characters.")
    .build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(
    response,
    new FutureCallback<GenerateContentResponse>() {
      @Override
      public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel, Schema } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration.
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service.
const vertexAI = getVertexAI(firebaseApp);

// Provide a JSON schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
const jsonSchema = Schema.object({
 properties: {
    characters: Schema.array({
      items: Schema.object({
        properties: {
          name: Schema.string(),
          accessory: Schema.string(),
          age: Schema.number(),
          species: Schema.string(),
        },
        optionalProperties: ["accessory"],
      }),
    }),
  }
});

// Initialize the generative model.
const model = getGenerativeModel(vertexAI, {
  model: "gemini-2.0-flash",
  // In the generation config, set the `responseMimeType` to `application/json`
  // and pass the JSON schema object into `responseSchema`.
  generationConfig: {
    responseMimeType: "application/json",
    responseSchema: jsonSchema
  },
});


let prompt = "For use in a children's card game, generate 10 animal-based characters.";

let result = await model.generateContent(prompt)
console.log(result.response.text());

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';

// Provide a JSON schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
final jsonSchema = Schema.object(
        properties: {
          'characters': Schema.array(
            items: Schema.object(
              properties: {
                'name': Schema.string(),
                'age': Schema.integer(),
                'species': Schema.string(),
                'accessory':
                    Schema.enumString(enumValues: ['hat', 'belt', 'shoes']),
              },
            ),
          ),
        },
        optionalProperties: ['accessory'],
      );

await Firebase.initializeApp();
// Initialize the Vertex AI service and the generative model.
final model =
      FirebaseVertexAI.instance.generativeModel(
        model: 'gemini-2.0-flash',
        // In the generation config, set the `responseMimeType` to `application/json`
        // and pass the JSON schema object into `responseSchema`.
        generationConfig: GenerationConfig(
            responseMimeType: 'application/json', responseSchema: jsonSchema));

final prompt = "For use in a children's card game, generate 10 animal-based characters.";
final response = await model.generateContent([Content.text(prompt)]);
print(response.text);

了解如何选择适合您的应用场景和应用的模型和(可选)位置

更多示例

如需查看有关如何使用和生成结构化输出的更多示例,请参阅 Google Cloud 文档中的示例架构和模型响应列表。

生成枚举值作为输出

以下示例展示了如何为分类任务使用响应架构。要求模型根据电影的说明来识别其类型。输出是模型从在所提供的响应架构中定义的列表值中选择的一个纯文本枚举值。

如需执行此结构化分类任务,您需要在模型初始化期间指定适当的 responseMimeType(在此示例中为 text/x.enum)以及您希望模型使用的 responseSchema

Swift

import FirebaseVertexAI

// Provide an enum schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
let enumSchema = Schema.enumeration(values: ["drama", "comedy", "documentary"])

// Initialize the Vertex AI service and the generative model.
let model = VertexAI.vertexAI().generativeModel(
  modelName: "gemini-2.0-flash",
  // In the generation config, set the `responseMimeType` to `text/x.enum`
  // and pass the enum schema object into `responseSchema`.
  generationConfig: GenerationConfig(
    responseMIMEType: "text/x.enum",
    responseSchema: enumSchema
  )
)

let prompt = """
The film aims to educate and inform viewers about real-life subjects, events, or people.
It offers a factual record of a particular topic by combining interviews, historical footage,
and narration. The primary purpose of a film is to present information and provide insights
into various aspects of reality.
"""

let response = try await model.generateContent(prompt)
print(response.text ?? "No text in response.")

Kotlin

对于 Kotlin,此 SDK 中的方法是挂起函数,需要从协程作用域调用。
// Provide an enum schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
val enumSchema = Schema.enumeration(listOf("drama", "comedy", "documentary"))

// Initialize the Vertex AI service and the generative model.
val generativeModel = Firebase.vertexAI.generativeModel(
    modelName = "gemini-2.0-flash",
    // In the generation config, set the `responseMimeType` to `text/x.enum`
    // and pass the enum schema object into `responseSchema`.
    generationConfig = generationConfig {
        responseMimeType = "text/x.enum"
        responseSchema = enumSchema
    })

val prompt = """
    The film aims to educate and inform viewers about real-life subjects, events, or people.
    It offers a factual record of a particular topic by combining interviews, historical footage, 
    and narration. The primary purpose of a film is to present information and provide insights 
    into various aspects of reality.
    """
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

对于 Java,此 SDK 中的流式传输方法会从 Reactive Streams 库返回 Publisher 类型。
// Provide an enum schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
Schema enumSchema = Schema.enumeration(List.of("drama", "comedy", "documentary"));

// In the generation config, set the `responseMimeType` to `text/x.enum`
// and pass the enum schema object into `responseSchema`.
GenerationConfig.Builder configBuilder = new GenerationConfig.Builder();
configBuilder.responseMimeType = "text/x.enum";
configBuilder.responseSchema = enumSchema;

GenerationConfig generationConfig = configBuilder.build();

// Initialize the Vertex AI service and the generative model.
GenerativeModel gm = FirebaseVertexAI.getInstance().generativeModel(
  /* modelName */ "gemini-2.0-flash",
  /* generationConfig */ generationConfig);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

String prompt = "The film aims to educate and inform viewers about real-life subjects," +
                " events, or people. It offers a factual record of a particular topic by" +
                " combining interviews, historical footage, and narration. The primary purpose" +
                " of a film is to present information and provide insights into various aspects" +
                " of reality.";

Content content = new Content.Builder().addText(prompt).build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(
    response,
    new FutureCallback<GenerateContentResponse>() {
      @Override
      public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel, Schema } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration.
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
 // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service.
const vertexAI = getVertexAI(firebaseApp);

// Provide an enum schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
const enumSchema = Schema.enumString({
  enum: ["drama", "comedy", "documentary"],
});

// Initialize the generative model.
const model = getGenerativeModel(vertexAI, {
  model: "gemini-2.0-flash",
  // In the generation config, set the `responseMimeType` to `text/x.enum`
  // and pass the JSON schema object into `responseSchema`.
  generationConfig: {
    responseMimeType: "text/x.enum",
    responseSchema: enumSchema,
  },
});

let prompt = `The film aims to educate and inform viewers about real-life
subjects, events, or people. It offers a factual record of a particular topic
by combining interviews, historical footage, and narration. The primary purpose
of a film is to present information and provide insights into various aspects
of reality.`;

let result = await model.generateContent(prompt);
console.log(result.response.text());

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';

// Provide an enum schema object using a standard format.
// Later, pass this schema object into `responseSchema` in the generation config.
final enumSchema = Schema.enumString(enumValues: ['drama', 'comedy', 'documentary']);

await Firebase.initializeApp();
// Initialize the Vertex AI service and the generative model.
final model =
      FirebaseVertexAI.instance.generativeModel(
        model: 'gemini-2.0-flash',
        // In the generation config, set the `responseMimeType` to `text/x.enum`
        // and pass the enum schema object into `responseSchema`.
        generationConfig: GenerationConfig(
            responseMimeType: 'text/x.enum', responseSchema: enumSchema));

final prompt = """
      The film aims to educate and inform viewers about real-life subjects, events, or people.
      It offers a factual record of a particular topic by combining interviews, historical footage, 
      and narration. The primary purpose of a film is to present information and provide insights
      into various aspects of reality.
      """;
final response = await model.generateContent([Content.text(prompt)]);
print(response.text);

了解如何选择适合您的应用场景和应用的模型和(可选)位置

用于控制内容生成的其他选项

  • 详细了解问题设计,以便影响模型生成符合您需求的输出。
  • 配置模型参数以控制模型如何生成回答。对于 Gemini 模型,这些参数包括输出 token 数上限、温度、topK 和 topP。 对于 Imagen 模型,这些包括宽高比、生成人物、添加水印等。
  • 您可以使用安全设置来调整系统提供可能被视为有害的回答(包括仇恨言论和露骨色情内容)的可能性。
  • 设置系统指令,以引导模型的行为。此功能类似于您在模型接触到来自最终用户的任何进一步指令之前添加的“序言”。


就您使用 Vertex AI in Firebase 的体验提供反馈