Cette page a été traduite par l'API Cloud Translation.

Streaming bidirectionnel à l'aide de l'API Gemini Live

Gemini Live API permet des interactions textuelles et vocales bidirectionnelles à faible latence avec Gemini. Vous pouvez l'utiliser pour proposer aux utilisateurs finaux des conversations vocales naturelles et leur permettre d'interrompre les réponses du modèle à l'aide de commandes textuelles ou vocales.Live API Le modèle peut traiter les entrées texte et audio (la vidéo sera bientôt disponible) et fournir des sorties texte et audio.

Vous pouvez prototyper avec des requêtes et Live API dans Vertex AI Studio.

Live API est une API avec état qui crée une connexion WebSocket pour établir une session entre le client et le serveur Gemini. Pour en savoir plus, consultez la documentation de référence de Live API.

Avant de commencer

Disponible uniquement lorsque vous utilisez Vertex AI Gemini API comme fournisseur d'API.

Si ce n'est pas déjà fait, suivez le guide de démarrage, qui décrit comment configurer votre projet Firebase, connecter votre application à Firebase, ajouter le SDK, initialiser le service de backend pour Vertex AI Gemini API et créer une instance LiveModel.

Modèles compatibles avec cette fonctionnalité

Live API est compatible uniquement avec gemini-2.0-flash-live-preview-04-09 (et non avec gemini-2.0-flash).

Notez également que gemini-2.0-flash-live-preview-04-09 n'est compatible qu'avec l'emplacement us-central1.

Utiliser les fonctionnalités standards de Live API

Cette section explique comment utiliser les fonctionnalités standards de Live API, en particulier pour diffuser différents types d'entrées et de sorties :

Envoyer et recevoir des messages
Envoyer et recevoir de l'audio
Envoyer de l'audio et recevoir du texte
Envoyer du texte et recevoir de l'audio

Générer du texte en streaming à partir d'une entrée de texte en streaming

Avant d'essayer cet exemple, suivez la section Avant de commencer de ce guide pour configurer votre projet et votre application.
Dans cette section, vous devez également cliquer sur un bouton pour le fournisseur Gemini API de votre choix afin d'afficher le contenu spécifique à ce fournisseur sur cette page.

Vous pouvez envoyer des entrées de texte en streaming et recevoir des sorties de texte en streaming. Veillez à créer une instance liveModel et à définir la modalité de réponse sur Text.

Swift

La Live API n'est pas encore compatible avec les applications de la plate-forme Apple, mais revenez bientôt sur cette page.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
       // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	  LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

La Live API n'est pas encore compatible avec les applications Web, mais revenez bientôt sur cette page.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

late LiveModelSession _session;

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModalities: [ResponseModality.text]),
);

_session = await model.connect();

// Provide a text prompt
final prompt = Content.text('tell a short story');
await _session.send(input: prompt, turnComplete: true);

// In a separate thread, receive the response
await for (final message in _session.receive()) {
   // Process the received message 
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("tell a short story");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }
}

Générer un flux audio à partir d'un flux audio en entrée

Vous pouvez envoyer des entrées audio en streaming et recevoir des sorties audio en streaming. Veillez à créer une instance LiveModel et à définir la modalité de réponse sur Audio.

Découvrez comment configurer et personnaliser la voix de réponse (plus loin sur cette page).

Swift

La Live API n'est pas encore compatible avec les applications de la plate-forme Apple, mais revenez bientôt sur cette page.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO 
   }
)

val session = model.connect()

// This is the recommended way.
// However, you can create your own recorder and handle the stream.
session.startAudioConversation()

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();

Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        session.startAudioConversation();
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

La Live API n'est pas encore compatible avec les applications Web, mais revenez bientôt sur cette page.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
   // Configure the model to respond with audio
   config: LiveGenerationConfig(responseModalities: [ResponseModality.audio]),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
// Map the Uint8List stream to InlineDataPart stream
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});
await _session.startMediaStream(mediaChunkStream);

// In a separate thread, receive the audio response from the model
await for (final message in _session.receive()) {
   // Process the received message 
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Start receiving the response
  await ReceiveAudio(session);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Queue audioBuffer = new();

async Task ReceiveAudio(LiveSession liveSession) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in liveSession.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

Générer du texte en streaming à partir d'une entrée audio en streaming

Vous pouvez envoyer des entrées audio en streaming et recevoir des sorties de texte en streaming. Veillez à créer une instance LiveModel et à définir la modalité de réponse sur Text.

Swift

La fonctionnalité Live API n'est pas encore disponible pour les applications sur les plates-formes Apple. Revenez bientôt !

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val audioContent = content("user") { audioData }

session.send(audioContent)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Send Audio data
	 session.send(new Content.Builder().addInlineData(audioData, "audio/pcm").build());

        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

La Live API n'est pas encore compatible avec les applications Web, mais revenez bientôt sur cette page.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';
import 'dart:async';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModality: ResponseModality.text),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});

await _session.startMediaStream(mediaChunkStream);

final responseStream = _session.receive();

return responseStream.asyncMap((response) async {
  if (response.parts.isNotEmpty && response.parts.first.text != null) {
    return response.parts.first.text!;
  } else {
    throw Exception('Text response not found.');
  }
});

Future main() async {
  try {
    final textStream = await audioToText();

    await for (final text in textStream) {
      print('Received text: $text');
      // Handle the text response
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity

using Firebase;
using Firebase.AI;

async Task SendAudioReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }

  StopCoroutine(recordingCoroutine);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Générer un contenu audio en streaming à partir d'une entrée de texte en streaming

Vous pouvez envoyer des entrées de texte en streaming et recevoir des sorties audio en streaming. Veillez à créer une instance LiveModel et à définir la modalité de réponse sur Audio.

Découvrez comment configurer et personnaliser la voix de réponse (plus loin sur cette page).

Swift

La Live API n'est pas encore compatible avec les applications de la plate-forme Apple, mais revenez bientôt sur cette page.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    // Handle 16bit pcm audio data at 24khz
    playAudio(it.data)
}

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.AUDIO)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle 16bit pcm audio data at 24khz
	liveContentResponse.getData();
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

La Live API n'est pas encore compatible avec les applications Web, mais revenez bientôt sur cette page.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'dart:async';
import 'dart:typed_data';

late LiveModelSession _session;

Future<Stream<Uint8List>> textToAudio(String textPrompt) async {
  WidgetsFlutterBinding.ensureInitialized();

  await Firebase.initializeApp(
    options: DefaultFirebaseOptions.currentPlatform,
  );

  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
    model: 'gemini-2.0-flash-live-preview-04-09',
    // Configure the model to respond with audio
    config: LiveGenerationConfig(responseModality: ResponseModality.audio),
  );

  _session = await model.connect();

  final prompt = Content.text(textPrompt);

  await _session.send(input: prompt);

  return _session.receive().asyncMap((response) async {
    if (response is LiveServerContent && response.modelTurn?.parts != null) {
       for (final part in response.modelTurn!.parts) {
         if (part is InlineDataPart) {
           return part.bytes;
         }
       }
    }
    throw Exception('Audio data not found');
  });
}

Future<void> main() async {
  try {
    final audioStream = await textToAudio('Convert this text to audio.');

    await for (final audioData in audioStream) {
      // Process the audio data (e.g., play it using an audio player package)
      print('Received audio data: ${audioData.length} bytes');
      // Example using flutter_sound (replace with your chosen package):
      // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("Convert this text to audio.");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Start receiving the response
  await ReceiveAudio(session);
}

Queue<float> audioBuffer = new();

async Task ReceiveAudio(LiveSession session) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent<AudioSource>();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

Créez des expériences plus attrayantes et interactives

Cette section explique comment créer et gérer des fonctionnalités plus attrayantes ou interactives de Live API.

Changer la voix de la réponse

Live API utilise Chirp 3 pour prendre en charge les réponses vocales synthétisées. Lorsque vous utilisez Firebase AI Logic, vous pouvez envoyer de l'audio dans différentes langues et avec des voix HD. Pour obtenir la liste complète et des démos de chaque voix, consultez Chirp 3 : voix HD.

Pour spécifier une voix, définissez son nom dans l'objet speechConfig dans la configuration du modèle. Si vous ne spécifiez pas de voix, la valeur par défaut est Puck.

Swift

La Live API n'est pas encore compatible avec les applications de la plate-forme Apple, mais revenez bientôt sur cette page.

Kotlin

// ...

val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voices.FENRIR)
    }
)

// ...

Java

// ...

LiveModel model = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
    "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModalities(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(Voices.FENRIR))
        .build()
);

// ...

Web

La Live API n'est pas encore compatible avec les applications Web, mais revenez bientôt sur cette page.

Dart

// ...

final model = FirebaseAI.vertexAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to use a specific voice for its audio response
  config: LiveGenerationConfig(
    responseModality: ResponseModality.audio,
    speechConfig: SpeechConfig(voiceName: 'Fenrir'),
  ),
);

// ...

Unity

var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  liveGenerationConfig: new LiveGenerationConfig(
    responseModalities: new[] { ResponseModality.Audio },
    speechConfig: SpeechConfig.UsePrebuiltVoice("Fenrir"),
);

Pour obtenir les meilleurs résultats lorsque vous demandez au modèle de répondre dans une langue autre que l'anglais, incluez les éléments suivants dans vos instructions système :

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

Conserver le contexte entre les sessions et les requêtes

Vous pouvez utiliser une structure de chat pour conserver le contexte entre les sessions et les requêtes. Notez que cela ne fonctionne que pour les entrées et sorties de texte.

Cette approche est idéale pour les contextes courts. Vous pouvez envoyer des interactions au fur et à mesure pour représenter la séquence exacte des événements. Pour les contextes plus longs, nous vous recommandons de fournir un seul résumé des messages afin de libérer la fenêtre de contexte pour les interactions ultérieures.

Gérer les interruptions

Firebase AI Logic n'est pas encore compatible avec la gestion des interruptions. Veuillez réessayer ultérieurement.

Utiliser l'appel de fonction (outils)

Vous pouvez définir des outils, comme des fonctions disponibles, à utiliser avec l'API Live, tout comme vous le feriez avec les méthodes de génération de contenu standards. Cette section décrit certaines nuances à prendre en compte lorsque vous utilisez l'API Live avec l'appel de fonction. Pour obtenir une description complète et des exemples d'appel de fonction, consultez le guide sur l'appel de fonction.

À partir d'une seule requête, le modèle peut générer plusieurs appels de fonction et le code nécessaire pour chaîner leurs sorties. Ce code s'exécute dans un environnement de bac à sable, générant par la suite des messages BidiGenerateContentToolCall. L'exécution est interrompue jusqu'à ce que les résultats de chaque appel de fonction soient disponibles, ce qui assure un traitement séquentiel.

De plus, l'utilisation de l'API Live avec l'appel de fonction est particulièrement efficace, car le modèle peut demander à l'utilisateur des informations complémentaires ou des précisions. Par exemple, si le modèle ne dispose pas de suffisamment d'informations pour fournir une valeur de paramètre à une fonction qu'il souhaite appeler, il peut demander à l'utilisateur de fournir des informations supplémentaires ou plus claires.

Le client doit répondre avec BidiGenerateContentToolResponse.

Limites et exigences

Tenez compte des limites et des exigences suivantes concernant Live API.

Transcription

Firebase AI Logic n'est pas encore compatible avec les transcriptions. Veuillez réessayer ultérieurement.

Langues

Langues d'entrée : consultez la liste complète des langues d'entrée acceptées pour les modèles Gemini.
Langues de sortie : consultez la liste complète des langues de sortie disponibles dans Chirp 3 : voix HD.

Formats audio

L'outil Live API accepte les formats audio suivants :

Format audio d'entrée : audio PCM 16 bits brut à 16 kHz little-endian
Format audio de sortie : audio PCM 16 bits brut à 24 kHz little-endian

Limites de débit

Les limites de débit suivantes s'appliquent :

10 sessions simultanées par projet Firebase
4 millions de jetons par minute

Durée des sessions

La durée par défaut d'une session est de 30 minutes. Lorsque la durée de la session dépasse la limite, la connexion est interrompue.

Le modèle est également limité par la taille du contexte. L'envoi de grandes quantités de données d'entrée peut entraîner une fin de session prématurée.

Détection de l'activité vocale (VAD)

Le modèle détecte automatiquement l'activité vocale (VAD) sur un flux d'entrée audio continu. La VAD est activée par défaut.

Comptage des jetons

Vous ne pouvez pas utiliser l'API CountTokens avec Live API.

Envoyer des commentaires sur votre expérience avec Firebase AI Logic

Streaming bidirectionnel à l'aide de l'API Gemini Live Restez organisé à l'aide des collections Enregistrez et classez les contenus selon vos préférences.

Avant de commencer

Modèles compatibles avec cette fonctionnalité

Utiliser les fonctionnalités standards de Live API

Générer du texte en streaming à partir d'une entrée de texte en streaming

Swift

Kotlin

Java

Web

Dart

Unity

Générer un flux audio à partir d'un flux audio en entrée

Swift

Kotlin

Java

Web

Dart

Unity

Générer du texte en streaming à partir d'une entrée audio en streaming

Swift

Kotlin

Java

Web

Dart

Unity

Générer un contenu audio en streaming à partir d'une entrée de texte en streaming

Swift

Kotlin

Java

Web

Dart

Unity

Créez des expériences plus attrayantes et interactives

Changer la voix de la réponse

Swift

Kotlin

Java

Web

Dart

Unity

Conserver le contexte entre les sessions et les requêtes

Gérer les interruptions

Utiliser l'appel de fonction (outils)

Limites et exigences

Transcription

Langues

Formats audio

Limites de débit

Durée des sessions

Détection de l'activité vocale (VAD)

Comptage des jetons

Streaming bidirectionnel à l'aide de l'API Gemini Live