All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite). Learn more.

Bu sayfa, Cloud Translation API ile çevrilmiştir.

Gemini Live API'yi kullanarak iki yönlü yayın

Gemini Live API, Gemini ile düşük gecikmeli çift yönlü metin ve ses etkileşimlerine olanak tanır. Live API kullanarak son kullanıcılara, metin veya sesli komutlarla modelin yanıtlarını kesme olanağı sunan, doğal ve insan benzeri sesli sohbet deneyimi sağlayabilirsiniz. Model, metin ve ses girişini işleyebilir (video desteği yakında!) ve metin ve ses çıkışı sağlayabilir.

Google AI Studio veya Vertex AI Studio içinde istemler ve Live API ile prototip oluşturabilirsiniz.

Live API, istemci ile Gemini sunucusu arasında oturum oluşturmak için WebSocket bağlantısı oluşturan durum bilgisi olan bir API'dir. Ayrıntılar için Live API referans belgelerine (Gemini Developer API | Vertex AI Gemini API) bakın.

Başlamadan önce

Bu sayfada sağlayıcıya özel içerikleri ve kodu görüntülemek için Gemini API sağlayıcınızı tıklayın.

Henüz yapmadıysanız başlangıç kılavuzunu tamamlayın. Bu kılavuzda Firebase projenizi ayarlama, uygulamanızı Firebase'e bağlama, SDK'yı ekleme, seçtiğiniz Gemini API sağlayıcısı için arka uç hizmetini başlatma ve LiveModel örneği oluşturma hakkında bilgiler yer almaktadır.

Bu özelliği destekleyen modeller

Live API özelliğini destekleyen modeller, seçtiğiniz Gemini API sağlayıcıya bağlıdır.

Gemini Developer API
- gemini-live-2.5-flash (özel GA^*)
- gemini-live-2.5-flash-preview
- gemini-2.0-flash-live-001
- gemini-2.0-flash-live-preview-04-09
Vertex AI Gemini API
- gemini-live-2.5-flash (özel GA^*)
- gemini-2.0-flash-live-preview-04-09 (yalnızca us-central1 üzerinden erişilebilir)

Live API için 2,5 model adında live segmentinin hemen gemini segmentinden sonra geldiğini unutmayın.

^{* Erişim isteğinde bulunmak için Google Cloud hesap ekibinizdeki temsilciyle iletişime geçin.}

Live API uygulamasının standart özelliklerini kullanma

Bu bölümde, Live API'nın standart özelliklerinin nasıl kullanılacağı açıklanmaktadır. Özellikle çeşitli giriş ve çıkış türlerini yayınlamak için:

Kısa mesaj gönderme ve alma
Ses gönderme ve ses alma
Ses gönderme ve metin alma
Metin gönderme ve ses alma

Yayınlanan metin girişinden yayınlanan metin oluşturma

Bu örneği denemeden önce projenizi ve uygulamanızı ayarlamak için bu kılavuzun Başlamadan önce bölümünü tamamlayın.Bu bölümde, seçtiğiniz Gemini API sağlayıcı için bir düğmeyi de tıklayarak bu sayfada sağlayıcıya özel içerikleri görebilirsiniz.

Yayınlanan metin girişini gönderebilir ve yayınlanan metin çıkışını alabilirsiniz. liveModel örneği oluşturduğunuzdan ve yanıt biçimini Text olarak ayarladığınızdan emin olun.

Swift


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with text
  generationConfig: LiveGenerationConfig(
    responseModalities: [.text]
  )
)

do {
  let session = try await model.connect()

  // Provide a text prompt
  let text = "tell a short story"

  await session.sendTextRealtime(text)

  var outputText = ""
  for try await message in session.responses {
    if case let .content(content) = message.payload {
      content.modelTurn?.parts.forEach { part in
        if let part = part as? TextPart {
          outputText += part.text
        }
      }
      // Optional: if you don't require to send more requests.
      if content.isTurnComplete {
        await session.close()
      }
    }
  }

  // Output received from the server.
  print(outputText)
} catch {
  fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

var outputText = ""
session.receive().collect {
    if(it.turnComplete) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java


ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
       // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	  LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web


// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
const model = getLiveGenerativeModel(ai, {
  model: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with text
  generationConfig: {
    responseModalities: [ResponseModality.TEXT],
  },
});

const session = await model.connect();

// Provide a text prompt
const prompt = "tell a short story";
session.send(prompt);

// Collect text from model's turn
let text = "";
const messages = session.receive();
for await (const message of messages) {
  switch (message.type) {
    case "serverContent":
      if (message.turnComplete) {
        console.log(text);
      } else {
        const parts = message.modelTurn?.parts;
        if (parts) {
          text += parts.map((part) => part.text).join("");
        }
      }
      break;
    case "toolCall":
      // Ignore
    case "toolCallCancellation":
      // Ignore
  }
}

Dart


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

late LiveModelSession _session;

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  liveGenerationConfig: LiveGenerationConfig(responseModalities: [ResponseModalities.text]),
);

_session = await model.connect();

// Provide a text prompt
final prompt = Content.text('tell a short story');
await _session.send(input: prompt, turnComplete: true);

// In a separate thread, receive the response
await for (final message in _session.receive()) {
   // Process the received message
}

Unity


using Firebase;
using Firebase.AI;

async Task SendTextReceiveText() {
  // Initialize the Gemini Developer API backend service
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("tell a short story");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }
}

Akış girişinden akışlı ses oluşturma

Yayınlanan ses girişini gönderebilir ve yayınlanan ses çıkışını alabilirsiniz. LiveModel örneği oluşturduğunuzdan ve yanıt biçimini Audio olarak ayarladığınızdan emin olun.

Yanıt sesini yapılandırma ve özelleştirme hakkında bilgi edinmek için bu sayfanın ilerleyen bölümlerini inceleyin.

Swift


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with audio
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio]
  )
)

do {
  let session = try await model.connect()

  // Load the audio file, or tap a microphone
  guard let audioFile = NSDataAsset(name: "audio.pcm") else {
    fatalError("Failed to load audio file")
  }

  // Provide the audio data
  await session.sendAudioRealtime(audioFile.data)

  var outputText = ""
  for try await message in session.responses {
    if case let .content(content) = message.payload {
      content.modelTurn?.parts.forEach { part in
        if let part = part as? InlineDataPart, part.mimeType.starts(with: "audio/pcm") {
          // Handle 16bit pcm audio data at 24khz
          playAudio(part.data)
        }
      }
      // Optional: if you don't require to send more requests.
      if content.isTurnComplete {
        await session.close()
      }
    }
  }
} catch {
  fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO 
   }
)

val session = model.connect()

// This is the recommended way.
// However, you can create your own recorder and handle the stream.
session.startAudioConversation()

Java


ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with audio
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.AUDIO)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();

Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        session.startAudioConversation();
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web


// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
const model = getLiveGenerativeModel(ai, {
  model: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with audio
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
  },
});

const session = await model.connect();

// Start the audio conversation
const audioConversationController = await startAudioConversation(session);

// ... Later, to stop the audio conversation
// await audioConversationController.stop()

Dart


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
   // Configure the model to respond with audio
   liveGenerationConfig: LiveGenerationConfig(responseModalities: [ResponseModalities.audio]),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
// Map the Uint8List stream to InlineDataPart stream
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});
await _session.startMediaStream(mediaChunkStream);

// In a separate thread, receive the audio response from the model
await for (final message in _session.receive()) {
   // Process the received message
}

Unity


using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Gemini Developer API backend service
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Start receiving the response
  await ReceiveAudio(session);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Queue audioBuffer = new();

async Task ReceiveAudio(LiveSession liveSession) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in liveSession.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

Akış ses girişinden akış metni oluşturma

Yayınlanan ses girişi gönderebilir ve yayınlanan metin çıkışı alabilirsiniz. LiveModel örneği oluşturduğunuzdan ve yanıt biçimini Text olarak ayarladığınızdan emin olun.

Swift


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with text
  generationConfig: LiveGenerationConfig(
    responseModalities: [.text]
  )
)

do {
  let session = try await model.connect()

  // Load the audio file, or tap a microphone
  guard let audioFile = NSDataAsset(name: "audio.pcm") else {
    fatalError("Failed to load audio file")
  }

  // Provide the audio data
  await session.sendAudioRealtime(audioFile.data)

  var outputText = ""
  for try await message in session.responses {
    if case let .content(content) = message.payload {
      content.modelTurn?.parts.forEach { part in
        if let part = part as? TextPart {
          outputText += part.text
        }
      }
      // Optional: if you don't require to send more requests.
      if content.isTurnComplete {
        await session.close()
      }
    }
  }

  // Output received from the server.
  print(outputText)
} catch {
  fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT
   }
)

val session = model.connect()

// Provide a text prompt
val audioContent = content("user") { audioData }

session.send(audioContent)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java


ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Send Audio data
	 session.send(new Content.Builder().addInlineData(audioData, "audio/pcm").build());

        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web


// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
const model = getLiveGenerativeModel(ai, {
  model: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with text
  generationConfig: {
    responseModalities: [ResponseModality.TEXT],
  },
});

const session = await model.connect();

// TODO(developer): Collect audio data (16-bit 16kHz PCM)
// const audioData = ...

// Send audio
const audioPart = {
  inlineData: { data: audioData, mimeType: "audio/pcm" },
};
session.send([audioPart]);

// Collect text from model's turn
let text = "";
const messages = session.receive();
for await (const message of messages) {
  switch (message.type) {
    case "serverContent":
      if (message.turnComplete) {
        console.log(text);
      } else {
        const parts = message.modelTurn?.parts;
        if (parts) {
          text += parts.map((part) => part.text).join("");
        }
      }
      break;
    case "toolCall":
      // Ignore
    case "toolCallCancellation":
      // Ignore
  }
}

Dart


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';
import 'dart:async';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  liveGenerationConfig: LiveGenerationConfig(responseModalities: ResponseModalities.text),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});

await _session.startMediaStream(mediaChunkStream);

final responseStream = _session.receive();

return responseStream.asyncMap((response) async {
  if (response.parts.isNotEmpty && response.parts.first.text != null) {
    return response.parts.first.text!;
  } else {
    throw Exception('Text response not found.');
  }
});

Future main() async {
  try {
    final textStream = await audioToText();

    await for (final text in textStream) {
      print('Received text: $text');
      // Handle the text response
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity


using Firebase;
using Firebase.AI;

async Task SendAudioReceiveText() {
  // Initialize the Gemini Developer API backend service
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }

  StopCoroutine(recordingCoroutine);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Akış metni girişinden akış sesi oluşturma

Yayınlanan metin girişi gönderebilir ve yayınlanan ses çıkışı alabilirsiniz. LiveModel örneği oluşturduğunuzdan ve yanıt biçimini Audio olarak ayarladığınızdan emin olun.

Yanıt sesini yapılandırma ve özelleştirme hakkında bilgi edinmek için bu sayfanın ilerleyen bölümlerini inceleyin.

Swift


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with audio
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio]
  )
)

do {
  let session = try await model.connect()

  // Provide a text prompt
  let text = "tell a short story"

  await session.sendTextRealtime(text)

  var outputText = ""
  for try await message in session.responses {
    if case let .content(content) = message.payload {
      content.modelTurn?.parts.forEach { part in
        if let part = part as? InlineDataPart, part.mimeType.starts(with: "audio/pcm") {
          // Handle 16bit pcm audio data at 24khz
          playAudio(part.data)
        }
      }
      // Optional: if you don't require to send more requests.
      if content.isTurnComplete {
        await session.close()
      }
    }
  }
} catch {
  fatalError(error.localizedDescription)
}

Kotlin


// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

session.receive().collect {
    if(it.turnComplete) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    // Handle 16bit pcm audio data at 24khz
    playAudio(it.data)
}

Java


ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Gemini Developer API backend service
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.AUDIO)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle 16bit pcm audio data at 24khz
	liveContentResponse.getData();
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web


// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API)
const model = getLiveGenerativeModel(ai, {
  model: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to respond with audio
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
  },
});

const session = await model.connect();

// Provide a text prompt
const prompt = "tell a short story";
session.send(prompt);

// Handle the model's audio output
const messages = session.receive();
for await (const message of messages) {
  switch (message.type) {
    case "serverContent":
      if (message.turnComplete) {
        // TODO(developer): Handle turn completion
      } else if (message.interrupted) {
        // TODO(developer): Handle the interruption
        break;
      } else if (message.modelTurn) {
        const parts = message.modelTurn?.parts;
        parts?.forEach((part) => {
          if (part.inlineData) {
            // TODO(developer): Play the audio chunk
          }
        });
      }
      break;
    case "toolCall":
      // Ignore
    case "toolCallCancellation":
      // Ignore
  }
}

Dart


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'dart:async';
import 'dart:typed_data';

late LiveModelSession _session;

Future<Stream<Uint8List>> textToAudio(String textPrompt) async {
  WidgetsFlutterBinding.ensureInitialized();

  await Firebase.initializeApp(
    options: DefaultFirebaseOptions.currentPlatform,
  );

  // Initialize the Gemini Developer API backend service
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  final model = FirebaseAI.googleAI().liveGenerativeModel(
    model: 'gemini-2.0-flash-live-preview-04-09',
    // Configure the model to respond with audio
    liveGenerationConfig: LiveGenerationConfig(responseModalities: ResponseModalities.audio),
  );

  _session = await model.connect();

  final prompt = Content.text(textPrompt);

  await _session.send(input: prompt);

  return _session.receive().asyncMap((response) async {
    if (response is LiveServerContent && response.modelTurn?.parts != null) {
       for (final part in response.modelTurn!.parts) {
         if (part is InlineDataPart) {
           return part.bytes;
         }
       }
    }
    throw Exception('Audio data not found');
  });
}

Future<void> main() async {
  try {
    final audioStream = await textToAudio('Convert this text to audio.');

    await for (final audioData in audioStream) {
      // Process the audio data (e.g., play it using an audio player package)
      print('Received audio data: ${audioData.length} bytes');
      // Example using flutter_sound (replace with your chosen package):
      // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity


using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Gemini Developer API backend service
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("Convert this text to audio.");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Start receiving the response
  await ReceiveAudio(session);
}

Queue<float> audioBuffer = new();

async Task ReceiveAudio(LiveSession session) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent<AudioSource>();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

Daha ilgi çekici ve etkileşimli deneyimler oluşturma

Bu bölümde, Live API'nın daha ilgi çekici veya etkileşimli özelliklerinin nasıl oluşturulacağı ve yönetileceği açıklanmaktadır.

Yanıt sesini değiştirme

Live API, sentezlenmiş konuşma yanıtlarını desteklemek için Chirp 3'ü kullanır. Firebase AI Logic kullanırken çeşitli HD seslerde ses gönderebilirsiniz. Her bir sesin nasıl duyulduğunun tam listesi ve demoları için Chirp 3: HD sesler başlıklı makaleyi inceleyin.

Bir ses belirtmek için speechConfig nesnesindeki ses adını model yapılandırmasının bir parçası olarak ayarlayın. Ses belirtmezseniz varsayılan olarak Puck kullanılır.

Swift


import FirebaseAILogic

// ...

let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to use a specific voice for its audio response
  generationConfig: LiveGenerationConfig(
    responseModalities: [.audio],
    speech: SpeechConfig(voiceName: "VOICE_NAME")
  )
)

// ...

Kotlin


// ...

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voice("VOICE_NAME"))
    }
)

// ...

Java


// ...

LiveModel model = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(
    "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModalities(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(new Voice("VOICE_NAME")))
        .build()
);

// ...

Web


// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
const model = getLiveGenerativeModel(ai, {
  model: "gemini-2.0-flash-live-preview-04-09",
  // Configure the model to use a specific voice for its audio response
  generationConfig: {
    responseModalities: [ResponseModality.AUDIO],
    speechConfig: {
      voiceConfig: {
        prebuiltVoiceConfig: { voiceName: "VOICE_NAME" },
      },
    },
  },
});

Dart


// ...

final model = FirebaseAI.googleAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to use a specific voice for its audio response
  liveGenerationConfig: LiveGenerationConfig(
    responseModalities: ResponseModalities.audio,
    speechConfig: SpeechConfig(voiceName: 'VOICE_NAME'),
  ),
);

// ...

Unity


var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  liveGenerationConfig: new LiveGenerationConfig(
    responseModalities: new[] { ResponseModality.Audio },
    speechConfig: SpeechConfig.UsePrebuiltVoice("VOICE_NAME"))
);

Modeli istem verirken ve İngilizce olmayan bir dilde yanıt vermeye zorlarken en iyi sonuçları elde etmek için aşağıdakileri sistem talimatlarınıza ekleyin:

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

Oturumlar ve istekler arasında bağlamı koruma

Oturumlar ve istekler arasında bağlamı korumak için sohbet yapısı kullanabilirsiniz. Bu özelliğin yalnızca metin girişi ve metin çıkışı için çalıştığını unutmayın.

Bu yaklaşım, kısa bağlamlar için en iyisidir. Etkinliklerin tam sırasını temsil etmek için adım adım etkileşimler gönderebilirsiniz. Daha uzun bağlamlar için, sonraki etkileşimlerde bağlam penceresini boşaltmak amacıyla tek bir mesaj özeti sağlamanızı öneririz.

Kesintileri ele alma

Firebase AI Logic henüz kesintileri işlemeyi desteklemiyor. Bir süre sonra tekrar kontrol edin.

İşlev çağrısını (araçlar) kullanma

Live API ile kullanmak üzere, standart içerik oluşturma yöntemlerinde olduğu gibi mevcut işlevler gibi araçlar tanımlayabilirsiniz. Bu bölümde, Live API'yi işlev çağrısıyla kullanırken dikkat edilmesi gereken bazı noktalar açıklanmaktadır. İşlev çağırma ile ilgili eksiksiz bir açıklama ve örnekler için işlev çağırma kılavuzuna bakın.

Model, tek bir istemden birden fazla işlev çağrısı ve bunların çıkışlarını zincirlemek için gereken kodu oluşturabilir. Bu kod, sonraki BidiGenerateContentToolCall mesajlarını oluşturarak bir korumalı alan ortamında yürütülür. Yürütme, her işlev çağrısının sonuçları kullanılabilir olana kadar duraklatılır. Bu, sıralı işlemeyi sağlar.

Ayrıca, Live API'nin işlev çağrısıyla birlikte kullanılması özellikle güçlüdür. Çünkü model, kullanıcıdan takip veya açıklama bilgisi isteyebilir. Örneğin, modelin çağırmak istediği bir işleve parametre değeri sağlamak için yeterli bilgisi yoksa model, kullanıcıdan daha fazla veya açıklayıcı bilgi vermesini isteyebilir.

Müşteri, BidiGenerateContentToolResponse ile yanıt vermelidir.

Sınırlamalar ve şartlar

Live API ile ilgili aşağıdaki sınırlamaları ve şartları göz önünde bulundurun.

Çeviri yazı

Firebase AI Logic henüz transkriptleri desteklemiyor. Bir süre sonra tekrar kontrol edin.

Diller

Giriş dilleri: Gemini modelleri için desteklenen giriş dillerinin tam listesini inceleyin.
Çıkış dilleri: Kullanılabilir çıkış dillerinin tam listesini Chirp 3: HD sesler bölümünde bulabilirsiniz.

Ses biçimleri

Live API aşağıdaki ses biçimlerini destekler:

Giriş ses biçimi: 16 kHz little-endian ham 16 bit PCM ses
Çıkış ses biçimi: 24 kHz little-endian ham 16 bit PCM ses

Hız sınırları

Live API, Firebase projesi başına eşzamanlı oturumlar ve dakikadaki jeton sayısı (TPM) için hız sınırlarına sahiptir.

Gemini Developer API:
- Sınırlar, projenizin Gemini Developer API"kullanım katmanına" göre değişir (hız sınırı belgelerine bakın).
Vertex AI Gemini API:
- Firebase projesi başına 5.000 eşzamanlı oturum
- Dakikada 4 milyon jeton

Oturum süresi

Varsayılan oturum uzunluğu 10 dakikadır. Oturum süresi sınırı aştığında bağlantı sonlandırılır.

Model, bağlam boyutuyla da sınırlıdır. Büyük giriş parçaları göndermek, oturumun daha erken sonlandırılmasına neden olabilir.

Ses etkinliği algılama (VAD)

Model, sürekli bir ses girişi akışında otomatik olarak ses etkinliği algılama (VAD) gerçekleştirir. VAD varsayılan olarak etkindir.

Jeton sayma

CountTokens API'yi Live API ile kullanamazsınız.

Firebase AI Logic ile ilgili deneyiminiz hakkında geri bildirim verme

Gemini Live API'yi kullanarak iki yönlü yayın Koleksiyonlar ile düzeninizi koruyun İçeriği tercihlerinize göre kaydedin ve kategorilere ayırın.

Başlamadan önce

Bu özelliği destekleyen modeller

Live API uygulamasının standart özelliklerini kullanma

Yayınlanan metin girişinden yayınlanan metin oluşturma

Swift

Kotlin

Java

Web

Dart

Unity

Akış girişinden akışlı ses oluşturma

Swift

Kotlin

Java

Web

Dart

Unity

Akış ses girişinden akış metni oluşturma

Swift

Kotlin

Java

Web

Dart

Unity

Akış metni girişinden akış sesi oluşturma

Swift

Kotlin

Java

Web

Dart

Unity

Daha ilgi çekici ve etkileşimli deneyimler oluşturma

Yanıt sesini değiştirme

Swift

Kotlin

Java

Web

Dart

Unity

Oturumlar ve istekler arasında bağlamı koruma

Kesintileri ele alma

İşlev çağrısını (araçlar) kullanma

Sınırlamalar ve şartlar

Çeviri yazı

Diller

Ses biçimleri

Hız sınırları

Oturum süresi

Ses etkinliği algılama (VAD)

Jeton sayma

Gemini Live API'yi kullanarak iki yönlü yayın