تمت ترجمة هذه الصفحة بواسطة Cloud Translation API‏.

البث في الاتجاهين باستخدام واجهة برمجة التطبيقات Gemini Live API

يتيح Gemini Live API تفاعلات ثنائية الاتجاه منخفضة الاستجابة مع Gemini من خلال النص والصوت. باستخدام Live API، يمكنك منح العميل المنتهي تجربتَي محادثات صوتية طبيعية تشبه المحادثات البشرية، مع إمكانية مقاطعة ردود النموذج باستخدام أوامر نصية أو صوتية. يمكن للنموذج معالجة الإدخال النصي والصوتي (سيكون الفيديو متاحًا قريبًا)، ويمكنه تقديم مخرجات نصية وصوتية.

يمكنك إنشاء نماذج أولية باستخدام الطلبات والرمز Live API في Vertex AI Studio.

‫Live API هي واجهة برمجة تطبيقات مستندة إلى الحالة تنشئ اتصال WebSocket لتأسيس جلسة بين العميل وخادم Gemini. للاطّلاع على التفاصيل، يُرجى الاطّلاع على مستندات مرجعية حول Live API.

قبل البدء

لا يتوفّر هذا الخيار إلا عند استخدام Vertex AI Gemini API كموفّر لواجهة برمجة التطبيقات.

إذا لم يسبق لك ذلك، أكمِل قراءة دليل البدء، الذي يوضّح كيفية إعداد مشروعك على Firebase، وربط تطبيقك بـ Firebase، وإضافة حزمة تطوير البرامج (SDK)، وبدء خدمة الخلفية للخدمة Vertex AI Gemini API، و إنشاء مثيل LiveModel.

الطُرز التي تتيح هذه الميزة

تتوفّر سمة Live API في gemini-2.0-flash-live-preview-04-09 فقط (وليس gemini-2.0-flash).

استخدام الميزات العادية في Live API

يوضّح هذا القسم كيفية استخدام الميزات العادية لتطبيق Live API، وعلى وجه التحديد لبث أنواع مختلفة من الإدخالات والمخرجات:

إرسال الرسائل النصية واستلامها
إرسال ملف صوتي واستلامه
إرسال محتوى صوتي واستلام نص
إرسال نص واستلام ملف صوتي

إنشاء نص يتم بثه من إدخال نص يتم بثه

قبل تجربة هذا العيّنة، عليك إكمال القسم قبل البدء من هذا الدليل لإعداد مشروعك وتطبيقك.
في هذا القسم، عليك أيضًا النقر على زر Gemini API مقدّم الخدمة الذي اخترته حتى يظهر لك المحتوى الخاص بالمقدّم في هذه الصفحة.

يمكنك إرسال إدخال نصي يتم بثه وتلقّي إخراج نصي يتم بثه. احرص على إنشاء مثيل liveModel وضبط طريقة الاستجابة على Text.

Swift

لا يمكن استخدام Live API مع تطبيقات منصة Apple بعد، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
       // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	  LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

لا تتوفّر ميزة Live API بعد لتطبيقات الويب، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

late LiveModelSession _session;

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
final model = FirebaseAI.vertexAI().liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModalities: [ResponseModality.text]),
);

_session = await model.connect();

// Provide a text prompt
final prompt = Content.text('tell a short story');
await _session.send(input: prompt, turnComplete: true);

// In a separate thread, receive the response
await for (final message in _session.receive()) {
   // Process the received message 
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Create a `LiveModel` instance with the model that supports the Live API
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("tell a short story");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }
}

تعرَّف على كيفية اختيار نموذج مناسب لحالة الاستخدام والتطبيق.

إنشاء محتوى صوتي يتم بثه من محتوى صوتي يتم بثه

يمكنك إرسال إدخال صوتي يتم بثه وتلقّي إخراج صوتي يتم بثه. تأكَّد من إنشاء مثيل LiveModel وضبط طريقة الاستجابة على Audio.

تعرَّف على كيفية ضبط وتعديل صوت الردّ (أدناه في هذه الصفحة).

Swift

لا يمكن استخدام Live API مع تطبيقات منصة Apple بعد، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO 
   }
)

val session = model.connect()

// This is the recommended way.
// However, you can create your own recorder and handle the stream.
session.startAudioConversation()

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();

Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        session.startAudioConversation();
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

لا تتوفّر ميزة Live API بعد لتطبيقات الويب، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
final model = FirebaseAI.vertexAI().liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
   // Configure the model to respond with audio
   config: LiveGenerationConfig(responseModalities: [ResponseModality.audio]),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
// Map the Uint8List stream to InlineDataPart stream
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});
await _session.startMediaStream(mediaChunkStream);

// In a separate thread, receive the audio response from the model
await for (final message in _session.receive()) {
   // Process the received message 
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Create a `LiveModel` instance with the model that supports the Live API
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Start receiving the response
  await ReceiveAudio(session);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Queue audioBuffer = new();

async Task ReceiveAudio(LiveSession liveSession) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in liveSession.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

تعرَّف على كيفية اختيار نموذج مناسب لحالة الاستخدام والتطبيق.

إنشاء نص يتم بثه من إدخال صوتي يتم بثه

يمكنك إرسال إدخال صوتي يتم بثه وتلقّي إخراج نصي يتم بثه. احرص على إنشاء مثيل LiveModel وضبط طريقة الاستجابة على Text.

Swift

لا تتوفّر ميزة Live API بعد لتطبيقات منصات Apple، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val audioContent = content("user") { audioData }

session.send(audioContent)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Send Audio data
	 session.send(new Content.Builder().addInlineData(audioData, "audio/pcm").build());

        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

لا تتوفّر ميزة Live API بعد لتطبيقات الويب، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';
import 'dart:async';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
final model = FirebaseAI.vertexAI().liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModality: ResponseModality.text),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});

await _session.startMediaStream(mediaChunkStream);

final responseStream = _session.receive();

return responseStream.asyncMap((response) async {
  if (response.parts.isNotEmpty && response.parts.first.text != null) {
    return response.parts.first.text!;
  } else {
    throw Exception('Text response not found.');
  }
});

Future main() async {
  try {
    final textStream = await audioToText();

    await for (final text in textStream) {
      print('Received text: $text');
      // Handle the text response
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity

using Firebase;
using Firebase.AI;

async Task SendAudioReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Create a `LiveModel` instance with the model that supports the Live API
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }

  StopCoroutine(recordingCoroutine);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

تعرَّف على كيفية اختيار نموذج مناسب لحالة الاستخدام والتطبيق.

إنشاء محتوى صوتي يتم بثه من خلال إدخال نص يتم بثه

يمكنك إرسال نص مُدرَج في بث وتلقّي محتوى صوتي مُدرَج في بث. احرص على إنشاء مثيل LiveModel وضبط طريقة الاستجابة على Audio.

تعرَّف على كيفية ضبط وتعديل صوت الردّ (أدناه في هذه الصفحة).

Swift

لا يمكن استخدام Live API مع تطبيقات منصة Apple بعد، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    // Handle 16bit pcm audio data at 24khz
    playAudio(it.data)
}

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Create a `LiveModel` instance with the model that supports the Live API
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.AUDIO)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle 16bit pcm audio data at 24khz
	liveContentResponse.getData();
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

لا تتوفّر ميزة Live API بعد لتطبيقات الويب، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'dart:async';
import 'dart:typed_data';

late LiveModelSession _session;

Future<Stream<Uint8List>> textToAudio(String textPrompt) async {
  WidgetsFlutterBinding.ensureInitialized();

  await Firebase.initializeApp(
    options: DefaultFirebaseOptions.currentPlatform,
  );

  // Initialize the Vertex AI Gemini API backend service
  // Create a `LiveModel` instance with the model that supports the Live API
  final model = FirebaseAI.vertexAI().liveModel(
    model: 'gemini-2.0-flash-live-preview-04-09',
    // Configure the model to respond with audio
    config: LiveGenerationConfig(responseModality: ResponseModality.audio),
  );

  _session = await model.connect();

  final prompt = Content.text(textPrompt);

  await _session.send(input: prompt);

  return _session.receive().asyncMap((response) async {
    if (response is LiveServerContent && response.modelTurn?.parts != null) {
       for (final part in response.modelTurn!.parts) {
         if (part is InlineDataPart) {
           return part.bytes;
         }
       }
    }
    throw Exception('Audio data not found');
  });
}

Future<void> main() async {
  try {
    final audioStream = await textToAudio('Convert this text to audio.');

    await for (final audioData in audioStream) {
      // Process the audio data (e.g., play it using an audio player package)
      print('Received audio data: ${audioData.length} bytes');
      // Example using flutter_sound (replace with your chosen package):
      // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);
    }
  } catch (e) {
    print('Error: $e');
  }
}

Unity

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Create a `LiveModel` instance with the model that supports the Live API
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("Convert this text to audio.");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Start receiving the response
  await ReceiveAudio(session);
}

Queue<float> audioBuffer = new();

async Task ReceiveAudio(LiveSession session) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent<AudioSource>();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

تعرَّف على كيفية اختيار نموذج مناسب لحالة الاستخدام والتطبيق.

إنشاء تجارب أكثر تفاعلاً وجذبًا

يوضّح هذا القسم كيفية إنشاء وإدارة ميزات Live API التي تجذب المستخدمين أو تتفاعل معهم بشكلٍ أكبر.

تغيير صوت الردّ

يستخدم Live API تقنية Chirp 3 لتقديم ردود من إنشاء الذكاء الاصطناعي. عند استخدام Firebase AI Logic، يمكنك إرسال ملف صوتي بعدة لغات وبأصوات عالية الوضوح. للحصول على قائمة كاملة وعروض توضيحية لصوت كل نموذج، يُرجى الاطّلاع على مقالة Chirp 3: أصوات بدقة عالية.

لتحديد صوت، اضبط اسم الصوت ضمن عنصر speechConfig كجزء من إعدادات النموذج. إذا لم تحدّد صوتًا، سيكون الخيار التلقائي هو Puck.

Swift

لا يمكن استخدام Live API مع تطبيقات منصة Apple بعد، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Kotlin

// ...

val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voices.FENRIR)
    }
)

// ...

Java

// ...

LiveModel model = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
    "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModalities(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(Voices.FENRIR))
        .build()
);

// ...

Web

لا تتوفّر ميزة Live API بعد لتطبيقات الويب، ولكن يمكنك التحقّق مرة أخرى قريبًا.

Dart

// ...

final model = FirebaseAI.vertexAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to use a specific voice for its audio response
  config: LiveGenerationConfig(
    responseModality: ResponseModality.audio,
    speechConfig: SpeechConfig(voiceName: 'Fenrir'),
  ),
);

// ...

Unity

var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  liveGenerationConfig: new LiveGenerationConfig(
    responseModalities: new[] { ResponseModality.Audio },
    speechConfig: SpeechConfig.UsePrebuiltVoice("Fenrir"),
);

للحصول على أفضل النتائج عند طلب النموذج الردّ بلغة غير الإنجليزية، يجب تضمين ما يلي كجزء من تعليمات النظام:

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

الحفاظ على السياق في جميع الجلسات والطلبات

يمكنك استخدام بنية محادثة للحفاظ على السياق في جميع الجلسات والطلبات. يُرجى العِلم أنّ هذه الميزة لا تعمل إلا لإدخال النص وإخراجه.

هذه الطريقة هي الأفضل للسياقات القصيرة، ويمكنك إرسال تفاعلات خطوة بخطوة لتمثيل تسلسل الأحداث بدقة . بالنسبة إلى السياقات الأطول، ننصحك بتقديم ملخّص رسالة واحد لإخلاء مساحة نافذة السياق للتفاعلات اللاحقة.

التعامل مع المقاطعات

لا تتيح Firebase AI Logic حتى الآن معالجة الانقطاعات. يرجى معاودة التحقق بعد قليل.

استخدام وظائف الاتصال (الأدوات)

يمكنك تحديد أدوات، مثل الدوالّ المتاحة، لاستخدامها مع Live API تمامًا كما يمكنك استخدام طرق إنشاء المحتوى العادية. يصف هذا القسم بعض الاختلافات عند استخدام Live API مع استدعاء الدوالّ. للحصول على وصف كامل وأمثلة على استدعاء الدوال، اطّلِع على دليل استدعاء الدوال.

من طلب واحد، يمكن للنموذج إنشاء طلبات متعددة للدوالّ والتعليمات البرمجية اللازمة لربط نواتج هذه الدوالّ. يتم تنفيذ هذه التعليمة البرمجية في بيئة مجرّبة ، ما يؤدي إلى إنشاء رسائل BidiGenerateContentToolCall لاحقة. يتم إيقاف التنفيذ مؤقتًا إلى أن تصبح نتائج كلّ طلب دالة متاحة، ما يضمن المعالجة التسلسلية.

بالإضافة إلى ذلك، فإنّ استخدام Live API مع طلب البيانات من واجهة برمجة التطبيقات مفيد بشكلٍ خاص لأنّ النموذج يمكنه طلب معلومات متابعة أو توضيح من المستخدم. على سبيل المثال، إذا لم يتوفّر للنموذج معلومات كافية لتقديم قيمة مَعلمة لدالّة يريد استدعاؤها، يمكن للنموذج أن يطلب من المستخدم تقديم معلومات إضافية أو توضيحية.

من المفترض أن يردّ العميل باستخدام الرمز التالي: BidiGenerateContentToolResponse.

القيود والمتطلبات

يُرجى مراعاة القيود والمتطلبات التالية لملف Live API.

النسخ

لا تتوفّر ميزة تحويل الصوت إلى نص في Firebase AI Logic بعد. يرجى معاودة التحقق بعد قليل.

اللغات

لغات الإدخال: يمكنك الاطّلاع على القائمة الكاملة للغات الإدخال المتاحة لطُرز Gemini.
لغات الإخراج: يمكنك الاطّلاع على القائمة الكاملة للغات الإخراج المتاحة في Chirp 3: أصوات بدقة عالية.

تنسيقات الصوت

تتوافق Live API مع التنسيقات الصوتية التالية:

تنسيق الصوت الذي يتم إدخاله: ملف صوتي PCM بترميز 16 بت بتنسيق little-endian بمعدّل 16 كيلوهرتز
تنسيق الصوت الذي يتم إخراجه: صوت PCM 16 بت غير مفعَّل بمعدل 24 كيلوهرتز بترتيب الوحدات الأقل أهمية أولاً

حدود معدّل الاستخدام

تنطبق حدود المعدّل التالية:

10 جلسات متزامنة لكل مشروع على Firebase
4 مليون رمز مميّز في الدقيقة

مدة الجلسة

المدة التلقائية للجلسة هي 30 دقيقة. عند تجاوز مدة الجلسة الحدّ الأقصى، يتم إنهاء الاتصال.

يعتمد النموذج أيضًا على حجم السياق. قد يؤدي إرسال أجزاء كبيرة من الإدخال إلى إنهاء الجلسة في وقت أبكر.

ميزة "رصد النشاط الصوتي" (VAD)

يُجري النموذج تلقائيًا عملية رصد النشاط الصوتي (VAD) على بث إدخال صوتي باستمرار. يتم تفعيل ميزة "توقُّف الصوت أثناء الصمت" تلقائيًا.

احتساب الرموز المميّزة

لا يمكنك استخدام واجهة برمجة التطبيقات CountTokens مع Live API.

تقديم ملاحظات حول تجربتك مع Firebase AI Logic

البث في الاتجاهين باستخدام واجهة برمجة التطبيقات Gemini Live API تنظيم صفحاتك في مجموعات يمكنك حفظ المحتوى وتصنيفه حسب إعداداتك المفضّلة.

قبل البدء

الطُرز التي تتيح هذه الميزة

استخدام الميزات العادية في Live API

إنشاء نص يتم بثه من إدخال نص يتم بثه

Swift

Kotlin

Java

Web

Dart

Unity

إنشاء محتوى صوتي يتم بثه من محتوى صوتي يتم بثه

Swift

Kotlin

Java

Web

Dart

Unity

إنشاء نص يتم بثه من إدخال صوتي يتم بثه

Swift

Kotlin

Java

Web

Dart

Unity

إنشاء محتوى صوتي يتم بثه من خلال إدخال نص يتم بثه

Swift

Kotlin

Java

Web

Dart

Unity

إنشاء تجارب أكثر تفاعلاً وجذبًا

تغيير صوت الردّ

Swift

Kotlin

Java

Web

Dart

Unity

الحفاظ على السياق في جميع الجلسات والطلبات

التعامل مع المقاطعات

استخدام وظائف الاتصال (الأدوات)

القيود والمتطلبات

النسخ

اللغات

تنسيقات الصوت

حدود معدّل الاستخدام

مدة الجلسة

ميزة "رصد النشاط الصوتي" (VAD)

احتساب الرموز المميّزة

البث في الاتجاهين باستخدام واجهة برمجة التطبيقات Gemini Live API