Here's everything we announced at I/O, from new Firebase Studio features to more ways to integrate AI. Read blog.

এই পৃষ্ঠাটি Cloud Translation API অনুবাদ করেছে।

Gemini লাইভ API ব্যবহার করে দ্বিমুখী স্ট্রিমিং, Gemini লাইভ API ব্যবহার করে দ্বিমুখী স্ট্রিমিং

পূর্বরূপ : Gemini Live API সাথে Firebase AI লজিক SDK ব্যবহার করা হল একটি বৈশিষ্ট্য যা প্রিভিউতে রয়েছে, যার অর্থ হল এটি কোনো SLA বা অবচয় নীতির অধীন নয় এবং এটি পিছিয়ে-বেমানান উপায়ে পরিবর্তন হতে পারে।

Vertex AI Gemini API ব্যবহার করার সময় এই প্রাথমিক রিলিজটি শুধুমাত্র Android, Flutter, এবং Unity অ্যাপগুলিকে সমর্থন করে ৷ অ্যাপল প্ল্যাটফর্ম এবং ওয়েব অ্যাপের জন্য সমর্থন — সেইসাথে সমস্ত প্ল্যাটফর্মে জেমিনি ডেভেলপার API-এর জন্য সমর্থন — শীঘ্রই আসছে!

Gemini Live API Gemini-এর সাথে কম-বিলম্বিত দ্বিমুখী পাঠ্য এবং ভয়েস ইন্টারঅ্যাকশন সক্ষম করে। Live API ব্যবহার করে, আপনি শেষ ব্যবহারকারীদের প্রাকৃতিক, মানুষের মতো ভয়েস কথোপকথনের অভিজ্ঞতা প্রদান করতে পারেন, পাঠ্য বা ভয়েস কমান্ড ব্যবহার করে মডেলের প্রতিক্রিয়াগুলিকে বাধা দেওয়ার ক্ষমতা সহ। মডেলটি পাঠ্য এবং অডিও ইনপুট প্রক্রিয়া করতে পারে (ভিডিও শীঘ্রই আসছে!), এবং এটি পাঠ্য এবং অডিও আউটপুট প্রদান করতে পারে।

আপনি Vertex AI স্টুডিওতে প্রম্পট এবং Live API সহ প্রোটোটাইপ করতে পারেন।

Live API হল একটি স্টেটফুল এপিআই যা ক্লায়েন্ট এবং জেমিনি সার্ভারের মধ্যে একটি সেশন স্থাপন করতে একটি ওয়েবসকেট সংযোগ তৈরি করে। বিস্তারিত জানার জন্য, Live API রেফারেন্স ডকুমেন্টেশন দেখুন।

আপনি শুরু করার আগে

আপনার API প্রদানকারী হিসাবে Vertex AI Gemini API ব্যবহার করার সময় শুধুমাত্র উপলব্ধ।

যদি আপনি ইতিমধ্যেই না করে থাকেন, শুরু করার নির্দেশিকাটি সম্পূর্ণ করুন, যা বর্ণনা করে যে কীভাবে আপনার Firebase প্রকল্প সেট আপ করবেন, আপনার অ্যাপকে Firebase-এর সাথে সংযুক্ত করবেন, SDK যোগ করবেন, Vertex AI Gemini API-এর জন্য ব্যাকএন্ড পরিষেবা শুরু করবেন এবং একটি LiveModel উদাহরণ তৈরি করবেন।

মডেল যে এই ক্ষমতা সমর্থন করে

Live API শুধুমাত্র gemini-2.0-flash-live-preview-04-09 ( gemini-2.0-flash নয়) দ্বারা সমর্থিত।

এছাড়াও মনে রাখবেন gemini-2.0-flash-live-preview-04-09 শুধুমাত্র us-central1 অবস্থানে সমর্থিত।

Live API এর মানক বৈশিষ্ট্যগুলি ব্যবহার করুন৷

এই বিভাগটি বর্ণনা করে যে কীভাবে Live API এর মানক বৈশিষ্ট্যগুলি ব্যবহার করতে হয়, বিশেষত বিভিন্ন ধরনের ইনপুট এবং আউটপুট স্ট্রিম করতে:

পাঠ্য পাঠান এবং পাঠ্য গ্রহণ করুন
অডিও পাঠান এবং অডিও গ্রহণ করুন
অডিও পাঠান এবং টেক্সট গ্রহণ
পাঠ্য পাঠান এবং অডিও গ্রহণ করুন

স্ট্রিম করা পাঠ্য ইনপুট থেকে স্ট্রিম করা পাঠ্য তৈরি করুন

এই নমুনাটি চেষ্টা করার আগে, আপনার প্রকল্প এবং অ্যাপ সেট আপ করতে এই গাইডের শুরু করার আগে বিভাগটি সম্পূর্ণ করুন।
সেই বিভাগে, আপনি আপনার নির্বাচিত Gemini API প্রদানকারীর জন্য একটি বোতামে ক্লিক করবেন যাতে আপনি এই পৃষ্ঠায় প্রদানকারী-নির্দিষ্ট সামগ্রী দেখতে পান ।

আপনি স্ট্রিম করা টেক্সট ইনপুট পাঠাতে পারেন এবং স্ট্রিম করা টেক্সট আউটপুট পেতে পারেন। একটি liveModel ইনস্ট্যান্স তৈরি করা নিশ্চিত করুন এবং প্রতিক্রিয়ার মোডালিটি Text সেট করুন।

সুইফট

Live API অ্যাপল প্ল্যাটফর্ম অ্যাপগুলির জন্য এখনও সমর্থিত নয়, তবে শীঘ্রই আবার চেক করুন!

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
       // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	  LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

Live API এখনও ওয়েব অ্যাপের জন্য সমর্থিত নয়, কিন্তু শীঘ্রই আবার চেক করুন!

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

late LiveModelSession _session;

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModalities: [ResponseModality.text]),
);

_session = await model.connect();

// Provide a text prompt
final prompt = Content.text('tell a short story');
await _session.send(input: prompt, turnComplete: true);

// In a separate thread, receive the response
await for (final message in _session.receive()) {
   // Process the received message 
}

ঐক্য

using Firebase;
using Firebase.AI;

async Task SendTextReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("tell a short story");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }
}

স্ট্রিম করা অডিও ইনপুট থেকে স্ট্রিম করা অডিও তৈরি করুন

আপনি স্ট্রিম করা অডিও ইনপুট পাঠাতে পারেন এবং স্ট্রিম করা অডিও আউটপুট পেতে পারেন। একটি LiveModel দৃষ্টান্ত তৈরি করা নিশ্চিত করুন এবং Audio প্রতিক্রিয়ার মোডালিটি সেট করুন।

প্রতিক্রিয়া ভয়েস কনফিগার এবং কাস্টমাইজ কিভাবে শিখুন (পরে এই পৃষ্ঠায়)।

সুইফট

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO 
   }
)

val session = model.connect()

// This is the recommended way.
// However, you can create your own recorder and handle the stream.
session.startAudioConversation()

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();

Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        session.startAudioConversation();
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

Live API এখনও ওয়েব অ্যাপের জন্য সমর্থিত নয়, কিন্তু শীঘ্রই আবার চেক করুন!

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
   // Configure the model to respond with audio
   config: LiveGenerationConfig(responseModalities: [ResponseModality.audio]),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
// Map the Uint8List stream to InlineDataPart stream
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});
await _session.startMediaStream(mediaChunkStream);

// In a separate thread, receive the audio response from the model
await for (final message in _session.receive()) {
   // Process the received message 
}

ঐক্য

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Start receiving the response
  await ReceiveAudio(session);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

Queue audioBuffer = new();

async Task ReceiveAudio(LiveSession liveSession) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in liveSession.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

স্ট্রিম করা অডিও ইনপুট থেকে স্ট্রিম করা পাঠ্য তৈরি করুন

আপনি স্ট্রিম করা অডিও ইনপুট পাঠাতে পারেন এবং স্ট্রিম করা টেক্সট আউটপুট পেতে পারেন। একটি LiveModel ইন্সট্যান্স তৈরি করা নিশ্চিত করুন এবং প্রতিক্রিয়ার মোডালিটি Text সেট করুন।

সুইফট

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.TEXT 
   }
)

val session = model.connect()

// Provide a text prompt
val audioContent = content("user") { audioData }

session.send(audioContent)

var outputText = ""
session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    outputText = outputText + it.text
}

// Output received from the server.
println(outputText)

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.TEXT)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle the response from the server.
	System.out.println(liveContentResponse.getText());
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Send Audio data
	 session.send(new Content.Builder().addInlineData(audioData, "audio/pcm").build());

        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

Live API এখনও ওয়েব অ্যাপের জন্য সমর্থিত নয়, কিন্তু শীঘ্রই আবার চেক করুন!

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'package:your_audio_recorder_package/your_audio_recorder_package.dart';
import 'dart:async';

late LiveModelSession _session;
final _audioRecorder = YourAudioRecorder();

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to respond with text
  config: LiveGenerationConfig(responseModality: ResponseModality.text),
);

_session = await model.connect();

final audioRecordStream = _audioRecorder.startRecordingStream();
final mediaChunkStream = audioRecordStream.map((data) {
  return InlineDataPart('audio/pcm', data);
});

await _session.startMediaStream(mediaChunkStream);

final responseStream = _session.receive();

return responseStream.asyncMap((response) async {
  if (response.parts.isNotEmpty && response.parts.first.text != null) {
    return response.parts.first.text!;
  } else {
    throw Exception('Text response not found.');
  }
});

Future main() async {
  try {
    final textStream = await audioToText();

    await for (final text in textStream) {
      print('Received text: $text');
      // Handle the text response
    }
  } catch (e) {
    print('Error: $e');
  }
}

ঐক্য

using Firebase;
using Firebase.AI;

async Task SendAudioReceiveText() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Text })
  );

  LiveSession session = await model.ConnectAsync();

  // Start a coroutine to send audio from the Microphone
  var recordingCoroutine = StartCoroutine(SendAudio(session));

  // Receive the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    if (!string.IsNullOrEmpty(message.Text)) {
      UnityEngine.Debug.Log("Received message: " + message.Text);
    }
  }

  StopCoroutine(recordingCoroutine);
}

IEnumerator SendAudio(LiveSession liveSession) {
  string microphoneDeviceName = null;
  int recordingFrequency = 16000;
  int recordingBufferSeconds = 2;

  var recordingClip = Microphone.Start(microphoneDeviceName, true,
                                       recordingBufferSeconds, recordingFrequency);

  int lastSamplePosition = 0;
  while (true) {
    if (!Microphone.IsRecording(microphoneDeviceName)) {
      yield break;
    }

    int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);

    if (currentSamplePosition != lastSamplePosition) {
      // The Microphone uses a circular buffer, so we need to check if the
      // current position wrapped around to the beginning, and handle it
      // accordingly.
      int sampleCount;
      if (currentSamplePosition > lastSamplePosition) {
        sampleCount = currentSamplePosition - lastSamplePosition;
      } else {
        sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;
      }

      if (sampleCount > 0) {
        // Get the audio chunk
        float[] samples = new float[sampleCount];
        recordingClip.GetData(samples, lastSamplePosition);

        // Send the data, discarding the resulting Task to avoid the warning
        _ = liveSession.SendAudioAsync(samples);

        lastSamplePosition = currentSamplePosition;
      }
    }

    // Wait for a short delay before reading the next sample from the Microphone
    const float MicrophoneReadDelay = 0.5f;
    yield return new WaitForSeconds(MicrophoneReadDelay);
  }
}

স্ট্রিম করা টেক্সট ইনপুট থেকে স্ট্রিম করা অডিও তৈরি করুন

আপনি স্ট্রিম করা টেক্সট ইনপুট পাঠাতে পারেন এবং স্ট্রিম করা অডিও আউটপুট পেতে পারেন। একটি LiveModel দৃষ্টান্ত তৈরি করা নিশ্চিত করুন এবং Audio প্রতিক্রিয়ার মোডালিটি সেট করুন।

প্রতিক্রিয়া ভয়েস কনফিগার এবং কাস্টমাইজ কিভাবে শিখুন (পরে এই পৃষ্ঠায়)।

সুইফট

Kotlin

// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
val model = Firebase.ai(backend = GenerativeBackend.vertexAI(location = "us-central1")).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with text
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
   }
)

val session = model.connect()

// Provide a text prompt
val text = "tell a short story"

session.send(text)

session.receive().collect {
    if(it.status == Status.TURN_COMPLETE) {
        // Optional: if you don't require to send more requests.
        session.stopReceiving();
    }
    // Handle 16bit pcm audio data at 24khz
    playAudio(it.data)
}

Java

ExecutorService executor = Executors.newFixedThreadPool(1);
// Initialize the Vertex AI Gemini API backend service
// Set the location to `us-central1` (the flash-live model is only supported in that location)
// Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).liveModel(
        "gemini-2.0-flash-live-preview-04-09",
        // Configure the model to respond with text
        new LiveGenerationConfig.Builder()
                .setResponseModalities(ResponseModality.AUDIO)
                .build()
);
LiveModelFutures model = LiveModelFutures.from(lm);
ListenableFuture<LiveSession> sessionFuture =  model.connect();
class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {
    @Override
    public void onSubscribe(Subscription s) {
        s.request(Long.MAX_VALUE); // Request an unlimited number of items
    }
    @Override
    public void onNext(LiveContentResponse liveContentResponse) {
        // Handle 16bit pcm audio data at 24khz
	liveContentResponse.getData();
    }
    @Override
    public void onError(Throwable t) {
        System.err.println("Error: " + t.getMessage());
    }
    @Override
    public void onComplete() {
        System.out.println("Done receiving messages!");
    }
}
Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
	 LiveSessionFutures session = LiveSessionFutures.from(ses);
        // Provide a text prompt
        String text = "tell me a short story?";
        session.send(text);
        Publisher<LiveContentResponse> publisher = session.receive();
        publisher.subscribe(new LiveContentResponseSubscriber());
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

Web

Live API এখনও ওয়েব অ্যাপের জন্য সমর্থিত নয়, কিন্তু শীঘ্রই আবার চেক করুন!

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';
import 'dart:async';
import 'dart:typed_data';

late LiveModelSession _session;

Future<Stream<Uint8List>> textToAudio(String textPrompt) async {
  WidgetsFlutterBinding.ensureInitialized();

  await Firebase.initializeApp(
    options: DefaultFirebaseOptions.currentPlatform,
  );

  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  final model = FirebaseAI.vertexAI(location: 'us-central1').liveModel(
    model: 'gemini-2.0-flash-live-preview-04-09',
    // Configure the model to respond with audio
    config: LiveGenerationConfig(responseModality: ResponseModality.audio),
  );

  _session = await model.connect();

  final prompt = Content.text(textPrompt);

  await _session.send(input: prompt);

  return _session.receive().asyncMap((response) async {
    if (response is LiveServerContent && response.modelTurn?.parts != null) {
       for (final part in response.modelTurn!.parts) {
         if (part is InlineDataPart) {
           return part.bytes;
         }
       }
    }
    throw Exception('Audio data not found');
  });
}

Future<void> main() async {
  try {
    final audioStream = await textToAudio('Convert this text to audio.');

    await for (final audioData in audioStream) {
      // Process the audio data (e.g., play it using an audio player package)
      print('Received audio data: ${audioData.length} bytes');
      // Example using flutter_sound (replace with your chosen package):
      // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);
    }
  } catch (e) {
    print('Error: $e');
  }
}

ঐক্য

using Firebase;
using Firebase.AI;

async Task SendTextReceiveAudio() {
  // Initialize the Vertex AI Gemini API backend service
  // Set the location to `us-central1` (the flash-live model is only supported in that location)
  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)
  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI(location: "us-central1")).GetLiveModel(
    modelName: "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to respond with audio
    liveGenerationConfig: new LiveGenerationConfig(
        responseModalities: new[] { ResponseModality.Audio })
  );

  LiveSession session = await model.ConnectAsync();

  // Provide a text prompt
  var prompt = ModelContent.Text("Convert this text to audio.");
  await session.SendAsync(content: prompt, turnComplete: true);

  // Start receiving the response
  await ReceiveAudio(session);
}

Queue<float> audioBuffer = new();

async Task ReceiveAudio(LiveSession session) {
  int sampleRate = 24000;
  int channelCount = 1;

  // Create a looping AudioClip to fill with the received audio data
  int bufferSamples = (int)(sampleRate * channelCount);
  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,
                                    sampleRate, true, OnAudioRead);

  // Attach the clip to an AudioSource and start playing it
  AudioSource audioSource = GetComponent<AudioSource>();
  audioSource.clip = clip;
  audioSource.loop = true;
  audioSource.Play();

  // Start receiving the response
  await foreach (var message in session.ReceiveAsync()) {
    // Process the received message
    foreach (float[] pcmData in message.AudioAsFloat) {
      lock (audioBuffer) {
        foreach (float sample in pcmData) {
          audioBuffer.Enqueue(sample);
        }
      }
    }
  }
}

// This method is called by the AudioClip to load audio data.
private void OnAudioRead(float[] data) {
  int samplesToProvide = data.Length;
  int samplesProvided = 0;

  lock(audioBuffer) {
    while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {
      data[samplesProvided] = audioBuffer.Dequeue();
      samplesProvided++;
    }
  }

  while (samplesProvided < samplesToProvide) {
    data[samplesProvided] = 0.0f;
    samplesProvided++;
  }
}

আরও আকর্ষক এবং ইন্টারেক্টিভ অভিজ্ঞতা তৈরি করুন

এই বিভাগটি বর্ণনা করে কিভাবে Live API এর আরও আকর্ষক বা ইন্টারেক্টিভ বৈশিষ্ট্য তৈরি এবং পরিচালনা করতে হয়।

প্রতিক্রিয়া ভয়েস পরিবর্তন করুন

Live API সংশ্লেষিত বক্তৃতা প্রতিক্রিয়া সমর্থন করতে Chirp 3 ব্যবহার করে। Firebase AI Logic ব্যবহার করার সময়, আপনি বিভিন্ন HD ভয়েস ভাষায় অডিও পাঠাতে পারেন। প্রতিটি ভয়েস কেমন শোনাচ্ছে তার সম্পূর্ণ তালিকা এবং ডেমোর জন্য, Chirp 3: HD ভয়েস দেখুন।

একটি ভয়েস নির্দিষ্ট করতে, মডেল কনফিগারেশনের অংশ হিসাবে speechConfig অবজেক্টের মধ্যে ভয়েস নাম সেট করুন। আপনি একটি ভয়েস নির্দিষ্ট না করলে, ডিফল্ট হল Puck ।

সুইফট

Kotlin

// ...

val model = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel(
    modelName = "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    generationConfig = liveGenerationConfig {
        responseModality = ResponseModality.AUDIO
        speechConfig = SpeechConfig(voice = Voices.FENRIR)
    }
)

// ...

Java

// ...

LiveModel model = FirebaseAI.getInstance(GenerativeBackend.vertexAI()).liveModel(
    "gemini-2.0-flash-live-preview-04-09",
    // Configure the model to use a specific voice for its audio response
    new LiveGenerationConfig.Builder()
        .setResponseModalities(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(Voices.FENRIR))
        .build()
);

// ...

Web

Live API এখনও ওয়েব অ্যাপের জন্য সমর্থিত নয়, কিন্তু শীঘ্রই আবার চেক করুন!

Dart

// ...

final model = FirebaseAI.vertexAI().liveGenerativeModel(
  model: 'gemini-2.0-flash-live-preview-04-09',
  // Configure the model to use a specific voice for its audio response
  config: LiveGenerationConfig(
    responseModality: ResponseModality.audio,
    speechConfig: SpeechConfig(voiceName: 'Fenrir'),
  ),
);

// ...

ঐক্য

var model = FirebaseAI.GetInstance(FirebaseAI.Backend.VertexAI()).GetLiveModel(
  modelName: "gemini-2.0-flash-live-preview-04-09",
  liveGenerationConfig: new LiveGenerationConfig(
    responseModalities: new[] { ResponseModality.Audio },
    speechConfig: SpeechConfig.UsePrebuiltVoice("Fenrir"),
);

মডেলটিকে একটি নন-ইংরেজি ভাষায় প্রতিক্রিয়া জানাতে অনুরোধ করার সময় এবং প্রয়োজন হলে সর্বোত্তম ফলাফলের জন্য, আপনার সিস্টেম নির্দেশাবলীর অংশ হিসাবে নিম্নলিখিতগুলি অন্তর্ভুক্ত করুন:

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

সেশন এবং অনুরোধ জুড়ে প্রসঙ্গ বজায় রাখুন

আপনি সেশন এবং অনুরোধ জুড়ে প্রসঙ্গ বজায় রাখতে একটি চ্যাট কাঠামো ব্যবহার করতে পারেন। মনে রাখবেন এটি শুধুমাত্র টেক্সট ইনপুট এবং টেক্সট আউটপুটের জন্য কাজ করে।

এই পদ্ধতিটি সংক্ষিপ্ত প্রসঙ্গের জন্য সর্বোত্তম; ঘটনার সঠিক ক্রম উপস্থাপন করতে আপনি পালাক্রমে মিথস্ক্রিয়া পাঠাতে পারেন। দীর্ঘ প্রসঙ্গগুলির জন্য, আমরা পরবর্তী মিথস্ক্রিয়াগুলির জন্য প্রসঙ্গ উইন্ডোটি খালি করতে একটি একক বার্তা সারাংশ প্রদান করার পরামর্শ দিই৷

বাধা হ্যান্ডেল

Firebase AI লজিক এখনও হ্যান্ডলিং বাধা সমর্থন করে না । শীঘ্রই ফিরে দেখুন!

ফাংশন কলিং ব্যবহার করুন (সরঞ্জাম)

আপনি লাইভ API এর সাথে ব্যবহার করার জন্য উপলব্ধ ফাংশনের মতো সরঞ্জামগুলিকে সংজ্ঞায়িত করতে পারেন, যেমন আপনি মানক সামগ্রী তৈরির পদ্ধতিগুলির সাথে করতে পারেন। ফাংশন কলিং সহ লাইভ API ব্যবহার করার সময় এই বিভাগটি কিছু সূক্ষ্মতা বর্ণনা করে। ফাংশন কলিংয়ের সম্পূর্ণ বিবরণ এবং উদাহরণের জন্য, ফাংশন কলিং গাইড দেখুন।

একটি একক প্রম্পট থেকে, মডেলটি একাধিক ফাংশন কল এবং তাদের আউটপুট চেইন করার জন্য প্রয়োজনীয় কোড তৈরি করতে পারে। এই কোডটি একটি স্যান্ডবক্স পরিবেশে কার্যকর করে, পরবর্তী BidiGenerateContentToolCall বার্তা তৈরি করে। প্রতিটি ফাংশন কলের ফলাফল উপলব্ধ না হওয়া পর্যন্ত এক্সিকিউশন বিরতি দেয়, যা ক্রমিক প্রক্রিয়াকরণ নিশ্চিত করে।

উপরন্তু, ফাংশন কলিংয়ের সাথে লাইভ API ব্যবহার করা বিশেষভাবে শক্তিশালী কারণ মডেলটি ব্যবহারকারীর কাছ থেকে ফলো-আপ বা স্পষ্ট তথ্যের জন্য অনুরোধ করতে পারে। উদাহরণস্বরূপ, যদি মডেলের কাছে পর্যাপ্ত তথ্য না থাকে যে একটি ফাংশনটি কল করতে চায় একটি প্যারামিটার মান প্রদান করার জন্য, তাহলে মডেলটি ব্যবহারকারীকে আরও বা স্পষ্ট তথ্য প্রদান করতে বলতে পারে।

ক্লায়েন্টকে BidiGenerateContentToolResponse দিয়ে প্রতিক্রিয়া জানাতে হবে।

সীমাবদ্ধতা এবং প্রয়োজনীয়তা

Live API এর নিম্নলিখিত সীমাবদ্ধতা এবং প্রয়োজনীয়তাগুলি মনে রাখবেন।

প্রতিলিপি

Firebase AI লজিক এখনও ট্রান্সক্রিপশন সমর্থন করে না । শীঘ্রই ফিরে দেখুন!

ভাষা

ইনপুট ভাষা: জেমিনি মডেলের জন্য সমর্থিত ইনপুট ভাষার সম্পূর্ণ তালিকা দেখুন
আউটপুট ভাষা: Chirp 3: HD ভয়েস- এ উপলব্ধ আউটপুট ভাষার সম্পূর্ণ তালিকা দেখুন

অডিও ফরম্যাট

Live API নিম্নলিখিত অডিও ফরম্যাট সমর্থন করে:

ইনপুট অডিও ফরম্যাট: Raw 16 bit PCM অডিও 16kHz লিটল-এন্ডিয়ান
আউটপুট অডিও ফরম্যাট: 24kHz লিটল-এন্ডিয়ানে কাঁচা 16 বিট পিসিএম অডিও

হারের সীমা

নিম্নলিখিত হার সীমা প্রযোজ্য:

প্রতি Firebase প্রকল্পে 10টি সমবর্তী সেশন
প্রতি মিনিটে 4M টোকেন

সেশনের দৈর্ঘ্য

একটি সেশনের জন্য ডিফল্ট দৈর্ঘ্য 30 মিনিট। যখন সেশনের সময়সীমা সীমা অতিক্রম করে, সংযোগটি বন্ধ হয়ে যায়।

মডেলটি প্রসঙ্গ আকার দ্বারা সীমাবদ্ধ। ইনপুটের বড় অংশ পাঠানোর ফলে আগের সেশন বন্ধ হয়ে যেতে পারে।

ভয়েস কার্যকলাপ সনাক্তকরণ (VAD)

মডেলটি একটি অবিচ্ছিন্ন অডিও ইনপুট স্ট্রীমে স্বয়ংক্রিয়ভাবে ভয়েস অ্যাক্টিভিটি সনাক্তকরণ (VAD) সঞ্চালন করে। VAD ডিফল্টরূপে সক্রিয় করা হয়।

টোকেন গণনা

আপনি Live API সাথে CountTokens API ব্যবহার করতে পারবেন না।

Firebase AI লজিকের সাথে আপনার অভিজ্ঞতা সম্পর্কে মতামত দিন