Monitor costs, usage, and other metrics

Monitoring the costs, usage, and other metrics of your AI features is an important part of running a production app. You need to know what normal usage patterns look like for your app and make sure you're staying within thresholds that matter to you.

This page describes some recommended options to monitor your costs, usage, and other metrics in both the Firebase console and the Google Cloud console.

Monitor costs

In the Usage and Billing dashboard of the Firebase console, you can view your project's costs for calling the Vertex AI Gemini API and the Gemini Developer API (when you're on the Blaze pricing plan).

The costs displayed on the dashboard are not necessarily specific to calls using the Firebase AI Logic client SDKs. The displayed costs are associated with any calls to those "Gemini APIs", whether they be using the Firebase AI Logic client SDKs, the Google GenAI server SDKs, Genkit, the Firebase Extensions for the Gemini API, REST calls, one of the AI Studios, or other API clients.

Learn more about pricing for the products associated with your use of Firebase AI Logic.

Set up alerting

To avoid surprise bills, make sure that you set up budget alerts when you're on the Blaze pricing plan.

Note that budget alerts are not budget caps. An alert will send you communications when you're approaching or surpassed your configured threshold so that you can take action in your app or project.

Observe usage of your AI features in the Firebase console

Only available when using the Vertex AI Gemini API as your API provider.

You can enable AI monitoring in the Firebase AI Logic page of the Firebase console so that you can observe various app-level metrics and usage to gain comprehensive visibility into your requests from the Firebase AI Logic client SDKs. These dashboards are more in-depth than the basic token counts you get from a call to the Count Tokens API.

Key capabilities of AI monitoring in the Firebase console include:

Viewing quantitative metrics like request volume, latency, errors, and per modality token usage for each of your apps.
Inspecting traces to see your requests' attributes, inputs, and outputs, which can help with debugging and quality improvement.
Slicing data by dimensions like request status, minimum latency, model name, and more.

All of these features are built using Google Cloud Observability Suite (see detailed product information below).

Enable AI monitoring

Here are the ways that you can enable AI monitoring in the Firebase console:

When you go through the initial guided setup workflow from the Firebase AI Logic page
At any time in the Firebase AI Logic Settings tab

Requirements for enabling and using AI monitoring:

You must be a project Owner, Editor, or Firebase Vertex AI Admin.
Your Firebase project must be on the pay-as-you-go Blaze pricing plan (see detailed product information below).
You must be using the Vertex AI Gemini API as your API provider (support for the Gemini Developer API is coming soon!).
Your app must use at minimum these Firebase library versions:
iOS+: v11.13.0+ | Android: v16.0.0+ (BoM: v33.14.0+) | Web: v11.8.0+ | Flutter: v2.0.0+ (BoM: v3.11.0+) | Unity: v12.9.0+
Your app must have opt-in data collection enabled (this is enabled by default).

After your app meets these requirements and you enable AI monitoring in the console, you don't need to do anything else in your app or the console to start seeing data populate the dashboards in the Firebase AI Logic AI monitoring tab. There might be a slight delay (sometimes up to 5 minutes) before telemetry from a request is available in the Firebase console.

Advanced usage

This section describes the sampling rate configuration, as well as different options for viewing and working with your data.

Sampling rate

If you're making a large number of requests, we recommend taking advantage of the sampling rate configuration. The sampling rate indicates the proportion of requests for which trace details are actually collected.

In the Firebase AI Logic Settings tab of the Firebase console, you can configure sampling rate for your project to a value from 1 to 100%, where 100% means AI monitoring will collect traces from all of your traffic. The default is 100%. Collecting fewer traces will reduce your costs, but it will also reduce the number of traces you can monitor. Note that regardless of your sampling rate, the graphs shown in the monitoring dashboard will always reflect the true volume of traffic.

Additional options outside of the Firebase console

In addition to the AI monitoring available in the Firebase console, consider these options:

Explore Vertex AI Model Garden.
These dashboards provide further trend insights into latency and throughput for the managed models, complementing your insights from AI monitoring in the Firebase console.
Explore and use your data with Google Cloud Observability Suite
Since telemetry data for AI monitoring is stored in Google Cloud Observability Suite associated with your project, you can explore your data in its dashboards, including Trace Explorer and Logs Explorer, which are linked to when you inspect your individual traces in the Firebase console. You can also use your data to build custom dashboards, set up alerts, and more.

Detailed information about products used for AI monitoring

AI monitoring stores your telemetry data in various products available in Google Cloud Observability Suite, including Cloud Monitoring, Cloud Trace, and Cloud Logging.

Cloud Monitoring: Stores metrics, including number of requests, success rate, and request latency.
Cloud Trace: Stores traces for each of your requests so that you can view details individually, instead of in aggregate. A trace is typically associated with logs so that you can examine the content and timing of each interaction.
Cloud Logging: Captures input, output, and configuration metadata to provide rich detail about each part of your AI request.

Since your telemetry data is stored in these products, you can specify your retention and access settings directly within each product (learn more in the documentation for Cloud Monitoring, Cloud Trace, and Cloud Logging). Note that the actual prompts and generated output from each sampled request are stored along with the metrics.

Pricing

Google Cloud Observability Suite is a paid service, so your Firebase project must be on the pay-as-you-go Blaze pricing plan. However, each product has generous no-cost tiers. Learn more in Google Cloud Observability Suite pricing documentation.

View project-level API metrics in the Google Cloud console

For each API, you can view project-level metrics, like usage, in the Google Cloud console.

Note that the Google Cloud console pages described in this section do not include information like request and response content and token count. To monitor that type of information, consider using AI monitoring in the Firebase console (see previous section).

In the Google Cloud console, go to the Metrics page of the API you want to view:
- Vertex AI API: View the usage associated with any request to the Vertex AI Gemini API.
  - Includes requests using Firebase AI Logic client SDKs, the Google GenAI server SDKs, Genkit, the Firebase Extensions for the Gemini API, REST API, Vertex AI Studio, etc.
- Gemini Developer API: View the usage associated with any request to the Gemini Developer API.
  - Includes requests using the Firebase AI Logic client SDKs, the Google GenAI server SDKs, Genkit, the Firebase Extensions for the Gemini API, REST API, Google AI Studio, etc.
  - The display name of this API in the Google Cloud console is "Generative Language API".
If you find yourself on an "overview page" for the API, click Manage, and then click the Metrics tab.

Note: In the Google Cloud console, you can also view project-level metrics for the Firebase AI Logic API, which is the proxy service for Firebase AI Logic. These metrics will reflect requests only from the Firebase AI Logic client SDKs.
Use the drop-down menus to view the metrics of interest, like traffic by response code, errors by API method, overall latency, and latency by API method.