Monitor your costs, usage, and other metrics of the Gemini API

Monitoring your costs, usage, and other metrics of the Gemini API is an important part of running a production app. It's important to know what normal usage patterns look like for your app and make sure you're staying within thresholds important to you.

Monitor costs

In the Usage and Billing dashboard of the Firebase console, you can view your project's costs for calling the Vertex AI Gemini API.

The costs displayed on the dashboard are not necessarily specific to calls via the Vertex AI in Firebase client SDKs. The displayed costs are associated with any call to the Vertex AI Gemini API, whether it be using the Vertex AI in Firebase client SDKs, the Vertex AI server SDKs, Firebase Genkit, the Firebase Extensions for the Gemini API, REST calls, Vertex AI Studio, or other API clients.

You can also get an estimate of the token size and billable characters of your requests using the Count Tokens API. Learn more about token limits per model and pricing per model.

Set up alerting

To avoid surprise bills, make sure that you set up budget alerts.

Note that budget alerts aren't budget caps. An alert will send you communications when you're approaching or surpassed your configured threshold so that you can take action in your app or project.

Monitor usage and other metrics

You can view your project's metrics for each API, like its usage, in the Google Cloud console.

  1. In the Google Cloud console, go to each API page: Vertex AI API and Vertex AI in Firebase API.

    • Vertex AI API page: This is the usage associated with any call to the Vertex AI Gemini API, whether it be using the Vertex AI in Firebase client SDKs, the Vertex AI server SDKs, Firebase Genkit, the Firebase Extensions for the Gemini API, REST calls, Vertex AI Studio, etc.

    • Vertex AI in Firebase API page: This is the usage specifically for calls coming from the Vertex AI in Firebase SDKs.

  2. Click Manage.

  3. Click the Metrics tab.

  4. Use the drop-down menus to view the metrics of interest, like traffic by response code, errors by API method, overall latency, and latency by API method.