Generative AI models (like the Gemini models) break down data into units called tokens for processing. Each Gemini model has a maximum number of tokens that it can handle in a prompt and response.
This page shows you how to get an estimate of token count and the number of billable characters for a request.
What information is provided in the count?
Note the following about counting tokens and billable characters:
Counting the total tokens
This count is helpful to make sure your requests don't go over the allowable context window.
The token count will reflect the size of all files (for example, images) that are provided as part of the request input. It will not count the number of images or the number of seconds in a video.
For all Gemini models, a token is equivalent to about 4 characters. 100 tokens are about 60-80 English words.
Counting the total billable characters
This count is helpful for understanding and controlling your costs, since for Vertex AI, number of characters is part of the pricing calculation.
The billable character count will reflect the number of characters in the text that's provided as part of the request input.
For Vertex AI, tokens are not part of the pricing calculation. Learn more about token limits per model and pricing per model.
Pricing and quota for counting tokens and billable characters
There's no charge or quota restriction for using the CountTokens
API. The
maximum quota for the CountTokens
API is 3000 requests per minute.