> ## Documentation Index
> Fetch the complete documentation index at: https://flexinference.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Authentication

> Your FlexInference API key and your BYOK OpenAI key.

You have your FlexInference API key plus a **provider key** (OpenAI, Gemini, and/or Anthropic) you bring yourself. Keep them straight:

| Key                       | Looks like    | Who it identifies     | Where it lives                                                    |
| ------------------------- | ------------- | --------------------- | ----------------------------------------------------------------- |
| **FlexInference API key** | `flex_live_…` | You, to FlexInference | Sent as the `Authorization` header on every request               |
| **OpenAI key (BYOK)**     | `sk-…`        | You, to OpenAI        | Stored encrypted in the dashboard; never sent by you on a request |
| **Gemini key (BYOK)**     | `AIza…`       | You, to Google        | Stored encrypted in the dashboard; never sent by you on a request |
| **Anthropic key (BYOK)**  | `sk-ant-…`    | You, to Anthropic     | Stored encrypted in the dashboard; never sent by you on a request |

## FlexInference API key

Create a key in the [dashboard](https://www.flexinference.com/dashboard). We show it once when you create it, so store it like a password. Send it as a bearer token:

```bash theme={null}
curl https://api.flexinference.com/v1/responses \
  -H "Authorization: Bearer flex_live_..." \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-5-nano", "input": "ping", "start_within": "default" }'
```

The example also sets `start_within`, the field FlexInference requires on every request. It says how long you are willing to wait for the request to start. This request uses `default`, so it runs on the standard tier. When you send a duration instead, FlexInference tries a cheaper flex tier first and falls back to your standard tier if flex cannot start in time. See [Deadline routing](/deadline-routing) for the full list of values.

A malformed or unknown key returns `401 invalid_api_key`. Every FlexInference key carries a checksum, so we catch an obvious typo and reject it before it ever reaches a provider. If you hit this error, check that you copied the whole key, that it starts with `flex_live_`, and that you have not revoked it in the dashboard. Create a fresh key if you are not sure it is still good.

<Tip>
  Treat `flex_live_` keys as secrets. Don't commit them or ship them in client-side code. If a key leaks, revoke it in the dashboard and create a new one.
</Tip>

## Your provider keys (BYOK)

FlexInference does not resell inference. You add your own provider key once in the dashboard, and we use it to call that provider for you. You can add an OpenAI key (`sk-...`) for GPT models, a Gemini key (`AIza...`) for Gemini models, an Anthropic key (`sk-ant-...`) for Claude models, or any mix of the three. The provider bills your account directly. We pick the key from the model name. A `gemini-*` model uses your Gemini key. A `claude-*` model uses your Anthropic key. Everything else uses your OpenAI key.

Each key is encrypted at rest with a per-organization binding, so it can only ever be used for your organization. We never log it, and we never return it.

If the key a request needs isn't configured, you get `400 no_byok_key` (OpenAI), `400 no_gemini_key` (Gemini), or `400 no_anthropic_key` (Anthropic). Add the key in the dashboard and try again.

<Info>
  Because billing runs on your own provider account, your existing rate limits, quotas, and spend controls all still apply. This holds whether the request goes to OpenAI, Gemini, or Anthropic.
</Info>
