Image Generation

aisdk now treats image generation as a first-class model family, separate from language models.

That separation matters because image-generation models return image artifacts, not chat completions.

Core APIs

The main entry points are:

generate_image()
edit_image()

Both APIs resolve an ImageModelV1 object and return a GenerateImageResult.

Creating an image model

Use a provider’s image_model() constructor:

library(aisdk)

provider <- create_gemini()
model <- provider$image_model("gemini-3.1-flash-image-preview")

OpenAI image models use the same pattern:

library(aisdk)

provider <- create_openai()
model <- provider$image_model("gpt-image-2")

Other supported provider patterns:

create_volcengine()$image_model("doubao-seedream-5-0")
create_xai()$image_model("grok-2-image")
create_stepfun()$image_model("step-1x-medium")
create_openrouter()$image_model("openai/gpt-image-2")
create_aihubmix()$image_model("gpt-image-2")

Provider support matrix

The current aisdk image-model support looks like this:

Provider	Example model	`generate_image()`	`edit_image()`	Notes
Gemini	`gemini-3.1-flash-image-preview`	Yes	Yes	Prompt-based edits; `mask` not yet exposed
OpenAI	`gpt-image-2` / `gpt-image-1.5`	Yes	Yes	`mask` supported; local file path or data URI required for edits; `input_fidelity` and multiple reference images available on the latest family
Volcengine	`doubao-seedream-5-0`	Yes	Yes	Image-to-image reuses the generation endpoint with `image` input
xAI	`grok-2-image`	Yes	Yes	JSON image generation and editing workflow
Stepfun	`step-1x-medium` / `step-1x-edit`	Yes	Yes	Editing currently requires `step-1x-edit`
OpenRouter	`openai/gpt-image-2`	Yes	Yes	Reuses OpenAI image-model path through the router
AiHubMix	`gpt-image-2`	Yes	Yes	Reuses OpenAI image-model path through AiHubMix

In practice, the easiest rule is:

use a provider-native image model when one exists
use OpenRouter or AiHubMix when you want routing flexibility over OpenAI-style image APIs
use provider-specific docs if you need model naming or parameter hints

Text-to-image generation

library(aisdk)

result <- generate_image(
  model = create_gemini()$image_model("gemini-3.1-flash-image-preview"),
  prompt = "A studio product photo of a matte white ceramic mug on linen",
  output_dir = tempdir()
)

result$images[[1]]$path

OpenAI works the same way:

library(aisdk)

result <- generate_image(
  model = create_openai()$image_model("gpt-image-2"),
  prompt = "A minimalist editorial photo of a cobalt blue mug on a white plinth",
  output_dir = tempdir(),
  background = "transparent",
  output_format = "webp",
  output_compression = 60
)

result$images[[1]]$path

Volcengine example:

library(aisdk)

result <- generate_image(
  model = create_volcengine()$image_model("doubao-seedream-5-0"),
  prompt = "A sleek editorial photo of a cobalt blue ceramic mug",
  output_dir = tempdir()
)

result$images[[1]]$path

xAI example:

library(aisdk)

result <- generate_image(
  model = create_xai()$image_model("grok-2-image"),
  prompt = "A premium product shot of a blue mug on white marble",
  output_dir = tempdir()
)

result$images[[1]]$path

Stepfun example:

library(aisdk)

result <- generate_image(
  model = create_stepfun()$image_model("step-1x-medium"),
  prompt = "A ceramic mug photographed in soft studio light",
  output_dir = tempdir()
)

result$images[[1]]$path

Generated images are materialized to disk automatically. By default, files are written to tempdir(), which is safer for package examples and scripts.

Image editing

Gemini image models can also perform image-to-image edits.

library(aisdk)

result <- edit_image(
  model = create_gemini()$image_model("gemini-3.1-flash-image-preview"),
  image = "inst/extdata/product.png",
  prompt = "Change the mug color from white to cobalt blue.",
  output_dir = tempdir()
)

result$images[[1]]$path

In the current aisdk implementation:

image is required
prompt is optional but strongly recommended
mask is not yet implemented for Gemini

OpenAI image models also support edit_image():

library(aisdk)

result <- edit_image(
  model = create_openai()$image_model("gpt-image-1.5"),
  image = "inst/extdata/product.png",
  prompt = "Change the mug color from white to cobalt blue.",
  output_dir = tempdir()
)

result$images[[1]]$path

In the current aisdk implementation, OpenAI image editing expects a local file path or data URI for the source image. mask is also supported when you want explicit localized edits.

For the latest OpenAI image models, aisdk also forwards:

background
output_format
output_compression
input_fidelity (editing only, for models that allow it)

OpenAI edits can also use multiple reference images:

result <- edit_image(
  model = create_openai()$image_model("gpt-image-1.5"),
  image = c("inst/extdata/product_front.png", "inst/extdata/product_side.png"),
  prompt = "Combine both references into one premium catalog render.",
  input_fidelity = "high",
  output_format = "webp",
  output_compression = 55,
  output_dir = tempdir()
)

Volcengine image editing example:

library(aisdk)

result <- edit_image(
  model = create_volcengine()$image_model("doubao-seedream-5-0"),
  image = "inst/extdata/product.png",
  prompt = "Turn this product photo into a watercolor illustration.",
  output_dir = tempdir()
)

result$images[[1]]$path

xAI image editing example:

library(aisdk)

result <- edit_image(
  model = create_xai()$image_model("grok-2-image"),
  image = "https://example.com/source.png",
  prompt = "Make this image look like a watercolor painting.",
  output_dir = tempdir()
)

result$images[[1]]$path

Stepfun image editing example:

library(aisdk)

result <- edit_image(
  model = create_stepfun()$image_model("step-1x-edit"),
  image = "inst/extdata/product.png",
  prompt = "Change the mug color to cobalt blue.",
  output_dir = tempdir()
)

result$images[[1]]$path

Current provider-specific caveats:

Gemini: no mask support in aisdk yet
OpenAI: source image for editing must be a local file path or data URI
OpenAI: input_fidelity is only valid for edit workflows and is fixed for some newer models such as gpt-image-2
Volcengine: mask not yet exposed
xAI: image editing currently uses JSON image inputs
Stepfun: editing currently requires step-1x-edit

Returned image artifacts

Each item in result$images is a list with fields such as:

path
media_type
bytes

This makes it easy to either keep images on disk or continue processing them in memory.

Choosing a provider

Use this rough decision guide:

Gemini when you want a clean provider-native path for both image understanding and image generation in the same SDK
OpenAI when you want the most standard OpenAI image workflow, including explicit edit and mask support
Volcengine when you want Doubao Seedream models hosted on Ark
xAI when you want Grok image APIs
Stepfun when you want Stepfun’s dedicated image generation and edit models
OpenRouter / AiHubMix when you want OpenAI-style image APIs behind a routing layer

Relationship to multimodal language models

Use the right API for the job:

use analyze_image() or generate_text() when you want text output from an image
use generate_image() or edit_image() when you want image output

This split keeps the SDK architecture clean and makes it easier to add new providers with image-generation support later.

Provider roadmap

Gemini, OpenAI, and Volcengine all support dedicated image_model() workflows in aisdk.

The same abstraction is designed to support future provider-specific image models without overloading LanguageModelV1.