gpt-image-2 for Instagram Post Images: One Anchor Image, Every Format
Turn one gpt-image-2 reference image into every Instagram surface — 1:1 and 4:5 feed, 9:16 Stories/Reels cover, and a consistent carousel — via outpainting, reference anchoring, and mask edits.
One product shot has to become at least four things before it's usable on Instagram: a 1:1 square for the main feed, a 4:5 portrait with more screen space, a 9:16 cover for Stories or Reels, and — for a carousel — five to eight more slides that need to look like the same shoot. Most workflows handle this badly: re-shooting per surface lets the product's look drift, and cropping into other ratios loses the framing that made the original work.
This piece runs on three gpt-image-2 capabilities: reference-image fidelity (a real photo stays recognizable across every generation that reuses it), outpainting (extend a composition into a new ratio instead of re-rendering it), and native mask editing (change one region — a background, a stray object — while everything else stays pixel-stable). Together: generate once, adapt everywhere.
A note on scope first: this is a technique piece, not a posting-strategy piece. It assumes you already have a shot — a product photo, a founder headshot, a mascot — or already know what a carousel should say, and just need the pixels made. See our overview of Instagram for marketing teams for the platform context.
TL;DR
- Four surfaces, four specs: feed square 1:1 = 1080×1080; feed portrait 4:5 = 1080×1350 (the higher-real-estate default); Stories/Reels cover 9:16 = 1080×1920; carousel = up to 10 images/videos per post, one ratio throughout.
- One anchor, not four generations. Generate the hero once, then outpaint it into every other ratio — never crop-and-hope, never re-prompt from scratch.
- Carousel consistency comes from reference-image anchoring, not from repeating a prompt — resupply the same anchor image as the reference for every slide.
- Reach for a mask edit before a full regeneration. Background swaps and object removal preserve what was already right; a fresh generation risks a different-looking product on every re-roll.
- Not the model-choice piece. For jobs gpt-image-2 doesn't own — 4+ people in a scene, on-image Japanese/CJK captions, high-volume low-cost batching — see the multi-model strategy post for when Nano Banana 2 is the better tool.
How This Differs From Our Carousel Best-Practices Guide
Worth separating cleanly, since the two pieces sit next to each other in search. Our Instagram carousel best-practices guide is a content-strategy piece — what each slide should say: hook, value, and CTA roles, why saves matter, captioning, posting frequency. It doesn't touch how the images get made.
This piece is the inverse: no posting-cadence advice, no hashtag guidance, no "engage with your audience" — that ground is already covered across this site's Instagram marketing and industry guides. What follows is a single gpt-image-2-specific production workflow: reference-anchored generation, outpainting across every Instagram ratio, carousel consistency via reference anchoring, and mask edits for cleanup. Read the carousel piece to decide what each slide says; read this one to generate it once you know.
Instagram's Four Formats, One Source Image
These are Meta's published dimensions, not a benchmark — the same specs Instagram shares with Facebook, since both run on the same infrastructure.
| Surface | Aspect ratio | Pixel size | Notes |
|---|---|---|---|
| Feed square | 1:1 | 1080×1080 | Safe default, crops cleanly in most feed contexts |
| Feed portrait | 4:5 | 1080×1350 | Maximum vertical real estate — the default worth building toward |
| Stories / Reels cover | 9:16 | 1080×1920 | Full-screen vertical; see Recipe 3 for safe-zone guidance |
| Carousel | 1:1 or 4:5 | Same as above | Up to 10 images/videos per post; every slide shares one ratio |
A carousel can't mix 1:1 and 4:5 — pick one ratio for the whole post. The 10-slide figure is a technical ceiling, not a recommendation; how many to actually use belongs to the carousel best-practices guide.
Why Start From One Reference Image
gpt-image-2 processes an uploaded reference image at high fidelity automatically — a documented behavior, not a new claim — which is why a product photo, a founder's face, or a brand mascot stays recognizable across every generation that reuses it. Labels keep their color, proportions stay put, the shape you uploaded is the shape you get back.
That matters on Instagram because the feed is a grid: viewers scroll past a dozen of your tiles in seconds, and a product that looks subtly different in every post reads as generic even when each image is well-made individually. Nano Banana 2 leans on prompt-described subject preservation instead, and drifts more on exactly this job — the published Job 5 finding in our head-to-head comparison covers it, and the multi-model strategy piece covers how Adpicto routes around it.
Recipe 1: Generate the Anchor (1:1, 1080×1080)
Everything else in this piece works from one anchor image. Get this right and every later recipe inherits it.
Running example: a small home-fragrance brand — a hand-poured soy candle in an amber glass jar with a kraft-paper label, warm terracotta-and-cream palette.
Workflow:
- Upload a real photo of the candle as the reference image — not a mood board, the actual product.
- Write the prompt so the reference is explicitly the subject to preserve, not just inspiration.
- Generate at 1:1, 1080×1080.
- Save this as the anchor. Every later recipe reuses it — don't regenerate a "similar" version later.
[Reference-anchored subject] on [surface], soft [light direction] light, [1–2 supporting props], [brand palette], editorial product-photography style, subject filling the central ~70% of frame, 1:1 aspect ratio (1080×1080). No text, no logos. Must match the reference exactly on [color / label / shape].
Filled example:
The amber glass candle jar with the kraft-paper label shown in the reference image, on a pale oak wood surface, soft window light from the left, a small dried eucalyptus sprig and a folded linen napkin as supporting props, warm terracotta and cream palette, editorial product-photography style, subject filling the central ~70% of frame, 1:1 aspect ratio (1080×1080). No text, no logos. Must match the reference exactly on jar color, label design, and cap shape.
Service businesses without a physical product can anchor on a founder headshot or brand mascot instead — the anchor just has to be the one visual element every later generation must match.
Why this works: naming the upload as the anchor, not a mood reference, is what keeps the product pixel-recognizable through every recipe that follows.
Recipe 2: Outpaint the Anchor to 4:5 for Feed Portrait
For the general mechanics of outpainting — canvas prep, the transparent-border technique — see our gpt-image-2 image editing workflow. Here's the Instagram-specific version.
Paste the 1080×1080 anchor centered inside a 1080×1350 transparent canvas — 270px of new canvas total, split roughly 135px top and bottom, or weighted to one side if your composition has a natural direction.
Template:
Extend the [surface] and [light direction] light [upward / downward / both] to fill a 1080×1350 4:5 canvas, continuing the same [palette] and editorial product-photography style. Add only environmental continuation — no new props, no second focal point. Keep [subject] in its current position and scale.
Filled example:
Extend the pale oak wood surface and soft left-window light downward to fill a 1080×1350 4:5 canvas, continuing the same warm terracotta and cream palette and editorial product-photography style. Add only environmental continuation — a little more linen surface and a soft shadow gradient — no new props, no second focal point. Keep the candle jar in its current position and scale.
This replaces two defaults: generating a second image from scratch (loses the Recipe 1 reference lock the moment you re-prompt) and center-cropping a 4:5 out of something else (loses the ~70%-of-frame framing that made the anchor work). Outpainting keeps both.
Recipe 3: Outpaint to 9:16 for a Stories/Reels Cover, With Instagram's Own Safe Zones
Instagram's 9:16 isn't one safe zone — it's two, and they don't line up.
Stories overlays the top of the frame with the profile picture, username, and timestamp, and overlays the bottom with the reply field and sticker tray. Reels stacks a different set of overlays on the same 9:16 canvas: a slimmer top strip, but a heavier bottom stack — caption, audio title, follow button — plus a right-side icon rail for like, comment, share, and save that Stories doesn't carry at all.
Neither Meta nor Instagram publishes an exact pixel or percentage spec for these overlays on organic posts, and they shift slightly by device and app version anyway — treat any specific percentage as a guess dressed up as data. The reliable move is qualitative: keep your subject, and any on-image text, well clear of all four edges and weighted toward the vertical center. If the same cover needs to survive both Stories and Reels, lean toward extra margin rather than less — Reels' bottom stack reaches noticeably higher up the frame than Stories' does.
Template:
Extend the scene vertically upward and downward from the existing composition to fill a 1080×1920 9:16 canvas, continuing the same [surface], [light direction] light, and [palette]. Keep [subject] centered in the frame with generous empty margin at the very top and bottom — no critical detail near either edge. Add only environmental continuation in the new top and bottom areas — no new subjects, no text.
Filled example:
Extend the scene vertically upward and downward from the existing composition to fill a 1080×1920 9:16 canvas, continuing the same pale oak wood surface, soft left-window light, and warm terracotta-and-cream palette. Keep the candle jar centered in the frame with generous empty margin at the top and bottom — no critical detail near either edge. Extend upward into a softly out-of-focus shelf and window edge, downward into more linen surface. No new subjects, no text.
Recipe 4: Carousel Slide Consistency via Reference-Image Anchoring
The recipe most tied to what this piece is for: making a multi-slide carousel look like one designed asset instead of ten separate renders.
- Generate or pick the anchor slide — usually the cover from Recipe 1, or its 4:5 outpaint from Recipe 2.
- Resupply that same anchor image as the reference for every subsequent slide. Hold the prompt skeleton constant; vary only the subject or prop slot — a different scent, SKU, or step number.
- Respect Instagram's real ceiling: up to 10 slides, every slide at the anchor's ratio — no switching between 1:1 and 4:5 mid-carousel.
What this recipe deliberately skips — how many slides to use, what each says, how to sequence hook, value, and CTA — belongs entirely to the carousel best-practices guide. Read it first to decide the slide roles; come back here once you know what each slide needs to be.
Recipe 5: Mask Edit — Background Swap for Grid Cohesion
Scope note: this recipe and the next describe OpenAI Images API mask-editing mechanics — Adpicto's own in-app edit flow is prompt-based, not a mask-upload UI (the linked editing workflow guide covers that distinction).
The Instagram-specific case: a product photo often gets shot somewhere off-brand — a cafe counter, a stockroom shelf, a car trunk on delivery day. The product's fine; the background clashes with the rest of your grid. A mask edit pulls it onto your brand backdrop without a reshoot, and without the risk a full regeneration carries of quietly changing the product itself. Mechanics: our image editing workflow and inpainting guide.
Mask: the candle jar and label opaque; everything else transparent.
Template:
Replace the masked background with [target backdrop description]. Match the existing subject's lighting — [light direction] light, [warm/cool] temperature. Add a soft contact shadow beneath [subject] consistent with that light direction. Do not alter [subject] itself.
Filled example:
Replace the masked background with a pale oak wood surface and soft cream linen backdrop, matching this candle brand's usual feed look. Match the existing subject's lighting — soft window light from the left, warm temperature. Add a soft contact shadow beneath the jar consistent with that light direction. Do not alter the jar, label, or cap.
Why this works: matching the light direction and asking for a contact shadow are what stop the swapped background from reading as a cutout.
Recipe 6: Mask Edit — Object/Element Removal Before It Hits the Grid
Same tool, different job: a stray charging cable at the frame's edge, a discontinued product's box still on a shelf, a passerby's reflection in a window. None of these are worth a reshoot or a full regeneration.
Mask: the unwanted element transparent; everything else, including the product, opaque.
Template:
Fill the masked area with a natural continuation of the surrounding [surface/texture], matching [light direction] light and existing color and grain. No new objects, no text, no shadow inconsistent with the scene's existing light.
Filled example:
Fill the masked area with a natural continuation of the oak wood counter surface, matching the grain direction, color, and the existing soft window light from the left. No new objects, no text, no shadow inconsistent with the scene's existing light.
Both mask recipes make the same point: fixing the one wrong element preserves everything that was already right — a full regeneration re-rolls the product too, undoing the pixel-recognizability Recipe 1 was built to lock in.
Which Pipeline for Which Instagram Surface
- Starting a new post from a brand or product asset? → Recipe 1.
- Same shot, need a second feed ratio? → Recipe 2 — outpaint, don't crop or re-generate.
- Need a Stories or Reels cover? → Recipe 3 — leave generous margin at top and bottom if the same cover might serve either surface.
- Building a carousel? → Recipe 4 for slide consistency; the carousel best-practices guide for what each slide should say.
- An existing photo is basically right, just one thing off? → Recipe 5 or 6, not a fresh regeneration.
- 4+ people in one scene, on-image Japanese/CJK text, or high-volume low-cost batching? → Not gpt-image-2's strengths. Route to Nano Banana 2 — see the multi-model strategy post and head-to-head comparison.
Common Mistakes
Cropping down to 4:5 or 9:16 instead of outpainting. You lose the composition decisions that made the original work; outpainting adds canvas instead of losing framing.
Treating Stories and Reels safe zones as identical, or the same as TikTok's. The reply-field-and-sticker-tray band on Stories sits differently than the caption-and-audio-title stack on Reels, and neither matches TikTok's own overlay geometry. Design for Instagram's actual overlays.
Regenerating each carousel slide from a fresh prompt. Without reference-anchoring off the cover you get ten technically fine images that don't look like they belong together. Anchor every slide to the same reference image instead.
Reaching for a full regeneration when a mask edit would do. Every re-roll risks a slightly different-looking product. Don't gamble the 90% that's already right to fix the 10% that's wrong.
Masking too tight, with no feather at the product edge. A tight mask leaves the model no room to blend, producing a visible seam — the inpainting guide covers how much feather to leave.
Letting gpt-image-2 render the carousel's value-slide copy. Multi-line text is its weakest spot. Generate the slide clean with negative space reserved and typeset the real copy after — our text and layout prompt recipes cover the wording.
Generating the next post in isolation from the grid it's about to join. A tile can look great alone and still clash with posts already live next to it. Check it against the current grid before shipping — the fix is reusing the anchor reference or a background-match mask edit, not a posting-cadence change.
One Afternoon: From Anchor to a Full Instagram Batch
Here's the candle brand's workflow in a single afternoon. Start with a real photo of the jar and run Recipe 1 for the 1:1 anchor, then outpaint it to 4:5 for feed portrait (Recipe 2) and to 9:16 for a Stories/Reels cover with generous margin clear of both apps' UI overlays (Recipe 3). One generation now covers three surfaces.
Build a five- or six-slide carousel by resupplying the cover as the reference across the remaining slides, swapping only the scent and label per prompt — fig and cedar, vanilla amber, sea salt driftwood — while surface, light, and palette stay locked (Recipe 4).
Separately, pull two real photos shot somewhere imperfect: one on a cluttered stockroom shelf, one with a stray cable in frame. A background-swap edit fixes the first (Recipe 5), an object-removal edit fixes the second (Recipe 6) — both join the grid without a reshoot.
Export everything at native dimensions — 1080×1080, 1080×1350, 1080×1920 — and the afternoon has produced one feed post in two ratios, one Stories/Reels cover, a six-slide carousel, and two salvaged photos, all recognizably the same product.
Scope note: the 9:16 cover from Recipe 3 is the still image Instagram shows before or during Reels playback, not the Reel's motion content — for the video itself, our Sora 2 for Instagram Reels guide covers that pipeline.
Tired of re-shooting or re-prompting from scratch every time a shot needs a new Instagram surface? Start with Adpicto free — no credit card required, 5 AI-generated images per month on the free plan.
The Short Version
Three moves cover almost everything here. One reference image, adapted rather than reshot, across every Instagram surface. One anchor, reused rather than re-prompted from scratch, across every carousel slide. And a mask, not a full regeneration, whenever only one element or the background is actually wrong.
Where to go next: the carousel best-practices guide for the slide-by-slide content strategy this piece skips; the image editing workflow and inpainting deep-dive for the general mechanics behind outpainting and masks; the multi-model strategy post and head-to-head comparison for when to route away from gpt-image-2 entirely. You already know what you want to post — this is how you generate it once and adapt it everywhere it needs to go.
Related Articles
gpt-image-2 Inpainting Guide: Masks, Prompts, and Fixes
Learn gpt-image-2 inpainting for social image edits: prepare masks, write masked-region prompts, run multi-pass fixes, and avoid seams or edge bleed.
Instagram for Freelancers (2026): Attract Higher-Paying Clients With Portfolio, Process, and Proof
How freelancers and solopreneurs turn Instagram into a steady client pipeline in 2026 — portfolio carousels, process Reels, testimonial highlights, and a profile that converts visits into discovery calls.
Instagram for Freelancers (2026): Win Higher-Paying Clients and Long-Term Retainers
How freelancers and solopreneurs use Instagram in 2026 to move from one-off gigs to premium retainers — portfolio carousels, process Reels, testimonial loops, and the bio-to-DM funnel that actually closes contracts.
Try this image workflow in Adpicto
Adpicto routes between gpt-image-2 and Nano Banana 2 automatically — generate and edit images straight from a prompt.
Create an image freeNo credit card required · 5 free images per month