robots.txt Blocking next/image on Vercel: The Real Fix

SEONext.jsrobots.txt

May 13, 20266 min read1114 words

The Problem

A client e-commerce site I work with started losing image-search traffic in mid-April. Google Search Console's Page Indexing report showed a new error: "Indexed, though blocked by robots.txt" for around 12,000 URLs. All of them were /_next/image?url=... endpoints. Image-search impressions dropped 40% in two weeks before anyone caught the notification.

The site is a Next.js 16 app on Vercel using the default next/image component. No custom robots.txt had been deployed. The platform-default robots.txt was blocking the very URL that Googlebot needs to fetch optimized images, and the regression had landed silently.

If your Next.js site lives on Vercel and product or content images stopped showing up in Google Image search, check /robots.txt first. There is a good chance the default is silently blocking /_next/image.

Why It Happens

When you use <Image src="..." /> from next/image, the rendered HTML does not point to your source image. It points to a Next.js image optimization endpoint:

<img
  srcset="
    /_next/image?url=%2Fproduct-1.jpg&w=384&q=75 1x,
    /_next/image?url=%2Fproduct-1.jpg&w=750&q=75 2x
  "
  src="/_next/image?url=%2Fproduct-1.jpg&w=750&q=75"
  alt="Product 1"
/>

That /_next/image path is what Googlebot crawls to fetch the actual image bytes. If /robots.txt disallows it, Google logs the URL as discovered, schedules it for fetching, hits the disallow, and marks it "Indexed, though blocked by robots.txt." Worse, because the canonical image URL in the rendered HTML is the blocked endpoint, the source /product-1.jpg never gets indexed on its own either.

The reason this regressed on Vercel specifically: starting around the platform update in February 2026, the default robots.txt served when your app does not define one was tightened. It now includes:

User-agent: *
Disallow: /_next/
Disallow: /api/

The intent was to prevent crawlers from hammering hot paths. The side effect is that /_next/image got swept up under the blanket /_next/ rule. Most teams never noticed because they were not actively tracking image-search impressions.

There are two more triggers that catch teams who already had a custom robots.txt:

A robots.ts copied from an older starter. Many Next.js starters define a robots.ts that disallows /_next/. If you initialized a new app from one of those templates this year, you inherit the regression on day one and the /_next/image endpoint is blocked from the first deploy.
A CDN-level robots.txt overriding the app. If Cloudflare or another CDN sits in front of Vercel and serves its own robots.txt at the edge, the app-level file never reaches the crawler. I have seen this on three migrations this year and it is invisible unless you curl your production URL.

The Fix

Step 1: Check what robots.txt is actually being served. Do not look at your repo, look at the response from production:

curl -sI https://example.com/robots.txt
curl -s  https://example.com/robots.txt

If you see Disallow: /_next/ with no matching allow, that is the bug. If /robots.txt returns a 404 and you are on Vercel, the platform default is being served and you need to define your own.

Step 2: Ship an explicit app/robots.ts that allows /_next/image. This is the canonical fix for the App Router:

// app/robots.ts
import type { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: ['/', '/_next/image'],
        disallow: ['/_next/static/', '/api/', '/admin/'],
      },
      {
        userAgent: 'Googlebot-Image',
        allow: '/',
      },
    ],
    sitemap: 'https://example.com/sitemap.xml',
  };
}

Two non-obvious bits:

The allow for /_next/image must be listed before any broad /_next/ disallow. Some crawlers (Bing especially) apply the longest-matching prefix in the order it appears, and listing allow first signals priority. Google's parser handles either order, but other crawlers do not.
Disallow /_next/static/ specifically, not all of /_next/. Static JS and CSS chunks do not need to be in the index. Image endpoints absolutely do.

Step 3: Verify with GSC's URL Inspection tool. Take a sample /_next/image?url=... URL straight from your rendered HTML, run it through the Search Console Live Test, and check "Crawl allowed." It should now show "Yes" with the new robots.txt rule highlighted. Google's robots.txt documentation covers the live test flow and the exact directive precedence rules.

Step 4: Submit a recrawl request for affected images. GSC's "Validate Fix" button on the Page Indexing report queues Google's recrawler. For high-priority product pages, also reference the images explicitly in your sitemap so the recrawl prioritizes them:

// app/sitemap.ts
import type { MetadataRoute } from 'next';
import { getAllProducts } from '@/lib/products';

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const products = await getAllProducts();

  return products.map((p) => ({
    url: `https://example.com/products/${p.slug}`,
    lastModified: p.updatedAt,
    changeFrequency: 'weekly',
    priority: 0.8,
    images: [`https://example.com${p.image}`],
  }));
}

The images field on a sitemap entry tells Google the page is image-relevant. Without it, recovery takes weeks. With it, I have seen full reindex inside seven days on a 5k-product catalog.

Step 5: If you are behind a CDN, audit the edge separately. Run curl against your CDN edge and against your origin and compare:

curl -s https://example.com/robots.txt        # CDN edge
curl -s https://origin.example.com/robots.txt # origin (skip CDN)

If the two responses differ, the CDN is overriding the app. Most Cloudflare setups serve a static robots.txt at the edge by default — turn that off in the dashboard, or sync the edge file with what app/robots.ts returns. There is no point shipping a perfect robots.ts if the edge replaces it.

Step 6: Monitor it. Add a smoke test to your CI that fetches /robots.txt from the deployed URL and fails if /_next/image is not allowed. It takes ten lines and prevents the same regression from sneaking back in during a refactor.

The Lesson

Image-search indexing on Next.js depends on Googlebot being allowed to hit /_next/image. The Vercel platform default and most older starter templates disallow /_next/ as a blanket rule, which blocks the optimization endpoint and silently kills image-search traffic. The fix is an explicit app/robots.ts that allows /_next/image and only disallows /_next/static/. Then verify with GSC's Live Test and add a CI check so it never regresses.

If your site's GSC indexing dropped after a Next.js upgrade or migration and you need it forensically audited, that is the kind of work I do for clients — see my services. For a related GSC issue I covered, see Crawled, Currently Not Indexed Next.js fix.

Back to blog Start a project