¿Cómo rastrea GPTBot?

El GPTBot de OpenAI rastrea páginas HTML de acceso público. Necesita permiso explícito en robots.txt y prefiere el HTML estático antes que el renderizado por JavaScript.

¿Cuándo conoce ChatGPT mi producto?

Con búsqueda en la web: entre 2 y 14 días después de la indexación en Google. Con datos de entrenamiento: en los futuros ciclos de entrenamiento de OpenAI.

¿Qué tengo que hacer para tener visibilidad en ChatGPT?

Permitir GPTBot en robots.txt, añadir el marcado Schema.org Product y ofrecer una URL estática y rastreable para cada producto.

Visibilidad en ChatGPT: cómo GPTBot encuentra tu producto

How does ChatGPT search for products?

ChatGPT uses two different paths to find product information:

Training data: Whatever GPTBot crawled before training is part of ChatGPT's base knowledge. Updates only happen with new training runs.
Real-time web search (ChatGPT with Bing): Newer ChatGPT versions can search the web live, and they favor current, indexed pages.

Important

For immediate visibility, real-time search is decisive. For long-term presence, training data matters. Both paths need the same foundation: a crawlable, structured product page.

GPTBot: OpenAI's crawler

GPTBot is OpenAI's official web crawler. It identifies itself as GPTBot/1.1 in the user agent and follows robots.txt rules.

What GPTBot crawls, and what it doesn't:

Public static HTML
Schema.org markup in JSON-LD format
Pages listed in the sitemap
JavaScript-only rendered content
Pages behind login walls
Pages that disallow GPTBot in robots.txt

Allow GPTBot in robots.txt

Many sites block GPTBot by accident. Check your robots.txt and make sure GPTBot is explicitly allowed:

User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

Frequently asked questions

How does GPTBot crawl? +

GPTBot crawls publicly accessible HTML pages, needs permission in robots.txt and prefers static HTML.

When does ChatGPT know my product? +

With real-time search: 2 to 14 days after Google indexing. With training data: during future training runs by OpenAI.

What do I need to do? +

Allow GPTBot in robots.txt, add Schema.org, and provide a static crawlable URL per product. Feed-AI handles this automatically.