How does ChatGPT search for products?
ChatGPT uses two different paths to find product information:
- Training data: Whatever GPTBot crawled before training is part of ChatGPT's base knowledge. Updates only happen with new training runs.
- Real-time web search (ChatGPT with Bing): Newer ChatGPT versions can search the web live, and they favor current, indexed pages.
For immediate visibility, real-time search is decisive. For long-term presence, training data matters. Both paths need the same foundation: a crawlable, structured product page.
GPTBot: OpenAI's crawler
GPTBot is OpenAI's official web crawler. It identifies itself as GPTBot/1.1 in the user agent and follows robots.txt rules.
What GPTBot crawls, and what it doesn't:
- Public static HTML
- Schema.org markup in JSON-LD format
- Pages listed in the sitemap
- JavaScript-only rendered content
- Pages behind login walls
- Pages that disallow GPTBot in robots.txt
Allow GPTBot in robots.txt
Many sites block GPTBot by accident. Check your robots.txt and make sure GPTBot is explicitly allowed:
User-agent: GPTBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
Frequently asked questions
How does GPTBot crawl? +
GPTBot crawls publicly accessible HTML pages, needs permission in robots.txt and prefers static HTML.
When does ChatGPT know my product? +
With real-time search: 2 to 14 days after Google indexing. With training data: during future training runs by OpenAI.
What do I need to do? +
Allow GPTBot in robots.txt, add Schema.org, and provide a static crawlable URL per product. Feed-AI handles this automatically.