The short version
Tracking your brand in ChatGPT means doing one of two things. Either you open ChatGPT yourself, in incognito, and run the prompts your customer would actually ask, three times each, logging what you see. Or you pay a tool to do it continuously across 25 to 50 prompts and report changes over time. The free method works for a single audit. The paid method is what you graduate to once you want a number to watch.
The free method, step by step
You don't need a tool to start. A clean ChatGPT session and 15 minutes are enough for a first read on where you stand.
1. Open ChatGPT in incognito, with memory off
Open a private window (Chrome incognito, Safari private). Go to ChatGPT. If you're logged in, sign out, or go to Settings, Personalization, and turn off Memory. The reason this matters: any prior chat you've had about your category contaminates the result. ChatGPT will pull a recommendation that's biased toward whatever brands you've discussed before, and you'll get a falsely flattering read of your visibility.
2. Ask your customer's actual question
Use the natural, unbranded question a real buyer would ask. Some examples that work for different categories:
- "What's the best medical alert system for an elderly parent in 2026?"
- "Which CRM is best for a 5 person sales team?"
- "What project management tool should a marketing agency use?"
Don't lead with your brand name. The question you want answered is "does ChatGPT bring me up unprompted," not "will ChatGPT confirm I exist if I name myself."
3. Repeat in three fresh chats
Open a new chat (not a new turn in the same chat) and ask the same question again. Then once more. Language models are stochastic; the same prompt returns different brand picks on different runs. One run is anecdote. Three to five runs is a starting measurement.
4. Log what you see
A spreadsheet with five columns is enough for now: prompt, run number, brands named (in order), your brand's position (or 0 if absent), and a one line description of how each brand was characterized. After three runs, patterns emerge. Some brands are named every time. Some are named one time in three. Some never appear.
5. Repeat in Gemini and Perplexity
Now do the same exercise in Gemini and Perplexity. Don't assume ChatGPT is representative. Cross engine citation overlap of cited domains is only about 11%. A brand can dominate Perplexity (which leans heavily on Reddit) while being invisible in ChatGPT (which leans on Wikipedia and trained knowledge), or the reverse. The per engine view is what you actually want to optimize against.
What most guides get wrong about ChatGPT
A lot of "how to track ChatGPT" articles assume ChatGPT is a search engine that browses the web on every query and cites everything. It isn't, and it doesn't.
The share of ChatGPT conversations that trigger at least one web search. The other ~82% are answered from training data with no citation block. If you're optimizing for citations alone, you're optimizing for a minority of answers.
Browsing fires for a specific set of conditions: time sensitive queries (news, prices, recent events), explicit "what are your sources" prompts, factual verification of a specific claim, and queries about specific named companies or products. Everything else is answered from the model's trained knowledge, which is typically 12 to 18 months stale and weighted heavily toward Wikipedia, Reddit, and a long tail of well linked editorial content.
Three implications for how you measure:
- Don't only test the browse mode. A lot of paid tools default to browsing prompts and miss the training mode answers that most users actually see.
- Turn 1 matters disproportionately. The first turn of a conversation is about 2.5x more likely to trigger a citation block than turn 10. Test fresh conversations, not deep ones.
- Free tier vs paid tier behave differently. Free ChatGPT has live web access in 2026 but with hard daily caps. Plus and Pro browse more aggressively. If your buyer is on the free tier (most are), test there.
One more thing. ChatGPT's browse step routes through Bing, but the page that gets fetched is fetched by OpenAI's own crawler, OAI-SearchBot. If your robots.txt blocks it, you're excluded from the citation block even when your site ranks well on Bing. Many SEO programs from 2023 still block AI crawlers by default; this is a quiet own goal.
Building a prompt set worth tracking
One prompt is anecdote. The point at which a prompt set becomes a measurement is somewhere between 25 and 50 prompts, grouped into topics. Practitioner research from Averi and Obsero has converged on this range as the level where week over week variance becomes stable at around ±3.7 points. Topics (groups of related prompts) deliver more stable signal than individual prompts, which is why the leading tools track at the topic level.
A good prompt set covers four patterns:
- Category prompts: "what's the best X for Y" type questions, unbranded. Where you most want to be named.
- Comparison prompts: "X vs Y" and "is X better than Y" pitches between you and your top competitors.
- Use case prompts: "what should I use to do Z" questions framed around the job, not the category.
- Skeptical / objection prompts: "is X a scam," "problems with X," "alternatives to X." If sentiment is going to turn against you, this is where you'll catch it first.
What to log per run
Five fields per run is enough to do real analysis without drowning in data:
- Prompt
- Engine and tier (ChatGPT free, ChatGPT Plus, Gemini, Perplexity, etc.)
- Whether your brand was named, and if so, in what position
- One line of how your brand was characterized ("recommended for X," "noted as a budget alternative," "compared favorably to Y")
- Competitors named alongside or instead of you
From those five fields you can compute everything: mention rate, average position, share of voice, sentiment, and the competitor set that's beating you in this category.
Tools worth paying for, once the free method gets old
The free method works for the first audit, and for a quarterly spot check. The moment you want to monitor 30 prompts continuously, across 5 engines, with alerts when something changes, you're going to want a tool. Here's what the 2026 landscape looks like.
| Tool | Entry price | Engines | Notable |
|---|---|---|---|
| Answer Socrates | Free (ChatGPT + Gemini) | ChatGPT, Gemini free. Claude, Perplexity, Grok, DeepSeek at $15/mo. | Cheapest entry. Good for the first audit. |
| Ahrefs AI Visibility Checker | Free | Multiple AI assistants | Strong dataset (Ahrefs has 260M+ monthly prompts in Brand Radar). |
| TrackAIMentions | Credits based | ChatGPT, Perplexity, Gemini | 60 second report. White label PDF on Agency Pro. |
| Otterly.AI | $29/mo Lite (15 prompts) | ChatGPT, Google AIO, Perplexity, Copilot. Gemini extra. | Lowest paid entry. Semrush partnership. |
| Peec AI | €85 to €505/mo | 5 engines, 115+ languages | Looker Studio connector on every plan. |
| AthenaHQ | $270/mo Lite (annual) | 8 engines, credit based | Action Center: generates actual tasks, not just dashboards. |
| Profound | $499/mo Lite (50 prompts) | 5 engines | Enterprise leader. Conversation Explorer, prompt volume data. |
| KodoUs | Free during private beta | 8 engines (ChatGPT, Gemini, Claude, Grok, Perplexity, Copilot, Meta AI, DeepSeek) | Real time prompt alerts. Transparent GEO score. Developer first API. |
- Answer Socrates
Free (ChatGPT + Gemini)
ChatGPT, Gemini free. Claude, Perplexity, Grok, DeepSeek at $15/mo.
Cheapest entry. Good for the first audit.
- Ahrefs AI Visibility Checker
Free
Multiple AI assistants
Strong dataset (Ahrefs has 260M+ monthly prompts in Brand Radar).
- TrackAIMentions
Credits based
ChatGPT, Perplexity, Gemini
60 second report. White label PDF on Agency Pro.
- Otterly.AI
$29/mo Lite (15 prompts)
ChatGPT, Google AIO, Perplexity, Copilot. Gemini extra.
Lowest paid entry. Semrush partnership.
- Peec AI
€85 to €505/mo
5 engines, 115+ languages
Looker Studio connector on every plan.
- AthenaHQ
$270/mo Lite (annual)
8 engines, credit based
Action Center: generates actual tasks, not just dashboards.
- Profound
$499/mo Lite (50 prompts)
5 engines
Enterprise leader. Conversation Explorer, prompt volume data.
- KodoUs
Free during private beta
8 engines (ChatGPT, Gemini, Claude, Grok, Perplexity, Copilot, Meta AI, DeepSeek)
Real time prompt alerts. Transparent GEO score. Developer first API.
Pricing and positioning as of May 2026 from each vendor's public site.
From tracking to ranking
Tracking is the easy part. The hard part is moving the number. Three things consistently move a brand's ChatGPT visibility in 2026:
- Earn presence on the sources ChatGPT cites most. Wikipedia, Wikidata, Reddit, YouTube, and LinkedIn. A Wikidata Q number for your brand is the highest leverage single action (faster to get than a Wikipedia article, and it grounds the entity in the knowledge graph that powers ChatGPT's reasoning).
- Publish content structured for extraction. Question as H2 headings, direct answers in the first 40 to 60 words of each section, comparison tables (LLMs cite tables disproportionately), and named statistics with named sources. Aggarwal et al. (KDD 2024) showed statistics additions and quotation additions produced the biggest visibility lift in the study, up to 40%.
- Earn branded mentions across the web. Ahrefs' 2026 data shows a 0.664 correlation between branded web mentions and AI Overview visibility, and 0.737 for YouTube mentions. Those are the strongest measurable citation signals in the dataset.
For the full playbook, see our piece on Generative Engine Optimization, which covers the research foundation and the tactic library in depth. If you'd rather just have Kodo measure and improve this for you, we're onboarding 25 brands per week during the private beta.
Frequently asked questions
The questions we hear most from teams just starting to track their brand in ChatGPT.
How do I see if my brand is mentioned in ChatGPT?
Open ChatGPT in incognito with chat memory turned off. Ask the question your customer would ask, the natural one ("what's the best medical alert system for my mom"), not a branded one. Repeat the same prompt in 3 fresh chats because ChatGPT is non deterministic. Log whether your brand was named, in what position, with what description, and which competitors were named instead. Then repeat the exercise in Gemini and Perplexity to triangulate.
Does ChatGPT remember previous chats when I test it?
Yes, and that's the most common reason early audits give you misleading results. ChatGPT's memory feature plus chat history both influence what the model says about brands you've discussed before. Always test in incognito with memory disabled. If you forget, your past conversations bias the answer toward your own brand and you'll think you have AI visibility you don't have.
Why does ChatGPT give different answers each time I ask the same question?
Language models are non deterministic. The same prompt can return different brand recommendations across runs because the model samples from a probability distribution. That's why one test isn't a measurement. Run each prompt at least 3 times. Practitioner research has settled on a sample of 25 to 50 prompts per category as the point where week over week variance becomes stable (around ±3.7 points).
Can I track ChatGPT brand mentions for free?
Yes, for a one shot audit. Use Answer Socrates' free tier (covers ChatGPT and Gemini), or Ahrefs' free AI Visibility Checker, or just open ChatGPT in incognito and run your prompts manually. The free approach breaks down once you want continuous monitoring across 20 to 50 prompts per week. At that point a paid tool starts to pay for itself.
How often should I check ChatGPT for brand mentions?
Daily monitoring is overkill for most categories. Weekly is the practical floor for a brand actively running GEO work. Monthly is fine if you're just keeping an eye on things. The cadence matters less than the consistency: run the same prompt set on the same day of the week so week over week comparisons are meaningful.
Why is my brand ranking on Google but invisible in ChatGPT?
Because the two systems use different signals. Recent Ahrefs research found only about 17% of AI Overview citations also rank in Google's top 10. ChatGPT pulls heavily from Wikipedia, Reddit, YouTube, and LinkedIn, and from training data that may be 12 to 18 months stale. A brand can dominate Google search and still be missing from ChatGPT's category answers if its content footprint on those four sources is thin.
Does ChatGPT browse the web for every query?
No, and this is what most guides get wrong. Roughly 18% of ChatGPT conversations trigger a web search. Browsing fires for time sensitive queries (news, prices), explicit "what are your sources" prompts, factual verification of specific claims, and queries about specific named companies. Everything else gets answered from training data with no citation block. Turn 1 is about 2.5x more likely to trigger citations than turn 10 of the same conversation.
Which crawler does ChatGPT use to fetch my page?
When ChatGPT browses, it routes through Bing for the search step and then fetches the actual page contents using its own crawler called OAI-SearchBot. If your robots.txt blocks OAI-SearchBot, your site is excluded from the citation block even if it ranks well on Bing. That's a common own goal: brands optimize for AI visibility, then block the very crawler that fetches their pages.
How do I improve my brand's visibility in ChatGPT after measuring it?
Three things move the needle: earn presence on the sources ChatGPT cites most (Wikipedia/Wikidata, Reddit, YouTube, LinkedIn), publish content structured for extraction (question as H2 headings, direct answers in the first 40 to 60 words of each section, comparison tables, named statistics with sources), and earn branded web mentions. Ahrefs found a 0.664 correlation between branded mentions and AI Overview visibility, the strongest measurable signal in the 2026 dataset.