Lawrence Hitches Written by Lawrence Hitches | AI SEO Consultant | May 30, 2026 | 14 min read

Across 100 ecommerce brands, ChatGPT referral sessions grew 19x year-on-year in 2025-2026. Those sessions generated $690,000 in tracked revenue from 340,000 sessions. Ninety-one of those 100 brands now receive ChatGPT referral traffic in GA4.

The revenue is real. The challenge is that most tools built to track it are still catching up to the scale of what is happening.

This is my honest review of the AI visibility tools available right now, written from the perspective of an SEO practitioner managing AI search optimisation across dozens of brands. I will cover what each category of tool actually does, which ones work at agency scale, what the free foundation stack looks like before you pay for anything, and where every tool on the market still falls short.

What "optimising for ChatGPT and Claude" actually means in 2026

Optimising for ChatGPT and Claude means two distinct things: increasing the likelihood your brand is cited in AI-generated answers, and tracking when that citation happens and what it drives. The first is a content and entity problem. The second is a measurement problem. Most tools on the market solve one of these. Very few solve both.

ChatGPT Search retrieves from the web via Bing-powered crawling. Claude uses its own web retrieval layer. Both weight content structure, entity clarity, and brand authority over raw keyword matching. The old "rank for the keyword" model is necessary but not sufficient anymore.

What moves the needle for AI citation is different from what moves Google rankings. An Ahrefs study across 75,000 brands found that unlinked brand mentions correlate with AI citation frequency at 0.664, three times stronger than backlinks at 0.218. YouTube mentions correlate at 0.737, the single strongest signal measured. These are signals traditional SEO tools were not built to track.

Before evaluating any paid tool, you need to understand which problem you are solving: citation acquisition, citation tracking, or both.

The free foundation stack, before you pay for anything

The free foundation stack for AI visibility has four components: llms.txt for content discoverability, UTM source tracking for traffic attribution, Cloudflare AI bot logs for crawler intelligence, and KAI Footprint for baseline citation monitoring. Set these up first. They answer questions no paid tool currently answers well, and they cost nothing.

Every paid tool in this category has a blind spot. The free stack patches those blind spots at zero marginal cost.

1. llms.txt

The llms.txt standard (proposed by fast.ai) gives AI crawlers a curated, structured summary of your site content. Think of it as a robots.txt for language models: a plain-text file at your domain root that tells AI systems what your site is, what it covers, and which pages matter most. Writesonic offers a free llms.txt generator. It takes 20 minutes to implement. Claude's web retrieval layer explicitly supports it.

2. UTM source=chatgpt tracking

ChatGPT Search sends traffic with a referrer that GA4 can capture. The cleanest attribution method is adding a view in GA4 filtered to utm_source=chatgpt and monitoring it weekly. This is how the $690K revenue figure above was tracked: direct session-to-conversion attribution in GA4, not inferred from dashboard data. No paid tool currently gives you better revenue attribution than your own GA4 setup.

3. Cloudflare AI bot logs

If your site runs on Cloudflare, the GraphQL analytics API lets you filter requests by verified bot category, including "AI Assistant", "AI Crawler", and "AI Search". This shows which AI crawlers are hitting your site, at what volume, and which pages they are requesting. This is crawler-level intelligence, not citation-level, but it tells you whether your content is being ingested before citation tracking dashboards see it.

4. KAI Footprint

KAI Footprint is a free citation monitoring tool that tracks brand mentions across ChatGPT, Claude, Perplexity, and Gemini. It runs a set of queries you define and shows whether your brand appears in the answers. Limitations: query volume is restricted on the free tier, and it does not show traffic or revenue correlation. But as a starting baseline, it costs nothing and gives you a directional read before you commit to a paid subscription.

Run all four of these before evaluating any paid tool. If you cannot answer "how many sessions from ChatGPT did we receive last month" from your own GA4, no dashboard will fix that for you.

AI visibility tracking tools: the core paid category

The core paid AI visibility tracking tools are Profound, Peec AI, Otterly AI, AthenaHQ, and Workduo. All track citation share across ChatGPT, Claude, and Perplexity. They differ materially in query volume, agency capability, and pricing. Profound is the strongest for enterprise data depth. Otterly is the best entry point at $29/month. Peec AI has the best multi-LLM coverage. None of them connect citations to revenue.

Profound

Profound tracks AI share of voice across ChatGPT, Claude, Perplexity, and Gemini. It runs a configurable set of queries against each platform, captures which brands appear in the answers, and reports share of voice over time. The data depth is genuinely enterprise-grade: it can run thousands of queries per month and segment by query intent, topic cluster, and brand. Pricing is custom and typically starts in the $1,000/month range for meaningful query volumes. Best for: enterprise brands or agencies with budget to run comprehensive share-of-voice studies.

Peec AI

Peec AI is a multi-LLM tracker built specifically for the brand mention monitoring use case. It queries ChatGPT, Claude, Perplexity, and Gemini in parallel and tracks whether your brand appears, with what sentiment, and in what context. The competitor tracking capability is strong: you can monitor up to five competitor brands alongside your own. Pricing sits in the mid-market range. Best for: brands wanting genuine multi-LLM coverage with competitor benchmarking.

Otterly AI

Otterly is the most accessible entry point in this category at $29/month. It tracks AI citation share across the major platforms, provides keyword-level reporting, and has a clean interface that non-technical users can navigate. The query volume on lower tiers is limited, which constrains how broad a monitoring programme you can run. Best for: solo consultants, small in-house teams, or anyone wanting to validate the category before committing to enterprise pricing.

AthenaHQ

AthenaHQ leans into the content optimisation side of AI visibility, not just tracking. It analyses your content against what AI systems are actually citing and gives content recommendations to improve citation likelihood. The platform sits at the intersection of tracking tool and content tool, which makes it more complex to evaluate but more useful for teams that want to act on the data, not just report it. Best for: teams with both an SEO and content function who want one tool bridging both.

WorkDuo

WorkDuo's strength is workflow integration: it connects AI citation tracking to broader marketing operations, with reporting built for teams rather than individual analysts. The agency and team features are more developed than most competitors. The platform tracks multiple LLMs and allows query segmentation by buying stage. Best for: marketing operations teams or agencies wanting citation tracking integrated into broader reporting stacks.

Copilot and Bing-specific tracking: the gap most tools miss

Microsoft Copilot is embedded in Bing, Edge, Windows, and Microsoft 365. It draws from Bing's index, making it a distinct tracking challenge from ChatGPT or Claude. Most AI visibility tools track ChatGPT and Claude well. Copilot tracking is underserved. Bing Webmaster Tools is currently the strongest free data source for Copilot-specific visibility, and Microsoft previewed a Citation Share metric at SEO Week in April 2026 that will give publishers direct citation data for the first time.

This gap matters more for Australian and APAC markets than most tools acknowledge. Bing has higher market share in certain APAC enterprise verticals than its global average suggests. If you are running AI visibility tracking for Australian clients and only measuring ChatGPT, you are likely missing 30-40% of the AI search picture.

The practical approach for Copilot-specific tracking right now:

  • Bing Webmaster Tools: Submit your sitemap, monitor Bing organic rankings monthly, and watch for the Citation Share metric when it rolls out broadly (it is in preview as of May 2026).
  • Cloudflare AI bot logs: Filter for BingPreview and bingbot to see Copilot-adjacent crawler activity on your site.
  • Manual query testing: Ask Copilot questions in your client's topic area and document whether their brand appears. Tedious, but it is currently the most reliable way to validate Copilot citation for specific queries.
  • GA4 referrer segmentation: Copilot-driven traffic arrives with Bing referral data. Segment GA4 by source/medium to separate Bing organic from Bing-via-Copilot (look for session patterns that suggest conversational rather than navigational behaviour).

No paid tool currently offers reliable, at-scale Copilot-specific citation tracking. Optimising for Copilot starts with Bing SEO fundamentals, which means this is a tracking gap you address with Bing Webmaster Tools until the paid tools catch up.

Tools for agencies managing multiple brands

Agency-scale AI visibility tracking requires tools with multi-client workspaces, bulk query management, and exportable reporting. WorkDuo has the most developed agency workspace. Ahrefs Brand Radar supports multi-brand monitoring under one account. Semrush's agency toolkit layers AI visibility data onto existing client dashboards. None of these were purpose-built for AI citation tracking at agency scale, and it shows in the workflow friction.

The operational reality of running AI visibility tracking across 20+ client brands is different from what any vendor demo shows. In practice, agencies face three problems that none of the current tools fully solve.

First: query management at scale. Defining a meaningful query set for one brand takes time. Doing it across 30 clients is a project. Tools that give you a template query library and allow bulk import save significant setup time. WorkDuo and Profound both offer this. Otterly does not at lower tiers.

Second: reporting consolidation. Clients want a single number: "how visible are we in AI search compared to last month?" Extracting that from a per-query citation log requires either custom reporting or a tool that surfaces an aggregate score. AthenaHQ's scoring model is designed for this. Profound's share-of-voice percentage works well in client presentations.

Third: white-label capability. Most tools do not offer white-labelling at the price points agencies can pass through to clients. WorkDuo has agency pricing options. Profound can be white-labelled at enterprise tiers. For most agencies, the current solution is exporting data and building custom Looker Studio dashboards on top.

Until a tool purpose-builds for agency workflow, the most practical approach is: Profound or Peec AI for data, Looker Studio for client-facing reporting, and GA4 UTM tracking for revenue attribution that the dashboards cannot provide.

Content optimisation tools vs tracking tools: do not confuse them

AI visibility tools split into two categories that are often conflated: tracking tools (which monitor whether your brand appears in AI answers) and content optimisation tools (which help you create content more likely to be cited). AthenaHQ and Writesonic's GEO features sit in the second category. Semrush's AI content tools also land here. These tools are complementary, not interchangeable.

The distinction matters because they answer different questions. A tracking tool tells you your current citation share. A content optimisation tool helps you improve it. You need both, but buying a content tool when you need a tracking tool, or vice versa, is a common mistake in the early stages of building an AI visibility programme.

Content optimisation tools worth knowing:

  • AthenaHQ: Analyses what AI systems are citing in your topic area and gives content gap recommendations based on that analysis. The only tool I have seen that connects AI citation patterns to content action items directly.
  • Writesonic's GEO mode: Positions content for AI answer engine retrieval. Includes a free llms.txt generator. Useful as a content creation layer, not as a tracking layer.
  • Semrush AI content tools: Semrush has layered AI content optimisation onto its existing toolset. If you are already a Semrush user, these features are worth activating. They do not replace purpose-built AI visibility trackers but they are useful consolidation for teams that live in Semrush already.
  • InLinks: Entity-based content optimisation that maps your content to knowledge graph entities. This is the technical GEO layer, building the entity architecture that makes citation trackers find something worth measuring. Often overlooked because it is not a dashboard.

What 100 ecommerce brands taught us about AI search tools

Across 100 ecommerce brands tracked at StudioHawk through the 2025-2026 AI search surge, ChatGPT referral sessions grew 19x year-on-year, generating $690,000 in tracked revenue from 340,000 sessions. Ninety-one of 100 brands now receive ChatGPT referral traffic in GA4. The key finding: none of the paid AI visibility tools gave us the revenue attribution that mattered. GA4 UTM tracking gave us that. The paid tools gave us the citation context to understand why certain brands outperformed others.

The $690K revenue number comes from direct GA4 attribution. We set up UTM source tracking across all client properties, filtered sessions by utm_source=chatgpt, and connected those sessions to Shopify and WooCommerce transaction data. This is not a modelled number. It is clicks from ChatGPT that converted to purchases, tracked the same way any other channel is tracked.

What the paid AI visibility tools told us was different, and complementary. The brands with the highest citation share in Profound and Peec AI were not always the brands with the highest revenue from ChatGPT sessions. The disconnect revealed something the tools cannot tell you: citation volume and conversion quality are not the same metric.

A brand cited frequently in "what is the best X" queries captures awareness traffic. A brand cited in "where to buy X in Melbourne" or "which X brand has free returns" captures purchase-intent traffic. The tools do not segment by intent with enough granularity to surface this distinction automatically. You need to layer in query-level analysis manually.

The most important signal from this dataset was about brand mentions, not rankings. Brands that had invested in PR, podcast appearances, YouTube coverage, and review site presence before the AI search surge happened to be the brands with the highest organic citation rates when AI search scaled. Unlinked brand mentions correlate with AI citation frequency at 0.664, per Ahrefs' 75,000-brand study. We saw this pattern clearly in our own data: brands with strong pre-existing mention profiles did not need to build citation from scratch. The AI systems already knew them.

For brands without that pre-existing presence, the tools helped us identify which query clusters we were being cited in versus which we were absent from, so we could prioritise content investment. That is the legitimate use case for paid tracking tools: gap identification, not revenue attribution.

How to choose the right tool for your situation

The right AI visibility tool depends on whether you are a solo consultant, in-house team, agency, or enterprise. Solo consultants: start with the free stack plus Otterly at $29/month. In-house teams: Peec AI or AthenaHQ for citation tracking plus content action items. Agencies managing 10+ clients: Profound or WorkDuo for data depth, with custom Looker Studio reporting on top. Enterprise: Profound or BrightEdge for share-of-voice at scale, integrated with existing analytics infrastructure.

Situation Recommended stack Approx. monthly cost
Solo consultant starting out Free stack (KAI Footprint + UTM tracking + Cloudflare logs + llms.txt) + Otterly AI $29
In-house SEO team (1-5 people) Peec AI or AthenaHQ + GA4 UTM tracking $150-$400
Agency managing 10-30 clients Profound or WorkDuo + Looker Studio dashboards + GA4 revenue tracking $500-$1,500
Enterprise brand (multiple markets) Profound or BrightEdge + Semrush agency toolkit + GA4 Custom
Copilot/Bing-focused (APAC) Bing Webmaster Tools (free) + any of the above + manual query testing Tool cost + time

One principle that applies regardless of tier: do not start with a paid tool until you have UTM tracking and GA4 set up correctly. The paid tools measure citation share. Your own analytics measure revenue. You need both to make a business case for AI search investment.

What these tools still cannot measure (yet)

No competitor review I have read is honest about this, so here is the full list of what the current generation of tools cannot do:

  • Revenue attribution from AI citations: No tool connects a citation event to a downstream purchase. GA4 UTM tracking is still the only reliable method, and it only captures sessions where ChatGPT sent a visitor directly. Assisted revenue from AI citations (someone sees your brand cited, Googles you later, converts) is invisible to everything.
  • Model-level segmentation: ChatGPT has multiple models (GPT-4o, GPT-4o mini, o3). Claude has Sonnet, Opus, and Haiku. Citation behaviour differs across these models. No tool currently segments citation data by which model generated the response.
  • Real-time citation correlation: You publish a new piece of content. When does it start appearing in AI answers? How quickly does citation rate change after a content update? No tool currently gives you this feedback loop with a latency under 72 hours.
  • Dark mentions: AI systems often draw on training data as well as live web retrieval. Citations that come from training data rather than live retrieval are invisible to tracker tools entirely. This is especially relevant for Claude, which has a more recent training cutoff and uses retrieval and training in combination.
  • Query sampling bias: Every paid tool in this category runs a query set you define against AI platforms and monitors what comes back. The query set you define determines what you can see. If you do not know to track a query, you will not see that you are being cited (or not cited) for it. There is no comprehensive query coverage in any current product.
  • Bing Copilot citation data at scale: As noted above, Copilot tracking is the unserved use case in this category. The Citation Share preview from Microsoft is the most direct solution on the horizon, but it is not broadly available yet.

These gaps are not criticisms of specific tools. They reflect where the technology is right now. The tools that exist are genuinely useful for what they do: tracking citation share across a defined query set across the major LLM platforms. That is worth measuring. Just do not mistake it for a complete picture of AI search performance.

FAQ

What is the best free tool to track ChatGPT citations?

KAI Footprint is the strongest free option for citation monitoring. Pair it with GA4 UTM source tracking for traffic attribution and Cloudflare AI bot logs for crawler intelligence. This free stack answers the baseline questions before you commit to any paid subscription.

Do AI visibility tools track Copilot as well as ChatGPT and Claude?

Most paid tools track ChatGPT, Claude, and Perplexity reliably. Copilot tracking is significantly weaker across the category. Bing Webmaster Tools is currently the best free data source for Copilot-specific visibility, with Microsoft's Citation Share metric in preview as of May 2026 being the most significant upcoming development in this space.

Can AI visibility tools tell me how much revenue comes from ChatGPT?

No current paid tool does this reliably. GA4 UTM source tracking is the only method that connects ChatGPT-referred sessions to transactions with any accuracy. Set up a GA4 view filtered to utm_source=chatgpt, connect it to your ecommerce conversion data, and you have a direct revenue attribution line that paid tools cannot replicate.

How many queries do I need to track to get meaningful AI citation data?

At minimum, track 20-30 queries covering your core topic clusters, branded queries, and your three to five highest-intent buying queries. At agency scale across multiple clients, query management becomes the biggest operational challenge: budget time for query set design, not just tool setup. The quality of your query set determines the quality of your citation data.

Is Semrush good enough for AI visibility tracking, or do I need a specialist tool?

Semrush's AI visibility features are useful if you are already in the platform and want to avoid adding another subscription. They are not as deep as Profound or Peec AI for citation share tracking specifically. If AI search is a significant channel for your clients or business, a specialist tool is worth the additional cost. If AI visibility is one of ten things on your dashboard, Semrush consolidation makes sense.

Soaring Above Search

Weekly AI search insights from the front line. One newsletter. Six sections. Everything that actually moved this week, with a practitioner's take.

Lawrence Hitches
Lawrence Hitches AI SEO Consultant, Melbourne

Chief of Staff at StudioHawk, Australia's largest dedicated SEO agency. Specialising in AI search visibility, technical SEO, and organic growth strategy. Leading a team of 120+ across Melbourne, Sydney, London, and the US. Book a free consultation →