ChatGPT Has Hidden Its Fan-Out Queries. Here’s How to Get Them Back with n8n

Categories

chatgpt hidden fan out queries
Dan George Avatar

LinkedIn profile

Dan Georgeis a former Group Marketing Director turned consultant and fractional marketing lead. He helps growing B2B businesses find clarity, generate leads, and build marketing that actually performs. He writes about marketing strategy, SEO, and the realities of doing more with less.

When ChatGPT searches the web to answer a prompt, it does not search for what the user typed. It rewrites the prompt into a set of more specific search queries, runs each one against a search index, and synthesises the results into a single answer. This is called query fan-out. Published research puts the typical range at 5 to 15 sub-queries per prompt, though in my own API testing simpler commercial prompts often produced two or three – the count scales with how complex and comparison-heavy the question is.

For anyone working in SEO, these queries are like gold dust. They are the real ranking context for AI visibility. You might sit on page one of Google for your target search term, but if your content does not appear for the sub-queries ChatGPT actually sends, you will miss out on being mentioned in the answer your prospect reads.

The fan-outs also reveal exactly which pages the model leans on when it builds an answer – sometimes listicles and directories, sometimes competitors’ own case study pages – which makes them a very sharp prospecting list for links, coverage and content gaps.

For over a year there was a free trick to see them: open Chrome DevTools, filter the network traffic for your conversation ID, and search the raw JSON for a field called search_model_queries. That method no longer works.

What changed

OpenAI has progressively removed fan-out data from the browser. When GPT-5.3 became the default model, the search_model_queries field disappeared from the conversation that the interface loads. With the release of GPT-5.4, the removal was complete – dig through the conversation files in DevTools and there is nothing to find. The Chrome extensions and bookmarklets that scraped this data have stopped working with it.

Credit where credits due: Chris Long at Nectiv flagged the change publicly, and Jerome Salomon confirmed the important part – the data has not been deleted, only hidden from the interface. It is still fully accessible through OpenAI’s API.

Chris published a Python script to pull it. That works fine if you are comfortable in a terminal, but most marketers are not, and a one-off script does not scale into a repeatable process. So here is the same approach rebuilt in n8n: a visual workflow you can run on a schedule, feed from a spreadsheet of prompts, and extend into a full backlink prospecting pipeline.

What you need

  • An n8n instance (cloud or self-hosted – I have been running self-hosted for a few months and can’t see me going back)
  • An OpenAI API key with access to the Responses API and web search. ]
  • Optionally, for the analysis stage: a DataForSEO account or an Anthropic API key (more on this below)
  • Somewhere to store results, such as a Google Sheet

Cost-wise, each prompt is a single API call. Web-search-enabled calls cost more than standard ones, but running 20 or 30 target prompts costs pennies, not pounds.

The core workflow: three nodes

Node 1: your prompts

Start simple with a Manual Trigger and a Set node containing one prompt, or point a Google Sheets node at a list of prompts your audience would realistically ask. The same rules apply as they did with the browser method: evaluative and comparative prompts (“best X in Leicester”, “top Y for manufacturers”, anything with a year) are far more likely to trigger web searches and generate rich fan-outs than simple factual questions.

Node 2: HTTP Request to the OpenAI Responses API

n8n’s OpenAI node does not expose everything you need here, so use a plain HTTP Request node:

json

{
"model": "gpt-5.4",
"tools": [{
"type": "web_search",
"user_location": {
"type": "approximate",
"country": "GB",
"city": "Leicester"
}
}],
"tool_choice": "auto",
"input": "{{ $json.prompt }}"
}

I’d suggest that you do not skip the user_location block. ChatGPT’s retrieval is location-aware, and if you leave it out the API defaults to a US user – the raw response will show country: “US” in the tools section. For anything with a geographic flavour, that means you are extracting the fan-outs an American would trigger, not your actual prospect. Set the country (and city if relevant) to match your audience.

Set the timeout generously – these calls can take 10 to 30 seconds because the model is genuinely going out and searching.

The response is a large JSON object. Buried inside its output array are items describing every web search the model ran, plus the citations it used in its answer. That is everything the old DevTools method gave you, and more.

Node 3: a Code node to parse the response

Add a Code node after the HTTP Request with something like this:

javascript

const response = $input.first().json;
const fanoutQueries = [];
const citations = [];
for (const item of response.output || []) {
// Web search calls contain the fan-out queries.
// The full set lives in action.queries (an array); action.query
// holds only the first one, so check the array first.
if (item.type === 'web_search_call' && item.action) {
if (Array.isArray(item.action.queries)) {
fanoutQueries.push(...item.action.queries);
} else if (item.action.query) {
fanoutQueries.push(item.action.query);
}
}
// The final message contains url_citation annotations
if (item.type === 'message') {
for (const block of item.content || []) {
for (const a of block.annotations || []) {
if (a.type === 'url_citation') {
citations.push({ title: a.title, url: a.url });
}
}
}
}
}
// The model often cites the same URL several times in one answer,
// so dedupe before counting
const seen = new Set();
const uniqueCitations = citations.filter(c => {
if (seen.has(c.url)) return false;
seen.add(c.url);
return true;
});
return [{
json: {
prompt: $('Set').first().json.prompt,
fanoutCount: fanoutQueries.length,
fanoutQueries,
citationCount: uniqueCitations.length,
citations: uniqueCitations
}
}];

A detail that will save you an undercount: the fan-out queries sit in an array called action.queries, but the payload also includes a singular action.query field holding just the first one. Parse the singular field alone and a three-query fan-out reports as one. The code above checks the array first for exactly this reason.

That aside, OpenAI adjusts its response payloads over time, so before trusting any parser, run the workflow once and scan down the raw JSON output of the HTTP Request node. If the field names shift again, the queries will still be in there – search the raw output for your topic keywords and adjust the paths.

From here, pipe the output into a Google Sheets node (one row per prompt, fan-outs and citations in columns) and you have a permanent, growing record of what ChatGPT searches for in your niche.

What real runs looks like

#1 Here is the output for the prompt “best HR consultancies for manufacturing companies UK”. ChatGPT expanded it into three searches:

  1. best HR consultancies for manufacturing companies UK HR consultancy manufacturing B2B industrial
  2. HR consultancy UK manufacturing case studies employee relations industrial
  3. HR consultancy manufacturing UK case studies official site

Two things jump out. First, the queries are long, keyword-stacked strings, not quite like most people would type – which is why standard keyword tools won’t always spot them. Second, look at what the model went hunting for: two of the three queries contain “case studies”, and one asks for an “official site”.

The citations that ChatGPT returned also confirmed the result. The response cited six URLs, and every one was a consultancy’s own website – dedicated manufacturing-sector HR service pages and named client case studies with results in the title. No directories, listicles or “top 10 consultancies” roundups. The most-cited firm had a sector-specific service page; another earned an extra citation purely because it published pricing, which the model lifted directly into its answer.

#2 Here is the output for the prompt “best builders leicester”. ChatGPT expanded it into four searches:

  1. best builders Leicester UK reviews recommendations 2026
  2. Leicester builders federation master builders Leicester
  3. Checkatrade builders Leicester reviews
  4. TrustATrader builders Leicester

Again there are two main conclusions. The model appends the current year and asks for reviews and recommendations – it wants fresh, third-party recommendations. Second, and more striking, it names its sources before it has even searched. The Federation of Master Builders, Checkatrade and TrustATrader appear in the queries themselves. The model is not discovering directories in the results; it goes looking for specific platforms by name.

The implication for a local trade business is obvious. For a query like this, ChatGPT has decided in advance where the answer lives: accreditation bodies and review platforms. A builder with a polished website but no Checkatrade profile, no FMB membership and thin reviews wont appear in the four searches before a single result loads. The visibility strategy here is not blog content or backlinks – it is profiles, memberships and review volume on the exact platforms the model names. You could not get a clearer target list if the model emailed it to you.

One more detail from the raw output: every cited URL carried a ?utm_source=openai parameter. That tag is how ChatGPT marks its outbound links, and it is what you filter on in GA4 to measure the referral traffic these citations actually send you.

The usual caveat applies: this is one run of one prompt, so treat it as a data point rather than a law. But it is a useful corrective to the assumption that AI answers are built entirely from listicles.

The analysis stage: turning queries into targets

Extracting the queries is half the job. The value comes from seeing what those queries return, because the pages ranking for fan-out queries are the pages ChatGPT reads and cites.

What those pages are depends on the query type, and the fan-outs tell you which game you are playing. When the model hunts comparison and “best of” angles, the winners tend to be listicles, directories and roundups – third-party pages you get onto through outreach and digital PR. When it hunts proof, as in the example above, the winners are first-party service pages and case studies – pages you build yourself. Often a prompt list produces a mix of both, which is exactly why running your own extraction beats relying on someone else’s generalisation.

For the third-party targets, you then have two options for pulling the SERPs, and this is where the workflow forks depending on budget and precision needs.

Option A: DataForSEO (paid). Loop your fan-out queries through DataForSEO’s SERP API and pull the top 10 organic results for each. This gives you the actual Google or Bing SERPs, which is the closest match to what ChatGPT’s retrieval layer sees. Aggregate the results across all fan-outs, count how often each domain appears, and the pages showing up repeatedly are your priority outreach targets. n8n has no native DataForSEO node, but it is another straightforward HTTP Request.

Option B: an Anthropic API call with web search (cheaper). Send each fan-out query to Claude via the Anthropic API with the web search tool enabled, and ask it to return the top sources it finds. You will not get a pixel-perfect SERP, but you get an LLM’s-eye view of what surfaces for each query, including titles and URLs, at lower cost and with the summarisation built in. For identifying which third-party pages dominate a niche, this is usually enough.

To be clear on what each API is doing, because it is easy to muddle: the OpenAI call is mandatory and does the extraction, since only OpenAI can tell you what ChatGPT searches. The DataForSEO or Anthropic call is optional and does the analysis, telling you what those searches return. Either works for the second job. Neither works for the first.

Whichever route you take, the end output is a two-part action list: third-party pages that repeatedly surface across your fan-outs, where you want your brand added through outreach or PR, and first-party page types the model rewards, where the fix is publishing the service pages, case studies and pricing your own site is missing.

One limitation to be honest about

The API responses do not perfectly match what the ChatGPT web interface produces for the same prompt. Different system prompts apply in each environment, and they influence how the model searches. Independent testing has confirmed the fan-out queries can differ between the two.

For pattern analysis, keyword discovery and backlink prospecting, this does not matter much – the shape of the fan-outs (the angles, modifiers and query styles) is consistent. But if you need to know precisely what a ChatGPT web user’s session searched for, treat the API data as a close approximation rather than a transcript.

Why this beats the old method anyway

Losing the DevTools trick stings, but the API route is genuinely better for anyone doing this seriously:

The browser method was one conversation at a time, done by hand. The n8n version runs a whole prompt list unattended and can re-run monthly, which matters because fan-out patterns shift as models update. What ChatGPT searched for in spring may not be what it searches for now, and a scheduled workflow catches that drift automatically.

It also produces structured data instead of screenshots. Queries and citations land in a sheet where you can dedupe, count domain frequency and track changes over time – the raw material for an actual process rather than a party trick.

Set it up once, point it at the 20 or 30 prompts that matter most to your business, and check the sheet monthly. Most of your competitors probably don’t know this data exists.

Tagged in :

Dan George Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *