Skip to main content
New Pages

Similar AI's New Pages Agent user guide

How the New Pages Agent turns your product feed into high-performing category pages - from label generation and topic creation through to filtering and publishing.

What the New Pages Agent does

The New Pages Agent takes a product feed and connects to Google Search Console (GSC) to incorporate current ranking keywords into the process. It generates structured attributes from your products, combines those into topics, then checks whether each topic is worth creating a page for.

After filtering out the duds, the remaining topics sit waiting for your approval. Once approved, they can typically be published straight to the site via the integration, complete with h1, meta-title and meta-description, plus the relevant products.

The pages can have internal links generated automatically, and products and links can typically stay up to date without manual intervention. For more detail on the publishing workflow, see the page creation platform page.

Key concepts

These terms appear throughout the New Pages Agent interface and documentation. Understanding them will make the rest of this guide straightforward.

Label

A meaningful part of a phrase. It can be one or more words. Examples: red, emerald green, audi, dresses.

Label group

What kind of thing a label is. For instance, red is a color, as is emerald green. audi is a brand. dresses is a category.

Topic

A set of labels. For instance, red Audi or emerald green dresses are both topics, each composed of two labels.

Grammar

A set of label groups. For instance, color + brand and color + category are both grammars. Grammars define which combinations of label groups the agent uses to build topics.

Positive label groups

The label groups required for any generated topics. At least one label from any of these groups is required for a topic to be created.

Ngrams

Bigram (2-gram) or trigram (3-gram) means topics with two or three labels, respectively. A bigram topic like red dresses combines two labels; a trigram like red cotton dresses combines three.

Product count

How many relevant products or listings match a topic. The agent uses this to decide whether there is enough inventory to justify a dedicated category page.

Maximum Best Position

A configurable Google rank threshold for existing pages. If a page on your site already ranks better than this position for a keyword similar to the topic, the new topic is rejected to avoid duplication.

Minimum number of products

The required count of relevant products a topic must match for the agent to consider creating a new category page.

Minimum demand

The required monthly search volume a topic must have in the local market. Topics below this threshold are rejected as low demand.

Topic Sieve

The automated set of checks that filters topics into rejection reasons. The exact number and labels of rejection reasons may vary, but they can include categories such as duplicated, already ranking, insufficient products, irrelevant products, or low demand.

How it works

The New Pages Agent follows a pipeline from raw product data to published pages. Here is each step in order.

1

Product feed ingestion

The agent reads your product feed and extracts structured data - titles, descriptions, categories, attributes, and prices. It also connects to GSC to incorporate your current ranking keywords into the content generation process.

2

Label generation

From the product data, the agent extracts structured attributes and adds semantic labels such as searchable brands, materials, and styles. For example, it might identify color labels (red, blue, emerald green), brand labels (Nike, Adidas), and category labels (trainers, jackets).

3

Topic creation

The Cleanup Agent combines keywords into topics and checks whether the site structure matches, flagging any mismatches. A grammar like color + category produces bigram topics such as red trainers or blue jackets. Trigram grammars like color + brand + category produce topics such as red Nike trainers.

4

Topic Sieve filtering

Every topic passes through the Topic Sieve, a series of automated checks. A topic can be rejected if it:

  • xOverlaps semantically with an existing page on the site, even when wording differs (caught by cannibalization detection)
  • xMatches keywords the site already ranks for, or similar ones (already ranks)
  • xHas too few matching products (insufficient products)
  • xHas matching products that are not relevant to the topic (irrelevant products)
  • xHas too little monthly search volume in the local market (low demand)
5

Approval and publishing

Topics that survive the Topic Sieve are presented for approval. Once you approve a topic, the agent can typically publish a full category page via the integration (for example, through the Shopify API). Each page typically includes elements such as an h1, category blurb, meta-title, meta-description, and the relevant products. Internal links can be generated automatically, and both products and links can typically stay up to date without manual work.

Pipeline summary

Product feed
  -> Label generation
    -> Topic creation (grammars + positive label groups)
      -> Topic Sieve (duplicate / already ranks / insufficient / irrelevant / low demand)
        -> Approval queue
          -> Published pages (h1, blurb, meta, products, internal links)

Frequently asked questions

What is the New Pages Agent and what does it do?

The New Pages Agent is Similar AI's tool that creates optimized category pages with schema markup, internal links, and auto-matched products from your product feed. It handles the entire pipeline from ingesting your product data and identifying candidate topics to filtering low-value ideas and publishing approved pages to your storefront.

How does the New Pages Agent decide which pages to create?

After ingesting your product feed, the agent identifies candidate topics by cross-referencing search demand with the product catalog and classifying shopping intent, then passes each candidate through the Topic Sieve, a sub-agent that runs five checks covering search demand, product sufficiency, existing traffic, page competition, and product match. Only topics that pass all checks move forward to content creation and review.

What is the Topic Sieve and why does it matter?

The Topic Sieve is a sub-agent that filters candidate topics before any content is written, saving time and protecting your site from thin or duplicate pages. It evaluates each topic against five checks, search demand, product sufficiency, existing traffic, page competition, and product match, so your crawl budget is spent on pages most likely to drive organic revenue.

Can I review and approve pages before they go live?

Yes - the New Pages Agent supports a human-in-the-loop approval workflow where generated pages are queued for your review before publishing. You can review recommended and rejected topics, adjust thresholds, and override decisions, giving your team full editorial control while the agent handles the heavy lifting of topic discovery, optimization, and content generation.

What product feed formats does the New Pages Agent support?

The New Pages Agent works with your product feed to create optimized category pages with auto-matched products. For Shopify and BigCommerce stores, Similar AI can pull product data directly from publicly available data via API integrations that typically take about 10 minutes. For other platforms, connecting a feed or API token is straightforward. During onboarding, Similar AI's team handles most of the setup so the agent can begin identifying topics and creating pages without extensive technical effort on your side.

Ready to create new pages?

See how the New Pages Agent can identify untapped category opportunities from your product feed and publish pages that drive organic traffic.