loader animation

AI-powered platform for personalized product recommendation

Conversational model for smarter product discovery & purchase.

Scroll down to read more

Project highlights

Industry: eCommerce, Retail
Client services: AI Product Development
Started in 2024
Location: United States
Team size: 4 members 
Duration: 6 months

About the client

A U.S.-based eCommerce platform in the outdoor/sporting-goods category.

The company helps consumers discover and purchase gear that fits their needs without spending hours scrolling through product catalogs.

Business challenges 

The client’s AI-powered catalog and conversation experience limited product discovery and conversion.

Pain points included:

  • Duplicate SKUs from color/size variants inflating catalog “noise”
  • Inconsistent vectorization across sources reducing semantic match accuracy
  • Weak categorization/labeling that constrained facet filtering and recommendations
  • Unclear scraping scope creating unstable upstream data
  • Limited dialog (few clarifying questions, flat result lists) that increased abandonment, and
  • Low explainability/determinism that eroded user trust in RAG-style answers

 

Goals set to Achievion

The engagement’s goal was to deliver explainable, high-relevance product recommendations through a conversational assistant that reasons over a clean, consistent catalog.

Success criteria:

  1. Consolidate variants to reduce duplication and sharpen embeddings
  2. Standardize attributes and categories to enable meaningful filters and comparisons
  3. Improve retrieval quality and dialog flow so users get to a confident choice faster
  4. Lay a scalable foundation that can support growth and potential white-label deployment without re-architecture

The north star was a shopping experience that feels guided, transparent, and predictable – where users understand why an item is recommended and can easily refine the path to purchase.

Solution 

We designed a normalization pipeline that consolidates product variants (e.g., color/size) under a canonical item, reducing duplication and clarifying attribution. This improves embedding quality by focusing signals on the product, not fragmenting them across near-identical entries. A consistent attribute schema (categories → subcategories → features) increases the precision of both retrieval and filtering.

To raise recommendation confidence, we combined accurate vector search with a deterministic query-builder path for scenarios that require traceability (e.g., strict attribute filters, exclusions, inventory constraints). The assistant can favor semantic retrieval for discovery and pivot to rule/SQL filtering when users need precise control – maintaining performance and transparency in both modes.

We also reworked the conversational flow to elicit clarifying preferences (intended use, conditions, fit, budget) early, then present results in cluster-aware groupings (e.g., “lightweight daypacks,” “ultra-durable expedition packs”) with short rationales (“recommended for X terrain; weight ≤ Y; fits torso length Z”). This turns a flat list into explainable sets, helping users understand trade-offs and refine quickly.

Trace logs, retrieval/ranking telemetry, and conversation analytics enable continuous tuning of prompts, filters, and vectors. Explanations are generated in plain language and tied to visible attributes so recommendations are auditable. The system’s componentized services (catalog integration, scraping, dialog manager, retrieval/rules engine) support iterative changes without destabilizing the whole stack.

Business outcome

The upgraded assistant delivers a cleaner, more trustworthy discovery experience, translating into faster decisions and higher session completion. Users see duplicate-free, well-explained recommendations that match their intent, reducing choice overload and improving confidence at the moment of selection. Clarifying prompts and cluster-based presentation help shoppers converge on the right product without back-and-forth searching, shortening the path from browse to buy.

Operationally, a structured, de-duplicated catalog lowers maintenance overhead and stabilizes upstream data flow. Normalized attributes and consistent categories improve search and filtering, reduce manual corrections, and make it easier to onboard new inventory or vendors without rework. The hybrid retrieval approach (semantic + deterministic) provides predictability when precision matters (e.g., strict specs, exclusions), while preserving the flexibility needed for discovery.

From a product and analytics standpoint, observability is built in. Telemetry on retrieval quality, ranking behavior, and conversation flow supports continuous tuning of prompts, filters, and embeddings. Explanations are generated in plain language and tied to visible attributes, improving audibility and trust for both users and internal stakeholders. The system now supports A/B experimentation on dialog and ranking strategies, enabling data-driven iteration rather than one-off changes.

Commercially, the platform is positioned for scalable growth and new monetization paths. The componentized architecture supports white-label distribution and partner integrations without full re-architecture, and the improved explainability opens opportunities for sponsored placements that remain transparent and user-centric. Together, these outcomes align user experience, catalog integrity, and engineering scalability – supporting measurable gains in engagement and conversion while keeping long-term cost and complexity in check.

Timeline 

Q4 2024
Discovery & Foundations
  • Stakeholder workshops to define success measures (relevance, conversion lift, time-to-choice) and guardrails (explainability, determinism, privacy)
  • Current-state audit of catalog sources, scraping coverage, duplication patterns, and attribute gaps; draft the canonical product/variant model
  • Technical spikes for vector store, rules/SQL path, and dialog manager; decide on service boundaries and integration points

Deliverables: Discovery report, canonical data model (v1), success metrics baseline, architecture blueprint & PoC plan

Q1 2025
Data Normalization & Catalog Integrity
  • Build normalization pipeline to consolidate variants (color/size) into canonical items; implement attribute schema and category hierarchy
  • Stand up ingestion & enrichment services; establish quality gates (missing attributes, invalid ranges, dedupe thresholds)
  • Backfill embeddings with improved signals; measure retrieval quality vs. pre-normalization baseline

Deliverables: Clean catalog (pilot categories), dedupe metrics, attribute completeness dashboard, embedding refresh (v1)

Q2 2025
Retrieval & Assistant MVP
  • Implement hybrid retrieval (semantic + deterministic filters) with clear fallbacks for strict/spec-driven queries
  • Design conversational flows to capture intent (use, conditions, fit, budget) and present cluster-based results with short rationales
  • Add explanation UX (plain-language “why recommended” tied to visible attributes) and quick refiners

Deliverables: Assistant MVP in staging, top-journey flows, explanation components, initial KPIs (relevance, abandonment, time-to-choice)

Team

Product Manager
AI Solutions Architect
LLM/NLP Engineer
Data Engineer

Tech Stack

Technologies:

LLM orchestration
Vector database for semantic retrieval

Rule/SQL filters for deterministic querying

Integrations:

Product catalog
Scraping pipeline

Dialog manager
Observability/Trace logs

Deployment:

Service-oriented components

You may also like

Get in touch to learn how our AI powered solutions
can solve your business problem.

    *

    *

    0 from 500