Spelunker: A Hybrid Approach to Smarter Item Similarity Search
In an age where digital content and e-commerce platforms continue to expand at an unprecedented pace, effective item discovery has become a critical challenge. Traditional keyword-based search methods often fall short when users express their needs in more nuanced or subjective ways. For example, searching for “a high-quality but affordable French red wine suitable for a special occasion” cannot easily be captured through rigid keyword matching.
To address this gap, we developed Spelunker, a proof-of-concept system designed to make item similarity search more natural, accurate, and transparent by combining the strengths of Large Language Models (LLMs) with a custom K-Nearest Neighbors (KNN) algorithm.
What Spelunker Does
Spelunker allows users to describe what they are looking for in their own words. Instead of requiring exact keywords, the system interprets free-form text queries, extracts key features, and identifies the most relevant items from a dataset.
The innovation lies in the two-stage process:
- Understanding intent: An LLM (Gemini 2.0 Flash) transforms the unstructured text into structured data by identifying attributes such as country, variety, or price.
- Retrieving results: A custom KNN algorithm, using multiple distance measures (Euclidean for numerical values, embedding cosine similarity for categorical variables, and match/no-match for booleans), retrieves the closest matches from the dataset.
This hybrid approach offers two advantages over purely vector-based retrieval:
- Interpretability: The system can explain why a particular item was recommended (e.g., “Recommended because it is from France and scored 95 points”).
- Data flexibility: By applying different distance metrics to different variable types, the system preserves the inherent structure of heterogeneous data instead of reducing all attributes into a single dense vector.
Testing the System
To evaluate the system, we used a publicly available dataset of 500 wine reviews.
The LLM demonstrated very strong accuracy in query interpretation, achieving an F1-score of 0.9779 and high orthographic fidelity (Jaro similarity of 0.9321).
When combined with the KNN algorithm, the system showed a statistically significant improvement in recall while maintaining precision, meaning it was able to retrieve more relevant items without introducing irrelevant ones.
A user interface was developed to allow queries to be entered naturally without any need for technical syntax.
Key Findings
The results confirm that Spelunker can effectively bridge the gap between unstructured human intent and structured machine-readable queries. The system demonstrates that LLMs can serve as reliable query interpreters, producing structured representations from free text with high accuracy.
The addition of LLM-based re-ranking significantly improves retrieval recall, showing that language models can refine and enrich results beyond what traditional algorithms capture. Most importantly, the ability to provide explicit, human-readable justifications for results enhances transparency and user trust, which is often missing in black-box AI systems.
Limitations
Like any proof-of-concept, Spelunker has limitations that must be addressed before deployment in real-world scenarios:
- Latency: Current average query time (~18 seconds) is too slow for real-time applications.
- Scalability: The system was tested on 500 items, whereas commercial use cases often involve millions.
- Evaluation scope: Test sets were small and manually curated, limiting the ability to generalize performance claims.
- Static configuration: The system does not yet dynamically adjust weights or adapt to new data types.
Conclusion
Spelunker demonstrates the potential of combining semantic understanding from LLMs with structured retrieval through custom algorithms. The system shows that it is possible to move beyond rigid keyword searches and opaque dense retrieval models toward solutions that are both powerful and interpretable.
By enabling natural language queries, accurate retrieval, and transparent explanations, Spelunker takes a step toward making search systems more aligned with how humans think and express their needs.
📖 Read the full article here: https://arxiv.org/abs/2509.21323