Lucene/Solr Revolution 2017 has ended
View analytic
Friday, September 15 • 11:20am - 12:00pm
Search LIKE %SQL%
Sometimes customers ask to search for substring occurrence LIKE %SQL%, completely ignoring the idea of keyword search.

We can also face this challenge in a chemical corpus and bioinformatic space.

Searching LIKE %SQL% is surprisingly hard in search engines. During the session, we'll look at the data structures behind Lucene index, and discuss what makes such search so heavy. Then, we describe common, but inefficient techniques like edge N-gramming and reversing. Finally, we'll look how to address it with the built-in algorithms, reducing customization as possible.

Note: this talk is not about introducing suffix arrays, and has nothing with the recent Solr SQL functionality.

avatar for Mikhail Khludnev

Mikhail Khludnev

Chief Engineer in Search, EPAM
Mikhail is a chief software engineer in Epam Search Competency Center, where he helps customers with terabyte indices and above that. He worked in eCommerce search for many years mostly focusing on handling relations in indices with joins and aiming a great relevancy with concept... Read More →

Friday September 15, 2017 11:20am - 12:00pm
South Seas D