Lucene/Solr Revolution 2017 has ended
View analytic
Friday, September 15 • 2:50pm - 3:30pm
An Intelligent, Personalized Information Retrieval Environment
Current enterprise search engines return prioritized results to users based on the engines’ internal ranking algorithms. The unique attributes of the user and documents are not taken into account. To improve users’ information retrieval experience, we are creating an environment that allows users to retrieve information that best matches their personal, time-sensitive needs.
To achieve this level of personalization, analysis of documents published by members of the work force is conducted utilizing clustering and classification algorithms. By combining users' past information retrieval behavior with document metadata generated by our analytic techniques, we can build predictive models. These models are able to predict information needed by specific groups of users and recommend appropriate content. We utilize state-of-the-art technologies, including Spark machine learning in a Hadoop environment and Convolutional Neural Networks, a deep learning architecture, to extract useful features from a large corpus of unstructured data. In addition, we developed and improved several machine learning algorithms in clustering, classification and auto labeling.

Building profiles of user's information retrieval activity required the development of an extensive query and click tracking facility. In order to achieve a highly integrated information retrieval environment, we are replacing our home grown query and click tracking database with the Lucidworks Fusion's signal capabilities. We will discuss how we approached the task of migrating from an internally developed logging system to the Fusion platform.

avatar for John Herzer

John Herzer

Enterprise Search Project Manager, Sandia National Laboratories
John Herzer leads the enterprise search analytics effort at Sandia National Laboratories.  He was instrumental in achieving the migration to open source search technology at Sandia and is guiding the effort to incorporate machine learning techniques to improve search results.  Before... Read More →
avatar for Pengchu Zhang

Pengchu Zhang

Computer Science Researcher & Developer, Sandia National Laboratories
Pengchu Zhang has more than ten years of experience in developing methods to improve enterprise information findabilities, most in unstructured data with various machine learning technologies. Recently, he's focused on creating an information retrieval environment in organizations... Read More →

Friday September 15, 2017 2:50pm - 3:30pm
Banyan AB