Lucene/Solr Revolution 2017 has ended
View analytic
Friday, September 15 • 11:20am - 12:00pm
Faster Data Analytics with Apache Spark using Apache Solr
Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Spark SQL allows users to execute relation queries in Spark with distributed in-memory computations. Though Spark gives us faster in-memory computations, Solr is blazing fast for some analytic queries. In this talk, we will take a deep dive into how to optimize the SQL queries from Spark to Solr by plugging into the Spark LogicalPlanner using pushdown strategies. The key take aways from the talk will be:

How to perform Spark SQL queries with Apache Solr?
What happens inside a Spark SQL query?
How to plug into Spark Logical Planner?
What type of push-down strategies are optimal with Solr?
Examples of push-down strategies

avatar for Kiran Chiturri

Kiran Chiturri

Data Engineer, Lucidworks
Kiran Chitturi is a software developer at Lucidworks. He works on Lucidworks enterprise product Fusion and currently leads the development for spark-solr (https://github.com/LucidWorks/spark-solr). He is part of the Smart Data team at Lucidworks working on Analytics features for... Read More →

Friday September 15, 2017 11:20am - 12:00pm
South Seas A