Lucene/Solr Revolution 2017 has ended
Back To Schedule
Friday, September 15 • 11:20am - 12:00pm
Indexing Videos in Solr
FindLectures.com is a discovery engine for tech talks, historic speeches, and academic lectures. The site rates audio and video content for quality, showing different recommended talks each day on a variety of topics.

FindLectures.com crawls conference sites to get talk metadata, such as speaker names and bios, descriptions, and the date a video was recorded. Often these attributes are sparsely populated, or available across multiple websites. Additional attributes are inferred from audio and video content, but require more sophisticated data extraction to be useful in a text- oriented search engine like Solr.

This talk will discuss interesting lessons learned from crawling historical videos, demonstrate information extraction with machine learning, and show how to map real world problems to search engine functionality.

avatar for Gary Sieling

Gary Sieling

Software Architect, Wingspan Technology
Gary Sieling is a Software Architect at Wingspan Technology, in Blue Bell, PA, with an interests in database technologies and software engineering practices. He is involved in curating talks for a company lunch and learn program and on the organizing committee for a Philadelphia area... Read More →

Friday September 15, 2017 11:20am - 12:00pm PDT
South Seas C