Saturday, January 16, 2016

Search in Django with Haystack using Solr or Elastic Search

Lets say you want to provide search on your Django application. In specific model, or file search on your media files or data files uploaded by users.

Here are tech solution for it -

HayStack - Modular search for Django
It allows querying on top of any search engine from following - Solr, ElasticSearch, Xapian, Whoosh.

Solr and ElasticSearch is built on top of powerful search server Apache Lucene. Both are free, and under Apache License 2.

Interesting presentation on Solr vs ElasticSearch

ElasticSearch is distributed, some functions in Solr doesn't not allow distributed execution. Easy cloud support with third party. easy to scale, add/remove nodes. ES is realtime and distributed.

Solr and ElasticSearch both provides admin page, in ES its called ElasticSearch-Head. ES also provides concept of GateWay, which allows index recovering if the system crash in any case.

Use ES if - index is big and real time, several indices, multi tenancy requirement, want to save administrative effort and cost.
Don’t use ES if - your company is relatively new, and already using Solr, or no real-time search indexing required,  relatively small indices

Utility other than ElasticSearch-Head, is ElasticSearch-bigdesk which provides analytics and charts.

Solr There are some concern when real time index updates and search queries been performed. For plain vanilla search Solr out performs and works much better than ES.

You can find more comparison here.

Solr is older than ElasticSearch, so it got bigger community and help available online. At the same time ElasticSearch was built in order to overcome the scaling limitation of Solr. ES is stable, though Solr is more mature.In terms of scalability, ElasticSearch is easy to scale compare to Solr, but with Solr 4.0 that limitation will be gone as per the documentation.

Sematext provides support for both Solr and ElasticSearch, you can find good overview and comparisons on various categories in this series of blog post by them.

and now the competition is joined by Amazon CloudSearch, applications which use AWS for hosting also seems widely using CloudSearch. Here is comparison between CloudSearch and Solr. There is no clear winner! Make choice based on requirement of your environment. Try to keep it simple, unless its really required.

No comments:

Post a Comment