Saturday, January 16, 2016

Search in Django with Haystack using Solr or Elastic Search

Lets say you want to provide search on your Django application. In specific model, or file search on your media files or data files uploaded by users.

Here are tech solution for it -

HayStack - Modular search for Django
It allows querying on top of any search engine from following - Solr, ElasticSearch, Xapian, Whoosh.

Solr and ElasticSearch is built on top of powerful search server Apache Lucene. Both are free, and under Apache License 2.

Interesting presentation on Solr vs ElasticSearch

ElasticSearch is distributed, some functions in Solr doesn't not allow distributed execution. Easy cloud support with third party. easy to scale, add/remove nodes. ES is realtime and distributed.

Solr and ElasticSearch both provides admin page, in ES its called ElasticSearch-Head. ES also provides concept of GateWay, which allows index recovering if the system crash in any case.

Use ES if - index is big and real time, several indices, multi tenancy requirement, want to save administrative effort and cost.
Don’t use ES if - your company is relatively new, and already using Solr, or no real-time search indexing required,  relatively small indices

Utility other than ElasticSearch-Head, is ElasticSearch-bigdesk which provides analytics and charts.

Solr There are some concern when real time index updates and search queries been performed. For plain vanilla search Solr out performs and works much better than ES.

You can find more comparison here.

Solr is older than ElasticSearch, so it got bigger community and help available online. At the same time ElasticSearch was built in order to overcome the scaling limitation of Solr. ES is stable, though Solr is more mature.In terms of scalability, ElasticSearch is easy to scale compare to Solr, but with Solr 4.0 that limitation will be gone as per the documentation.

Sematext provides support for both Solr and ElasticSearch, you can find good overview and comparisons on various categories in this series of blog post by them.

and now the competition is joined by Amazon CloudSearch, applications which use AWS for hosting also seems widely using CloudSearch. Here is comparison between CloudSearch and Solr. There is no clear winner! Make choice based on requirement of your environment. Try to keep it simple, unless its really required.

Wednesday, January 13, 2016

Python 32bit or 64bit ?

Recently I moved my application from centOS 5 to centOS 7, which had 64bit python installed. It end up crashing my django application because some of the packages I was using were compiled in 32bit python and they weren't compatible.

First thing you need to check whether the python you are running is 32 bit or 64 bit. Here is the simple command to check it -

$ python
Python 2.7.5 (default, Nov 20 2015, 02:00:19) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> print struct.calcsize("P") * 8
64

That means its 64bit!

Friday, January 8, 2016

Redis server installation on webfaction/ on shared server

I come across redis while using sentry. With the latest version 7.7.4, you need redis to run sentry.

Here are the steps to install Redis server on shared server of webfaction -

First fetch redis installation on home directory (of your account)
$ wget "http://download.redis.io/releases/redis-3.0.6.tar.gz"

Extract it and remove the version name from the directory.
$ tar -xzf redis-3.0.6.tar.gz
$ mv redis-3.0.6 redis
(to keep it clean remove the redis-3.0.6.tar.gz)

Run installation
$ make
Run test to verify that installation was correct
$ make test
Now go to webfaction and create custom app, so that we can get port number and we can use it at various places on the configuration.

It will get you the new port information and creates the directory based on the name you have given inside the webapps directory in your account.

As per above, your app name is redis_server and custom port for it - 19957

Now copy redis.conf from extraction to the webapps.

$ cd ~/webapps/redis_server/
$ cp ~/redis/src/redis-server .
$ cp ~/redis/src/redis-cli .
$ cp ~/redis/redis.conf .
Update three items in the redis.conf file.

daemonize no > daemonize yes
pidfile /var/run/redis.pid > pidfile /home//webapps/redis_server/redis.pid
port 6379 -> port 19957
Now try to run it manually to verify once. (ideally we want to run it in background)
./redis-server redis.conf

Once it’s running, you can test if its running fine or now by going to cli
./redis-cli -p 19957

It should prompt -
127.0.0.1:19957>

Otherwise it will say-
not connected >

you can get out of the cli mode by Ctl + d

You can automate this commands by creating Makefile

vi Makefile
client cli:
       ./redis-cli -p 19957
start restart:
       ./redis-server redis.conf
stop:
       cat redis.pid | xargs kill

In order to start now, you can use
make start

to stop
make stop

you can manually search and find the redis process
ps -u $USER -o pid,command | grep redis

and kill it manually

Would still prefer the clean way of

Wednesday, January 6, 2016

Sentry - That page number is less than 1 [error]

While running sentry, when you click on your project, it throws Internal server error with some random code. If you look at the log, it shows something like following. Only thing stands out is "That page number is less than 1"  

  File "/home/User/.virtualenvs/sentry/lib/python2.7/site-packages/sentry-6.4.4-py2.7.egg/sentry/templatetags/sentry_helpers.py", line 217, in paginator
    result = paginate_func(request, queryset_or_list, per_page, endless=True)
  File "/home/User/.virtualenvs/sentry/lib/python2.7/site-packages/paging/helpers.py", line 24, in paginate
    'paginator': paginator.get_context(page),
  File "/home/User/.virtualenvs/sentry/lib/python2.7/site-packages/paging/paginators.py", line 96, in get_context
    'previous_page': paginator.previous_page_number(),
  File "/home/User/.virtualenvs/sentry/lib/python2.7/site-packages/Django-1.5.8-py2.7.egg/django/core/paginator.py", line 143, in previous_page_number
    return self.paginator.validate_number(self.number - 1)
  File "/home/User/.virtualenvs/sentry/lib/python2.7/site-packages/Django-1.5.8-py2.7.egg/django/core/paginator.py", line 30, in validate_number
    raise EmptyPage('That page number is less than 1')

EmptyPage: That page number is less than 1

This error is because of the django-paging version, you may have version lower than 0.2.5. you need >=0.2.5 

Run the following command -

pip install django-paging==0.2.5

And restart your sentry server (web).