Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Wednesday, January 13, 2016

Python 32bit or 64bit ?

Recently I moved my application from centOS 5 to centOS 7, which had 64bit python installed. It end up crashing my django application because some of the packages I was using were compiled in 32bit python and they weren't compatible.

First thing you need to check whether the python you are running is 32 bit or 64 bit. Here is the simple command to check it -

$ python
Python 2.7.5 (default, Nov 20 2015, 02:00:19) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> print struct.calcsize("P") * 8
64

That means its 64bit!

Sunday, December 28, 2014

Convert .dta file to .csv file

.dta is stata data file format, often you want to see the content with other more known tools like google spreadsheet or excel or any other open data format and you need to convert the file to csv format. Follow below steps to quick easy conversion.

Download python pandas if you already don't have it installed.


pip install pandas

Navigate to the folder where you have stored .dta file and follow below set of instruction to get csv out of it -

>>> import pandas as pd
>>> data = pd.io.stata.read_stata('sample.dta')
>>> data.to_csv('changed_to_csv.csv')

And you will get quick csv conversion of the dta file.

Tuesday, June 17, 2014

Setup New Relic with Webfaction Django App (Python setup)


New Relic is awesome Application Performance management tool. You can setup your application's health check in few easy steps -

Create a free account with New Relic. Here, below are the steps to setup your app's performance management on New Relic Dashboard :

- Get the licence key from newrelic


- install package on your server - pip install newrelic


- generate config file - newrelic-admin generate-config newrelic.ini
(It should generate newrelic.ini file)


- Add following lines in to .wsgi file (provide the full path of the newrelic.ini)
import newrelic.agent
newrelic.agent.initialize('/path/newrelic.ini')

- Restart the application

Within few minutes the you should be able to see the dashboard with different metrics. Also setup the web url of your application for the ping checker. In case of any issue with it, you will get real time notification. 


There are other tools like DataDog also used by so many companies. Both allows setup of different hosts and apps health check setup. I am also planning to setup celery and solr in new relic dashboard. I'll add setup steps for those as and when its done.

Monday, October 14, 2013

install ipython notebook on mac

First why you need IPython to getting started with -

- It provides interactive capability with rich architecture
- Browser based support
- Visualization
- Save your exercise capability
- High performance tool for parallel computing and statistics analysis on datasets.

How to install ipython notebook on mac -

There are various ways you can install it, and you can find so many resources online, though I try here listing down the steps which I followed.


You can follow below instruction to install it -

$ pip install readline ipython
$ sudo pip install pyzmq
$ sudo pip install jinja2
$ sudo pip install tornado

Once all those gets installed fine, run following command to check whether it installed fine or not. You can check on command prompt first by typing ipython. 

You should be able to see on command prompt if you just run ipython -

Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53)
Type "copyright", "credits" or "license" for more information.

IPython 1.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]:
.....

Once it comes up fine, start notebook

$ ipython notebook --pylab=inline

and it should open browser with -



You can click on New Notebook and try out few things to check its working fine, like 1+1, or 5*5 etc.





All Set!

Other clean recommended way to install is with proper virtualenv and requirement.txt file -


Create a seperate virtual env and run following requirement.txt

numpy==1.7.1
pandas==0.11.0
python-dateutil==2.1
pytz==2013b
pyzmq==2.1.11
six==1.3.0
wsgiref==0.1.2

And install pip install -r requirement.txt

Above instruction is for mac, which is comparatively easy compared to other environments. Feel free to post comments or issues you run into while installing.

Thursday, September 12, 2013

iter() returned non-iterator of type '_timelex'

If you running into issue -  iter() returned non-iterator of type '_timelex', here is the solution -

You got python-dateutil 2.0 with python 2.7 which is not compatible. What you need is, install older version of python-dateutil, so downgrade to python-dateutil==1.5


In order to downgrade the version you can get details over here.

Wednesday, November 7, 2012

python url parse

Python got awesome module called urlparse. When you want different values from url, you might think of doing substring. Which is very risky and bad practice.

Here is the simple and easy way to do it in python -


>>> str = "http://demo.myapp.com/api/v1/items/26/"
>>> from urlparse import urlparse
>>> o = urlparse(str)
>>> o
ParseResult(scheme='http', netloc='demo.myapp.com', path='/api/v1/items/26/', params='', query='', fragment='')

>>> o.path
'/api/v1/items/26/'


That was easy!

You can do much more using this module, please checkout the more details on this Reference link.

Sunday, November 4, 2012

Null Check in different language


Python - Null Check

15 or "default"       # returns 15
0 or "default"         # returns "default"
None or "default"    # returns "default"
False or "default"    # returns "default"

Django Template - None Check


{{obj.item_value|default_if_none:"smile"}}

C# - Null Check

var data = val ?? "default value";

Java - Null check

Foo f = new Foo();
DummyObj obj = new Obj().getSelection();
String str = obj != null ? f.format(obj) : "";

Javscript - Null/Undefined check

if (! param) param = "abc";
//other way to check this
if (param == null) is same as if(!param)


jQuery - Null/existence check

if ( $('#myDivObj').length ) {}

Feel free to add comments and other languages Null check mechanism (inline or explicit) 

Friday, October 5, 2012

Know your environment: checkout versions

Often we run into situation to check what version we are running for particular framework, language, module.

Here, I have tried to list down the one I come across during my work -

Go to Python shell and follow below instruction for individual.

Django


>>> import django
>>> print django.VERSION
(1, 3, 1, 'final', 0)

Python


>>> import sys
>>> print sys.version
2.7.1 (r271:86832, Jul 31 2011, 19:30:53) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)]

NLTK


>>> import nltk
>>> nltk.__version__
'2.0.1rc4'

dateutil

>>> import dateutil
>>> dateutil.__version__
'1.5'

- Important thing to note here is python-dateutil 2.0 is not compatible with python 2.7 it only works with python 3.0. For 2.7 please try 1.5

-If you have 2.0, first uninstall and then install specific one


$ pip uninstall python-dateutil


$ pip install python-dateutil==1.5

Thursday, October 4, 2012

Sympy - Superb python library

Recently we came across requirement where we need to evaluate/compare/validate the algebra equation.

Different ways to do it -

- Use eval() function of python
>>> x = 2

>>> eval("x+2")
4


- Use Regex - Not a practical approach

- Use Sympy - works for simple cases which we are interested in, though probably we are just using the <10% of it.
>>> from sympy import *
>>> x = Symbol('x')
>>> y = Symbol('y')

>>> simplify(2*x+y)
2*x + y
>>> simplify(y+2*x)
2*x + y

p.s. both 2x+y and y+2x is same thing.


>>> simplify(2*(x+7)) == simplify(2*x + 14)
True
>>> simplify(2*(x+7)) == simplify(2*x + 13)
False
>>> simplify(2*(y+7)) == simplify(2*x + 14)
False
>>> simplify(2*(y+7)) == simplify(2*y + 14)
True


Here is the interesting one though -
>>> simplify((1/2) * 7*x) == simplify(7*x / 2)
False
>>> simplify(0.5 * 7*x) == simplify(7*x / 2)
True
>>> simplify((1.0/2.0) * 7*x) == simplify(7*x / 2)
True

Because in python if you specify 1/2, the answer is 0 as it consider it as positive integer, to get the decimal points you need to use 1.0/2.0

It can do expand the equation for you too -
>>> expand((x+y)**6)

x**6 + 6*x**5*y + 15*x**4*y**2 + 20*x**3*y**3 + 15*x**2*y**4 + 6*x*y**5 + y**6


For more information you can checkout here.

Monday, September 17, 2012

RSS pubDate to python date

In RSS feeds one of the field is pubDate, which is common across any feed. This date is required to be in RFC 822 - Standard for ARPA Internet Text Messages.

Sample - Sat, 17 Sep 2012 00:00:01 GMT

if you want to convert it to python date time, get help from email.utils of python
e.g.

>>> import rfc822
>>> rfc822.parsedate_tz('Thu, 26 Jul 2012 13:30:52 EDT')
(2012, 7, 26, 13, 30, 52, 0, 1, 0, -14400)

Though above gives tuple, which you will need to convert to datetime.

If you dont care about timezone (e.g. EDT, GMT etc.), use below -

>>> from datetime import datetime
>>> datetime.strptime('Thu, 26 Jul 2012 13:30:52 EDT'[:-4], '%a, %d %b %Y %H:%M:%S')
datetime.datetime(2012, 7, 26, 13, 30, 52)

I am sure there are many other ways to do this, suggest if you come across any good one :)

Sunday, September 16, 2012

Boilerpipe integration in python



Boilerpipe is a library for boilerplate removal and full text extraction from HTML. In most of the scenarios it works pretty amazing, you can try out here.

We wanted to use it with python, and so tried out the python wrapper for Boilerpipe.

It requires JPype install prerequisites, which is available here.

Once its downloaded, run 'sudo python setup.py install'

While installation I run into different errors, and following are the steps for it -

1. Install JPype.

command gcc fail error -

error: command 'gcc-4.2' failed with exit status 1

I followed the explanation here to get it install. Basically you need to update the javaHome path in setup.py based on your machine (Mac, Windows or other Linux based on your platform change appropriate method.) Next step is find out the Java path on your machine, and add it in .bash_profile if its not already set.


I did following changes in my setup.py–

def setupMacOSX(self):
        self.javaHome = '/Developer/SDKs/MacOSX10.7.sdk/System/Library/Frameworks/JavaVM.framework'
        self.jdkInclude = ""
        self.libraries = ["dl"]
        self.libraryDir = [self.javaHome+"/Libraries"]
        self.macros = [('MACOSX',1)]


def setupInclusion(self):
        self.includeDirs = [
            self.javaHome+"/Headers",
            self.javaHome+"/Headers/"+self.jdkInclude,
            "src/native/common/include", 
            "src/native/python/include",
        ]

It should do the trick, and JPype should be installed fine.

2. Run intallation of boilerpipe wrapper, that will the boilerpipe jars and chardet (universal encoding detector) as well to your environment as its also one of the required package for the boilerpipe.

Ran ‘sudo python setup.py install’ on https://github.com/misja/python-boilerpipe to get java boilerpipe wrapper install on your machine.

3. Start running your app as provided instruction on documentation.

I got following error upon running the app -

java.lang.Exception: Class de.l3s.boilerpipe.sax.HTMLHighlighter not found

Its because your $JAVA_HOME is not setup correctly. And another part of the reason was boilerpipe jar were missing from the path. (probably python-boilerpipe install didn't bring boilerpipe java jars, so I brought it manually.) After getting those it works fine.

Please suggest if any other good tool for extraction of the main content out of the web page.

Sunday, July 15, 2012

Django Application Image issues while running on mac

If you are using django and getting jpg adapter error on image upload or while running the app on rendering the jpeg image, in that case follow below instructions. The error is most likely because of installation of jpeg in environment.


You have to install support for JPEG, for example in mac use homebrew for installation and follow below instructions - 

1. Uninstall PIL
    sudo pip uninstall PIL

2. Install JPEG
    brew install jpeg

3. Install PIL again
    sudo pip install PIL

for ubuntu users use below command -
sudo apt-get install libjpeg62-dev

Sunday, April 1, 2012

Python XML node parsing


Parse value between tags, i.e. XML, HTML

e.g. <title xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">New York Knicks' Jeremy Lin needs knee surgery, likely done for season</title>

Lets say, you want to extract value from title tag, which is shown in bold in above.
There are multiple ways; one of the simple one is as listed as below -

>>> import xml.dom.minidom
>>> a = ‘<title xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">New York Knicks' Jeremy Lin needs knee surgery, likely done for season</title>’
>>> x = xml.dom.minidom.parseString(a)
>>> x.firstChild.firstChild.toxml()
u'New York Knicks' Jeremy Lin needs knee surgery, likely done for season'

There are other ways too, but I found this one simplest!

Monday, March 12, 2012

GCD


Here is another simple program of greatest common divisor (GCD) implementation -

Euclid's Algorithm:

while m is greater than zero:
   If n is greater than m, swap m and n.
   Subtract n from m.
n is the GCD

Simple python code :)

import sys

line = sys.stdin.readline()
[n,m] = line.split(None)
n = int(n)
m = int(m)

def gcd(n,m):
    while(m > 0):
        if(n > m):
            t = m
            m = n
            n = t
        m = m - n
    print n

gcd(n,m)

Sample Input/Output

$ python gcd.py
6 15
3
$ python gcd.py
17 8
1
$ python gcd.py
27 9
9

Wednesday, March 7, 2012

K Diff


Given N numbers , [N<=10^5] we need to count the total pairs of numbers that have a difference of K. [K>0 and K<1e9]


In below code k is difference and num is list/set
#Code -
def kdiff(k, num):
        print k
        print num
        a = []
        for i in range(len(num)):
                a.append(num[i]+k)

        print a
        result = set(n).intersection(set(a))
        print result
        print len(result)

#Output -
>>> f = kdiff.kdiff(2, [1,5,3,4,2])
2
[1, 5, 3, 4, 2]
[3, 7, 5, 6, 4]
set([3, 4, 5])
3
>>> f = kdiff.kdiff(1, [363374326, 364147530, 61825163, 1073065718, 1281246024, 1399469912, 428047635, 491595254, 879792181, 1069262793 ])
1
[363374326, 364147530, 61825163, 1073065718, 1281246024, 1399469912, 428047635, 491595254, 879792181, 1069262793]
[363374327, 364147531, 61825164, 1073065719, 1281246025, 1399469913, 428047636, 491595255, 879792182, 1069262794]
set([])
0

Saturday, February 11, 2012

Python color syntax highlighting

If you open python file first time on your terminal, it will be plain black color text. To enable the color highlighting follow below steps -

  • open the file - vi myProg.py
  • hit esc (escape) - : syntax on

It will enable the color highlighting, which will be helpful to recognize any syntactical error.

To enable the highlighting permanent follow below steps -

  • open the file - vi ~/.vimrc
  • hit i > it will enable the insert mode
  • type - syntax on
  • hit esc (escape) - :wq
open the python file in vi and check the syntax.