Thursday 3 July 2014

Strip Punctuation from String in Python along with Time Efficiency Analysis

You might have googled on how to remove punctuation characters in python and you must have came across several posts on StackOverflow about the different ways you can do it. But, do you know which one is the best or which one is most time-efficient? When you are scaling your applications to large databases, then you need to think from these perspectives in order to save some computation time.

This blog post will discuss 3 different methods of string punctuation along with a comparison of their computation time.

The 3 different methods are:
  1. In-built "String Translate" function : This method is the most time-efficient. However, it strips some of the essential punctuation.
    e.g.: It will convert 'didn't' to 'didnt' and now the word doesn't make sense at all.
  2. Splitting the word and then using 'String Punctuation': It splits the whole string and then strips the punctuation from each word.
  3. Using Regular Expression Substitution: It is a simple regular expression substitution.
Example:

review_string=""I'd give Ardor a TWO THUMBS UP!\nLove the food.. esp the Indian and total value for money!\nThe drinks are well made and the food is to die for. \nI've been there about 5 times - mix of at night and for lunch.\nDuring the night, its like the new watering hole for the young crowd, Love the energy it has. \nThe ambiance is great.\nFor those of you who have not been there yet, what are you waiting for?!\n\nI'll totally recommend this place to anyone!'give'"

1. Using String.Translate

Code: review_string.translate(None, string.punctuation)
Time: 1000000 loops, best of 3: 1.53 µs per loop
Result:
'Id give Ardor a TWO THUMBS UP\nLove the food esp the Indian and total value for money\nThe drinks are well made and the food is to die for \nIve been there about 5 times mix of at night and for lunch\nDuring the night its like the new watering hole for the young crowd Love the energy it has \nThe ambiance is great\nFor those of you who have not been there yet what are you waiting for\n\nIll totally recommend this place to anyonegive'




2. Using String Punctuation and Word Splitting

Code: ' '.join(word.strip(string.punctuation) for word in review_string.split())
Time: 10000 loops, best of 3: 43.8 µs per loop
Result:
"I'd give Ardor a TWO THUMBS UP Love the food esp the Indian and total value for money The drinks are well made and the food is to die for I've been there about 5 times  mix of at night and for lunch During the night its like the new watering hole for the young crowd Love the energy it has The ambiance is great For those of you who have not been there yet what are you waiting for I'll totally recommend this place to anyone!'give"




3. Using Regular Expression Substitution

Code: 
p = re.compile(r'(\n)|(\r)|(\t)|(\')|(\u00A9)|([!"#$%&()*+,-./:;<=>?@\[\\\]^_`{|}~])', re.IGNORECASE) re.sub(p,'',review_string)

Time: 10000 loops, best of 3: 101 µs per loop
Result:
"I'd give Ardor a TWO THUMBS UP!\nLove the food.. esp the Indian and total value for money!\nThe drinks are well made and the food is to die for. \nI've been there about 5 times - mix of at night and for lunch.\nDuring the night, its like the new watering hole for the young crowd, Love the energy it has. \nThe ambiance is great.\nFor those of you who have not been there yet, what are you waiting for?!\n\nI'll totally recommend this place to anyone!'give'"


Conclusion :
String.Translate is quite fast since it is built on C-module, but you can't modify it as per your needs. So you can either go for Word-Stripping, or else, if you want too specific stripping, then go for regular expression substitution.

Wednesday 11 June 2014

No module named scikits.crab

----> 1 from scikits.crab import datasets ImportError: No module named scikits.crab

If you are facing this issue, then either scikits.learn is not installed in your system or you might have installed crab in the following ways:

pip install crab 
or
easy_install crab


If scikits.learn is not installed in your system, use the following command :
sudo pip install scikits.learn
For Python 2.6 you have to use the following command:
sudo python2.6 pip install scikits.learn

If scikits.learn is installed and you are still facing this bug,then you have to install or compile it directly using the source code.


You are ready to go after this :)

Tuesday 29 April 2014

How to install Genetic Algorithm Utility Library (GAUL) ?

 
The Genetic Algorithm Utility Library (referred to as GAUL in short ) is a C Programming Library which is used to develop or aid the development of applications that use Genetic Algorithms or Evolutionary Algorithms.

 Instructions for installing Genetic Algorithm Utility Library on Ubuntu ( LINUX) :
  • Download GAUL library from http://gaul.sourceforge.net/downloads.html.
  • Extract the GAUL files into your home folder and go into that particular folder and type the following commands:
    $ cd gaul-devel-0.1850-0
  • Compilation without S-Lang is done by:
    $./configure --enable-slang=no && make
  • Finally type:
    $ sudo make install
  • To compile a program, use the  linkers along with the file in the following way:
    gcc test_utils.c -lgaul -lgaul_util -lm 
  • To output the compilation to a file, use :
    ./a.out>test.txt
  • For more detailed instructions, please refer to: http://varuagdiary.blogspot.in/2011/05/setting-up-gaulgenetic-algorithm.html . The blog is descriptive with discussion about the installation of S-Lang along with Simple GAUL installation.

Friday 18 April 2014

Digha Beach


Hey everyone,

Being an engineering student at IIT Kharagpur as well as a globetrotter, I constantly feel the need to take a break from the monotonous routine at college and enjoy a day or two of 'peace life' (as we call it at kgp) with my friends. We considered all of our options and finally set our mind on visiting the popular tourist spot Digha Beach. From this trip, I have gathered a lot of information which can come pretty handy for people are planning a visit to Digha.

However prior to our trip, the planning part of the trip was too tedious. There is hardly any blog or site on the web which clearly mentions everything relevant or useful about Digha. And this lack of information can lead to a quite frustrating trip, to be very honest.  Due to this, I decided to consolidate my knowledge into a blog and put it up on the internet, so that it does not happen to other tourists.


Why Digha and not Mandormoni or Shankarpur?

It’s easy. There are no trains to Mandormoni or Shankarpur.

Getting from Kharagpur:

Trains: There are no direct trains available from Kharagpur. You can take a train from Kharagpur to Panksura or Mecheda and from Mecheda to Digha. There is a local train available from Panskura to Digha for all 7 days.

Buses: There are local buses available from Kharagpur Bus Stand. Buses start from 3:15 am in the morning to 8 pm in the night. Buses are very frequent, available at an interval of 15 minutes. The average fare is around Rs 50.However there are no AC or Volvo buses available from Kharagpur to Digha.

Cab or Taxi: They will charge you around Rs 1600 for one way journey.

What we did: We took a local train from Kharagpur to Panskura in morning and then from Panskura to Digha. It costs only Rs 50 per person for the whole train journeys.

Getting from Kolkata:

Trains: Direct trains are available from Kolkata or Santaragachi to Digha.

Buses:  A.C. or Volvo buses are available on this route.


Old Digha or New Digha?

New Digha

Pros:
  • The place is very calm.
  • The beach is not too crowded.
  • You can enjoy adventure sports.
  • It is near to railway station.
Cons:
  • There is literally nothing apart from Hotels and Beach on New Digha Side.
  • There are no food joints or restaurants in New Digha.
  • There are no ATM's. Make sure you have sufficient cash with you or otherwise you will have to go to Old Digha to withdraw it.
Old Digha

Old Digha is crowded and it is similar to a small dense packed town. There are plenty of shopping spots as well lot of local restaurants available. However I have heard that people are not allowed to enter in water in beaches near Old Digha.

What we did: We stayed in New Digha. Our whole stay was pleasant with a little chaos at the end since we were out of money and we had to go to Old Digha in the afternoon. Also we weren't able to find any North Indian Restaurant in New Digha.

Good Hotels:

  • Hotel Dolphin
  • Hotel Seagull
  • Hotel Amantran ( Family resort)
  • Hotel Shantiniketan
  • Hotel Seabird
  • Avoid Hotel Larica Inn. It’s neither in Old Digha nor in New Digha. It’s in the middle of two and you will have to get a rickshaw to get.
Recommendations:
  • Please avoid any advance booking of Hotels or Groupon Deals.
  • It is safe for couples to stay. Carry your PAN Card.
  • Optimum time for a stay at Digha is 2-3 days.
  • It is generally hot in summer. You can stay in A.C. room.
  • You can carry an electric kettle and have maggi or green tea or pasta if you want.
  • You can carry a mat to sit on the beach. 
Overall this is a nice place to visit. It is a good break from the daily hectic schedule as well as it is cheap as compared to other beaches.

If you loved it, then please share it with your friends.