tag:blogger.com,1999:blog-90832217087796975972024-03-18T21:07:49.247-07:00Programmer's notebookAshish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.comBlogger34125tag:blogger.com,1999:blog-9083221708779697597.post-53381601429054326842014-09-17T22:08:00.001-07:002014-09-17T22:12:53.150-07:00Building and packaging a python application for distribution<div dir="ltr" style="text-align: left;" trbidi="on">
I find it messy to build a python application package that is easy to distribute . Though, python tools have come a long way.
<br>
"<b>pip/wheel</b>" helps in installing, managing and distributing individual packages. "<b>virtualenv</b>" provides an approachable way to create isolated environment and avoid the pollution of the main distribution. "<b>zc.buildout</b>" lets you reproduce / assemble and deploy a python application through configuration. Together, they provide a powerful framework for building and distributing python applications.
However, It is not as simple as build once and distribute everywhere model of executable / jar package distribution.
In all likelihood, you will be creating an isolated virtual environment, installing the dependencies and the application into it.
</div><a href="http://pyfunc.blogspot.com/2014/09/building-and-packaging-python.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com100tag:blogger.com,1999:blog-9083221708779697597.post-31579347442643700272013-08-14T01:13:00.002-07:002013-08-14T10:13:19.499-07:00Centralized logging for distributed applications with pyzmqSimpler distributed applications can take advantage of centralized logging. PyZMQ, a Python bindings for ØMQ provides log handlers for the python logging module and can be easily used for this purpose. Log handlers utilizes ØMQ Pub/Sub pattern and broadcasts log messages through a PUB socket. It is quite easy to construct the message collector and write messages to a central location.
<pre>
+-------------+
|Machine1:App1+-------------------------
+-------------+ |
+---------------+
+-------------..................|Machine3:Logger|
|Machine1:App2| +---------------+
+-------------+ |
|
+-------------+ |
|Machine2:App1|-----------------
+-------------+
</pre>
<a href="http://pyfunc.blogspot.com/2013/08/centralized-logging-for-distributed.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com196tag:blogger.com,1999:blog-9083221708779697597.post-65971651841580678562012-06-07T23:12:00.001-07:002012-06-12T11:06:08.907-07:00Ingest data from database into Hadoop with Sqoop (2)<div dir="ltr" style="text-align: left;" trbidi="on">
Here, I explore few other variations for importing data from database into HDFS.
This is a continuation of <a href="">previous article.</a>.<br><br>
Previous sqoop command listed were good for one time fetch when you want to import all the current data for a table in database. <br><br>
A more practical workflow is to fetch data regularly and incrementally into HDFS for analysis. You do not want to skip any previously imported data.
For this you have to mark a column for incremental import and also provide an initial value. This column mostly happens to be time-stamp.
<pre>
sqoop import
--connect jdbc:oracle:thin:@//HOST:PORT/DB
--username DBA_USER
-P
--table TABLENAME
--columns "column1,column2,column3,.."
--as-textfile
--target-dir /target/directory/in/hdfs
-m 1
--check-column COLUMN3
--incremental lastmodified
--last-value "LAST VALUE"
</pre><br>
</div><a href="http://pyfunc.blogspot.com/2012/06/ingest-data-from-database-into-hdfs-for_07.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com5tag:blogger.com,1999:blog-9083221708779697597.post-64366560292729131342012-06-07T21:15:00.002-07:002012-06-12T11:07:01.967-07:00Ingest data from database into Hadoop with Sqoop (1)<div dir="ltr" style="text-align: left;" trbidi="on">
Sqoop is an easy tool to import data from databases to HDFS and export data from Hadoop/Hive tables to Databases.
Databases has been de-facto standard for storing structured data. Running complex queries on large data in databases can be detrimental to their performance.<br>
It is some times useful to import the data into Hadoop for ad hoc analysis. Tools like hive, raw map-reduce can provide tremendous flexibility in performing various kinds of analysis.<br>
This becomes particularly useful when database has been used mostly as storage device (Ex: Storing XML or unstructured string data as clob data).
<br><br>
Sqoop is very simple on it's face. Internally, it uses map-reduce in parallel data import from Database and utilizes JDBC connection for the purpose.
<br><br>
I am jumping straight into using sqoop with oracle database and will leave installation for some other post.
<br><br>
Sqoop commands are executed from command lines using following structure:
<pre>sqoop COMMAND [ARGS]</pre>
All available sqoop commands can be listed with: sqoop help<br><br>
Article focuses on importing from database specifically Oracle DB.
<br></div><a href="http://pyfunc.blogspot.com/2012/06/ingest-data-from-database-into-hdfs-for.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com26tag:blogger.com,1999:blog-9083221708779697597.post-24769602391467610242012-05-16T20:47:00.001-07:002012-05-16T20:59:36.777-07:00Hadoop Map-Reduce with mrjobWith Hadoop, you have more flexibility in accessing files and running map-reduce jobs with java. All other languages needs to use Hadoop streaming and it feels like a second class citizen in Hadoop programming.<br><br>
For those who like to write map-reduce programs in python, there are good toolkit available out there like <a href="http://packages.python.org/mrjob/index.html">mrjob</a> and <a href="https://github.com/klbostee/dumbo/">dumbo</a>.<br>
Internally, they still use Hadoop streaming to submit map-reduce jobs. These tools simplify the process of map-reduce job submission.
My own experience with mrjob has been good so far. Installing and using mrjob is easy.
<br><br>
<b>Installing mrjob</b>
<br><br>
First ensure that you have installed a higher version of python than the default that comes with Linux (2.4.x for supporting yum). Ensure that you don't replace the existing python distribution as it breaks "yum".
<br><br>
Install mrjob on one of the machine in your Hadoop cluster. It is nicer to use virtualenv for creating isolated environment.
<pre class="shell">
wget -O virtualenv.py http://bit.ly/virtualenv
/usr/bin/python26 virtualenv.py pythonenv
hadoopenv/bin/easy_install pip
hadoopenv/bin/pip install mrjob
</pre><br>
<a href="http://pyfunc.blogspot.com/2012/05/hadoop-map-reduce-with-mrjob.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com140tag:blogger.com,1999:blog-9083221708779697597.post-44956811923078137462012-05-15T19:21:00.001-07:002012-05-15T21:59:49.482-07:00HBase pseudo-cluster installation<div dir="ltr" style="text-align: left;" trbidi="on">
I have been preparing a vm with Hbase installed in pseudo-cluster mode for experimental purposes.
There are quite a few useful blogs on installing Hbase. I settled on the following minimum installation procedure.<br>
<br>
I am blogging it for future reference. Hopefully it will help others too.<br>
<br>
Before proceeding to install Hbase in pseudo cluster mode, you can check out the procedures for installing <a href="http://pyfunc.blogspot.com/2012/05/hadoop-pseudo-cluster-installation.html" target="_blank">Hadoop in pseudo-cluster mode</a>.<br>
<br>
A few tweaks are required in OS configuration. Add the following to <u>/etc/security/limits.conf</u>:
<br>
<ul>
<li>hdfs - nofile 32768</li>
<li>hbase - nofile 32768</li>
</ul>
<br>
A few changes are required to hadoop configuration that I have mentioned earlier.
Add following to <u>hdfs-site.xml</u>
<pre><property>
<name></name><value><property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property></value>
</property>
</pre>
<br>
</div><a href="http://pyfunc.blogspot.com/2012/05/hbase-pseudo-cluster-installation.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com4tag:blogger.com,1999:blog-9083221708779697597.post-48770162237331264272012-05-14T16:55:00.003-07:002012-05-14T17:03:27.249-07:00Hadoop pseudo-cluster installation<div dir="ltr" style="text-align: left;" trbidi="on">
Install Java and cloudera yum repo
<br>
<pre>yum install java-1.6.0-openjdk.x86_64
curl -O http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo
mv cloudera-cdh3.repo /etc/yum.repos.d/
</pre>
<br>
Ensure that you have hostname and localhost entries in /etc/hosts
<br>
<pre>comment out ipv6 entry</pre><br>
Create hadoop user and group manually
<br>
<pre>Create "hdfs" and "mapred" user with group "hadoop"
groupadd hadoop
useradd -G hadoop hdfs
useradd -G hadoop mapred
passwd hdfs
passwd mapred
</pre>
<br>
</div><a href="http://pyfunc.blogspot.com/2012/05/hadoop-pseudo-cluster-installation.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com73tag:blogger.com,1999:blog-9083221708779697597.post-76988826513173838042012-05-08T09:25:00.000-07:002012-05-08T15:40:38.658-07:00Few things to take care while building vagrant boxes<div dir="ltr" style="text-align: left;" trbidi="on">
Following are some of the tricks that were useful to me while creating Oracle Enterprise Linux vagrant box.
<br>
Create vm using VDI format for easy handling.
<br>
Make sure you have removed all the extraneous packages from the installed vm.<br>
You can check out package descriptions at <a href="http://pkgs.org/search/?keyword=util-linux" rel="nofollow" target="_blank">pkgs.org</a>. <br>
<pre class="shell">yum remove X11
yum list installed | grep gnome</pre>
<br>
Also ensure that yum installs only relevant language support
<br>
<pre>Edit /etc/rpm/macros.lang and include
%_install_langs en:fr
</pre>
<br>
</div><a href="http://pyfunc.blogspot.com/2012/05/few-things-to-take-care-while-building.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com5tag:blogger.com,1999:blog-9083221708779697597.post-52881806346633534352012-03-15T10:12:00.001-07:002012-03-15T15:20:23.279-07:00External tables in Hive are handy<div dir="ltr" style="text-align: left;" trbidi="on">
Usually when you create tables in hive using raw data in HDFS, it moves them to a different location - "/user/hive/warehouse".
If you created a simple table, it will be located inside the data warehouse. The following hive command creates a table with data location at "/user/hive/warehouse/user".
<br>
<pre>hive> CREATE TABLE user(id INT, name STRING) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;
</pre>
<br>
Consider that the raw data is located at "/home/admin/data1.txt" and if you issues the following hive command, the data would be moved to a new location at "/user/hive/warehouse/user/data1.txt".
<br>
<pre>hive> LOAD DATA INPATH '/home/admin/userdata/data1.txt' INTO TABLE user;
</pre>
<br>
If we want to just do hive queries, it is all fine. When you drop the table, the raw data is lost as the directory corresponding to the table in warehouse is deleted.
<br>
You may also not want to delete the raw data as some one else might use it in map-reduce programs external to hive analysis. It is far more convenient to retain the data at original location via "EXTERNAL" tables. <br>
</div><a href="http://pyfunc.blogspot.com/2012/03/external-tables-in-hive-are-handy.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com24tag:blogger.com,1999:blog-9083221708779697597.post-57089457982989394292012-03-13T11:57:00.001-07:002012-03-13T14:59:22.599-07:00Introducing ØMQ and pyzmq through examples<div dir="ltr" style="text-align: left;" trbidi="on">
ØMQ is a messaging library that has the capability to revolutionize distributed software development. <br />
<br />
Unlike full fledged messaging systems, it provides the right set of abstractions to incorporate various messaging patterns. It also provides the concept of devices which allows creation of complex network topology. <br />
<br />
To get a quick overview, you can read the <a href="http://nichol.as/zeromq-an-introduction" target="_blank">introduction to ØMQ</a> by Nicholas Piël.<br />
<br />
ØMQ sockets are a light abstraction on top of native sockets.<br />
This allows it to remove certain constraints and add new ones that makes writing messaging infrastructure a breeze.<br />
<ul style="text-align: left;">
<li>ØMQ sockets adhere to predefined messaging patterns and has to be defined during ØMQ socket creation time.</li>
<li>ØMQ sockets can connect to many ØMQ sockets unlike the native sockets. </li>
<li>There is constraint on type of ØMQ sockets that can connect to each other.</li>
</ul>
<div style="text-align: left;">
<br />
ØMQ has bindings for many languages including python (pyzmq) and that makes it very interesting. <br />
<br />
It has been fun learning the basics and I hope soon to create some real world examples to whet my knowledge of ØMQ. Till then, I hope that the mini tutorial on ØMQ and pyzmq will serve as good introduction to it's capabilities. <br />
<br />
Check out : <a href="http://readthedocs.org/docs/learning-0mq-with-pyzmq/en/latest/index.html">http://readthedocs.org/docs/learning-0mq-with-pyzmq/en/latest/index.html</a><span id="goog_294701871"></span><span id="goog_294701872"></span><a href="http://www.blogger.com/"></a><br />
<br />
It is quite easy to get started. Use virtualenv and pip.<br />
<pre>pip install pyzmq-static
pip install tornado</pre>
<br />
Checkout the code from <a href="https://github.com/ashishrv/pyzmqnotes">https://github.com/ashishrv/pyzmqnotes</a><br />
Follow some of the annotated examples to see the awesomeness of ØMQ.<br />
<br />
Do post your feedback on the mini tutorial here as comments. </div>
</div>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com1tag:blogger.com,1999:blog-9083221708779697597.post-22211208051096900372011-12-01T23:52:00.001-08:002011-12-13T10:21:17.875-08:00My experiences with Fabric based deployment automation<div dir="ltr" style="text-align: left;" trbidi="on">
Many good tools are available for configuration management and application deployment.<br>
<a href="http://puppetlabs.com/" target="_blank">Puppet</a>, <a href="http://www.opscode.com/chef/" target="_blank">Chef</a> have attained cult status among the dev-ops team. There are good tools available in Python too. <a href="http://saltstack.org/" target="_blank">Salt</a> may soon become a viable alternative and looks definitely promising to me.
<a href="http://agiletesting.blogspot.com/2010/03/automated-deployment-systems-push-vs.html" target="_blank">Push-Pull</a> is commonly used to explain various types of tools in the eco-system.<br>
<br>
<a href="http://docs.fabfile.org/" target="_blank">Fabric</a> is an excellent tool that allows you to weave operation locally and remotely on cluster of machines, allowing you to deploy applications, start/stop services and perform other operations on a cluster of machines.
There are few good tutorials available to help familiarize with Fabric. If you haven't read it already, you should.<br>
<ol style="text-align: left;">
<li><a href="http://yuji.wordpress.com/2011/04/09/django-python-fabric-deployment-script-and-example/">An example on deploying django using Fabric</a></li>
<li><a href="https://docs.google.com/present/view?id=0AcvwZqy5XUWkZGN6eGp4ZHFfMjZmYmg2cjNjdw&hl=en_US&pli=1">A presentation on using Fabric</a></li>
<li><a href="http://blog.bixly.com/post/908893709/this-week-ryan-guides-use-through-fabric-a-python" target="_blank">A video on Fabric usage</a></li>
</ol>
I have used Fabric to automate deployment of <a href="http://hadoop.apache.org/" target="_blank">Hadoop</a> / <a href="http://hive.apache.org/" target="_blank">Hive</a> application, <a href="http://www.nagios.org/" target="_blank">Nagios</a> deployment on cluster of machines on <a href="http://aws.amazon.com/" target="_blank">EC2</a>, private cloud based on <a href="http://cloudstack.com/" target="_blank">Cloudstack</a> and commodity machines.
<br>
<br>
The code grew from nifty little commands / functions like setting up the "Fully qualified domain hostname" (<a href="http://en.wikipedia.org/wiki/Fully_qualified_domain_name" target="_blank">FQDN</a>), creating users and groups on Linux, installing yum packages to a complete system of commands that installs and brings up <a href="http://web.mit.edu/kerberos/" target="_blank">Kerberos</a> enabled secure hadoop cluster using <a href="http://www.cloudera.com/" target="_blank">Cloudera</a> hadoop packages.<br>
<br>
Code soon became unwieldily.
<br>
<br>
There are a few practices that helps contain the level of complexity that grows when using Fabric enthusiastically.<br>
<br>
</div><a href="http://pyfunc.blogspot.com/2011/12/my-experiences-with-fabric-based.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com3tag:blogger.com,1999:blog-9083221708779697597.post-87932644707913432552011-12-01T11:30:00.001-08:002011-12-13T10:21:36.000-08:00Installing funkload on Mac<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="http://funkload.nuxeo.org/" target="_blank">Funkload</a> is a useful tool for understanding the characteristics of the application server under stress and load conditions.<br>
<br>
Installation of Funkload is very straightforward using <a href="http://www.virtualenv.org/en/latest/index.html" target="_blank">virtualenv</a> and <a href="http://www.macports.org/" target="_blank">macports</a> on Mac.<br>
If you aren't using it already, you should think of checking it out.<br>
<br>
Create a isolated environment for installing Funkload.<br>
<br>
<pre>virtualenv --no-site-packages loadtest
source loadtest/bin/activate
pip install yolk</pre>
<br>
</div><a href="http://pyfunc.blogspot.com/2011/12/installing-funkload-on-mac.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com4tag:blogger.com,1999:blog-9083221708779697597.post-46136052263417284042011-11-29T10:32:00.001-08:002012-04-23T21:36:14.666-07:00Creating base box from scratch for Vagrant<div dir="ltr" style="text-align: left;" trbidi="on">
<br>
At <a href="http://www.vagrantbox.es/" target="_blank">vagrantbox.es</a>, you can find boxes for many flavours like CentOS, Ubuntu, Debian etc.<br>
<br>
How ever, you might require a flavour of OS that is not available packaged for you already.<br>
In such a case, you might want to package it for use with Vagrant.<br>
I needed Oracle Enterprise Linux Box.<br>
<br>
Following is a step by step approach to create a base box for Oracle Enterprise Linux 5.7 64 bit version.<br>
<b><br></b><br>
<b>Creating a VM on VirtualBox</b>
<br>
<br>
<span class="Apple-style-span" style="color: blue;">Step 1</span>: Get the ISO file from which we will install the Oracle Enterprise Linux.<br>
<br>
<span class="Apple-style-span" style="color: blue;">Step 2</span>: Create your virtual machine on VirtualBox.<br>
<br>
<pre> Create a new Virtual Machine
Type: VMDK
Name : oel57
Base memory size: 512 MB, Memory Space Maximum 40 GB
Enable Host I/O cache
</pre>
<br>
</div><a href="http://pyfunc.blogspot.com/2011/11/creating-base-box-from-scratch-for.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com21tag:blogger.com,1999:blog-9083221708779697597.post-30284959999969042232011-11-28T13:50:00.001-08:002011-11-28T15:22:17.092-08:00Using Vagrant<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="http://vagrantup.com/" target="_blank">Vagrant</a> is a great tool for creating vm at whim and tearing it down so that you could start all over again. It helps to start from a clean state, when you are trying to test deployment and setups. Vagrant requires <a href="https://www.virtualbox.org/" target="_blank">VirtualBox</a> and is written in Ruby.<br>
<br>
Following is a step by step take down on how to setup and use vagrant on Mac<br>
<br>
</div><a href="http://pyfunc.blogspot.com/2011/11/using-vagrant.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com6tag:blogger.com,1999:blog-9083221708779697597.post-54900276342718092392010-11-02T11:58:00.000-07:002011-12-13T10:22:11.364-08:00Learning Twisted (part 8) - Anatomy of deferreds in TwistedThere are numerous posts and document on conceptual explanation of one of the central concepts in Twisted framework - deferred. <br>
<br>
The book on Twisted network programming provides an analogy of deferreds as buzzers which are handed to a visitor by a restaurant owner. This buzzer notifies the visitor that the table is ready and he could set aside what ever he has been doing and can come over to occupy the table meant for him.<br>
<br>
Others identify deferreds as a place holder or a promise that is yet to be fulfilled. We could attach other actions that should follow when the promise is fulfilled or breached. These actions are like callback chains that would be triggered when a deferred fires.<br>
<br>
Deferreds allows you to create followup action for something that will take some time to get fulfilled. This in turn relieves twisted to attend to other tasks and come back to execute follow-up actions when the condition is completed. <br>
<br>
I will keep myself to code commentary and current behavior of deferreds.<br>
<a href="http://pyfunc.blogspot.com/2010/11/learning-twisted-part-8-anatomy-of.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com11tag:blogger.com,1999:blog-9083221708779697597.post-75351019197828828542010-10-29T15:48:00.000-07:002011-12-13T10:23:43.811-08:00Tracing call flows in PythonPython decorators comes handy when you want to intercept a piece of call flow and profiling technique seems just too verbose.<br>
I use this quite often to analyze a python program to understand it better. <br>
<br>
Consider the following piece of contrived python code to illustrate this approach of tracing python call flows.<br>
<br>
<pre class="brush: python">def f():
f1('some value')
def f1(result):
print result
f2("f1 result")
def f2(result):
print result
f3("f2 result")
fe("f2 result")
return "f2 result"
def f3(result):
print result
return "f3 result"
def fe(result):
print result
f()
</pre><br>
Output: <br>
<pre>some value
f1 result
f2 result
f2 result
</pre><br>
<a href="http://pyfunc.blogspot.com/2010/10/tracing-callflows-in-python.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com4tag:blogger.com,1999:blog-9083221708779697597.post-4476523942473034372010-10-21T15:00:00.000-07:002010-10-22T17:29:18.292-07:00Before taking a dip into haskellI have been itching to start learning another language. I have been perusing through rather a voluminous opinions on what language to learn, on the net.<br>
Too many opinions and it could freeze you from doing something. In any case, I have taken the plunge and would start learning haskell, keeping a commentary on the same here.<br>
<br>
Before I do that, I really wanted to have <a href="http://github.com/mrueegg/haskell_syntax_highlighter">Haskell syntax highlighting</a> support in blogger.<br>
<br>
I am yet to test it though. so here is a snippet attached that should have been highlighted. Of course, this code is not mine and just serves to confirm that highlighting works.<br>
<br>
<pre class="brush: hs" name="code">module Main where
main = putStrLn "Hello, World!"
</pre><a href="http://pyfunc.blogspot.com/2010/10/before-taking-dip-into-haskell.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com2tag:blogger.com,1999:blog-9083221708779697597.post-69908210231351272112010-10-20T00:26:00.000-07:002011-12-13T10:23:43.806-08:00Buildbot - Issue with svn pollerSVN poller may miss a check-in based on poll interval. <br>
<br>
The current behavior of the poller is <br>
<br>
The poller polls the version control system and stores last change (version number). Subsequent changes are notified as log entries. These log entries are marked with the Time Stamp when the changes are noticed. These log entries are used to create change objects that is then passed to scheduler to trigger builds.Scheduler sees these change objects with the same timestamp and picks the latest change object to trigger the build.<br>
<br>
<b>The issue with this model is that if there are multiple changes within a single polling interval, this poller will result in triggering build only for the last one.</b><br>
<a href="http://pyfunc.blogspot.com/2010/10/buildbot-issue-with-svn-poller.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com2tag:blogger.com,1999:blog-9083221708779697597.post-37423165509531611012010-10-13T23:38:00.001-07:002010-10-21T15:45:12.012-07:00Python wisdom from stackoverflow #1I had started participating in "stack overflow" in anticipation to improve my knowledge on topics of interest. What would be better than answering, working on problems posted by users and also look at the answers provided by various folks from the community. <br>
<br>
In many posts, I could find some very elegant way of attacking the problem that I had never thought of. It was clear that there are nuggets of wisdom buried in "stack overflow" and mostly it would be difficult to go back and look at them. So I started by collecting weekly wisdoms on topic of my interest which usually is "Python programing". The good thing is that they are going to be unrelated snippets and bad thing is that their isn't any central theme to these posts. <br>
<br>
Starting with this post, I will try to pull some neat solutions provided there for reference and later perusal.<br>
<br>
<b>#1 : round-up numbers to two decimal points</b><br>
<br>
<pre class="brush: python">anFloat = 1234.55555
print round(anFloat, 2)
# Output : 1234.5599999999999
rounded = "%.2f" % round(anFloat, 2)
print rounded
# Output: '1234.56'
</pre><a href="http://pyfunc.blogspot.com/2010/10/python-wisdom-from-stackoverflow-1.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com1tag:blogger.com,1999:blog-9083221708779697597.post-9138654174394338792010-10-13T10:31:00.000-07:002011-12-13T10:23:43.825-08:00Setting up buildbot - customizing configuration fileThe crux of BuildBot involves a master and multiple build slaves that can be distributed across many computers.<br>
<br>
Each Builder is configured with a list of BuildSlaves that it will use for its builds. Within a single BuildSlave, each Builder creates its own SlaveBuilder instance.<br>
Once a SlaveBuilder is available, the Builder pulls one or more BuildRequests off its incoming queue.These requests are merged into a single Build instance, which includes the SourceStamp that describes what exact version of the source code should be used for the build. The Build is then randomly assigned to a free SlaveBuilder and the build begins.<br>
<br>
All this is configured via a single file called master.cfg which is a dictionary of various keys that is used to configure the buildbot when it starts up.<br>
Open up the sample "master.cfg" that comes with the buildbot distribution, drop it in the master directory that you have created and start hacking it.<br>
<br>
I have listed few important configuration that should get you started.<br>
Below is instance of dictionary that is populated in the configuration file<br>
<pre class="brush: python">c = BuildmasterConfig = {}
</pre><a href="http://pyfunc.blogspot.com/2010/10/setting-up-buildbot-customizing.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com1tag:blogger.com,1999:blog-9083221708779697597.post-17285667231674050842010-10-06T16:15:00.000-07:002011-12-13T10:22:11.344-08:00Learning Twisted (part 7) : Understanding protocol class implementationIn my last post, I had focused on protocol factory class, various methods it needs to provide and also the code flow within which these methods gets called or invoked.<br>
Here we will look into the structure of protocol class , various methods it needs to provide and context in which they are called.<br>
<br>
There are two ways to lookup and learn this:<br>
<br>
<ul><li>Look at the interface definition: IProtocol(Interface) in interfaces.py</li>
<li>Like I did in my previous posting , supply a protocol class with no methods and look at the traceback to understand the code flow</li>
</ul><br>
So usual imports for writing a custom protocol:<br>
<br>
<pre class="brush: python">from twisted.web import proxy
from twisted.internet import reactor
from twisted.internet import protocol
from twisted.python import log
import sys
log.startLogging(sys.stdout)
</pre><br>
It is much better to derive from protocol.Protocol to build custom protocol. It does a few things for you.<br>
<blockquote class="left">Any intricate logic should be built using the connect, disconnect, data received event handlers and methods to write data onto connection</blockquote><b>makeConnection</b> method sets the transport attribute and also calls connectionMade method . You can use this to start communicating once the connection setup has been established.<br>
<b>dataReceived</b> method is called when there is data to be read off the connection. <b>connectionLost</b> is called when the transport connection is lost for some reason. To write data on the connection, you use the transport method <b>self.transport.write</b>. This results in adding the data to buffer which will be sent across the connection. To make twisted send the data or buffer immediately, you can call <b>self.transport.doWrite</b><br>
<br>
<a href="http://pyfunc.blogspot.com/2010/10/learning-twisted-part-7-understanding.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com2tag:blogger.com,1999:blog-9083221708779697597.post-40929800501086460172010-09-30T11:38:00.000-07:002010-10-14T00:39:47.165-07:00Tools that I find useful with macHere is my list of useful tools on mac:<br>
<br>
<a href="http://notational.net/">Notational Velocity</a> is a cool way to keep textual notes.<br>
<br>
I always had issue of manually deleting the archive after extraction, <a href="http://wakaba.c3.cx/s/apps/unarchiver">Unarchiver</a> helps with that.<br>
<br>
Want to have your favorite websites into Mac desktop applications, use <a href="http://fluidapp.com/">fluidinfo</a>.<br>
<br>
<a href="http://pyfunc.blogspot.com/2010/09/tools-that-i-find-useful-with-mac.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com3tag:blogger.com,1999:blog-9083221708779697597.post-37716292143961462242010-09-27T16:25:00.000-07:002011-12-13T10:23:43.794-08:00Using buildbot for continuos integration developmentContinuos integration in it's simplicity embodies certain agile tenets like frequent integration of code and automated verification of the integrated code to provide continuous feedback to the team on development and reducing heart burns during large integrations. It also avoids silent creeping in of broken builds into the code repository. At the heart of this process is a tool that can be integrated with the workflow of code check-in to trigger automated testing of frequently checked in development artifacts.<br>
<br>
This helps in providing the developer an immediate feedback and assurance that things are moving in a positive direction.<br>
<blockquote class="left">Buildbot is a "continuos integration" tool.</blockquote>BuildBot can automate the compile/test cycle required by most software projects to validate code changes.<br>
<br>
I had a chance to set it up some time back. What follows is a snippet of that experience on getting it up and running quickly.<br>
<br>
<a href="http://pyfunc.blogspot.com/2010/09/using-buildbot-for-continuos.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com0tag:blogger.com,1999:blog-9083221708779697597.post-22851779187400935712010-09-23T13:39:00.000-07:002010-10-14T00:39:16.996-07:00Ubantu on Mac OSX using VirtualBox<span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"><span class="Apple-style-span" style="font-size: 11px;"></span></span><br>
<span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"><span class="Apple-style-span" style="font-size: 11px;"><div>I installed ubantu on mac osx using VirtualBox some time back. Installation went fairly easy except for the fact that I had to look out for increasing the resolution from the default 800X600.</div><div><br>
</div><div>Here is an step by step approach to install and use VirtualBox.</div><div></div></span></span><a href="http://pyfunc.blogspot.com/2010/09/ubantu-on-mac-osx-using-virtualbox.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com2tag:blogger.com,1999:blog-9083221708779697597.post-45817136266170313862010-09-22T14:48:00.000-07:002011-12-13T10:23:43.802-08:00Python and binary data - Part 3Normal file operations that we use are line oriented<br>
<pre class="brush: python">FILE = open(filename,"w")
FILE.writelines(linelist)
FILE .close()
FILE = open(filename,"r")
for line in FILE.readlines(): print line
FILE .close()
</pre><br>
We can also use byte oriented I/O operations on these files.<br>
<pre class="brush: python">FILE = open(filename,"r")
FILE.read(numBytes) # This reads up to numBytes of Bytes from the file.
</pre>But if the file does not contain textual data, the contents may not be meaningful.<br>
<br>
It is much better to open the file in binary mode<br>
<pre class="brush: python">FILE = open(filename,"rb")
FILE.read(numBytes)
</pre><br>
<a href="http://pyfunc.blogspot.com/2010/09/python-and-binary-data-part-3.html#more">Read more »</a>Ashish R Vidyarthihttp://www.blogger.com/profile/14497850886027578534noreply@blogger.com55