Tuesday, November 2, 2010

Learning Twisted (part 8) - Anatomy of deferreds in Twisted

There are numerous posts and document on conceptual explanation of one of the central concepts in Twisted framework - deferred.

The book on Twisted network programming provides an analogy of deferreds as buzzers which are handed to a visitor by a restaurant owner. This buzzer notifies the visitor that the table is ready and he could set aside what ever he has been doing and can come over to occupy the table meant for him.

Others identify deferreds as a place holder or a promise that is yet to be fulfilled. We could attach other actions that should follow when the promise is fulfilled or breached. These actions are like callback chains that would be triggered when a deferred fires.

Deferreds allows you to create followup action for something that will take some time to get fulfilled. This in turn relieves twisted to attend to other tasks and come back to execute follow-up actions when the condition is completed.

I will keep myself to code commentary and current behavior of deferreds.

Friday, October 29, 2010

Tracing call flows in Python

Python decorators comes handy when you want to intercept a piece of call flow and profiling technique seems just too verbose.
I use this quite often to analyze a python program to understand it better.

Consider the following piece of contrived python code to illustrate this approach of tracing python call flows.

def f():
    f1('some value')


def f1(result): 
    print result
    f2("f1 result")
    

def f2(result): 
    print result
    f3("f2 result")
    fe("f2 result")
    return "f2 result"

def f3(result): 
    print result
    return "f3 result"

def fe(result): 
    print result

f()

Output:
some value
f1 result
f2 result
f2 result

Thursday, October 21, 2010

Before taking a dip into haskell

I have been itching to start learning another language. I have been perusing through rather a voluminous opinions on what language to learn, on the net.
Too many opinions and it could freeze you from doing something. In any case, I have taken the plunge and would start learning haskell, keeping a commentary on the same here.

Before I do that, I really wanted to have Haskell syntax highlighting support in blogger.

I am yet to test it though. so here is a snippet attached that should have been highlighted. Of course, this code is not mine and just serves to confirm that highlighting works.

module Main where

main = putStrLn "Hello, World!"
         

Wednesday, October 20, 2010

Buildbot - Issue with svn poller

SVN poller may miss a check-in based on poll interval.

The current behavior of the poller is

The poller polls the version control system and stores last change (version number).  Subsequent changes are notified as log entries. These log entries are marked with the Time Stamp when the changes are noticed. These log entries are used to create change objects that is then passed to scheduler to trigger builds.Scheduler sees these change objects with the same timestamp and picks the latest change object to trigger the build.

The issue with this model is that if there are multiple changes within a single polling interval, this poller will result in triggering build only for the last one.

Wednesday, October 13, 2010

Python wisdom from stackoverflow #1

I had started participating in "stack overflow" in anticipation to improve my knowledge on topics of interest. What would be better than answering, working on problems posted by users and also look at the answers provided by various folks from the community.

In many posts, I could find some very elegant way of attacking the problem that I had never thought of. It was clear that there are nuggets of wisdom buried in "stack overflow" and mostly it would be difficult to go back and look at them. So I started by collecting weekly wisdoms on topic of my interest which usually is "Python programing". The good thing is that they are going to be unrelated snippets and bad thing is that their isn't any central theme to these posts.

Starting with this post, I will try to pull some neat solutions provided there for reference and later perusal.

#1 : round-up numbers to two decimal points

anFloat = 1234.55555
 print round(anFloat, 2)
# Output : 1234.5599999999999
rounded = "%.2f" % round(anFloat, 2)
print rounded
# Output: '1234.56'

Setting up buildbot - customizing configuration file

The crux of BuildBot involves a master and multiple build slaves that can be distributed across many computers.

Each Builder is configured with a list of BuildSlaves that it will use for its builds. Within a single BuildSlave, each Builder creates its own SlaveBuilder instance.
Once a SlaveBuilder is available, the Builder pulls one or more BuildRequests off its incoming queue.These requests are merged into a single Build instance, which includes the SourceStamp that describes what exact version of the source code should be used for the build. The Build is then randomly assigned to a free SlaveBuilder and the build begins.

All this is configured via a single file called master.cfg which is a dictionary of various keys that is used to configure the buildbot when it starts up.
Open up the sample "master.cfg" that comes with the buildbot distribution, drop it in the master directory that you have created and start hacking it.

I have listed few important configuration that should get you started.
Below is instance of dictionary that is populated in the configuration file
c = BuildmasterConfig = {}

Wednesday, October 6, 2010

Learning Twisted (part 7) : Understanding protocol class implementation

In my last post, I had focused on protocol factory class, various methods it needs to provide and also the code flow within which these methods gets called or invoked.
Here we will look into the structure of protocol class , various methods it needs to provide and context in which they are called.

There are two ways to lookup and learn this:

  • Look at the interface definition: IProtocol(Interface) in interfaces.py
  • Like I did in my previous posting , supply a protocol class with no methods and look at the traceback to understand the code flow

So usual imports for writing a custom protocol:

from twisted.web import proxy
from twisted.internet import reactor
from twisted.internet import protocol
from twisted.python import log
import sys
log.startLogging(sys.stdout)

It is much better to derive from protocol.Protocol to build custom protocol. It does a few things for you.
Any intricate logic should be built using the connect, disconnect, data received event handlers and methods to write data onto connection
makeConnection method sets the transport attribute and also calls connectionMade method . You can use this to start communicating once the connection setup has been established.
dataReceived method is called when there is data to be read off the connection. connectionLost is called when the transport connection is lost for some reason. To write data on the connection, you use the transport method self.transport.write. This results in adding the data to buffer which will be sent across the connection. To make twisted send the data or buffer immediately, you can call self.transport.doWrite

Thursday, September 30, 2010

Tools that I find useful with mac

Here is my list of useful tools on mac:

Notational Velocity is a cool way to keep textual notes.
      
I always had issue of manually deleting the archive after extraction, Unarchiver helps with that.
  
Want to have your favorite websites into Mac desktop applications, use fluidinfo.

Monday, September 27, 2010

Using buildbot for continuos integration development

Continuos integration in it's simplicity embodies certain agile tenets like frequent integration of code and automated verification of the integrated code to provide continuous feedback to the team on development and reducing heart burns during large integrations. It also avoids silent creeping in of broken builds into the code repository. At the heart of this process is a tool that can be integrated with the workflow of code check-in to trigger automated testing of frequently checked in development artifacts.

This helps in providing the developer an immediate feedback and assurance that things are moving in a positive direction.
Buildbot is a "continuos integration" tool.
BuildBot can automate the compile/test cycle required by most software projects to validate code changes.

I had a chance to set it up some time back. What follows is a snippet of that experience on getting it up and running quickly.

Thursday, September 23, 2010

Ubantu on Mac OSX using VirtualBox


I installed ubantu on mac osx using VirtualBox some time back. Installation went fairly easy except for the fact that I had to look out for increasing the resolution from the default 800X600.

Here is an step by step approach to install and use VirtualBox.

Wednesday, September 22, 2010

Python and binary data - Part 3

Normal file operations that we use are line oriented
FILE = open(filename,"w")
FILE.writelines(linelist)
FILE .close()
FILE = open(filename,"r")
for line in FILE.readlines(): print line
FILE .close()

We can also use byte oriented I/O operations on these files.
FILE = open(filename,"r")
FILE.read(numBytes)  # This reads up to numBytes of Bytes from the file.
But if the file does not contain textual data, the contents may not be meaningful.

It is much better to open the file in binary mode
FILE = open(filename,"rb")
FILE.read(numBytes)

Python and binary data - Part 2

In the previous post, I discussed about numbers (floating, signed and whole) representation on a computer. In "C", the bits used for representation are limited. This means that there is inherent limitation on what can be represented. It also means that there is danger of overflow when you can't hold all the bit values after an operation to represent a number that exceeds the bit limits.

How about Python? How does it represent these numbers?

Even if Python underlying implementation is in "C" or "C++" types, Python integers are not like typical "C" integers.
Python integer have arbitrary precision.
It creates a higher level of abstraction for number representation.
Python represents integers by allocating the memory that is required to hold the number value. Initially, the size is set to 32 or 64 bits and increased when required. They can pretty much store a very large value or an astronomical figure. Arbitrary precision operations on these integers can be very slow.

Tuesday, September 21, 2010

Algorithms in Python - Smallest free number

I have recently started reading the book "pearls of functional algorithm design" by Richard Bird. The book details various problems and functional algorithmic solutions.
I thought it would be a good exercise to solve them in Python and attempt to solve them using a functional programming language in future.
Task: find the smallest natural number not in finite set of numbers - X
The first problem illustrates the programming task in which objects are named by numbers and X represents the set of objects currently in use. The task is to find the object with the smallest name not in use. I.e. it is not in X.

Of course, there are multiple ways to solve this problem and I am not even trying to look at what may be an elegant solution.

Python and binary data - Part 1

All data is represented by ones and zeroes. How ever, the stream of binary data (ones and zeroes) can represent anything. Practically any thing can be represented with multiple ones and zeroes strung together along with the means for interpretation. The most common interpretation is textual data or Ascii data.
If the representation format is not known then we simply refer to it as binary data.
This interpretation process is called decoding and reverse transformation to binary data is called encoding.
If the binary data is not an ascii representation, you can't manipulate it in a textual editor.
Python has a specific module called 'binascii' for transformation of binary data to ascii representation and back and forth.

Friday, September 17, 2010

Learning Twisted (part 6) : Understanding protocol factory

Since most of us will be reusing the transport implementations that are already provided by Twisted. Our focus will be to create protocols and protocol factory that ties up a transport with protocol instance.

from twisted.web import proxy
from twisted.internet import reactor
from twisted.python import log
import sys
log.startLogging(sys.stdout)

class myProtocolFactory():
    protocol = proxy.Proxy

reactor.connectTCP('localhost', 80, myProtocolFactory())
reactor.run()

This barebones throws a whole lot of trace backs which helps understand the code flow a little bit easily. You can keep supplying the functions and rerun to see all the required methods of protocol factory and how the code flows or is structured.

Friday, August 27, 2010

Learning Twisted (part 5) : Low-Level Networking

So far we have explored the basic pieces of twisted core - reactors , event handlers, scheduled & looping calls. It also provides support for cooperative multitasking or loop parallelism.
In this post, I will explore the framework that implements low level protocol and networking support in twisted. This one for me was really twisted. No wonder, it is named as such!

In retrospection, generality of twisted as a networking framework is greatly improved by the underlying complexity of it's implementation. How ever, tackling this complexity does pay off in the awesome power this tool provides in building networking applications.

I am a picture guy. I can't hold all these dense information without some sort of picture in place. So I guess, I will put the picture first and then delve in the details.

Sorry folks, no code in this post.

+----------------+       connectSSL       +-----------+
|PosixReactorBase|.......listentTCP ......|IReactorTCP|
+----------------+       connectTCP       +-----------+
                             |
                             |
    +----------+             |
    |IConnector|________connector obj <.......You pass a factory
    +----------+        obj -> connect        This is a protocol
     |   |                                    factory.
     |   |                                 ,-'
     |  stopDisconnecting               ,-'
     |  getDestination                ,'
     |   ..............> factory.dostart()
  connect..............> _makeTransport()
                               |
        +-------------+    connection obj
        |ISystemHandle|________|   |__calls factory.buildProtocol(addr)
        |             |                               |
        |ITCPTransport|.......>+----------+           |
        +-------------+        |ITransport|    +-------------+
                               +----------+    |IProtocol obj|
                                    |          +-------------+
                       .............:................
                       |                            |
                     Client..___                __Server
                                `'`---..----''''
                               These are initialized differently

Tuesday, August 24, 2010

Learning Twisted (part 4) : Cooperative multitasking

EventLoop dispatches the events to be handled by eventHandlers. If an eventHandler plays fraud and do not return immediately, event or reactor loop can not service other events. This means that all our event handlers need to return quickly for this model to work.
i.e functions that have substantial waiting period and are non-blocking. Typically, network or file I/O can be are non-blocking.

Twisted provides facility to interleave execution of various functions, where the period of waiting can be used to do something useful for other functions that are waiting to be serviced.

+--------------------------++--------------------------+
|task1 | wait period | comp||task2 | wait period | comp|
+--------------------------++--------------------------+
+--------------------------+
|task1 | wait period | comp|
+--------------------------+
       +--------------------------+
       |task2 | wait period | comp|
       +--------------------------+


Learning Twisted (part 3) : Scheduling with reactor

In previous installments, we found that twisted does provide methods to schedule a call at some time later, after the reactor loop starts running. It also provides method for registering a call that will be called repeatedly after a delay.

The facility is provided in twisted.internet.task.

from twisted.internet import reactor
from twisted.python import log
import sys, time
from twisted.internet.task import LoopingCall, deferLater

log.startLogging(sys.stdout) 

Friday, August 20, 2010

Learning Twisted (Part 2): Async IO

  +---------------+
  |ILoggingContext|......... logPrefix
  +---------------+
  +---------------+
  |IFileDescriptor|'''''''''''  fileno
  +-------+-------+'''''''''''  connectionLost
        | |  +---------------+.......doRead
        | +--|IReadDescriptor|.....__init__ : open the descriptor or
        |    +---------------+                  
       +----------------+..|.......__init__ : open a connection
       |IWriteDescriptor|..|.........doWrite
       +--------.-------+  |
                |          |
         addWriter        addReader
               +-------------+
               |IReactorFDSet|
               +-------------+


Well, this will become clear soon.

We know that the core of twisted is reactor loop that listens for events and dispatches those events to "event handlers" (Event based computing paradigm). We also learnt that how to generate custom events and register methods as "event handlers" for the custom events and some special events.

Various frameworks are available within twisted that provide facility to register events, add & remove the producer and consumer of these events. These framework usually follow certain well known paradigms. We will follow through some of those implementations in a short while as we continue exploring the default reactor "SelectReactor".

Non-blocking synchronisation shows better performance in certain application than blocking synchronization. Select based Non-blocking I/O allows us to implement this paradigm and is supported by SelectReactor.

Thursday, August 19, 2010

Learning Twisted (part 1) : reactor basics

Twisted framework is an event based networking framework in Python. This description suffices for initial impression about this framework.

These and subsequent posts will serve as my notes for learning Twisted.

At the core of event based programming is reactor loop that run endlessly unless asked to stop.
While in the loop, reactor listens to events and dispatches these events to event handlers for processing of events. The example below is the simplest way of starting up the reactor.

print("To stop this example, press ctrl-c")
from twisted.internet import reactor
reactor.run()

reactor.run() should be the last line of code that gets executed immediately. After this, the control is handed to reactor loop, which listens for events endlessly. Some of the methods of this class also allows you to register for events and event handlers to process something real.

There are many reactors available, which needs to be installed. Default reactor is select based reactor - "SelectReactor"