CONTENTS OF THIS SITE

OUR OTHER CONTENTS

RECENT BLOG ENTRIES

Kamaelia: The future of Python Frameworks looks promising.

May 10th, 2008 by comment gloriajw

Kamaelia is a general purpose Python framework developed by BBC Research.
http://kamaelia.sourceforge.net/Home

It is what I would consider a second generation framework. It has a mature level of feature and plug-in support, and a naturally extensible model, two things which are either rare in most frameworks, or too difficult to be practical. Audio and video plug-in support is amazingly clean in the Kamaelia examples. The Kamaelia kernel, named Axon, is the core component for module execution. Module intercommunication happens via pipes, so the issue of the context switch slowness for threaded apps is not an issue here.

Kamaelia also supports OS level threading, but this support seems to be nested down into the TCP protocol module. Nevertheless, the snap-together pipe model is tempting for any app, since it’s quite easy to grok. The code examples are exactly what Python enthusiasts have come to expect: clean, short, and almost intuitive.

Like many great tools, it comfortably straddles the line between tool and toy, with clean integration of pygame, as well as the support of many common protcols (HTTP, Torrent, etc.) and audio and video codecs (Vorbis, Wav, etc.) supplemented by a solid engine.

The component list for Kamaelia is called the Component Toybox, setting the tone for this project. We are encouraged to play, and novice contributors are encouraged to join and contribute. This is what I love the most about this project. It is unpretentious, approachable by anyone who wants to learn and contribute, and it is well organized and well designed. The documentation flows smoothly, although I’d love to see more detail on helping newbies find, install and configure all of the dependencies for the dependencies for each architecture. Pygame, for example, needs SDL development libs to run the provided examples. But newbies are stuck calling friends like me to explain this to them and help them past the installation hump. This is a difficult problem to resolve in any toolset dependent on many external toolsets, having their own development paths and practices. Maybe the Python buildout tool, plus some additional scripting could resolve this issue?

This framework is exciting. It opens the possibilities of faster web and app based integration of tools and tricks. It makes you ponder the infinite possibilities of nested protocol support, not just encapsulation of protocols within HTTP. Python development just keeps getting better and more fun, and this is most certainly a project to watch for ideas and possibilities of things to come.

Gloria

ˆ Back to top

I Love Python: BBC Language web scrape and encode to disk in 54 lines.

April 17th, 2008 by comment gloriajw

This module scrapes the BBC language web site (http://www.bbc.co.uk/worldservice/languages/)
for sample text from all 35 languages offered. It encodes the text snippets and writes to independent files, then test-reads one sample file.

The encoding requirements took some digging through obscure docs, but the rest wasn’t so bad. If you want to know how to do unicode language support to file in Python, this is for you.

import urllib2
import codecs
import BeautifulSoup
import re
import pdb
import os

class GetBBC:
	def __init__(self):
		print "In constructor"
		self.language_links = []
		self.dir = ‘BBC_Language_pages’
		try:
			os.makedirs(self.dir)
		except OSError:
			pass

	def getLanguageChoices(self):
		lang_page = urllib2.urlopen(”http://www.bbc.co.uk/worldservice/languages/”).read()
		self.soup = BeautifulSoup.BeautifulSoup(lang_page)
		# match langtexttop too
		links = self.soup.findAll(attrs={’class’:re.compile(’^langtext*’)})
		for x in links:
			self.language_links.append(x)
			print “Appending %s with link %s ” % (x.a.string,x.a['href'])

		print “There are %d language choices for the BBC news page!” % len(self.language_links)

	def archiveLanguagePages(self):
		os.chdir(self.dir)
		for x in self.language_links:
			lang_page = urllib2.urlopen(’http://www.bbc.co.uk’ + x.a['href']).read()
			clean_page = BeautifulSoup.BeautifulSoup(lang_page).prettify()
			rawfile = codecs.open(x.a.string,’wb+’,'ISO8859-1′)
			rawfile.write(unicode(clean_page,’ISO8859-1′))
			rawfile.close()
			print “Saved the %s page.” % x.a.string
		os.chdir(’..’)

	def readLanguagePage(self,language):
		os.chdir(self.dir)
		rawfile = codecs.open(language,’rb’,'ISO8859-1′)
		file = rawfile.read()
		rawfile.close()
		os.chdir(’..’)
		return rawfile

if __name__ == “__main__”:
	x=GetBBC()
	x.getLanguageChoices()
	x.archiveLanguagePages()
	y = x.readLanguagePage(’Portuguese’)

There are languages for which ISO8859-1 encoding may not work, so you may need to experiment with encoding codecs for languages not supported by the BBC.

I wrote this in May 2007, as a language support test for GrrlCamp, which is an online Open Source development group for women. We will be recruiting again in late June. If you are female, interested in volunteering development effort in exchange for learning, and have at least 6 hours free each week to do cutting edge fun Python design and development in a supportive and great online community, please post your email address and we will get back to you.

Gloria

The unmodified code

ˆ Back to top