I Love Python: Twitter Feed/Content Auth and Scrape in One HTTP Request

February 21st, 2009 by comment gloriajw

This is some lingering code from a GrrlCamp project, written by Gaba. It is such a handy, great little nugget of code. It logs into a Twitter account and scrapes either the content or the RSS feed (your choice), in one fell swoop, using Base64 encoding of the login and password in the HTTP request header. Very clever:

import urllib2, base64
import sys
import feedparser
#import configuration

class Page:

    def __init__(self):
        self.data = {}
        self.data["url"] = 'http://www.twitter.com/gloriajw'
        self.data["username"] = 'gloriajw'
        self.data["password"] = 'XXXXXXXXXX'

        self.data["urlrss"] = 'http://twitter.com/statuses/user_timeline/18107956.rss'

    def __getitem__(self, key):
        return self.data[key]

    def __setitem__(self,key, value):
        self.data[key] = value

    def getContent(self):
        base64string = base64.encodestring('%s:%s' % (self.data['username'], self.data['password']))[:-1]
        authheader =  "Basic %s" % base64string

        req = urllib2.Request(self.data["url"])
        req.add_header("Authorization", authheader)
        try:
            handle = urllib2.urlopen(req)
        except IOError, e:                  # here we shouldn't fail if the username/password is right
            print "It looks like the username or password is wrong."
            sys.exit(1)

        return handle.read()

    def getRSS(self):
        base64string = base64.encodestring('%s:%s' % (self.data['username'], self.data['password']))[:-1]
        authheader =  "Basic %s" % base64string

        req = urllib2.Request(self.data["urlrss"])
        req.add_header("Authorization", authheader)
        try:
            handle = urllib2.urlopen(req)
        except IOError, e:                  # here we shouldn't fail if the username/password is right
            print "It looks like the username or password is wrong."
            sys.exit(1)

        return handle.read()

    def getData(self):
        """auth = urllib2.HTTPBasicAuthHandler()
        auth.add_password('BasicTest', 'twitter.com', self.data['username'], self.data['password'])

        return feedparser.parse('http://www.twitter.com/statuses/user_timeline/18107956.rss', handlers=[auth])
        """

        return feedparser.parse('http://%s:%s@twitter.com/statuses/user_timeline/18107956.rss' % (self.data['username'], self.data['password']))

To invoke it:

class Data:
    def __init__(self, entries):
        self.entries = entries

    def save(self):
        pass

    def parse(self):
        pass

    def imprimir(self):
        for item in self.entries:
            print item.title

And this:

def main():
    page = Page()

    statuses = page.getData().entries

    data = Data(statuses)
    data.save()

    data.imprimir()

if __name__ == "__main__":
    main()

This code will also be attached, in case of copy/paste mangling.

In both getContent() and getData(), Gaba constructs the HTTP response header so that the encoded username and password are passed in the Authorization section of the header. This is easier and more secure than making two requests, and maintaining session cookies. Very nice indeed. This can be used to sign into any web site which accepts HTTP Basic authentication headers (there are different types of HTTP authentication (BASIC, DIGEST, FORM, and CLIENT-CERT).

It is left as an exercise for you to get the content (not the feed) and use BeautifulSoup to extract the data portions. If you want to try this, and need help, post questions here.

Enjoy,
Gloria

ˆ Back to top

Don't go splurging at the widget store

February 15th, 2009 by comment sarah g

It is easy for clients, I have noticed, to mistakenly conflate adding widgets, effects and acronyms — Sliders, Sorters, Expanding-Menus, Oh My! — with implementing an idea. The client talks excitedly, rattling off a Rube Goldberg chain of widget-to-widget interactions, their voices rising, the importance of each and every widget in the chain perceived critical to the achievement of the Internet Holy Grail: Angel Investment. Or at least, a really slick site.

Don’t get me wrong. I think everyone is in favor of a well-placed widget.

They can be so smooth and beautiful that you gasp. They can glow yellow for just the correct duration before fading to white (“where has that beautiful apparition gone?”, you wonder, before drunkenly clicking again. And again. And again.). They can add an item to a list almost magically: never was it so fun to have so many things To Do. They can save you clicks, keep you in one place, slide items into carts with almost illicit ease.

In short, they can make things so simple that a tear comes to your eye, and you rush off, hat in hand, in the quest of The Holy Spinner to deliver your payload.

The Holy Spinner

But stop.

What are you looking for? Forget the elevator pitch, as it can be intoxicating: the sound of your voice, people nodding enthusiastically, the doors shut blocking their escape (especially if you are stuck between floors). Instead, do the quiet room test. You alone. Your idea. Naked. A convergence of souls.

“What do you need, Idea”, you ask, “in order to fully manifest your glorious Idea-ness?”

If your idea is quiet, do not rush to speak for it. If your idea speaks but is simple, do not scoff. Do not dress your idea up in Widget Drag, so it looks like a teenager searching for their identity at the Web 2.0 Store. If your idea does not need a Yellow Fade or slider, that is OK. If you remove the slider and yellow fade and find there is no idea underneath, that’s OK, too. Go for a walk. Another idea will come.

Think about building a UI like listening to the ones that you love. You observe them. You listen to their likes and dislikes so your gifts will please them, not reflect your tastes. It’s not about the shiny present: it’s about the connection, the need anticipated and met, a little bit of the edge taken off. Brush cleared, the path made simpler.

If you’re tempted to drive up to your date in the red corvette of ideas — or wow your user with the accordian navigation ’cause it like, opens and closes! — remember that you might be saying more about yourself than anything else.

And then ask yourself: Do you need that rainbow-colored slider on your site?

ˆ Back to top

Good programmers aren't lazy

February 8th, 2009 by comment sarah g

It goes something like this. A really good programmer, who obviously does not have a lazy cell in his or her brain (the body is another matter entirely!) declares, “I program because I’m too lazy to do the same thing again and again!”.

Variations abound, all coming back to the idea that “laziness” == an aversion to repetition.

Really?

It’s a fallacy, like the skinny girl standing in front of you declaring, “I’m so fat!”. It’s a cry for attention that points out the obvious opposite, and generates a protest, spoken or not. “No, you’re not! You’re (skinny|smart|efficient)!”

Some of the worst code I’ve ever written has been when I’m in genuine lazy mode. Laziness makes you not mind if you copy and paste a function 40 times without abstracting it. Laziness lets you avoid figuring out how to do something better, because you really don’t want to think about it. Laziness lets you put off finding a solution, copy code without understanding it, hard-code values in your applications, or stuff inline css in your html tags ‘just for now’.

Laziness is the ability to tolerate boredom and inefficiency. Good programmers lack that tolerance.

For programmers who still hide behind that phrase, it’s time to drop it. Try something else, like “I am unable to be bored”, “I cannot tolerate repetition”, or “my skin itches and I start squawking when I see DreamWeaver-generated JavaScript”.

Simplicity and elegance are hard. Striving for them is many things, but not lazy.

ˆ Back to top

Using YUI DataTable with Rails

February 7th, 2009 by comment sarah g

I am currently working on an Rails app that integrates the YUI dataTable, and in going through the tutorials I noticed they are all assume a PHP back-end. I also saw a number of people asking how to get this to work with a Rails controller, so I thought I’d write up my experience in the hopes that it helps someone else. For basic info about setting up the dataTable, refer to the YUI site, linked above. I’m also going to try to clarify a few things that I found a bit obscure.

To create a dataTable you have to define a few basic ingredients:

  1. A dataSource. This defines where the info in the table comes from, and what format it is returned in. I’m using JSON.
  2. A schema that you define. This is a part of the dataSource and is essentially a map to your data. The schema tells the table where to find the field values.
  3. An array of column definitions. You supply the name and “key” of each column and any additional information, such as whether it is editable (and if so, which editor to use) and how to format the data inside the column if you don’t want to just spit it out directly.

Let’s start with a dataSource. In this example, we’re making a table of tasks, so I want to hit my tasks controller to return the data. Standard Rails controller action. In the code snippet below,

  1. in line 1, I supply a URL (note that the url ends in .json, and in my controller I have a responds_to block which constructs my json response).
  2. In line 2, I’m telling the dataSource — hey, you’re gonna get some JSON.
  3. And line 3, is where we define the responseSchema, which has two critical parts: the resultsList and the fields.
  var dataSource = new YAHOO.util.XHRDataSource("/projects/13/elements/21/tasks.json");
  dataSource.responseType = YAHOO.util.XHRDataSource.TYPE_JSON;
  dataSource.responseSchema = {
   resultsList: "Resources.data",
   fields: [
      {key:"task.id"},
      {key:"task.status"},
      {key:"task.percent_complete"},
      {key:"task.description"},
      {key:"task.due_date", parser:date}
      ]
  }

Let me be clear since I got hung up on this a bit. You as the application developer are responsible for two things. One, constructing your response — in other words, you can build your JSON response any way you want, with any keys, values and hierarchical structure that make sense (though it will help you to think it through and standardize it, Satyam has some good info on that here). And two, telling your dataSource how to navigate your response: i.e. — where do find what the data you’re looking for (the needle) in the response (the haystack of JSON)? This is the schema, the map of your data.
Controller code that creates my custom JSON:

# tasks_controller.rb
def index
 respond_to do |format|
      format.json{
         tasks = []
         @tasks.each do |task|
           task_container = {}
           task_container  = task
           task_container['editable_by_user'] = permission.edit? # some metadata I use
           task_container['deletable_by_user'] = permission.delete?
           task_container['resource_name'] = "task"
           tasks << task_container
        end
        data = {"Resources" => {"data" => tasks}}
        render :json => data.to_json
      }
  end
end

And just for fun, here’s a look at the JSON my controller gives me back when I hit this url ‘/projects/13/elements/21/tasks.json’.
The JSON:

{"Resources": {"data": [{"task": {"status": "not_started", "started_on": null, "updated_at": "2009-02-07T22
:03:18Z", "project_id": 13, "percent_complete": null, "high_priority": null, "element_id": 32, "deletable_by_user"
: true, "completed_on": null, "editable_by_user": true, "element_title": "looking good", "id": 50, "created_by_id"
: 7, "resource_name": "task", "description": "Pass the stimulus bill", "assignments": [], "due_date"
: "", "users": [], "resource_url": "/projects/13/elements/32/tasks/50", "due": null, "created_at": "2009-02-07T22
:03:18Z"}}, {"task": {"status": "not_started", "started_on": null, "updated_at": "2009-02-07T22:03:39Z"
, "project_id": 13, "percent_complete": null, "high_priority": null, "element_id": 32, "deletable_by_user"
: true, "completed_on": null, "editable_by_user": true, "element_title": "looking good", "id": 51, "created_by_id"
: 7, "resource_name": "task", "description": "Negotiate without pre-conditions", "assignments": [], "due_date"
: "", "users": [], "resource_url": "/projects/13/elements/32/tasks/51", "due": null, "created_at": "2009-02-07T22
:03:39Z"}}]}}

So that’s it! I’ve defined this JSON array and built it, then returned it from the controller when I get a .json request.

Now that you’ve seen the response, take another look at the responseSchema (you created above) and the two properties that you set:

  1. ResultsList. Notice it is set to “Resources.data”, which are the JSON keys I used. It uses dot-syntax to point to array of tasks in my json.
  2. Fields. Again using dot syntax, and having the ResultsList array to navigate through, it can pull the specific values it wants from the list; so ‘task.status’, ‘task.started_on’, etc., will retrieve those values from the response.

Make sense?

You are now one step away from being able to see your table. So, create your column definitions, an array of data about each column in the table. This is also where you can specify formatting and editing information (not covered in this article).

var columnDefinitions = [
  {key:"task.status",formatter:"formatPriority", sortable:true},
  {key:"task.percent_complete", label:"Percent Complete", sortable:true},
  {key:"task.description", label:"Description"},
  {key:"task.due_date", label:"Due Date", editable:true,sortable: "true",formatter:YAHOO.widget.DataTable.formatDate, editor: new YAHOO.widget.DateCellEditor({resource:'task', updateParams:"task[due]"})}
  ];

Notice that each column definition has a key: this key must be accessible in your JSON, AND it must be defined as a field.

And then you create your table. The first argument is the id of a div on the page to which the table will be attached, the last is an optional configuration hash (pagination, anyone?).

var dataTable = new YAHOO.widget.DataTable("project_tasks", columnDefinitions, dataSource, optionalConfigurationHash);

I hope this is helpful to get you started with the YUI dataTable on Rails. I may do further posts on XHR editing of individual cells in a dataTable using Rails.

ˆ Back to top