My new blog is here

Tom Coates has slides up from his talk. You really need to hear it too to get the full flavour. Permanent link to this item in the archive.

Now, we enter a new era! Hoorah! Permanent link to this item in the archive.

I love the fact that the government think that more people applying to university provides "a complete vindication" of their fee policy. No, it doesn't! People still need a university degree, even though most jobs won't actually need the skills or training that university tuition provides - or they need skills which could be acquired in a much easier way (through, say, employer training or self-training). If the government think that they some kind of new Enlightenment by sending lots of people to have a distinctly average education, they need to think again. Permanent link to this item in the archive.

Some new survey results have come out saying that 54% of Americans would not vote for a well-qualified atheist for president. Did anyone tell them that Karl Rove is an agnostic? Oh, wait... perhaps not the best example of secular morality and piety.  Permanent link to this item in the archive.

BarCamp London 2: Semantic Web and microformats Permanent link to this item in the archive.

I have put up the slides of my RDF and microformats and the Semantic Web. Ian Forrester shot some video, which I'll post as soon as it's available.

In the meantime, read it here - it's in S5, but it's liberally laced with lots and lots of hyperlinks to interesting stuff.

Tags: , , ,

|

BeautifulSouping Twitter Permanent link to this item in the archive.

I'm here with Aral Balkan and we're working on scraping Twitter to do functions that the Twitter API doesn't currently support. Aral just releeased TwitAPI, a PHP regular expressions-based screen scraper.

Aral's written some regular expressions to pull the data out of the direct messages out. I'm doing it with Python's BeautifulSoup.

Here are the BeautifulSoup recipes ('n' is the B.S. instance, x is to be looped over).

User URL: n.findAll(True, {"class": "status_actions"})[x].
parent.contents[5].contents[1].contents[0]['href']

User Name: n.findAll(True, {"class": "status_actions"})[0]
.parent.contents[5].contents[1].contents[0].contents

Comment: n.findAll(True, {"class": "status_actions"})[0]
.parent.contents[5].contents[2].string.strip()

Fucked-up Twitter timecode: n.findAll(True, {"class": "status_actions"})[0].parent.contents[5].contents[3].contents[1].string.strip()

Once I've figured out how to do HTTP Basic authorisation using urllib2, the Twitter parser can be released unto the world!

Tags: , , , , ,

|

HomeTom MorrisOpiumfield

Last modified: Sunday, August 26, 2007 at 11:32 AM.

February 2007
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
 
Jan   Mar

This is my old blog. Please visit the new one.

Send me a voice message via Odea PayPal
 Subscribe

My podcast (RSS)