12.26.07

Realizing AtomPub entries from XML using ROXML

Posted in Ruby, blog, rails at 11:00 am by Robert Horvick

I have done some digging and there does not seem to be any plugins or gems (or even active projects on RubyForge) that are providing support for an AtomPub server. I did find a sample based on Camping but it was not really what I was looking for. It supports a subset of AtomPub and not in a way that is really friendly for what need.

So I set out on my own.

An atom pub entry can be as simple as this:

<atom:entry xmlns="http://www.w3.org/2005/Atom">
  <atom:title>ATOM Post Test</atom:title>
  <atom:id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</atom:id>
  <atom:content>Some text.</atom:content>
</atom:entry>

Or quite a bit more complex (lifted from http://tools.ietf.org/html/rfc4287):

<entry>
  <title>Atom draft-07 snapshot</title>
  <link rel="alternate" type="text/html"
    href="http://example.org/2005/04/02/atom"/>
  <link rel="enclosure" type="audio/mpeg" length="1337"
    href="http://example.org/audio/ph34r_my_podcast.mp3"/>
  <id>tag:example.org,2003:3.2397</id>
  <updated>2005-07-31T12:29:29Z</updated>
  <published>2003-12-13T08:29:29-04:00</published>
  <author>
    <name>Mark Pilgrim</name>
    <uri>http://example.org/</uri>
    <email>f8dy@example.com</email>
  </author>
  <contributor>
    <name>Sam Ruby</name>
  </contributor>
  <contributor>
    <name>Joe Gregorio</name>
  </contributor>
  <content type="xhtml" xml:lang="en"
    xml:base="http://diveintomark.org/">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p><i>[Update: The Atom draft is finished.]</i></p>
      </div>
  </content>
</entry>

I started with just caring about title and content. Everything else I could infer from authentication (author, anyway).

With this thought, I created some simple xpath queries to find the title and content. That was working out. I could get the atom response created and the post serialized to the database. So with just a few lines of code (beyond the scaffolding) I was able to use Live Writer to post via atompub to my blog engine.

But I wasn’t happy with just winking away the atompub protocol in favor of getting done quickly.

So I sat down and really started reading the atom publishing spec (and the a related atom rfc). I have a lot of work to do.

I started using XmlSimple to parse the XML. That was ok, except some things don’t map well to hashes and it did not get me closer to serializing atom entries to xml (for the feed later on).

So I moved down to REXML and that was working ok. But I still wasn’t happy. I felt like I was writing too much monkey code and not doing making progress on my real goal.

Tinally I found ROXML - and now I’m happy. I’ll cut to the chase and just show the code:

require 'rubygems'
require 'roxml'

class AtomBase
  include ROXML

  def to_s
    self.to_xml
  end
end

class AtomAuthor < AtomBase
  xml_name "author"
  xml_text :name
  xml_text :uri
  xml_text :email
end

class AtomContributor < AtomBase
  xml_name "contributor"
  xml_text :name
  xml_text :uri
  xml_text :email
end

class AtomLink < AtomBase
  xml_name "link"
  xml_attribute :rel
  xml_attribute :type
  xml_attribute :href
  xml_attribute :length
end

class AtomContent < AtomBase
  xml_name "content"
  xml_attribute :type
  xml_attribute :lang
  xml_attribute :base
  xml_text :text, nil, ROXML::TEXT_CONTENT
end

class AtomSummary < AtomBase
  xml_name "summary"
  xml_attribute :type
  xml_attribute :lang
  xml_attribute :base
  xml_text :text, nil, ROXML::TEXT_CONTENT
end

class AtomEntry < AtomBase
  xml_name "entry"
  xml_text :title
  xml_object :link, AtomLink, ROXML::TAG_ARRAY
  xml_text :id
  xml_object :summary, AtomSummary
  xml_text :updated
  xml_text :published
  xml_object :author, AtomAuthor
  xml_object :content, AtomContent
  xml_object :contributor, AtomContributor, ROXML::TAG_ARRAY
end

The major gap is that this does not support xhtml content and summary fields. Those fields contain embedded XML and I have not figured out how to get ROXML to stop navigating the tree and make the object property return the inner xml.

I did try hacking ROXML to support an XML_CONTENT tag. It solved about 70% of the problem but it was not working exactly as I wanted after about a half hour and I felt like I was shaving a yak. I can come back to that later on. For now that the atom bits were are neatly wrapped up behind a class with a known bug. I’m ok with that.

I haven’t reviewed the atom syndication spec to make sure I’m doing everything right but it’s working with all the samples I’ve thrown at it (including the samples from the spec) and Live Writer seems to be having a field day with it. So that’s good.

Anyway - that’s where I’m heading.

I have the feeling that I’m reinventing the wheel. I hope not poorly.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

12.25.07

Just what the world doesn’t need … another blog engine.

Posted in Ruby, blog, rails at 9:36 pm by Robert Horvick

Nuby on Rails recently asserted that “every beginning Rails developer should write their own blog software. It’s a great learning experience and you can try things that aren’t possible with just an app running on localhost.”

I can buy into this.  It’s the canonical Rails sample app.  I’m new to Ruby and Rails.  Maybe I should do this.  And I don’t mean a little toy that I do in 15 minutes and then throw away.  I mean the blogging platform I use for my primary blog.

So I sat down and tried to list out what I would want my blog to look like …

  1. I don’t want to spend time writing code for a text editor to write blog posts
  2. I want people to be able to subscribe to my content
  3. I want caching (not that I plan to get dugg - but it’s good learning)
  4. I want to be able to easily customize the look/feel
  5. I want to support tagging and providing post views based on those tags
  6. I want to support uploading media files (images, etc)
  7. I want to be able to import this blog into it.

So I spent some time researching those issues and this is what I’ve come up with:

  1. I will post via atom publishing -there are already blog tool that support this (Microsoft Live Writer, for example - I’m sure there are others).
  2. I will provide atom feeds.
  3. Rails has lots of caching examples.  I’ll start with file system based caching and move upwards from there.
  4. I won’t provide full theme support but I’ll have a generic post format and include user defined stylesheets (and possible javascript).
  5. acts_as_taggable until shown why not.
  6. atompub can do this.
  7. I think I can use atom feeds to do this.

Also …

As mentioned I’m hosting with SliceHost.com on a 256 slice so I need to keep it trim.  I may go to production on Sqlite3 (if I do the caching right this shouldn’t be a problem - and that would keep 4-10% of my limited RAM free).

Also my goal is to write as little new code as possible.  Plugins and gems whenever possible.

I figure it will take about a month to have something I can show to the world and not worry about hiding my face.

Over the next few weeks I’ll be blogging about what I learn that seems interesting.  And plenty that isn’t, I’m sure.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]