Hello! ☞ This is an archived version of Al Shaw's personal website. The current site is http://shaw.al.


Introducing Homer: The Blogware-agnostic Feed-based Homepage Creator for the 'News Blob'

03/ 8/10
Future of News, Homer, Ruby, Sinatra, code

Adrian Holovaty, in his post “A fundamental way newspaper sites need to change,” writes (emphasis mine):

So much of what local journalists collect day-to-day is structured information: the type of information that can be sliced-and-diced, in an automated fashion, by computers. Yet the information gets distilled into a big blob of text — a newspaper story — that has no chance of being repurposed.

He has a point, and goes on to explain why we need a CMS that can hold semantic structured data because journalism on the web should be more than just blobs of text.

A Git “blob” object is nothing but a chunk of binary data.

This is totally correct, and the rapid proliferation of news apps and data-driven news organizations is an awesome thing, especially when it lets you do stuff like put in a zip code and see if the water you’ve been drinking is toxic, or how many people have been mugged in the last week on your block. But it also negates (or rather maybe tries to actively obsolesce) something very fundamental to journalism: the story. Holovaty says in the article: “Newspapers need to stop the story-centric world-view.” My question to this is, why?

Git Blob

There will always be a “news blob”— the fundamental block of data that every piece of news has, and every blogging system can handle: a title (headline), a body (story) and a permalink. Bylines and photos are important too, but they can, in a sense, be subsumed into these three primary types. We’ve been so blinded by data, that we’ve lost sight of the humble blob. Why is Twitter so popular? Because they’ve figured out how to distill these elements down to just one field, and limit the amount you can put into it.

The idea of a blob is so great because it can be anything. In a version control system like Git, a blob is just a bunch of text. It doesn’t care what you’re writing or how you organize it. It doesn’t even have awareness of files, but it can handle whatever you throw at it. When you’re on deadline, you need software that thinks like Git, that can handle the blob you’re throwing at it.

Why Write a CMS?

A big skill you grapple with as a newbie software developer is how to recognize solved problems. If you squint, there is usually an opportunity for a new app in the cracks between them. A while back there was a hilarious response on StackOverflow to a user who was asking for help parsing HTML with regular expressions. While the verbose answer is incredible, the simple answer is you don’t— you use a library to consume HTML. It’s a solved problem.

From a news point of view, getting story data from a writer and keeping it in a database is a solved problem. There is excellent free software out there that does this much better than I could ever hope to. What blogware does not do well is arrange stories hierarchically. Where Holovaty is right is that these blobs (and 99% of news on the web now is in blob-form) need to be repurposed, because blogware isn’t designed with news judgement in mind. This is why I wrote Homer. Homer is for news homepages.

How it works

Homer works through XML feeds, which I, yes, use a library to parse. Stories from those feeds can then be assigned to slots which live at arbitrary places on your homepage that you place where you want and label semantically. Once there, stories can be moved between slots or unassigned into oblivion. In a sense, Homer is a blog disaggregator. It allows you to add as many feeds as you want, and cherry-pick stories from them to live in your homepage until you bump them for other stories. It is transient, where blogware plans for permanence. It is serendipitous, where blogware plans for order. It is hand-curated, where blogware rewards predictability and automation. It is hierarchical, where blogware is chronological. Homer is to WordPress as TPUTH is to TechCrunch. In short, they work well together. To steal the tagline from Underscore.js, Homer is the tie to (Movable Type|WordPress|Tumblr|ExpressionEngine)’s tux.

While you can use Homer to “curate the web” because all you need to add a feed is its URL, its newsroom use case is to serve as your homepage and hand-arranged topic pages, sitting atop a sea of template-generated back pages.

Here’s how you use it:

Starting Page

The home page of Homer shows lists of the two basic objects in the app, homepages and feeds. Incidentally, these two objects have a has_and_belongs_to_many relationship in the database, so they can be reused. In other words, a feed can be used on as many homepages as you’d like after entering its title and URL once. The home page shows you the raw material you’ll use to construct your pages. The edit/delete nubbins allow you to change the file paths and titles of your homepages and titles/urls of feeds.

Manage Homepage

The Manage page is where all the action happens in Homer. This is where you define slots for your homepage, assign feeds to your homepage, assign stories to slots, and rearrange stories within slots. Let’s look at each of these features in order:

Slots: When you create a new homepage, you’ll first want to create your slots. There is no inherent style to default Homer homepages, so you can be free to devise your own hierarchy. A feature might be the top story. Then a subfeature. Or maybe define them by location: left, middle, right. When you get to templating, Homer will auto-generate code to surface slots based on their labels. At this point, slot labels can’t be edited.

Slot Creation

Feeds: The feed menu allows you to assign/remove feeds from the general pool to this homepage. Once they have been assigned, they’ll show up in the left menu to be “refreshed” whenever you want a selection of stories to put into slots.

Feed Chooser

Assignment: There are two kinds of assignment in Homer: Assignment from feed stories to slots, and rearranging stories within slots. When you refresh a feed, its latest stories will show up in a yellow area on the right side of the screen. You can refresh multiple feeds at the same time, and they will all be mixed into that area. From there you can edit the titles, bodies and permalinks (click the down arrow next to the title to reveal the other fields) and assign them to slots via the dropdown menus. Once a story has been assigned to a slot, it will show up in a grey area at the top of the page with the dropdown already filled in for the slot it is currently assigned to. If you assign a story to an already-filled slot, it will “bump” that story in favor of your new story. Once your story is assigned, you can move it around by changing the slots in the dropdowns, and hitting save. Assigning a story to the “blank” option will unassign it.

The assignment system is built with transience in mind. It is built for the present, the fierce urgency of the now. Call it opinionated software. Only the last 5 entries in a feed will be available when refreshed, and once a story has been unassigned, it is all but gone from the system. But don’t be afraid of this— your blogware of choice is good at keeping archives, and hopefully it will give you an automatically generated list of your posts somewhere. Homer is for bumping stories at will. A good workflow may be refreshing your feed, then assigning a battery of stories to your slots in the morning. Then cycling stories down the slots while putting new stories in the top slot until they fall off the page a la Techmeme-style aggregators. If you must get a story back, try refreshing the feed from which it came and reassigning it. Homer doesn’t support dredging up bumped stories for reassignment because no one likes old news.


Templating: While Homer won’t style your homepages for you, it will provide you with boilerplate code to start you off. After you’ve created all your desired slots, click on the Template button at the top of the page. You should see a bunch of code already in your editor. You’ll get a stock HTML wrapper, and for every slot, you’ll get one of these, assuming your slot is called feature:

<div id="feature">
 <% @feature = Ho.new(@homepage,'feature') %>
 <h2><a href="<%= @feature.url %>"><%= @feature.title %></a></h2>
 <p><%= feature.body %></p>

The wrapper is a standard html div, and inside are slot-specific variables. Ho is the templating class for Homer, and line 2 uses it to instantiate a new slot on the page. Once you have done that, you can use the new variable, in this case @feature to output that slot’s attributes wherever you want. Homer uses ERB (Embedded Ruby) to output variables, so anything within <% and %> will be parsed as Ruby code. In the boilerplate, @feature.url is wrapped around @feature.title, followed by @feature.body in a paragraph block. You can, however, edit the code to output these variables wherever you want, as long as you instantiate the slot first. If you’d like, you can move all of the slot inits to the top of the page and out of the markup. Also, as of now, unless you have access to your server’s filesystem, you’ll need to write all your style inline much like you do with Tumblr templates.

Aside— A few thoughts (caveats) on templating and Ruby DSLs: This is really only a first stab at a good templating language for Homer. I had a few more syntax ideas, and am still mulling the best way of doing it. Admittedly my current implementation is crufty. Ruby is really good for creating Domain Specific Languages (Sinatra itself is a good example), and I need to read up on best practices. This will be a major focus for the next version of Homer.

Once you have the code the way you like it, hit Save. This will also create a file in the filesystem at homer/templates/your_homepage.erb (the path to this template is also shown at the top of the template editor to encourage you to edit in a text editor) because storing templates in a database sucks. Once you save (or save from your text editor), you can preview your new homepage design at any time and publish it to the world.

Previewing and Publishing:

Finally, the fun part where all your hard work pays off. Previewing your homepage will give you a new window where you can see exactly what your page will look like to the outside world. Publishing will write your homepage out to the place in the filesystem you defined when creating it. This should be some place that serves static files, like an Apache Document Root. When you publish, the homepage you direct your audience to is 100% static, so load is basically a nonissue. Homer will do its best to resolve permissions issues when you set up the path, but it may not be able to publish if it doesn’t have access to the directory, so plan accordingly.

That’s basically it. It’s a fun news app for writing static homepages from feeds.

Under the Hood, and Philosophy

Homer is written in Ruby with Sinatra and ActiveRecord. Since starting the TPM PollTracker, I’ve been smitten with Rails-style development. ActiveRecord, for database interaction, might be one of my favorite pieces of software ever. One thing I hate about Rails, though, is how it does routing. Sinatra connects controller actions directly to URLs, which is much easier for me to wrap my head around. It also doesn’t impose a filesystem structure for your app. In fact, the only thing it assumes is that you are writing something in Ruby that uses URLs. That’s just the right amount of opinionation for me. Maybe Rails 3 will allow me to pick and choose how much framework I want, but for now Sinatra + AR gets me 90% of the way there.

With Homer, I’m also putting my stake in the ground for a few principles and (what I believe to be) best practices in CMS design:

First, you host it yourself. Movable Type and WordPress have blazed the trail for casual users to self-install server-based web apps, and the fact that this has been catching on is a Good Thing for a number of reasons: less reliance on third parties to be good stewards of your data, demystification of how web-based software works, and more agency given to the end user are a few of them. Mint and Fever are good examples of how this is becoming a viable distribution method with PHP and mySQL apps. My hope is that the Ruby stack will continue to evolve, so Ruby-based software will become as easy to install as WordPress.

Second, it’s open source. Homer is built on open source software, and the Ruby community lives on free open source software. I’m a big believer in it.

Above: László Moholy-Nagy’s Kompozicija Z VIII (1924). Moholy-Nagy is the patron saint of Homer.

Third, it generates static pages. Caching sucks. I’ve spent an inordinate amount of time on PollTracker working on caching, and even moving hosts over it. While most people agree that static pages are the best way to serve content (no one can disagree that it is the fastest), there are different approaches on how best to generate them (and how to expire them). For sites with a large number of pages, I like the Rails page caching method: expire data when it is changed, and give the brunt of caching each page to its first visitor. This makes deployment easy, and allows you to roll out a global change instantly. Movable Type does the opposite: it generates static pages at time of publish. This puts the brunt of publishing wait on the author, and also makes it very difficult to roll out template changes across an entire site. For one page, though (one that is assumed to be constantly changing), I prefer this method. It also means that the outward-facing server doesn’t have to run the slow Ruby stack at all, and no caching has to be done within the app itself.

Fourth, the templates are files. They aren’t stored in the database. This is one area where WP (and most web frameworks) have a leg up on MT and EE. The database should be used to populate template variables, and store as little markup as possible. Storing templates in the database makes them tough to republish, and makes it really difficult to edit them in a text editor, unless you want to do a lot of copy and pasting.


Where did I get this idea? A couple different things gelled into me needing to write this app. The first was a lunch I had with Scott Klein of ProPublica a few months ago, where we talked about CMS design and basically agreed it is a mistake to develop against your CMS. ProPublica uses ExpressionEngine for their stories, and spins up EC2 instances for their Rails apps. They integrate the two mostly through style. Homer works along these lines, and Scott got me thinking down that path.

Secondly, I was inspired by two Movable Type plugins from SixApart. The first, SqueezePlay (which unfortunately has not been open-sourced) is a homepage manager that uses the story assignment/slot metaphor, and is what we use at TPM to handle our front page. The other is Reblog, an awesome MT plugin that parses RSS feeds into MT entries. Homer is kind of SqueezePlay + Reblog, but outside the CMS.

Thirdly, a group of my friends from the University of Chicago are starting a new (as of yet unreleased) online magazine, albeit a “deconstructed” magazine which will collect content from a wide variety of sources and contributors and mash it together. This provided an initial use case that spurred development. Since I was developing Homer for free, I decided to open-source it as a community journocoding project. I hope it will prove useful to others!

Get it!

To install Homer, first make sure you have a newish version of Ruby— 1.8.6 and higher is required for the current version of Rubygems (1.3.6). Then install Rubygems, and gem install sinatra activerecord sqlite3-ruby feedme. Now you’re ready to install Homer.

If you have git, just

git clone git://github.com/ashaw/homer.git
cd homer/bin
./homer init //to set up SQLite db and templates dir
./homer run

then open a browser to http://localhost:4567

or grab the tarball or zipball, uncompress and start at step 2 above.

To install on an Apache web server, the best way is to install the Phusion Passenger gem, follow the instructions to set it up, and make a VirtualHost for Homer, like so (I recommend setting up a password in htpasswd for Homer since the app doesn’t support authentication):

<VirtualHost *:80>
     ServerName homer.yourdomain.com
     DocumentRoot /var/www/homer/public
     <Directory /var/www/homer/public>
        AllowOverride all
        Options -MultiViews
        AuthType Basic                  
        AuthName "Restricted"
        AuthUserFile /var/www/yourpasswords
        Require user youruser

To start and restart the app, just touch tmp/restart.txt from your Homer root directory, don’t homer run, Passenger will take care of the rest.

Obviously, this is still very much alpha (perhaps pre-alpha) software, so act accordingly. I’d love to get your feedback, bug reports, comments, etc. And you should follow me on twitter here.