Bitter, template based syntax highlighter

Way back in 2008 I started writing a syntax highlighter with the goal of making is both easy to use and easy to develop, not an easy task by any measure.

About Bitter

Over the past few years I've made many attempts at writing experimental syntax highlighter, one used XML files to define syntax, another was basically a bunch of PHP classes and some crazy looping constructs.

None of them really satisfied me. They where all too hard to maintain, and too hard to actually make use of, so I gave up, for a little while at least.

I started thinking about the subject again when it became clear that Symphony needed a new syntax highlighter, one that I would probably be maintaining.

The more I thought about it, the more a functional language seemed to be the solution, something like XSLT, only instead of matching XML nodes with XPath, it would match syntax with regular expressions.

So that's what I did, on and off from the December 2008 until early June 2009, I took what I could from my previous projects and hacked together something better. Since then until a few days ago (about the time I rebuilt this website), the project has basically been stagnant.

So here's a quick introduction to what Bitter is all about:

Languages

Languages are the 'functional programming' bit I was talking about earlier, here is a very simple language definition that highlights any acronyms in a piece of text:

<?php
	
	Bitter::rule(
		Bitter::id('acronyms'),
		Bitter::tag('context-text acronyms'),
		Bitter::match('.*'),
		
		Bitter::rule(
			Bitter::id('acronyms-found'),
			Bitter::match('\b[A-Z]+\b')
			Bitter::tag('acronym')
		)
	);
	
?>

So basically you define an all encompassing rule (acronyms) that matches any text (.*) which then applies a second rule that matches any sequence of more than one uppercase letters.

A 'tag' in this instance is a space separated list of class names to use when outputting the highlighted source.

So if you gave it this:

TODO: Find some ammusing witticism that contain acronyms. Oh darn.

It would output this:

<span class="context-text acronyms"><span class="acronym">TODO:</span> Find some amusing witticism that contain acronyms. Oh darn.</span>

At some point I'll write a full tutorial on how to create new language files for Bitter, for now you'll have to content yourself with this short example and the languages that I've already written as part of the project.

Formats

Bitter also has a post highlighting formatter stage, it exists so that you can change the output to suit your needs. One example of this is the Symphony debug page, which splits the output into lines with line numbers.

Another would be a simple conversion from tabs into spaces:

<?php
	
	require_once BITTER_FORMAT_PATH . '/default.php';
	
	class BitterFormatTabsizeTwo extends BitterFormatDefault {
		protected $tabsize = 2;
		
		public function process($source) {
			$this->output = $source;
			
			$this->processTabs();
			$this->processLines();
			
			return $this->output;
		}
	}
	
	return new BitterFormatTabsizeTwo();
	
?>

As you can see, tab conversion is built in, as is wrapping of lines with spans. You can of course override the defaults at any time.

Usage

You can use Bitter directly in PHP by creating an instance of the Bitter class, this does have some disadvantages however, because highlighting is quite a CPU expensive process you may end up delaying your page from being sent by several seconds.

Because of this you're better off using the jQuery plugin, which sends AJAX requests to a small PHP script.

Put this in your sites HTML, after you include the jQuery library:

<script type="text/javascript" src="assets/jquery.bitter.js"></script>

Then in your sites JavaScript:

$(document).ready(function() {
	// Point it at the HTTP handler:
	$.fn.bitter.defaults.handler = 'assets/jquery.bitter.php';
	
	// Use the 'tabsize-4' format:
	$.fn.bitter.defaults.format = 'tabsize-4';
	
	// Tell it what to highlight:
	$('pre.language-css').bitter({
		language:	'css'
	});
	$('pre.language-js').bitter({
		language:	'js'
	});
	
	// ...
});

After doing that you'll need to style the output, however I'll leave that to your imagination. You can find some more complete examples in the Bitter repository.

Thoughts

That concludes this quick introduction to Bitter, which is currently used to highlight snippets of itself on this blog, oh and handles the Symphony debug experience too.

I hope you're game to give it a try, any suggestions, criticisms or questions are more than welcome.

Share your thoughts...

Wilhelm Murdoch wrote on :

Nice! REAL nice! :D

craig zheng wrote on :

Brilliant. Been hoping for this sort of intro from you. Thanks.