“Obsessed with putting ink on paper”

What is behind LilyPond?

LilyPond is not unique in making music notation: there are a lot of programs that print music, and nowadays most of the newly printed music is made with computers. Unfortunately, that also shows: just ask any musician that plays classical music: new scores do not look as nice as old ones.

What is the difference between hand-work and machine work, and what has caused it? How can we improve the situation? This essay explains problems in music notation (software), and our approach to solving them.

Table of contents

This essay is also available in one big page.

What's wrong with computer music notation?

We like to call LilyPond an "automated engraving system." It will format music notation beautifully without requiring typographical expertise of its users.

LilyPond is not unique in making music notation: there are a lot of programs that print music, and nowadays most of the newly printed music is made with computers. Unfortunately, that also shows: just ask any musician that plays classical music: new scores do not look as nice as old (from before, say, 1970) scores: the new ones have a bland, mechanical look. They are not at all pleasurable to play from.

To illustrate this, take a look at the following examples. Both are editions of the 1st Cello Suite by J.S.Bach. The one on the left is a very beautifully hand-engraved edition from 1950, the one on the right is a typical contemporary computer product. Take a few seconds to let the looks of both pages sink in. Which one do you like better, and why?

Bärenreiter (BA 350, (c) 1950) Henle (nr. 666 (c) 2000)

The left picture looks nice: it has flowing lines and movement. It's music, and it's alive. Now, the picture on the right shows the same music, and it was written by Bach. His music surely has liveliness and flowing lines.... Except, the score doesn't show it: it looks rigid and mechanical. To understand better why that is, let's blow up a fragment of both pieces:

Hand-made


Computer-made

The location of the bar lines is a giveaway. In the new edition, both barlines are on exactly the same horizontal location. Also, the note heads are on the exact same horizontal location. When you look back at the whole page, you can easily verify that almost all barlines are in the same location, as are most of the note heads. The entire thing is spaced as if it were put to a big grid, which is what causes the mechanical impression.

This is not the only error on this example, and more importantly, this piece is not the only one with typographical errors. Sadly, almost all music printed nowadays is full of basic typographical mistakes.

Musicians are usually more absorbed with performing the music than with studying its looks, so this nitpicking about typographical details may seem academical. That is not justified. This piece here has a monotonous rhythm. If all lines look the same, they become like a labyrinth. If the musician looks away once or has a lapse in his concentration, he will be lost on the page.

In general, this is a common characteristic of typography. Layout should be pretty, not only for its own sake, but especially because it helps the reader in his task. For performance material like sheet music, this is doubly important: musicians have a limited amount of attention. The less attention they need for reading, the more they can focus on playing itself. In other words, better typography translates to better performances.

Next: What's wrong with software, or how Finale is not the end-all of music software.

What's wrong with music notation software

Computers have made music printing accessible to the masses, but they tend to deliver mediocre typography. Apparently, programmers have been doing a shoddy job on notation programs. To illustrate that, we had an amateur user set a piece of music in one of the most popular ‘professional’ notation programs sold today, Finale 2003. It was made with all of the default settings. The music is from the Sarabande of the 2nd Cello Suite by J. S. Bach.

(Finale is a registered trademark of MakeMusic! Inc.)

This example far surpasses the previous one when it comes to formatting errors: there are serious errors in literally every measure. The errors come in all sizes: a big one is the oddly s p a c e d   o u t last line. A smaller one is the flat in measure 13, which is covered by the note preceding it. Here is a magnification of that measure:

The errors go down to the teensy details: below is a blowup of the beam in that measure. Of course, in proper typography the beam should not stick out to the right of the stem, and the ribbles provide a telling glimpse into Coda Music Technology programmers' aptness (or lack thereof) with the underlying PostScript technology.

Now, one could refute that Finale has a graphical interface, and it lets you easily move about elements to correct errors, or use plug-ins to do so. This is certainly true: in fact, good professional engravers that use Finale typically spend the majority of their time correcting all the errors that Finale routinely makes. But do you want to spend your time on correcting all glaring errors? For the spaced out line, it is doable, but imagine that you have to correct each and every beam that sticks out of the stems.... by hand?

There is a less obvious reason why correcting things by hand is a bad idea. Consider again measure 13 reproduced above. The misplaced flat is pretty obvious, but did you notice that repeat bar? Its lines are too far apart. Did you notice that the eighth rest is too far down? Did it occur to you that the stem of the last eighth note is too long?

Unless you are an expert, typographical errors will irk you without being obvious. Many of them will go uncorrected and will still be in the final print.

This example may seem contrived, but in fact, it's not. All major producers of notation software claim to follow engraving standards, but we have not seen any that gets the basics right; all of them make systematic mistakes. If you want to assess the output of your favorite program, then buy a decent hand-made score from a respectable publisher, and try to reproduce one page of it. Then compare them:

Next: How not to design software, or: modeling music notation.

Designing notation software: how not to do it

It would be nice if notation software didn't need any babysitting to produce acceptable output. Our goal with LilyPond was to write such a system: a program that will produce beautiful music ("engraving") automatically.

At first sight, music notation follows a straightforward hierarchical pattern. Consider the example below, with two staves containing two measures.

Isn't writing software all about finding hierarchies and modeling the real world in terms of trees? In the view of a naive programmer, the above fragment of notation is easily abstracted to a nested set of boxes

It's easy to follow this model when writing software. It's obvious how to store this data in memory, and writing on disk can be easily mirrored. In an XML-file you could write something like
  <score>
    <staff>
      <measure id="1">
         <chord length="1/2">
	   <pitch name="c">
         </chord>
         <chord>
	 
	 ....
      </measure>
    </staff>
  </score>

In short, this model is obvious, simple and neat. It's the format used by a lot software. Unfortunately, it's also wrong. The hierarchical representation works for a lot of simpler music, but it falls apart for advanced use. Consider the following example:

In this example, several assumptions of the previous model are violated: staves start and stop at will, voices jump around between staves, and sometimes span two staves.

Music notation is really different from music itself. Notation is an intricate symbolic diagramming language for visualizing an often much simpler musical concept. Hence, software should reflect that separation.

Next: Divide and conqueror, a blue print for automated notation

Plan de campagne

Since content and form of a score are separate, we have to match that in the design of software. Hence, the basic blueprint of our program should this scheme
{ c'4 d'8 }
1. form 2. translation 3. content
In effect, we are conquering the problem by dividing it into subproblems
  1. Typography: where to put symbols
  2. Notation: what symbols to produce
  3. Representation: how to encode music
Finally, whenever you subdivide a problem, a new problem is created,
  1. Architecture: glue everything together

Next: Impressive, but does it also work in theory? A practical approach to capturing notation.

Music notation

Common music notation encompasses some 500 years of music. Its applications range from monophonic melodies to monstruous counterpoint for large orchestras. How can we get a grip on such a many-headed beast? Our solution is to make a strict distinction between notation, what symbols to use, and engraving, where to put them. For tackling notation, we have broken up the problem into digestible (and programmable) chunks: every type of symbol is handled by a separate plugin. All plugins cooperate through the LilyPond architecture. They are completely modular and independent, so each can be developed and improved separately.

Polyphonic notation

The system shown in the last section works well for monophonic music, but what about polyphony? In polyphonic notation, many voices can share a staff:

In this situation, the accidentals and staff are shared, but the stems, slurs, beams, etc. are private to each voice. Hence, engravers should be grouped. The engravers for note head, stems, slurs, etc. go into a group called "Voice context," while the engravers for key, accidental, bar, etc. go into a group called "Staff context." In the case of polyphony, a single Staff context contains more than one Voice context. Similarly, more Staff contexts can be put into a single Score context:

Next: The art of stamping: how did they make hand-made music?

Music engraving

When we know what symbols to print, we have to decide where to put them so the the result looks pleasing. This art is called music engraving. The term derives from the traditional process of music printing. Only a few decades ago, sheet music was made by cutting and stamping the music into zinc or pewter plates in mirror image. The plate would be inked, and the depressions caused by the cutting and stamping would hold ink. An image was formed by pressing paper to the plate. The stamping and cutting was completely done by hand. Making corrections was cumbersome, so engraving had to be done correctly in one go. Of course, this was a highly specialized skill

Next: Stamping computer screens?. Computer hackers take over the engraving business.

Implementing typography

How do we go about implementing typography? Answering the "music notation" problem left us with a bunch of graphic objects representing note heads, the staff, stems, etc.

If craftsmen need over ten years to become true masters, how could we simple hackers ever write a program to take over their jobs?

The answer is: we cannot! Since typography relies on human judgement of appearance, people cannot be replaced. However, much of their dull work can be automated: if LilyPond solves most of the common situations correctly, then this will be a huge improvement over existing software. The remaining cases can be tuned by hand. Over the course of years, the software can be refined to do more and more automatically, so manual overrides are necessary less and less.

How do we go about building such a system? When we started, we wrote the program in C++. Essentially, this means that the program functionality is set in stone by us developers. That proved to be unsatisfactory:

Clearly, there is a need for a flexible architecture. The architecture should encompass formatting rules, typographical style and individual formatting decisions.

Next: Program architecture, your flexible friend: tuning, tweaking and developing typography rules.

A flexible formatting architecture

Remember the music notation problem? Its solution left us with a bunch of objects. The formatting architecture is built on these objects. Each object carries variables:

Next: Beautiful numbers: how LilyPond participates in the Miss World contests.

Beautiful numbers

How do we actually make formatting decisions? In other words, which of the three configurations should we choose for the following slur?

There are a few books on the art of music engraving available. Unfortunately, they contain rules of simple thumbs and some examples. Such rules can be instructive, but they are a far cry from an algorithm that we could readily implement in a computer. Following the instructions from literature leads to algorithms with lots of handcoded exceptions. Doing all this case analysis is a lot of work, and often not all cases are covered completely.

Formatting rules defined by example. Image from Ted Ross' The Art of Music Engraving

We have developed a much easier and robust method of determining the best formatting solution: score based formatting. The principle is the same as a beauty contest: for each possible configuration, we compute an ugliness score. Then we choose the least ugly configuration.

For example, in the above configuration, the slur nicely connects the starting and ending note of the figure, a desirable trait. However, it also grazes one note head closely, while staying away from the others. Therefore, for this configuration, we deduct a `variance' score of 15.39.

In this configuration, the slur keeps a uniform distance from the heads, but we have to deduct some points because the slur doesn't start and end on the note heads. For the left edge, we deduct 1.71, and for the right edge (which is further from the head) we deduct 9.37 points. Furthermore, the slur goes up, while the melody goes down. This incurs a penalty of 2.00 points

Finally, in this configuration, only the ending the slur is far away from the ending note head, at a score of 10.04 ugliness points.

Adding up all scores, we notice that the third option is the least ugly, or most beautiful version. Hence we select that one.

This technique is a general technique, and it is used in a lot of situations, for example

This technique evaluates a lot of possibilities, which takes some time to compute. However, that is a worthwhile expense, because the end result is much better, and because it makes our lives easy.

Next: Man is the measure of things: is a flexible architecture enough?

Notation benchmarking

A flexible architecture is necessary for good formatting. Unfortunately, it is not sufficient. Only a careful emulation of printed matter will give a good result. We suggested in the introduction to compare program output with existing hand-engraved scores. It is exactly this technique that we use to perfect LilyPond output. In a way, this is a benchmarking technique: the performance of the program, in terms of quality, is measured in relation to a known quantity.

Here you see parts of a benchmark piece. At the top the reference edition (Bärenreiter BA 350) at the bottom the output from LilyPond 1.4:

Bärenreiter

LilyPond 1.4

The LilyPond output is certainly readable, and for many people it would be acceptable. However, close comparison with a hand-engraved score showed a lot of errors in the formatting details:

(And there were missing notes in the original version for LilyPond)

By addressing the relevant algorithms, settings, and font designs, we were able to improve the output. The output for LilyPond 1.8 is shown below. Although it is not a clone of the reference edition, this output is very close to publication quality.

LilyPond 1.8

Bärenreiter

Another example of benchmarking is our project for the 2.1 series, a Schubert song.

Next: Cool features, typographical hoops that we made LilyPond jump through.

Font design

A large factor that makes LilyPond output look traditional lies in the blackness of the page. By using heavy stafflines, and a font design to match that, the overall impression is much stronger. This is also very clear from the following blowups:
Henle (2000) Bärenreiter (1950) LilyPond (2003)

Another typical aspect of hand-engraved scores is the general look of the symbols. They almost never have sharp corners. This is because sharp corners of the punching dies are fragile and quickly wear out when stamping in metal. The general rounded shape of music symbols is also present in all glyphs of our "Feta" font.

Spacing

One of the problems that the Bach piece above inspired us to attack is the spacing engine. One of its features is optical spacing. It is demonstrated in the fragment below.

This fragment only uses quarter notes: notes that are played in a constant rhythm. The spacing should reflect that. Unfortunately, the eye deceives us a little: not only does it notice the distance between note heads, it also takes into account the distance between consecutive stems. As a result, the notes of an up-stem/down-stem combination should be put farther apart, and the notes of a down-up combination should be put closer together, all depending on the combined vertical positions of the notes. The top fragment is printed with this correction, the bottom one without. In the last case, the down-stem/up-stems combinations form clumps of notes.

Ledger lines

Ledger lines are typographically difficult. They can easily blot together with other signs, such as ledger lines or accidentals. Other software prevents these collisions by spacing the lines wider (thus taking up more space), or shortening ledger lines (which hampers readability.)
   
Henle (2000) Bärenreiter (1950) LilyPond (2004)
Traditional engravers would adjust the size of a ledger line, depending on what symbols were in the neighborhood. LilyPond does the same. Ledgers are shortened so they never collide with neighboring lines, and they are shortened when there is an accidental.

Next: Use the Source, Luke, or: what goes into LilyPond.

Input format

As discussed earlier, the ideal input format for a music engraving system is the content: the music itself. This poses a formidable problem: how can we define what music really is? Our way out of this problem, is to reverse it. Instead of defining what music is, our program serves as a definition: we write a program capable of producing sheet music, and adjust the format to be as lean as possible. When the format can no longer be trimmed down, by definition we are left with content itself.

The syntax is also the user-interface for LilyPond, hence it is easily typable, e.g.,

  c'4 d'8
Are a quarter note C1 and eighth note D1, as in this example:

On a microscopic scale, such syntax is easy to use. On a larger scale, syntax also needs structure. How else can you enter complex pieces like symphonies and operas? The structure is formed by the concept of music expressions: by combining small fragments of music into larger ones, more complex music can be expressed. For example,
c4

Combine this simultaneously with two other notes by enclosing in << and >>.
  <<c4 d4 e4>>
This expression is put in sequence by enclosing it in braces, i.e.,
   { <<c4 d4 e4>> f4  }
The above is another expression, and therefore, it many combined again with a simultaneous expression (in this case, a half note).
<< { <<c4 d4 e4>> f4 } g2 >> 

Such recursive structures can be specified neatly and formally in a context-free grammar. The parsing code is also generated from this grammar. In other words, the syntax of LilyPond is clearly and unambiguously defined.

User-interfaces and syntax are what people see and deal with most. They are partly a matter of taste, and also subject of much discussion. Although discussions on taste do have their merit, they are not very productive. In the larger picture of LilyPond, the importance of input syntax is small: inventing neat syntax is easy, writing decent formatting code is much harder. This is also illustrated by the line-counts for the respective components: parsing and representation take up less than 10% of the code.
Parsing + representationtotal
6000 lines C++ 61500 lines C++

Next: wrapping it up, the conclusion.

Conclusion

We've shown you what engraved music should look like, and how we built our software to emulate that look. We have put a lot of effort into building it. Thanks to all that hard work, you can use the program to print nice music too.

Go back to the index.