Code – God of the Machine
Aug 022006
 

This blog used to run on Greymatter, a collection of Perl scripts that its creator stopped supporting about ten minutes after I chose it. Greymatter served me well for years, but modern features that I need, like RSS feeds and comment spam control, are alien to it, and my Perl isn’t good enough to add them. Greymatter also stores posts in flat files, which is OK if you just want to display them, but not so good if you want to search or perform any other batch operations. I program computers by trade and therefore report these matters with some embarrassment.

New software was called for, and a cursory survey of the alternatives led me, on a Wisdom-of-Crowds basis, to WordPress, a collection of PHP scripts. The WordPress slogan is “Code Is Poetry.” I agree; but if code is poetry, WordPress code is doggerel. Here’s a sample from their Greymatter import script:

if ($i<10000000) {
$entryfile .= "0";
if ($i<1000000) {
$entryfile .= "0";
if ($i<100000) {
$entryfile .= "0";
if ($i<10000) {
$entryfile .= "0";
if ($i<1000) {
$entryfile .= "0";
if ($i<100) {
$entryfile .= "0";
if ($i<10) {
$entryfile .= "0";
}}}}}}}

A non-programmer, being told that Greymatter file names are eight characters long and begin with as many zeros as are required to fill out the number of the blog post (e.g., “00000579.cgi”), can probably figure out what this code snippet does. If you said that it prefixes the appropriate number of zeros to the blog post number, you win. If it occurs to you that there are more concise ways to do this than with seven nested if-statements, you may have a future in the burgeoning software industry.

On the other hand, this code, like WordPress itself, has the great merit of actually working. WordPress has a pretty easy-to-use API as well, so people can, and do, write skins that you can borrow and modify to your taste, and other extensions to its functionality. Its administrative interface is excellent. The documentation is extensive and the forums are helpful. You could do worse.

Several flavors of syndication are now available, at the bottom right. Old posts can be left open for comments thanks to proper spam control. Comments are numbered. Posts are (beginning to be) categorized.

The mini-blog is a transparent effort to raise my marks in Deportment, by allowing me to pretend I’m interested in what the rest of you are writing while I’m off Thinking Big Thoughts. It will also improve my discipline: at one short sentence every other day, I’ll have a finished book after forty years.

A few matters remain. The baseball search engine is busted, thanks to an ill-advised upgrade. A new one will be forthcoming, with better-looking results, more search criteria, and updated statistics. A reorganization of The Gee Chronicles, which can still be found at their old location, in their old format, for the time being. Links to old posts still work, but point to their old versions. URLs will eventually be rewritten to point to the new ones, but the old links will always work because permalinks should be permanent.

The banner and layout are mostly the work of my girlfriend. All complaints should be directed to her.

Apr 082004
 

They’re advertising Colby Cosh’s blog on ESPN now. “What’s Cosh writing about?” “Hockey.” “What’s he writing about next?” “Hockey.” I’m getting the same way about programming. We will release the honest-to-god production version of our software Monday, and the experience has been, shall we say, instructive. The lessons include:

  • The trouble with most programmers isn’t that they’re too lazy, it’s that they aren’t lazy enough. Never write anything until you absolutely have no other choice. (Regular readers will note how well I have absorbed this.)
  • More classes = less code.
  • “Marketing” tends not to attract the sharpest tools in the shed.
  • If you refuse to test your own code righteously enough, eventually you can con somebody else into doing it for you.
  • You cannot write good business application software unless you understand the business better than the people who make their living at it. This is mostly why business application software sucks, especially when it’s written by people in the business.
  • In a client-server application, given the choice between working on the client or the server, always take the server. This is a lemma of the more general principle that at any job you should strive to stay as far away from the customer as possible.
  • You never discover the right design until you have written an enormous amount of code based on the wrong design.
  • A mediocre programmer is useless. His time would be better spent washing dishes or picking up trash.
  • The better two programmers work together, the likelier they are to end up despising each other.
  • In programming, especially, is George Pólya’s dictum true that it is often easier to solve the general case. In fact solving the special case is usually a waste of time.
  • I’d rather be blogging. Honest.
Apr 012004
 

Immersed, by necessity, in technical matters lately, I began to wonder what my vocation, software, and my avocation, poetry, have in common. (Meanwhile my readers, if any remain, began to wonder if I was ever going to post again.) The literary lawyers go on about the intimacy between poetry and the law and compile an immense anthology devoted to attorney-poets. Who better to speak for the programmers than I? And I do have some company in these two interests: Richard Gabriel, the well-known Stanford computer scientist, is a poet, and among the poet-bloggers Mike Snider and Ron Silliman, two poets as different as you’re likely to find, both write software for a living. Less illustrious, perhaps, than Wallace Stevens and James Russell Lowell and Archibald MacLeish, but computer science is an infant profession while the lawyers have been with us forever.

What the programmer shares with the poet is parsimony, and here we leave law far behind. Programmers, like poets, often labor under near-impossible conditions for practice, and for fun; Donald Knuth, responsible for TeX, the world’s best typesetting program, says that his favorite program is “a compiler I once wrote for a primitive minicomputer that had only 4096 words of memory, 16 bits per word. It makes a person feel like a real virtuoso to achieve something under such severe restrictions.” A popular game in computer science is to try to write the shortest possible program, in a given language, whose source code is identical with its output. Is this any different from writing poems in elaborately complex forms, like sestinas or villanelles, or playing bouts-rimé? In a sense, it’s the constraints that make the poetry.

Successive versions of the same program shrink, even as they improve. In Version 1.0 the developers usually lack, like Pascal in his letters, the time to make it shorter. In 2.0 excess code is pruned, methods and interfaces are merged that at first appeared to have nothing in common, more is done with less. Successive drafts of the same poem shrink the same way, for the same reason. The Waste Land was supposed to have been cut by Ezra Pound from five times its present length. (Pound claimed that he “just cut out all the adjectives.”) Whether it wound up any good is a topic for another day; that it wound up better than it started no one can reasonably doubt.

Good programming requires taste. Certain constructs — long switch or if/else blocks, methods with a dozen arguments or more, gotos, labels, multiple return statements, just about anything that looks ugly on the page — these must make you queasy, your fingers must rebel against typing them. Some programmatic and poetic strategies look eerily alike. The classic way to avoid switch and if/else statements in code is polymorphism, which closely resembles ambiguity in poetry.

Donald Knuth maintains a complete list of errata for all his books, and pays $2.56 ($.028) for every new error you find. In most human endeavor the perfect is the enemy of the good, and many people who have never written a program or a poem might regard Knuth’s quest for perfection as insane. Randall Jarrell once defined a novel as “a long stretch of prose with something wrong with it,” which is amusing but overbroad. A poem is a stretch of verse with something wrong with it; a program is a stretch of code with something wrong with it. A novel is a stretch of prose with something hopelessly wrong with it. For poets and for programmers, perfection seems always a few revisions away. This may be an illusion, but it’s an illusion that the novelist, the civil engineer, certainly the lawyer, cannot share.

In truth, however, yesterday’s code had more in common with poetry than today’s. The great lyric code poems, the brilliantly compressed algorithms, have mostly been written, and live on in the native libraries that all modern programmers use but few read. They are anthologized in Knuth’s three-volume opus, The Art of Programming, one volume each for fundamental algorithms, semi-numerical algorithms, and sorting and searching. Where yesterday’s tiny assembly programs were lyric, today’s n-tier behemoths are epic, and epic programs, like epic poems, never fail to have something hopelessly wrong with them. Nonetheless, in programming we have entered the age of the epic, and there’s no going back. Once, in a bout of insanity, I interviewed for a programming job at a big bank, and encountered a C programmer who liked to work close to the metal. He asked me to write a program that would take a string of characters and reverse it. I asked if I could use Java and he said sure. My program was a one-liner:

string reverse( string pStr ) { return pStr.reverse(); }

The point being that Java has a built-in method to reverse a string, called, remarkably, reverse(). Now I knew exactly what he wanted. He wanted me to use one of the classic algorithms, which have been around since at least the 1960s and are described in Jon Bentley’s excellent book on programming in the small, Programming Pearls, among other places. He wanted a nostalgia tour. But these algorithms are great poems that have already been written. Any decent function library, like Java’s, includes them, and it makes no sense to reinvent them, priding yourself on your cleverness. I didn’t get the job.

(Update: Rick Coencas comments. mallarme comments. Ron Silliman points out in the comments that he’s not a software developer after all; I apologize for the error.)

Mar 232004
 

Dear 66.65.2.105:

d00d. Do not run kiddie scripts that generate buffer overflows against my web server. They will avail you nothing but a 414. This is especially unwise if you happen to subscribe to the same cable company, in the same area, as I do, because then I will inform them, and they will find out who you are. And when Mommy hears from Time-Warner, as she shortly will, she will ground you for a month and take your Internet away.

Sincerely,

Mar 072004
 

You don’t want to know. Or maybe you do. I’ve spent 14 to 16 hours a day programming — rearchitecting, in the argot, a project I’m working on. The application tracks resources, for construction companies, in real time, and there were quite a few things to fix. (Note to Cosh: this is what I do for a living.)

Don’t misunderstand: I’m as lazy as the next man, probably lazier. My exertions were mostly geared toward maximum future leisure. The application is in beta now, and very soon it will go into production. The beta users want new features, and the production users will want more new features. Your choice is, fix the server design now to make these features relatively easy to implement, or do ten times as much work down the road. As the FRAM oil filter guy used to say, you can pay me now, or pay me later.

I’m also one of those guys who will work forever if something interests him and idle for weeks otherwise, which makes me, as you might imagine, a less than satisfactory employee. In this case we decided on more or less the server design that I wanted in the first place, so I was forced to work around the clock to prove that I was right. Which I was. And isn’t that what life is really all about?

Microsoft, about which I rarely have a good word to say, certainly earned its keep this week. It turns out that C#, unlike Java, can transparently proxy objects over machine boundaries. This means you can create complex objects on the server with references to their subobjects, pass in a proxy that knows how to construct the subobjects, and then fetch the subobjects dynamically, without the callers having to know a thing about it. The nasty synchronization issues associated with client-side caching disappear, turning event-handling from a nightmare into a breeze. If you understood this, what a thrill, right? If you didn’t, I’ll write about poetry again soon.

I shall return later tonight with an explanation of why David Lee Roth is the world’s most eminent living sociologist. If by some chance I don’t, see above.

Dec 072003
 

This place has gone to seed, in large part, because I’ve been doing some actual work, trying to get a software release out — late, inadequate, but out — and as a consequence have followed Floyd McWilliams’s and Evan Kirchhoff’s theorizing about the future of software with more than academic interest. Evan starts here, Floyd replies, more Evan, more Floyd, and finally Evan again. The question at hand is when all of our jobs shall be outsourced to Softwaristan (India), where they produce high-quality source code for pennies a day, and what we software developers shall be doing for a living when that happens. As Evan puts it, “Floyd says ‘decades,’ I say ‘Thursday.'”

And I say, with due respect to both of these highly intelligent gentlemen, that neither one has the faintest idea what he’s talking about. They are speculating on the state of a science seventeen years in the future, and if they were any good at it they wouldn’t be laboring, like me, in the software mines, but in the far more lucrative business of fortune-telling. I — and I suspect I speak for Floyd and Evan here too — would happily swap W-2s, sight unseen, with Faith Popcorn or John Naisbitt, and they’re always wrong.

Floyd compares the current state of software development to chemistry circa 1700, which is generous; I would choose medicine circa Paracelsus, the Great Age of the Leeches. The two major theoretical innovations in modern software are design patterns and object orientation. Design patterns and object orientation are, depending on how you count, ten and thirty years old respectively, which indicates the blazing pace of innovation in the industry. Design patterns mean that certain problems recur over and over again, and instead of solving them the old-fashioned way, from scratch every time, you write down a recipe, to which you refer next time the problem crops up. Object orientation means that software modules, instead of just encapsulating behavior (“procedural programming”), now encapsulate data and behavior, just like real life! Now doesn’t that just bowl you right over?

Hardware, by contrast, improves so rapidly that there’s a law about it. It is a source of constant reproach to software, which has no laws, only rueful aphorisms: “Adding people to a late software project makes it later,” “right, fast, cheap: choose two,” and the like.

Evan claims, notwithstanding, that “a working American programmer in 2020 will be producing something equivalent to the output of between 10 and 1000 current programmers.” Could be. He points to analogies from other formerly infant industries, like telephones and automobiles. He also cites Paul Graham’s famous manifesto on succinctness as power, without noting that Graham’s language of choice is LISP. LISP is forty years old. If we haven’t got round to powerful languages in the last four decades are we really going to get round to them in the next two?

Floyd counters with an example of an object-relational library that increased his team’s productivity 25-50%, arguing that “as long as development tools are created in the same order of magnitude of effort as is spent using them, they will never cause a 100 or 1000-fold productivity improvement.” Could be. Certainly if, as we baseball geeks say, past performance is the best indicator of future performance, I wouldn’t hold my breath for orders-of-magnitude productivity improvements. On the other hand, bad as software is, enormous sums are poured into it, large segments of the economy depend on it, and the regulators do not even pretend to understand it. This all bodes well for 2020.

Me, I don’t know either, which is the point. Evan works on games, which are as good as software gets; this makes him chipper. Floyd works on enterprise software, which is disgusting; this makes him dolorous. I work on commercial business software, which is in-between; this makes me ambivalent. We all gaze at the future and see only the mote in our own eye.

(Update: Rick Coencas comments. Craig Henry comments.)

Oct 262003
 

Computers may or may not be changing the nature of art; I leave this question to the eminent Blowhards. But at the very least they could be the handmaidens of literary scholarship. Shouldn’t the Internet be full of concordances by now?

You remember concordances. Those thick books your English professors had on their shelves, where you could look up how many times Milton uses the word “swain,” or Dryden “wit,” or Dickinson “nature”? Now difficult as this may be for some of you juvenile readers to grasp, in antediluvian times scholars compiled these by hand. They are indispensable for serious literary scholarship, and excellent for settling arguments and jogging memories.

There are a few online concordances for the obvious choices, like Shakespeare and the Bible. There’s even pretty cheap software that will do it for you automatically, which these folks have used to make a desultory stab at a few of the British romantic poets. The University of Georgia English Department has managed to post a complete one for William Blake. It is defective (a search for “rose” yields hits for “prose” and “arose” with no way to ask for the whole word only, or to distinguish the noun from the verb) but far better than nothing. Bartleby offers search on its texts, but they are nearly always single works or selections. No remotely complete online concordance exists for, moving in reverse chronological order and considering only a few poets who interest me, Stevens, Robinson, Hardy, Hopkins, Dickinson, Pope, Dryden, Milton, Jonson, Donne, Greville, Ralegh, Gascoigne, Skelton, and Chaucer. Print concordances exist for every one of these authors.

Clearly there’s a shortage people with the necessary technical skills and literary interests to do the job. A sufficiently interested and modestly competent database programmer could rig this up in a jiffy. Do I know anyone like that? Oh. Right. Never mind then.

Aug 262003
 

I have previously discussed my facility with hardware. Yesterday’s outage proves that my UNIX system administration skills are up to the same exacting standard. I upgraded from RedHat 7.1 to the latest, 9.0, because I absolutely had to have a journaled file system, and various catastrophes ensued whose consequences I am still sorting out. Have I mentioned that I write software for a living?

What happens to the vast majority of the computer-using population, who understand nothing of executable file privileges, network interfaces, and firewall rules, when their machines go bad?

Nothing happens. They live with whatever went wrong, and in this lies the great secret of Microsoft’s success. Windows machines work, in a crude way, with minimal user intervention, nearly all the time; and when they don’t, they’re cheap enough that most people can afford to buy a new one. Most users don’t care if their desktop is ugly; they often take special pains to choose wallpaper that makes it uglier. They don’t care if 90% of their software is in barely working order, don’t care that it takes five minutes to reboot, don’t care that it beeps at odd intervals. So long as they can surf the web, read their email, and use the application of their choice — Word, Excel, some game, or, God help us, Powerpoint — they are willing to leave well enough alone. After yesterday, I can’t say I blame them.

Aug 132003
 

A null, in computer programming, is not a thing but its absence. Suppose you have a list, a handy object that programmers use all the time. You can do many things to a list object — add an item to it (list.add(item)), count the number of items (list.count()), iterate through the items one by one. Your list might have items in it, it might be empty. Doesn’t matter, these methods will still work. But if your list is null it is not a list at all. If you try to call a method on the null list, or on any null object, your program will except, in every computer language known to man, as it should. Nothing can never do something.

You might imagine that all nothings are the same; not so. A variable that has not yet been set to a value — “uninitialized” in the parlance — is nothing, but not null. If you try to use an uninitialized variable your program will not only fail to run, it won’t even compile. Nor are all nulls created equal. Microsoft’s C#, the language in which I’m programming at the moment, has a programming null, called null, and a database null, called System.DBValue.Null. Are they equal? No such luck. Some other languages have three different nulls or more.

Everyone, I trust, has tried to visit a database-driven web site and been greeted by an error page with some gibberish like “SQL Error Column[0] invalid value NullValueException 1003.” This means that a careless programmer has allowed a null into his database and failed to check for it on retrieval. Nulls will not stand such rough treatment.

Nulls sound like a nuisance, and they are. In my experience nearly half of all programming errors can be attributed to their improper use. The layman might wonder what possible use there can be in such a thing as nothing. Here’s an instance. I’m writing a program now that involves constructing on a server long lists of complicated objects, containing many subobjects, and sending them to various clients. Creating all the subobjects takes a long time, so to improve performance and reduce network traffic I have the server “proxy” them, by sending a skeletal object to the client and not filling it out until the client specifically requests the details.

Here nulls prove their mettle. Suppose the subobject is a list, which might be empty. The server will send the parent object back with this list set to null. Now, when the client needs the list, it simply checks if the list is null, and if so, calls the server again to request it. The server returns a non-null, but possibly empty, list, and the client now knows that it has a valid list and need not bother the server for it again. Without nulls the client would not know, if the list was empty, whether to return to the server and ask for it. With nulls the client always knows to fetch the list exactly once. There are other ways to implement this kind of logic, but they are far more complex and prone to error.

Nothing can sometimes be a very beautiful thing indeed.

Aug 092003
 

Cryptographic revolutionary Alan Bruzzi writes:

I was wondering where my reading level program would fit into cryptography. It takes a sentence out of a book, and computes its reading age. For example, John 3:16, spoken by Jesus Christ, would give a reading age of 33, because that’s when He died. Also, my program computes the age of the Virgin Mary, when she got married, to be 14. It’s an incredible program, but I just can’t figure out where it would be classifed under cryptography, because it converts a whole ASCII sentence into a single value, which would be the person’s age. Please help…

Dear Alan:

It is unfortunate that cryptography is already perfect, pending quantum computing, which offers a theoretical attack on RSA via Shor’s Algorithm, for this sounds like a remarkable program indeed. John 3:16 computes to 33, you say. John 3:17 through 21, also spoken by Jesus, also compute to 33, I assume. The entire Sermon on the Mount? Let me guess: 33. The number of words in the Rolling Rock legend? 33. I assume that computing marrying age, as in your example of the Virgin Mary, is a simple matter of a command-line switch.

Your program will shed light on many important historical questions. No one has been certain how old Homer was when he died or if he lived at all — until now, when we can feed a few lines of The Iliad (Pope’s translation, or Lang’s, or Butler’s, or Fitzgerald’s, I’m sure it doesn’t matter) into the computer, and voilà. If Homer should turn out to fictional, or an amalgamation of authors, will the program return zero, or an error code?

I have a few questions. Does your program require an English translation, or will other languages, say Greek or Aramaic for John 3:16, work equally well? For living authors, does it return their current age, or the age at which they can be expected to expire? Most important, do you reboot your computer by turning it upside down and shaking it until the screen goes gray?

In my professional opinion your program is unclassifiable. It is unique in the history of the cryptography, and I look forward to reviewing the source code.

I remain at your service.