How and Why to dump your Word Processor

The word processor has been a part of computing platforms since the earliest days of the home computer; I’ve used a number them over the years, including PFS-write on the Apple IIc, WordPerfect (both DOS and Windows versions), Microsoft Word, OpenOffice, and AbiWord.

A couple years ago, though, I got frustrated with the whole word processor concept, and found a way to create text documents that works a lot better for me. This article aims to describe the how and why of that move, for the benefit of others who find they just don’t like working with word processing software.

Wait, what’s wrong with word processors??

I’m sure someone reading this article is completely baffled at this point about what sort of problems I could possibly have with word processors. I’ve met many people for whom a computer without Microsoft Office is practically a paper-weight, and for whom the primary purpose of a computer is the creation and modification of Office documents. For such a purpose, the word processor is generally accepted as a proven, highly evolved, and time-tested tool.

Without dwelling too much on the negative, I’ll go through some of the issues I have with word processors that eventually led me to look for an alternative way of creating documents.

They use WYSIWYG editing

WYSIWYG1 editing has been a standard method of document creation since the early 1990s. Most people appreciate being able to directly work on the “end product” without going through some meta-data middleman or having to imagine what the end product could look like.

I’ve discovered that I don’t like it.

You see, WYSIWYG assumes that you know or care what you’re going to “get” in the end. Not every piece of text (nor even a significant minority of them) that I write is destined for paper. In fact, often it’s not clear what the end format of a document is going to be. It might get rendered to PDF for printing, or HTML for blog or online help content. It might end up as part of a plain text README file bundled with source code, or it might just stay on disk as-is for me to refer to at a later time.

Even when a document is bound for print, my brain doesn’t need the ever-present distraction of “what you get”. I’ve found myself getting derailed from writing for hours looking for the perfect font for level three headers, or tweaking the perfect margin between paragraphs. Taking away this capability means I can focus on content and organization, rather than a final appearance which may not actually be the final appearance.

Aside from the distraction, though, WYSIWYG requires that a program do a whole lot of stuff “behind the scenes” to try to make editing as simple and straightforward as if you were working on the physical printout. Sometimes it doesn’t do what you want, or sticks things in there you didn’t ask for2. I do not, in general, like programs that refuse to let me get under the hood, see what’s really going on, and edit the underlying nuts and bolts of my data directly.

They have serious mouse dependency

When I compose a significant amount of text, I like to keep my hands on the keyboard. Most modern word processors have some keyboard shortcuts, but they also seem to assume that you’re perfectly happy grabbing for the mouse when you need to do some moderately advanced editing routine.

It may seem trivial, but for me it’s highly non-ergonomic to constantly bounce between keyboard and mouse, especially when the mouse is a touchpad on a laptop. Having to do it repeatedly is an exercise in frustration.

They have problems with File formats

We finally, as of a couple years ago, have some open standards in the document arena; even so, modern word processing documents are so complex that the standards don’t seem to be able to accomplish what they fundamentally set out to do: enable people to exchange documents between different editors without losing or corrupting any work. That’s still pie in the sky.

I’ve lost no small amount of work to the changing tides of file formats over the years. Perhaps those days are behind us now, but with all the platform shifts going on right now, having your documents stuck in a format that only reliably opens in one program, which only runs on a certain platform, has got to be an inconvenience now and then.

They’re bloated, and command-line un-friendly.

The older I get, and the more I need to get done, the less tolerant I am of software that wastes my time. When I need to work on a file, I want to go from desktop to blinking cursor in as little time as possible. Not after you’ve displayed your splash screen, read half a gigabyte off the hard drive, and painstakingly styled a bunch of widgets that I won’t even use.

Not only that, but I’ve grown to be an old-school Linux user; I am never more than a keystroke or two from a BASH prompt, and I am just as comfortable with grep and find as you likely are with mouse clicks.

If I need to find a snippet of information in a file, I don’t want to have to load up some massive monolithic application just to get a piece of text. I want to grep for it from a prompt and move on. That’s and example of my workflow; it may be odd, but there you have it. Word processing applications don’t fit into that workflow.

They aren’t Emacs

If you aren’t an Emacs user, you might not understand; but once you start using Emacs, you eventually want to migrate all your computing tasks to it. It’s an amazingly powerful editing environment, and as you begin to learn and use its more advanced editing features, you increasingly feel their absence in other environments.

In addition, a lot of the documents I have to prepare are technical documentation for code that I’m writing. It’s nice to be able to do this in the same editor that I’m writing my actual code in, and not have to bounce between two big bloated programs with different behaviors and keyboard shortcuts. I can use one program for everything I write, be it code, web pages, technical documentation, song lyrics, blog articles, etc.

The alternative

So maybe by now you’re wondering, if I don’t use a word processor, how do I compose documents? In a word: markup.

In a few more words, I use my über-powerful text editor to compose documents in a lightweight markup language, which I can easily render to any end-format that I need.

If that sounds difficult, it’s not. In fact, if you pick the right combination of text editor and markup language, it can be amazingly ergonomic and productive.

So let’s take a look at the choices you have with this approach.

Text editor

Is there really a choice here? You want to use Emacs. Of course you do. OK, well, you don’t have to; but unless your text editor has some special modes for handling your chosen markup language, it’s going to be a bit tedious to do certain things.

I suppose you’ll realize some of the benefits of the markup language approach even using notepad.exe, but what you really want is a text editor geared at programmers: something that will at least give you syntax highlighting and the ability to run arbitrary shell commands on the current file (for quick rendering).

I’m guessing you probably have a favorite editor already; but if not, it’s time to go Emacs, my friend.

Markup Language

Here’s where you have some choices to make. I haven’t tried every markup language under the sun, but I’ve been through a few. Here are my impressions of some of the ones I tried.

HTML/XML

When you hear “markup”, probably the first thing you think (if you actually know what common computing acronyms stand for) is HTML; or, if you’re a bit more modern and savvy, XML. It’s the world’s most popular markup language, after all; and for a while, I attempted to compose documents directly in raw HTML.

Honestly, though, HTML is not what you want to use. It doesn’t matter if you’re a web developer extraordinaire who knows w3c specs inside and out, the fact is HTML and XML are equally painful both to type and to read in raw format. All those angle brackets, verbose tags, and forward slashes get quickly painful to type, and it’s amplified by the fact that you need opening and closing tags.

The text itself is quickly buried in a forest of tags, making it difficult to read un-rendered. Overall, despite its widespread popularity, HTML is just not a good markup for writing in.

Markdown

Markdown is a popular markup language created mainly for blogging, with the express goal of making it simple to generate HTML with a simple, readable syntax. So for example, if you had a bit of HTML like this:

<h1>My super list of superness</h1>
<p>This list is <em>SUPER</em></p>
<ul>
  <li>First List item</li>
  <li>Second List item
  <ul>
    <li>First Sublist item</li>
    <li>Second Sublist item</li>
  </ul>
  </li>
  <li>Third list item</li>
</ul>

…the equivalent Markdown would be more like this:

# My super list of superness
 
This list is _SUPER_
 
+ First List item
+ Second List item
    + First sublist item
    + second sublist item
+ Third list item

As you can see, the Markdown is much more type-able and readable than the HTML. Run this simple syntax through a Perl script, and you’ve got your HTML. Markdown even allows you to insert raw HTML when its own syntax fails you,

The downside of Markdown is that it’s really only designed for writing simple HTML. If you want to generate something else, or have a complex document, you may end up writing a significant amount of HTML (for instance, there’s no canonical way of creating tables in Markdown apart from using HTML tables… which I always found odd, since that’s one of the most painful aspects of writing raw HTML).

Markdown is quite popular, though; and for that reason, it was one of the first markup languages I looked at. I didn’t get very far with it, though.

Restructured Text

Restructured Text, or RST, was originally conceived of for the purpose of writing Python documentation; and all of Python’s built-in documentation is now written in it. It shares a lot of syntax with Markdown, but it’s not specifically aimed at generating HTML (thought it often is used that way) and it boasts a few more features. Here’s an example of RST, including a simple table:

==============================
The World's best document ever
==============================
 
Chapter 1: Cool table, man
==========================
 
The Characters
--------------
 
=====  ===========================
Name   Information
=====  ===========================
Bob    Crazy wearer of pants
Rob    Bob's brother
Amy    Odd nose-haver
Jed    Guy with old-fashioned name
=====  ===========================
 
The Plot
--------
 
Lots of **stuff** will happen to //people// in this crazy crazy story.

Being a bit of a Python hacker, and wanting something with some flexibility, I ultimately gravitated to RST and have been using it mostly happily for the last few years. Most recently I was happy to find that github will automatically render my RST README files to HTML on my project sites.

Where RST started to let me down is in rendering. The tools for rendering RST are a set of python scripts with names like rst2pdf, rst2html, etc. These work decently, but it’s hard to get specific customization of the output without writing your own styles and templates (which is yet another API to learn…). Most of the time I just end up going with the (rather “meh”) defaults.

The other area I grew disenchanted with RST about is the tools available. Emacs has an RST mode available (rst.el), but apart from some (incomplete) syntax highlighting and some rendering shortcuts, it doesn’t offer much. Finally, some of the syntax is just a little non-obvious (footnotes and URLs, for example), and I find myself looking it up all the time.

Emacs org-mode

My current favorite, in which I’m writing this very text, is Emacs org-mode.

For the longest time, I thought org-mode was some kind of day-timer facility for Emacs, as that is how it’s often billed. Since I seem to be allergic to any tool which promises to organize and schedule my life, I never quite took to org-mode.

Even so, I kept hearing and reading about all the cool stuff you can do with org-mode, and got curious; I discovered that org-mode, really, is just a markup language. It’s a very specific markup language for creating organized, hierarchical documents, yet it supports a huge range of features, and one can easily insert snippets of other markup languages.

It’s not just a markup language though; it’s also an extensive Emacs mode chock-full of tools to manipulate and use your org-mode documents. It’s got nice features like being able to fold up sections or use includes. It has strong support for meta-data and things like tagging, which is how it came to be used as a scheduler/calendar application (you can date-tag entries in various ways). The Emacs community seems to have given org-mode a de-facto blessing simply on the basis of the number and variety of tools3 available for it.

Org documents can be exported to (reasonably clean) HTML, or (by way of LaTeX) PDF files. I found the latter to be a little troublesome at first, since the default LaTeX template is ugly like the 1990s; but after applying some tweaks and copy-pasta’ing some alternate styles from various emacs blogs, I was able to make a default style that I liked.

The cons of org-mode are that some things require the use of metadata calls and other non-obvious syntax; I had to learn a bit more about LaTeX than I originally wanted to so that I could render PDFs the way I wanted. I think RST is a bit cleaner in some situations, such as quoting blocks of code.

Of course, outside of Emacs, I imagine that editing org-mode markup is rather a drag, and probably not supported at all. So if you’re not an Emacser, it’s probably not going to make your short list.

The advantages

Regardless of the chosen markup, the advantages of this method are mostly the same; in no particular order:

  • My documents are readable and editable regardless of what device, OS, or applications are available to me (everything has a text editor, even phones nowadays).
  • I can use the text editor of my choice, as well as any text-manipulation tool available to me (sed, grep, etc.)
  • I can write without a thought or premature commitment to the cosmetics of the end result.
  • It satisfies my programmer sense of separation (content is separate from presentation, mechanism from policy, data from logic).
  • I don’t have to use a mouse.
  • I can work on a document on a remote system across an SSH WAN link without the overhead of forwarding X11 (yes, I’ve done this).
  • My documents can end up as PDF, ODF, HTML, nicely-formatted plain-text, or a number of other formats with just a few keystrokes.

Most importantly, I enjoy writing now more than I ever did in my word-processor days; and that means more writing. And that, after all, is the point, no?

Footnotes:

1 “What-you-see-is-what-you-get” – do I really need to spell that out in this day and age? Better safe than sorry, I guess.

2 I’ve had the worst time with bullet and numbered lists in just about every word processor I’ve tried. There never seems to be an easy way to get out of them.

3 The one I’m use just now, for instance, is org2blog, a mode for posting directly to WordPress blogs using org-mode.

7 Thoughts on “How and Why to dump your Word Processor

  1. Tommy Begley says:

    OR

    Use a Typewriter, its simple and effective

    1. Alan says:

      Well, whatever works for you, Tommy. 🙂

  2. Scott says:

    Org mode brings back one of the things you say you don’t like about word processors: it pretty much works only in one program.

    The idea of using one markup language that targets several output formats is seductive, but it also brings back something that you don’t like in word processors: attending to the details of appearance while composing text.

    Several times I have almost bought into the idea of markup languages, specifically txt2tags, but they all have much the same problems of word processors. They usually work only with one program, they require diverting attention from the task of writing to apply formatting, etc.

    That formatting need only be applied once as long as it is not too complex. What does one do if they need to write an essay for a college class and must use a specific typeface and point size, set margins and line spacing, and format a bibliography? A markup language turns out to be a poor choice for the job. LaTeX could handle this, but how many people want to learn it in enough depth to create an essay template in time for finals, let alone in time for turning in their next essay?

    This is precisely the job a word processor makes easy. The formatting can be saved as a template while the text can be composed and edited in a text editor. Documents can be archived as a folder containing a pure text file, a word processing document, a PDF of the rendered word processor output, and any images contained in the document as separate files. The whole thing can be compressed with zip or as a gzipped tar file. If desired, markdown or txt2tags versions of the pure text may be created to ease export to additional mediums.

    Word processors hold our words captive and hamstring our ability to make further use of them. They carve those words in stone that may be unreadable in a relatively short time. Its appearance, e. e. cummings notwithstanding, is almost completely irrelevant to its message. Charles Dickens reads no better in modern typeset text than he did in nineteenth century typesetting.

    It is not that word processors are bad hammers; it is that they make all text processing problems look like nails. Word processors are layman’s text formatters that have been misapplied as typewriters and editors. I would like to see a graphical text formatter that lacks the ability to function as a typewriter or text editor, that functions as easily as Word or Writer, and keeps the typing and editing tasks separate from the formatting task. Add to it the ability to target multiple mediums as markdown and txt2tags do with rendered views for those in addition to page views and supply a separate text editor with actually advanced editing functions for creating text. It must keep the text separate from the formatting, though, and avoid proprietary files for storing document formatting. Otherwise, it becomes just another trap. That would advance the state of the word processing art.

    1. Alan says:

      Org-mode may only technically have “full support” in emacs, but if Emacs suddenly vanished from the face of the earth (perish the thought!), my documents would still be perfectly legible plain text documents. Even without an actual syntax spec, one could easily cobble together a script to convert it to another format (HTML, e.g.) from guesswork.

      As for the rest, I’m sharing what works for me — and it still does, long after this post was written. Markup isn’t a perfect solution, and org-mode isn’t the perfect markup; but for someone feeling the same pain with word processors as I’ve felt, it’s been a nice alternative.

    2. Ghost Writer says:

      Some of your arguments are valid but you are missing a few of the author’s points, he is a technical writer, he is suggesting that if you are such, then a markup language may be better for you.

      From my experience markup languages allow me to be more productive than word processors, you can not really know until you have tested the power of Rst, asciidoc, Rmarkdown and Latex. Why do you say a markup language that targets several outputs brings back the problem of attending to the details of appearance while composing text? maybe I am misunderstanding you but I do not see the truth of this argument.

      Org-mode works in Emacs only but the markup language can be handled by the document conversion tool called pandoc, you can write Org markup in any text editor and use pandoc to go to almost any other markup language, you can even convert to docx or odt word processor files. You will miss some things like the awesome table making capabilities in Emacs but one can survive without this.

      Essay? that is peanuts for Latex or Rmarkdown, just try doing some code blocks in a word processor, try including a page from a pdf document in a word processor, try creating an animation in a pdf for instructions in a word processor, try including descriptive drawings (Tikz) in a word processor, try reproducible research in a word processor, try doing an admonition block in a word processor, technical writers should avoid word processors if possible.

      you should not criticise markup languages as poor at a task and then obviate one of the markup languages that is very good at that task,this is dishonest

      you know Latex will make writing essays and doing the associated formatting a piece of cake so you make the excuse that it is hard to learn, one should not wait until one have an urgent need to finish an assignment to learn the tool to finish the assignment.

      there are so many templates and help available for Latex that there should be no excuses about learning to do anything that these puny word processors can do in short time.

      The big problem is that most Latex power users know what word processors are like, most people that depend on word processors do not know what Latex and its cousins are. Maybe a markup language is better for them and they do not really know or care.

Leave a Reply

Your email address will not be published. Required fields are marked *