Making the Sentence Structure of Paragraphs Apparent


This post is about a tiny thing that makes a big difference in practice because I spend so much time writing. Usually, people compose paragraphs as monolithic blocks of text. For several years now, I’ve written paragraphs like this:

Integer overflow bugs in C and C++ programs are difficult to track
down and may lead to fatal errors or exploitable vulnerabilities.
%
Although a number of tools for finding these bugs exist, the
situation is complicated because not all overflows are bugs.
%
Better tools need to be constructed---but a thorough understanding
of the issues behind these errors does not yet exist.

For the non-LaTeX users out there, the percent symbol indicates a comment line. When this text is typeset, the sentences will flow together in the usual fashion. Why do I do it this way? The most important reason is that it calls my attention to the individual sentences in a paragraph. Frequent offenders like run-on sentences, paragraphs that lack a topic sentence, groups of sentences with repetitive structure, and paragraphs that contain a single sentence become trivial to spot. I often find that when I take normally formatted text — whether written by me or by a co-author — and split it into sentences like this, hidden problems in the writing become obvious. A secondary benefit comes out only when interacting with a revision control system: because editing individual sentences does not cause an entire paragraph to require re-wrapping, diffs are much easier to read and conflicts become less likely. The cost of writing this way is that I suspect it annoys co-authors sometimes.

There must be other tricks like this that people use — I’d be interested to learn about them. As a random example, when using some old word processor (MultiMate, I think — but modern word processors support this as well) I used to use a switch that made certain non-printing characters visible.

,

14 responses to “Making the Sentence Structure of Paragraphs Apparent”

  1. The trick I’m used to seeing (and use WAY too seldom) is to make each sentence a single line, and use an editor that will wrap things neatly for you.

    (unrelated: my name doesn’t fit your namefield. Do you have any influence on the set length?)

  2. I start every sentence in a new line, so I do the same thing as you but without the % signs. Olin does that too. It definitely helps with the diffs.

  3. In emacs, turning off auto-fill-mode, and turning on visual-line-mode, makes it easy to do the one line per sentence thing and have it still look nice on the screen. But without the separating comment chars, you do need to train yourself out of the “constantly hit M-q” habit!

  4. Michael, the esc-q thing is so ingrained that I think I’d just have to rebind it!

    Murat, looks interesting — though I am pretty traditional when it comes to typesetting issues. People have been typesetting for more than 500 years, but editing on screen for a small fraction of that.

  5. I do something similar, but not identical:

    Traditionally, testbeds for networking and systems research have been
    designed as monolithic facilities: they contain a single root of trust.
    The resources in the facility are assumed to be administered by a single
    entity or a set of mutually-trusting entities.
    All user management, including vouching for users’ identities and taking
    responsibility for their actions, is done using a flat trust structure or a
    simple hierarchy with the facility itself as the root.
    This design is not a good match for testbeds that are composed of multiple
    autonomous facilities, or in which different parts of the testbed operate
    under different trust models.

    I like it because it preserves the relatively dense appearance the paragraph will have when typeset (eg. no nearly-blank lines in the middle of the source file), while still making the sentences stand out. Also makes it extremely obvious when you’ve started too many sentences in a paragraph withe the same word, something I have to watch out for in my own writing.

    IIRC, I picked this style up from John Byers.

  6. Bah, the comment system ate my pre tags. What it’s supposed to look like is that the first line of each sentence starts in column 0, then subsequent lines are indented by a tab.

  7. Mikael, I couldn’t find a way to increase the length of the name field.

    I am far from being a major WP fan, just not sure what else to use and do not feel like hacking something together myself.

  8. If you’re a vimmer, then you can add the following to your .vimrc to turn on those invisible characters:

    if has(‘multi_byte’)
    scriptencoding utf-8
    set enc=utf-8
    set fileencodings=ucs-bom,utf-8,latin1
    if version >= 700
    set lcs=tab:»\ ,trail:·,eol:¶,extends:→,precedes:←,nbsp:×
    else
    set lcs=tab:»\ ,trail:·,eol:¶,extends:>,precedes:<,nbsp:_
    endif
    set list " This actually turns them on
    endif

  9. I’m with the no \n and turn on word wrap camp here. (About the only time I use word wrap in a test editor.) As for visable control characters I won’t use a text editor that doesn’t let me do that.

  10. Especially when using version control with LaTeX documents, I absolutely prefer one sentence per line and no automatic line breaks inserted by the editor.
    It allows version control to do its job, while avoiding strange LaTeX problems when the comment signs are not there or have a space in front of them by accident.