dot-font: Importing Text

The biggest headache for anyone producing a book or any other text-based publication is dealing with imported text. It never arrives in a form that’s easy to work with; it usually requires a great deal of hand work to massage raw files (usually Microsoft Word documents) into typographically well-behaved body text.

When I was talking recently with Olav Martin Kvern (known far and wide as “Ole”) about this perennial problem, his advice was succinct and provocative: “First — give up!” That is, give up hoping to get well-prepared files every time. Ole knew, from many long years in the production trenches, just how unlikely it is that anyone would prepare a Word file so that it imports easily and seamlessly. He reminisced about technical writers who insisted on creating — and using — as many as eight different levels of indented lists, and uncounted levels of subheads. “When I imported those lists,” said Ole, “I just changed them all to a List style, no matter what level the writer used.”

Although sometimes we just have to deal with overly complex manuscripts, as designers we can encourage writers and editors to keep it simple. But even more important is keeping it consistent.

Using Styles
The usual problem is what’s called “local formatting,” which encompasses everything that’s applied by the writer on an ad-hoc basis: italics, bold, indents, changes of font or size. All of those changes — all of them — should be incorporated into Word styles rather than applied by hand. That way, it’s easy to map them to styles in InDesign or QuarkXPress with the same names, and use those to automate all the typographic changes. (I mostly use InDesign as my design and production tool, so that’s what I’ll be talking about here. The principles are the same in QuarkXPress, though the details may vary.)

Using styles seems to be a problem for a lot of writers and editors, but it’s really quite simple, and you can help them get into the habit. In Word, writers probably use the Normal style to format the basic text; all they have to do is make their preferred formatting (font, size, line spacing, etc.) part of the default style. (Microsoft makes it absurdly hard to change the defaults in the global Normal style, but you can modify the Normal style in a particular document fairly easily.) If writers want the manuscript to appear in 12pt Courier on the screen, then all they have to do while writing is format the Normal style to use 12pt Courier, rather than the default Times New Roman. Or they can create a new style, with a new name and whatever font, size, and leading they want, and use that instead of Normal.

Flexible Styles
The style attributes in Word don’t have to reflect the way the text is going to appear in your layout. That’s the beauty of styles; you can give them one definition in Word, based on what the writer finds easy to use, and another, entirely different definition in InDesign or QuarkXPress, based on the page design and typography of the publication.

But if, in Word, someone adds extra formatting that’s not part of the style — extra indents, for instance, or a change of size or font — then that “local formatting” will come in along with the style, when the file is imported into the page-layout program. And that will cause a lot of grief for the production people, and add time to the schedule.

Encourage the writers to think ahead of time about the different elements they might need to use in their manuscripts. Indented quotations? Create a style for it! Then, instead of having to change the indents as they’re writing each time they come to another indented quotation, they can simply apply the “Indented quote” style to those paragraphs. If the indents are part of the style itself, every time writers apply that style to a paragraph, it will automatically become indented.

Since the amount of indent on a manuscript page doesn’t have much to do with the amount of indent on, say, a printed page with three narrow columns of type, you can simply redefine the “Indented quote” style in InDesign to something appropriate. No leftover local formatting you have to delete!

Everyone on the Same Page
Ole suggested exporting a set of styles from InDesign, tweaking them in Word (for instance, by making the font size larger, to be easier on the eyes of the writer), then giving that style sheet to all the writers and editors who will submit files for the project.

I usually do something simpler: I give the writer a set of style names and ask them to use those names when constructing styles in Word. I leave the definition of the styles up to the writer. (I suppose it would make more sense to simply create a set of styles in InDesign with names identical to the most commonly used Word styles, but somehow I never seem to do it that way.)

Consistency is the key. Either of these plans will work only if the writers and editors limit themselves to the styles they’ve been given, or confer with the designer before introducing new ones.

Paragraph Styles and Character Styles
What are styles, in Word or InDesign? They are a way of automating the format. A style is a collection of attributes that can all be applied together to any block of text. Using styles, rather than doing all the text formatting manually, is a great time-saver. That’s why Word has so many different styles already pre-defined, so writers can apply them as they write or edit, without having to invent their own.

In Word and InDesign, there are both “paragraph” and “character” styles. Paragraph styles apply to everything in a paragraph, everything up until the next paragraph return. Character styles work on a smaller level; they apply to any text that’s selected, whether it’s a single character or the whole text of the document. You can use paragraph and character styles together; for instance, Normal, which is always a paragraph style, might define the basic font, type size, leading, and indents of the text, while Strong, a character style that’s another one of Word’s default styles, might give emphasis to a word here and there.

It’s particularly useful, for the sake of production, if you can persuade the writer to use a character style for italics they want to apply to words or titles. Instead of just hitting a key combination, or clicking an icon, to make a word italic, they should choose the “italic” character style instead. (You may have to create such a style and give it to them.)

If the writer doesn’t use a style for italics, you can open the manuscript in Word before you import it into InDesign and search for the Font Style “Italic” (without specifying any font), replacing it with the “italic” character style. Then you’ll know that all the italics in the file you import will be styled, not locally formatted.

Import Options and Style Mapping
When you’re importing files into InDesign, there are a few options open to you. InDesign offers an “import option” for Word files, where you can choose either to ignore any formatting that has been applied or to preserve that formatting. An important subset of that option is “style mapping,” where you can map any Word style to an equivalent InDesign style. This is handy for situations where the style names were meant to be the same but don’t quite match; InDesign pays attention to capitalization in style names, so “Text Paragraph” and “text paragraph” aren’t the same, for example.

But sometimes the Word files you get are hopelessly muddled, with confused styles haphazardly applied, and local formatting all over the place, so the only thing to do is strip out all the formatting and start from scratch with plain text in InDesign. The most frequent casualty of this technique is the italics. That’s why it’s so important to have an “italic” style applied to all the italics in the Word file; otherwise, you may have to re-apply it by hand in the InDesign file.

Note to software developers: it would be extremely handy to be able to strip out all local formatting except italics. This is the most common problem — losing the italics — and yet it would be so easy to fix. Even more handy and flexible would be an interface where you could choose which characteristics, whether style or local formatting, to import and which to ignore.

A Few Common Problems
Have you ever imported a Word file with styles that correspond to your InDesign styles and found that all the paragraphs seem to have some sort of extra formatting applied to them, but you can’t tell what it is? Every style name in the Paragraph Styles palette shows up with a plus-sign next to it (which usually indicates that there’s local formatting applied that’s not defined in the style). Ole pointed out one way this can happen when you’re dealing with international publishing. If you receive a file that has been prepared on a system with Japanese-language options turned on at the system level, there may be certain instructions included in the style that are only applicable to text in Japanese; they don’t affect how the text appears in English, but InDesign sees them as added local formatting, and indicates this with the “+” next to the style name. If your system isn’t Japanese-enabled, you may not have any interface for turning those features off.

But styling problems may be more mundane than that. If some of your text shows up looking different — in 12pt Courier, for instance, when the rest of the paragraph is 10pt Sabon — even though there is no “+” next to the style name in the style palette, what’s going on? It may be a Word artifact. In Word, you can use a character style as part of the definition of a paragraph style, but InDesign doesn’t work that way; it considers the paragraph style and the character style independent of each other. So you may end up with an unexpected character style applied to some or all of the text in a paragraph, even though you thought that paragraph had only a paragraph style applied to it. Probably the best way to avoid this kind of trouble is to carefully check all the character-level styles that are being imported, in InDesign’s import options, and either redefine or delete any character styles that you don’t actually intend to use.

Finally, the most common problem of all: Many writers, especially those who grew up using typewriters rather than computers, don’t realize that, in an electronic text file, the definition of a paragraph isn’t what it looks like on their screen or on the page, but whether it ends with a paragraph return. These writers may use tabs, rather than returns, to create a new paragraph — or what they think is a paragraph. If you find that paragraphs in an imported document seem to be behaving oddly, or that local formatting seems to leak from one paragraph to the next, the most likely culprit is misplaced tabs.

Working Smart
After jumping through all these hoops to import a file and massage it typographically, the last thing you want is to have to do it all over again. Once the text has been imported, any corrections, changes, and edits should be made in the InDesign or QuarkXPress file, not in a new Word file.

There isn’t any simple solution to the problem of importing text. Despite all the nifty tools that software developers give us for this purpose, it still takes forethought and close attention on the part of everyone involved: the people who prepare the original text, anyone who works on it in the middle stages, and the person who has to pour it into a page layout and make it typographically functional. But the process can be made easier, if never truly easy. The key is simplicity and styles. And consistency in applying both.


Posted on: February 9, 2007

4 Comments on dot-font: Importing Text

  1. Well done John for addressing this common problem. I used to be surprised when I found people who didn’t know about styles, but now I’m used to it.

    I’ve become good at doing quick demos that make people want to learn styles (I think it’s when they realise they’ve done days of overtime reformatting large documents that they see how useful styles are). Anyone who’s worked on a large document can instantly see the benefits when you demonstrate how easily you can globally reformat it.

    I generally create Word templates for writers/keyboardists, with styles in place. This lets me create styles that automatically switch (back to text after subhead for instance). This really excites people and gets them using the styles. For my own writing I connect Word styles to the function keys on my Mac, then label these with Post-It tape. I don’t do this for other people in case they use the function keys. [Note to Adobe – why not free up the definition of keystrokes for styles in InDesign? I like using my function keys; others like using their numeric keyboard]

    Because Word has a slew of its own styles, which you can’t delete, I start all my styles with an asterisk. This keeps them all grouped together and ensures I (or my co-writers) use the right styles.

    One thing I’d strongly recommend is that you never, ever use the Normal style – even for regular word processing. Because all the other preset styles are dependent on it, a stray “Normal” can drive you crazy. So I normally build my Word document styles from a “*Body” style, which I make the starting default in my templates.

    Word also has an irritating ‘helpful’ feature that automatically assigns styles when someone uses local formatting. Turn this off.

    Using the leading asterisk (or bullet, hyphen, whatever) makes the imported styles easy to spot in InDesign or Quark style lists. They can then be mapped through to the publication’s final styles.

    Hope this is a useful addition to the arsenal!

  2. Mr. Barry mentions “…exporting a set of styles from InDesign, tweaking them in Word…” — I’d love to find out how to do this. Currently we’re using trial & error to make Word styles dovetail into our InDesign templates, and working backwards this way would be excellent. Anybody got advice?

  3. Your suggestion: “Note to software developers: it would be extremely handy to be able to strip out all local formatting except italics. Even more handy and flexible would be an interface where you could choose which characteristics, whether style or local formatting, to import and which to ignore.”
    A solution exists for QuarkXPress in the form of Stylin’, from Vision’s Edge ( This $49 XTension lets you apply a new style without losing your favorite attributes that are already applied to the text. You can choose to keep any combination of paragraph or character attributes, such as font styles, tracking, color, space-before and -after, alignment, etc. Once you define a combination of attributes you want to keep, you can save that as a Setting for future use.

  4. I do a PDF newsletter for an technical organization who solicits articles from multiple members. I agree with “Ole” that you’ll never get them to the point that you can simply import their files—even if they KNOW your druthers. Forget matching style lists: they won’t cooperate. My stripping out the formatting works too well; these authors always have many web references and email addresses as part of the story, and these get stripped as well. I was struck by your wish: “…handy and flexible would be an interface where you could choose which characteristics, whether style or local formatting, to import and which to ignore.” Ohh, YESS!

Leave a comment

Your email address will not be published.