HTML to InDesign

Method #2: By Way of Word

Upside

• Free (unless you don’t own Word)

• Translates most tags and classes into styles

Downside

• Word can be very fussy about opening certain URLs

You can use Microsoft Word to act as a bridge between HTML and InDesign by using Word’s Open URL feature (File > Open URL). Simply enter a URL into the field, and Word captures the content from the destination page. One thing to note is that Word doesn’t resolve the “friendly URL” scheme (i.e., /name-of-article-all-spelled-out/) used by most sites today. It considers those URLs to be directory names, and Word needs a filename—preferably, one ending in “htm” or “html.” To get around this limitation, go to the web page and use your browser’s Save As command to save the content as “source”—meaning the actual HTML markup and content—and then open that saved “.html” file with Word.

Don’t expect Word to get the formatting right, however—the page won’t look like it does on the web when its content arrives in a Word document (Figure 5).

Figure 5

That said, Word does an excellent job translating most HTML tags—headings, lists, etc.—into paragraph styles (Figure 6).

Figure 6

In addition, paragraphs in the HTML with class attributes in the <p> tag result in Word paragraph styles named to match those classes. For example, text enclosed with a tag like <p class=”author_name”> will produce a Word style of “author_name.” Generic paragraphs get the basic “Normal” (or, sometimes, “Normal (Web)”) style applied. Bolds, italics, and hyperlinks automatically produce “Strong,” “Emphasis,” and “Hyperlink” character styles, respectively.

However, this method might bring more of the web content into Word than you want. Site navigation options, sidebars, and page footers will be in the document if you’ve saved the entire web page. In that case, simply select and delete those elements from the Word file, and then choose File > Save As. Give the file a name, and save it in either Word format (.doc
x) or Rich Text Format (.rtf) by choosing the appropriate option from the Format menu in Word’s Save dialog box.

When you import the Word or RTF file into InDesign, the styles will come along with it (Figure 7).

Figure 7

Be prepared, however, that you’ll find overrides in abundance, and you’ll need to spend time cleaning up and redefining your styles. Word does an okay job, but it will get a lot wrong, especially if the incoming HTML isn’t clean, logically-ordered, and standards-compliant. Also, don’t expect to get a perfect representation of HTML content in InDesign. The best you can expect to achieve is greatly reducing the amount of manual re-formatting required, along with maintaining the content hierarchy, most style designations, and your bolds and italics.

Bookmark
Please login to bookmark Close

This article was last modified on January 8, 2023

Comments (2)

Leave a Reply

Your email address will not be published. Required fields are marked *

Loading comments...