Create index (topics and page references) from a word list

Using the entries in a word list, the script adds topics and page references to all open InDesign documents.

Use

First, rename the word list so that it has topic list in its name and open it. Then open all documents that should be indexed and run the script. It has no interface. The script uses each item in the list to create a topic in each document and adds page references.

A log is created in a new document that lists items that could not be found and ambiguous items (see below).

Settings

The script can be set to search case-sensitively, to delete an existing index, and to consider only paragraphs formatted with certain paragraph styles.

Ignore commas

By default the script ignores everything from the first comma in a topic name. Thus, of the entry

Leech, G.

Only Leech is used for the search: after all, a text is more likely to contain just an author's surname than their surname followed by their initials or first name. To change this, and have the script look for the whole entry, i.e., including the comma and what follows it, change this line in the script:

var comma_split = true;

to read var comma_split = false;

Ignore parentheses

This one is similar to the previous one: ignore everything from the first opening parenthesis. Take this entry:

abomination (see also terribleness)

By default, the script looks for abomination only. To change this, look for this line in the script:

var paren_split = true;

and change it so that reads var comma_split = false;

Case-sensitivity

The script searches case-sensitively by default. To make it search case-insensitively, change this line:

var case_sensitive = true;

to read var case_sensitive = false;.

Whole words only

The script searches only for whole words. Cannot be changed.

Replace existing indexes

By default, the script deletes any existing topics and page references. To keep an existing index, change this line:

var replace_index = true;

to read var replace_index = false;.

Certain paragraph styles only

There are probably a number of parts in each document that you want to exclude from the index. Typically, indexes don't refer to such items as bibliographies, quotations, and chapter titles. Some publishers want to exclude tables as well. To allow for this, the script can be set to mark topics only in paragraphs formatted with certain paragraph styles.

The script defaults to ignore this feature, that is, everything it finds is marked for the index. To make the script look in certain paragraphs, list those paragraph styles, using only the first three letters of their names only, separated by a |, the vertical bar. Look in the script for this line:

var paragraph_styles = "";

This is the default – the script finds all paragraphs. To include only paragraphs that are formatted with the paragraph styles default and sectionA, sectionB, sectionC, etc., change the above line to this:

var paragraph_styles = '|def|sec';

Again, use just the first three letters of the paragraph style (case sensitive), prefixed by |. Be careful not to add any spaces bbefore or after the | symbol.

Log

The script lists all items it could not find in a new document.

In the case of ambiguities, both items are listed. For example, if the topic list contains Williams, Frank and Williams, Max, and the script is set to ignore commas (and what follows), the name Williams is indexed without distinguishing the first names and both Williams, Frank and Williams, Max are logged. It's up to you to disambiguate the index.


Version history

28 Aug. 2024: Some matches needed to be refined. If a search term contains a dash, then the script should target the term with normal and non-breaking hyphens. Similarly, if a search term contains a space, then the script should find the term with a non-breaking space as well. Parentheticals weren't matched correctly. All this is fixed. Also, the paren_split option somehow got lost, now reinstated. Finally, ScriptUI progress bars stopped working in ID2025, so I removed all progress indication.

6 Aug. 2024: When run against more than one document the scripts log contained many false negatives. Fixed.

16 Sept. 2023: Some improvements make the sript more efficient. Various clarifications in the text.

28 July 2019: The script now creates a log file that shows which items weren't found and a list of amiguous entries.

16 Nov. 2011: Fixed bug that ignored some words in the word list.

Around 2007: First posted.


Useful script?

Consider making a donation. To make a donation, please press the button below. This is Paypal's payment system; you don't need a Paypal account to use it: you can use several types/brands of credit and debit card.

Peter Kahrel's paypal account

Show script (right click, Save Link/Target As to download)

Back to the main page on indexing

Back to the main script page

Installing and running scripts

Editing a script

Questions, comments? Get in touch