Language-aware sorter (light)

The script sorts paragraphs using the alphabetisation rules of the document's default language. It is a quick alternative to the more configurable paragraph sorter (see here), though it cannot deal with formatted lists.

Use

Before you sort a text, you need to set two things:

sort foreign languages

To sort a text, select a text frame and run the script. The script sorts the whole story, not just the selected frame.

Details

Sorting is no easy matter in general, and some lengthy posts suggest that scripting a sorter in JavaScript isn't straightforward at all if you want to sort using the rules of particular languages – see Marc Autret's collator and the page that describes my paragraph sorter. These two approaches started from scratch, implementing sorting routines in JavaScript to alphabetise JavaScript arrays or InDesign text, or a combination of these methods.

But if the problem is to sort using language-specific rules, then maybe we could use of one of InDesign's two internal sorters – the index and the table of contents – to sort text. And in fact we can. For reasons that I'll not go into here, the index is the better choice for deployment as our custom sorter. Though InDesign has its own problems when sorting indexes – you'll notice these only in particular circumstances (see below, under Disadvantages) – the language awareness of InDesign's built-in sorter is excellent.

Now, if we wanted to use InDesign's index feature to sort our lines, we'd have to do something along the following lines:

  1. transfer all lines in the text to InDesign's index; they will appear in the Index panel as topics, sorted by inDesign;
  2. remove all text from the document; you now have a document without text but with a list of topics;
  3. write all topics into the document; you now have sorted your list.

This is precisely what the attached script does. The script is short, can be adapted for different purposes, and is very quick. The are some advantages, but some disadvantages as well.

Advantages

The script is unbelievably quick. That's not because of the way the script is written, but it's down to the blazing speed at which InDesign handles its indexes. A 3,500-word list sorts in about 1.5 seconds – and that includes reading and writing the list.

As a side benefit, InDesign removes all duplicate lines. This is because you can't have duplicate topics in an index, therefore when the script adds topics to the index, duplicates are filtered out. No extra effort required.

Disadvantages

The main disadvantage of the script is that it makes a mess of any styling and formatting. For styled lists, use the more elaborate paragraph sorter, linked above.

A potential problem is that InDesign ignores spaces and commas when sorting its indexes. That spaces are ignored is correct for letter-by-letter sorting, but commas should be respected. A list of names, for instance, will be sorted as follows:

  Li, C.
Liebers, J.
Li, G.N.
Li, K.
Lima, C.J.
Lim, C.S.

While the correct sort would be this:

  Li, C.
Li, G.N.
Li, K.
Liebers, J.
Lim, C.S.
Lima, C.J.

By manipulating the sortorder fields, InDesign's sort can be fine-tuned. This is not catered for in the attached script but can be added.


Version history

8 August 2012: posted


Useful script? Saved you lots of time?

Consider making a donation. To make a donation, please press the button below. This is Paypal's payment system; you don't need a Paypal account to use it: you can use several types/brands of credit and debit card.

Peter Kahrel's paypal account

Show script (right click, Save Link/Target As to download)

Back to script index

Installing and running scripts

Questions, comments? Get in touch