Languages differ in the way that they treat accented letters when sorting lists. The script takes these differences into account by using the sort orders defined in a separate, editable, file (see below for details). The script sorts paragraphs (not tables; for tables, see here). The script can also create retrograde lists (words are sorted from the end of the word rather than from the beginning, sorting together rhyming words, so to speak), sort numerically, and sort by character style.
The script can be configured to sort paragraphs following the sort-order rules of any language. In addition, it can be set to sort paragraphs entirely accent-neutrally, which is needed, for instance, in English-language texts that can contain many accented characters, such as bibliographies and indexes.
Word by word | Letter by letter |
High, J. high (light-hearted) high chair high-fliers high heels High-Smith, P. high water High Water (play) highball highbrow Highclere Castle highlights Highsmith, A. highways |
High, J. high (light-hearted) highball highbrow high chair Highclere Castle high-fliers high heels highlights Highsmith, A. High-Smith, P. high water High Water (play) highways |
Tip: to sort a text completely accent-neutrally, select [No Language]:
Download script (a ZIP file) – View/download sortorders.txt (it's also in the ZIP file)
Languages divide into different types according to how they treat accented letters and digraphs when sorting lists (if you see garbled characters in this text, enable Unicode/UTF-8 – probably in View > Character Encoding or something similar; also select a Unicode font such as Lucida Sans Unicode or Microsoft's Times or Arial in the options section of your browser):
Result Should be ----------------- cote cote coté côte côte coté côté côté(Source: SortingAndCollating.pdf.) The script doesn't handle these cases so French lists may need some manual post-ordering. I've no idea about the frequency of such cases – it might not be a real problem. Follow the link for detailed documentation.
The sorter handles all these possibilities (except, as mentioned, some French cases). The script looks for a text file "sortorders.txt" which should be located in the script folder. An attempt is made to determine the currently selected language and to show its sort order (if the file can't be found the script defaults to [No Language] and diacritic-insensitive sort order):
The different types of sort order are accounted for by using a different format for each type of letter. All types are displayed in the screen shot, which shows the sort order for Czech (a different language can be picked from the dropdown). The formats are as follows:
It is easy to add a new sort order or to change an existing one. To change a sort order, pick the language and make any changes in the displayed string. Make sure that the Save sort order box is checked. Press OK to save the changes.
To add a new language, pick it from the dropdown and enter the sort order at Sort order:. You can copy and paste a string here. Make sure that the Save sort order box is checked, then press OK to store the new data.
When changing or adding sort orders, bear two things in mind:
To add a language that InDesign doesn't support, pick any language from the list and pretend that it's the unsupported language. Suppose you want to add a sort-order string for Macedonian, which InDesign doesn't support. Select a language that you don't use and set the sort string to that language. You could take Latvian, for instance, or any other language that you don't already use. Then when you want to sort some Macedonian text, select Latvian in the script's dropdown.
Words that are to be ignored at the beginning of paragraphs are listed together with the sort order. They can be entered using two formats. The simplest is just to list the words:
the a an
Write each word separated by a space. To enter an apostrophe, just type the straight apostrophe (or 'straight single quote') on the keyboard. The scripts changes it into a smart curly quote at runtime.
Another way to list the words is to write a regular expression. Some examples:
^(the|an?d?)\s ^(de[rsmn]|die|das)\s
The first expression matches the, a, an, and and; the second one, der, des, dem, den, die, and das.
The script recognises an item as a regular expression by the circumflex ^. Make sure you enter the expression correctly – the script doesn't check the expression's syntax at all.
The sort orders associated with certain languages are stored in a file called sortorders.txt, which lives in the same folder as the script. You can edit it in a text editor (note that the file is in UTF-8 format and must stay in that format), e.g. to remove languages or to edit an existing order. You could do that in the script's interface, but editing the file in an editor is a bit easier.
The example here lists some lines from the file, showing how sort orders are encoded (the first entry, [No Language], has been truncated here).
<This file uses UTF-8 encoding> [No Language] 0123456789 A[ÁÀÂÄÅĀĄĂÆ]BC[ÇĆČĊ]D . . . Z[ŹŻŽ] Polish 0123456789 AĄBCĆDEĘFGHIJKLŁMNŃOÓPQRSŚTUÚVWXYZŹŻ Czech 0123456789 A[Á]BCČD[Ď]E[ÉĚ]FGH{CH}I[Í]JKLMN[Ň]O[Ó]PQRŘSŠT[Ť]U[ÚŮ]VWXY[Ý]ZŽ German: Reformed 0123456789 A[Ä]BCDEFGHIJKLMNO[Ö]PQRS{SS}TU[Ü]VWXYZ de_DE_2006 0123456789 A[Ä]BCDEFGHIJKLMNO[Ö]PQRS{SS}TU[Ü]VWXYZ
German: Traditional 0123456789 A[Ä]BCDEFGHIJKLMNO[Ö]PQRS{SS}TU[Ü]VWXYZ Icelandic 0123456789 A[Á]BCDÐE[É]FGHI[Í]JKLMNO[Ó]PQRSTU[Ú]VWXY[Ý]ZÞÆÖ
Each line consists of two parts: the name of the language in InDesign's internal format, followed by a tab, followed by the sort order. Note that even in English versions of InDesign, its internal format is not always the same as the way names are represented in the interface. For instance, German: Reformed corresponds with German: 1996 Reform and de_DE_2006 with German: 2006 Reform.
For that reason it is best to add new languages in the script's interface; you can then later edit the file in an editor if necessary.
There's a lot of information on sorting. SortingAndCollating.pdf is a good general overview. For some interesting technicalities, see this NISO report. See also Marc Autret's post on sorting in JavaSript, here, and a discussion in Adobe's scripting forum, here. Finally, Marc's work has culminated in a multi-lingual paragraph-sorting script (in 2020).
With thanks to Igor Freiberger and Jaroslav Průka for comments on the (Brazilian) Portuguese and Czech sort orders.
Consider making a donation. To make a donation, please press the button below. This is Paypal's payment system; you don't need a Paypal account to use it: you can use several types/brands of credit and debit card.
Installing and running scripts
Questions, comments? Get in touch
7 June 2023: Added support Polytonic Greek. Some upsilons and some iotas with diacritics don't sort correctly, there's still an odd problem with upper-lower case conversion. Also, it's no longer possible to sort a selection. To sort part of a list, move it to a separate text frame, sort it, and move it back.
25 May 2022: Added the option to ignore text from a tab (or right-indent) to th end of the paragraph. This is necessary in e.g. tables of content.
22 May 2023: language names were changed slightly internally; fixed. Added a checkbox to ignore commas; see the text for details.
20 May 2020: fixed a problem with sorting by character style.
31 Dec. 2019: added some characters that should be ignored. Added a section to the text on how to add sort orders for languages not supported by InDesign.
8 Dec. 2013: fixed a problem with sorting letter combinations after another letter. CH in Czech is sorted as a separate letter after H, while other languages may have letter combinations that are variants of a letter, as in Spanish, which treats LL as L. This required an addition to the sort-string syntax; see the text for details.
18 Oct. 2012: fixed problem that could occur if the sort-order text file could not be found.
4 Oct. 2012: index markers and XML tags interfered with the sort order; fixed.
3 Oct. 2012: (1) deletion of duplicate paragraphs now works (optionally) when sorting whole stories and selections; (2) there was a problem if a selection of paragraphs included the story's last paragraph – fixed.
3 June 2012:
(1) The script's interface is now (finally) language independent. This means that the sortorder.txt file has changed, so if you start using this latest version of the script, you must use the new version of the sort-order file. Note that the changes are in the language names (which now use InDesign's internal localisation-independent names), so if you made many changes you can transfer those changes.
(2) The script now handles numbers correctly: lines that start with numbers, but also lists such as Figure 1.1, Figure 1.2, Figure 2.1, etc.
(2) The option to sort numerically is no longer necessary, and since descending sorts weren't going to make it anyway the interface was rearranged. Minor changes to the behaviour of the interface.
(4) Added some usage notes to the description of the sortorder.txt file.
30 May 2012: fixed a problem with the sorting of some accented characters.
11 Feb. 2012: the rewrite caused some problems with some details of letter-by-letter and word-by-word sorting. Fixed.
16 July 2011: a rewrite from scratch, apart from the interface. The script is now much faster, so much so that I removed the Formatted text option. The script now sorts everything as if it was formatted. The option to sort English and Irish patronyms (Mac, Mc, O') is no longer there: the script now does that by default.
1 Jan. 2011: fixed problem with sorting upper-case letters with diacritics.
Updated January 2010.