URLs: Breaking (Badly)
With the help of GREP styles, you can solve the problem of bad breaks in long URLs.
This article appears in Issue 50 of InDesign Magazine.
Dividing URLs over a line is seemingly simple—never hyphenate a URL—but the next question is, where do we break long URLs? The Chicago Manual of Style, 16th Edition now recommends breaking before rather than after a slash. But not everyone agrees with or follows this style. Luckily, we can use GREP styles to find URLs and override InDesign’s paragraph composer.
The first task is to prevent hyphenation. John Gruber’s liberal, accurate regex pattern for matching URLs is a great GREP pattern for finding nearly any URL, and then it’s simply a matter of assigning No Language (found in Advanced Character Formats) to that (or any) URL.
But this GREP won’t work for the second task of allowing long URLs to break since it selects the entire thing (applying No Break here could cause instant overset text). Instead, we need use a second GREP expression to selectively apply No Break within a URL:
([w-]+:/{1,3})|(<[^s$&+/:;=?@#]+.[^s$&+/:;=?@#]+?>)
Now in English: Group 1 finds protocols like https:// or telnet:// and even x-yojimboitem:// by matching a word that may contain hyphens followed by a colon and one to three forward slashes. And Group 2 finds hostnames like www.adobe.com or indesignmag.com by matching two strings of non-space and non-reserved characters that are separated by a period and omitting any trailing punctuation.
CMS-style breakup is more complex, since InDesign breaks after punctuation, not before. While this can’t be fully automated, it can be accomplished much faster using GREP Find/Change. First, Find:
([$&+/:;=?@#](<[^s$&+/:;=?@#]+))(?=[$&+/:;=?@#.])
Group 1 matches any reserved URL character followed by a string of non-space and nonreserved URL characters, and Group 2 is a positive lookahead for any reserved URL character or a period.
After determining that you’re within a URL (don’t use Change All… bad things could happen), use Change to insert a leading Discretionary Line Break before the found text: ~k$0.
This article was last modified on December 5, 2025
This article was first published on April 24, 2017
Commenting is easier and faster when you're logged in!
Recommended for you
ChainGREP: A Script that Gives You GREP Super Powers
In my experience, there are there are two keys to achieving amazing feats of eff...
GREP of the Month: Reveal Codes
Normally, you can’t target text with mixed formatting in a GREP search. But with...
Tip of the Week: Halting Hyphenation Without Forced Line Breaks
This InDesign tip was sent to Tip of the Week email subscribers on April 18, 201...
