Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

The perfect GREP pattern for URLs

Return to Member Forum

  • Author
    Posts
    • #55191
      Steven Wild
      Member

      Hi, I recently read Casey D's post “Shortest GREP Pattern to address URLs and e-mail addresses?”. It seemed there was no resolution to creating a perfect search for finding URLs.

      Some issues that came out of Casey's post were (1) avoiding the full-stop if the URL is at the end of a sentence, and (2) including any slashes that appear at the end of a URL.

      I'm new to GREP, and so am struggling with the syntax of the whole thing, and was hoping that someone might have put together a GREP search that covers all of these, and any other, tricky aspects of finding URLs in all circumstances. My biggest problem is understanding the syntax of people's existing searches, in order to make any modifications. It's great that InDesign GREP provides the codes for various metacharacters – it would be even better if it were possible to look up metacharacters from existing strings to work out what they are doing…

      If anyone has a comprehensive search – even if it's a mile long – that works in all circumstances (and you're happy to share it) I'd be most grateful if I could get a copy.

      Thanks for any help forthcoming.

      Cheers,

      Steven

    • #55192

      There is no such beast :P

      Every time I thought I found “the definitive GREP” something else popped up — latest major addition was support for “?” queries, and that opened another can of snakes. Best you can hope for is “something that works 80% of the time”, I think.

      As for Casey's problems,

      Some issues that came out of Casey's post were (1) avoiding the full-stop if the URL is at the end of a sentence, and (2) including any slashes that appear at the end of a URL.

      (1) No problem if you only allow a period inside, that is, it always should be followed by a alphanumeric.

      (2) Also no problem — all you need to do is end with “/?”

      You can build something up from this, for starters:

      (http|ftp)://[a-zA-Z][a-zA-Z0-9]+.([a-zA-Z_0-9]+.)+[a-zA-Z_]+(/[a-zA-Z_0-9.]+)*/?

      – it found all 7 URLs in the document I happened to have on my screen right now.

      (Ed. Hah. Didn't address Casey's #1 — never allow a period end. Some shuffling around will solved that, tho'.)

    • #55193
      Anonymous
      Inactive

      Well I've tried this with all the links from my bookmarks and it seems to work :D

      Going on the basis that URLs don't have spaces in them, I simply search for all the text in a sentence string like so

      (https|http|ftp|www)[[:punct:]]+.+?(?=s)

    • #55199
      Steven Wild
      Member

      Thanks for your help guys – really appreciate the response. I'll have a play around and see how I go.

      Cheers,

      Steven

    • #126518
      Arvind Sathe
      Member

      The above GREP did not work. The ‘/’ before s in the end did the work.
      (https|http|ftp|www)[[:punct:]]+.+?(?=/s)
      Thanks everyone for your contribution.

Viewing 4 reply threads
  • The forum ‘General InDesign Topics (CLOSED)’ is closed to new topics and replies.
Forum Ads