Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

Indesign GREP to find question phrases

Return to Member Forum

  • Author
    Posts
    • #89885

      Hello everyone,

      For an art project I am trying to scrape through large amounts of text (about 4000 A4) and want to scrape the questions in the text.
      To explain myself further see the example below.

      Here is just any text with a period. This is the question I want to scrape? Here is just any text with a period.

      Now I would like just to scrape the following part: “This is the question I want to scrape?”
      Does anyone have any idea on how this can be achieved?

      I tried \>\? but that only finds the question mark itself.

    • #89887
      David Blatner
      Keymaster

      There is probably a more elegant way to do this but this is a start:
      (\w||'|,|;|~_|~=)+?\?
      That means look for a string of characters that could be “a letter or number, a space, an apostrophe, a comma, a semicolon, an em dash, or an en dash” and that ends with a question mark. (The vertical bar means “or”)

    • #89900
      Aaron Troia
      Participant

      David is on the right track, I would do it a little differently with a positive lookbehind or a Keep (\K) to look for (but not capture) the punctuation and space before, such as:

      (?<=\. ).+?\?
      or
      \. \K.+?\?

      they both do the same thing, just in a different way. They both look for a period space (both zero width so as not to be included or affected in your replace) on one end and a question mark on the other end, and, of course, everything in between.

      you can either then group the .+? like (.+?) as a capture group or if you can just use it as is and in your replace field use $0 which basically matches everything that was found.

      Let me know if you have any questions.

      Aaron

    • #89901
      David Blatner
      Keymaster

      Thanks, Aaron! Good to hear from you. Those expressions are great. But that won’t work if the question is the first sentence in the paragraph, right? Or if the previous sentence ends with an exclamation point or some other non-period punctuation.

    • #89902
      Aaron Troia
      Participant

      Hey David! Yes you’re right, I should’ve taken that into account, I have modified the version using Keep (\K) to account for more instance, and you could add more punctuation into the noncapture group at the beginning ((?:^|\.|\!|\?)) to account for more. There might be a better way, I know this isnt perfect, but it should catch most instances.

      (?:^|\.|\!|\?) ?\K.+?\?

    • #89913

      Thank you David and Aaron, that helped a lot. I got what I was looking for!

Viewing 5 reply threads
  • You must be logged in to reply to this topic.
Forum Ads