Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

GREP to grab consecutive paragraphs of text

Return to Member Forum

  • Author
    Posts
    • #33957
      Chris Lane
      Participant

      I have word files that are coded with angle brackets. Heads may be coded as <h1>; text coded as <t>. The text code is only on the first paragraph of text; meaning I may have the code <t> and then 10 paragraphs of text and then the code changes on the next head, such as <h1>. How can I grab onto all 10 paragraphs of text until the next code appears? I am thinking I need to use a positive lookahead but can’t seem to get it to work. It would be great if each paragraph of text started with the <t> code but that is not the case and I do not want to modify the word files supplied and add the code.

    • #33962

      I guess it is failing for you because by default, GREP only works on single paragraps. You can force it to consider multi-paragraph matches by prepending the flag (?s) — this switches off “single line mode”.
      Try something like this:

      (?s)^<t>((?!\r<).)+

      This should match any sequence of single characters (including hard returns) where “the next match” does not equal a hard return plus a new “<” start-of-next-tag marker.

      You can apply your new paragraph styles this way, and in a second loop remove the ‘used’ <t> markers. I don’t think it’s safe to try both at the same time. Then again, from within a script this ought to be fast nevertheless.

    • #33975
      Chris Lane
      Participant

      Thank you! This works beautifully. I was close in all my efforts and the switching off of the single line mode was what I was missing. Scripting?—that’s my next learning adventure. Thanks again!

    • #75596
      amyb
      Member

      How would I modify this to end at the next bracketed code, which is at the end of the last paragraph of the string of paras I want selected? Following the OP’s example, it would be </t>.

      When I tried the GREP above, it skipped over that closing code and selected all paras until the next <t>

      Thanks!

    • #75598
      David Blatner
      Keymaster
Viewing 4 reply threads
  • You must be logged in to reply to this topic.
Forum Ads