Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

Grep to find all-italic paragraphs

Return to Member Forum

  • Author
    Posts
    • #83698
      Anonymous
      Inactive

      Is it possible to write a GREP find that will find a paragraph where every character is Italic.
      I’ve been trying to construct such a search, but no success after trying for a couple of days.

      Many thanks
      Jim

    • #83699
      Ari Singer
      Member

      This should work:
      ^.+$
      Find Format: Italic

    • #83700
      Anonymous
      Inactive

      Very many thanks for the quick reply.
      The book is in Italian (I dont speak Italian :-) ) and that search seems to miss any paragraphs that contain diacrit characters?

      Jim

    • #83702
      Ari Singer
      Member

      This is interesting to me, as the . wildcard should catch any character, no matter the form.

      Does it catch some paragraphs and some not? Or does it not catch any paragraph at all?

    • #83703
      David Blatner
      Keymaster

      My guess is that it is not the diacritical characters but rather that something about those paragraphs is not italic. Try removing the Find Format: Italic part. Then it should find every paragraph, no matter what the formatting, right?

      • #83704
        Ari Singer
        Member

        I was thinking the same thing. That’s why I asked him if it does find some paragraphs. If it doesn’t, then it’s one of two options. Either…

        A: Some characters in the paragraph (might even be a space character) are not italic, so GREP doesn’t catch the entire paragraph.

        or

        B: The specified font does not use the strict term ‘Italic’ (such has Helvetica Neue which has ’36 Thin Italic’ for example). So you need to make sure that the term of italic used in said font is the exact same term used in the Find Format options.

    • #83706
      Anonymous
      Inactive

      I thought I’d sent a reply, apologies if this becomes a duplicate.

      Thanks very much for the replies. There was an è character in the otherwise all-italic paragraphs that was regular not italic. That was the problem all along.

      But now I have a more difficult problem. How to find all the regular è characters that are in an otherwise all italic paragraph?

      Thanks again

      Jim

    • #83707
      Ari Singer
      Member

      What do you mean by ‘find all the regular è characters that are in an otherwise all italic paragraph’? Do you want to find only that character? Or to find an italic paragraph that also has that character in it?

    • #83719
      Anonymous
      Inactive

      The regular è characters are a mistake that needs to be corrected if they occur within an al-italic paragraph, but not if they occur elsewhere.
      The original question is solved. But it showed up this new problem.

      Jim

    • #83722
      David Blatner
      Keymaster

      I do not think that is possible with InDesign’s GREP.
      However, if you had a lot of this text that required searching, it might be worth the time to export as InDesign Tagged Text. When you do that and open the tagged text in a text editor, you would see something like this:

      <ParaStyle:NormalParagraphStyle><cTypeface:Italic>Natiunt am eumenet idus adi dolora sum harcimus recum aceribus ad quid endamet fugia sin custiae ctibearum sunt aut faces aut<cTypeface:><0x00E9><cTypeface:Italic>nis apienimus maxim voluptia voluptatate prat lita dolorernam que sit a que laborem necaboribus magnatur aut ut utemporem as eos que con elliqui dusdam que nonsequatat.
      

      From there, you could search for places where the typeface is changing at the accented e (which is what that unicode 0x00E9 thing is in the middle of the paragraph).

      This is pretty geeky, but it can work, I think.

    • #83730
      Ari Singer
      Member

      You can do this:

      Find What: (^.+?)(\x{00E9})(.+?$)
      Change To: $1e$3
      Find Format: Italic

      Then make sure the ‘scope’ of the selection is to the entire story, and start hitting ‘Change All’ again and again and again, until the popup dialog box says ‘Search is completed. 0 replacement(s) made.’. As long as the number on the dialog box is 1 or more, it means that it is still found an è, but once it says 0 it means that all è have been replaced with a regular e.

      • #83732
        Ari Singer
        Member

        Just want to add one important bit (as I can’t edit my previous post): This won’t find the è if it’s at the beginning of the paragraph, to make sure that the è at the beginning of a paragraph is also captured, use this:
        (^.+?|.?)(\x{00E8})(.+?$)

    • #83737
      Anonymous
      Inactive

      Thanks David, Ari for all your help – very much appreciated.

      David, I will look into the tagged option if there are too many characters to change, but I think that there might be only 50 or so in the 350 or so pages. There are only about 80 all-italic paragraphs.

      Ari. I tried your coding and it found the any italic text that contained the italic {00E9}, but only up to the character we are trying to find, which is the ragular {00E9} If the italic paragraph did not contain any italic {00E9} then the para was ignored even if there was a regular {00E9} in the para

      But at least I can use that to find some of the characters that need changing and I will experiment to see if I can find the others.

      I did a screen capture to show just what’s going on but I cannot see that this forum supports screen capture jpgs?

      Thanks again

      Jim

      • #83739

        You can post a screenshot, but you have to have it stored someplace (like 4shared or something like that).

      • #83743
        Peter Kahrel
        Participant

        Jim,

        This is a good candidate for changing italics formatting to text tags like <i>. . .</i>. That way you can look for a single non-italic character in an italic environment. After the italic conversion, such a character would look like this: </i>è<i> (italic off, è, italic on). Replace that with è and finally change the italic text tags to italic formatting.

        To change italic formatting to tags:

        Find what: .+
        Find format: +Italic

        Change to: <i>$0</i>
        Change format: Regular (or Roman, Medium, Book — the non-italic stylen name of your font)

        Now replace </i>è<i> with è. Or place all accented characters in your document in a class: </i>[éëê]<i>. Or target all single characters: </i>.<i>.

        Finally, rstore italic formatting:

        Find what: <i>(.+?)</i>
        Find format: <nothing>

        Change to: $1
        Change format: Italic

        Peter

      • #83746
        Ari Singer
        Member

        Peter, I immediately knew that this problem is a good candidate for your trick of your book (pun intended). I just couldn’t figure out how to apply it to this situation. Thanks for coming in to share your advice.

        While I’m at it, I have to thank you for your great book on GREP (I just recently purchased it from Google Play and read it), it’s an amazing book and I recommend it to anyone who wants to have a solid understanding of GREP.

      • #83755
        Anonymous
        Inactive

        Peter, you have an most enjoyably devious mind. :-) :-)
        MANY thanks.

        I was not aware of your GREP book. Thanks for that reference Ari, it’s on my GET list now.

        Your search revealed something interesting – there were invisible (italic) characters at the beginning of some paragraphs that were tagged by the first step.
        At the end of the third step- restoring the italics – these tags were left behind
        <i></i>
        and
        </i><i>
        Which I just removed and so got rid of those invisible chars

        I read your message in the email notification I got and so used the literal form of < rather than < in the search, but it made no difference in the short test I made.

        Thanks again for the enjoyable solution.

        Jim

    • #83751
      Peter Kahrel
      Participant

      Thanks for your kind words, Ari.

      P.

    • #83762
      Anonymous
      Inactive

      Thanks, very much, again Peter and David for introducing me to the idea of tagging.
      After using Peter’s great idea, I now understand your suggestion David.

      There’s just one anomaly that occurs when using your three steps, Peter
      When a paragraph ends with italic words, they are tagged but then so is an empty space at the beginning of the next line>

      Like this:

      Thich Nhat hanh, <i>Vita di Siddhatta Il Buddha,</i>
      <i></i>Ubaldini, Roma, 1992.

      The tags at the beginning to the next line are not picked up by the find <i>(.+?)</i> in the last step.

      I’ve started to go through your book but don’t know enough yet to figure out why this is happening and correct it.

      Many thanks

      Jim

      • #83764
        Ari Singer
        Member

        The reason the search doesn’t pick it up is because there are no characters between the tags, and the search only finds when there is some characters in between.

        To clean up these instances just do a Find/Change like this:
        Find What: <i></i>
        Change To: Leave Empty
        Find Format: Leave Empty
        Change Format: LEAVE EMPTY (if you don’t clear this field, no replacements will be made).

    • #83763
      Anonymous
      Inactive

      Further to the previous post, I was too late to edit it, but the example I gave of the tags at the beginning of the second line was of an unedited line, the line break at the end of the first line should not have been there in the first place.

      I’ve not yet finished editing for errors like that. I expect all those <i></i> tags come from instances like that.

      Jim

    • #83765
      Peter Kahrel
      Participant

      Jim,

      .+ means 'one or more characters'.
      .* stands for 'zero or more characters', so that would find </i><i> as well. But . doesn't match the end-of-paragraph marker, and in this case that seems good because </i> at the end of a paragraph and <i> at the beginning of the next one seem legitimate. Or not?

      P.

    • #83766
      Peter Kahrel
      Participant

      Made a mess of the codes. Sorry. But it’s still readable and correct.

    • #83767
      Anonymous
      Inactive

      Many thanks Peter. Yes, I understand now and everything is cleaned up.

      This has been a real eye-opener for me and many thanks to you, Ari and David for opening up new facet of InDesign for me.

      Best wishes

      Jim

Viewing 16 reply threads
  • The forum ‘General InDesign Topics (CLOSED)’ is closed to new topics and replies.
Forum Ads