Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

GREP Find and replace expression

Tagged: 

Return to Member Forum

  • Author
    Posts
    • #114472
      Anonymous
      Inactive

      Hi all,

      Not sure if this is possible with GREP or whether it needs a script. I’m editing a long text conversation between two parties. Each party starts their message with

      Name1: Message message message
      Name2: Message message message

      but sometimes Name1 sends multiple messages in a row before Name2 responds. In those instances, I’d like to replace “Name1:” with a carriage return and tab. What I am effectively looking for is an expression that says “Find when this happens more than once before the other thing happens, and replace only the second and subsequent instances of it until you find the other one.”

      I like my GREP but I think I might be pushing it. Does this require a script instead?

      Thanks in advance for your help!

    • #114612
      Masood Ahmad
      Participant

      Try this…

      GREP Find/Change:
      Find What: ^(Name1:.+\r)(Name1:.+\r)+
      Change to: $0\r

    • #114643
      Anonymous
      Inactive

      Hi,

      Thanks for your help, but that hasn’t found anything unfortunately! Have tried a few different variations of it to no avail.

    • #114650

      Craig, if Masoods grep doesn’t work, please provide a real example instead of “message message” …

    • #114655
      Anonymous
      Inactive

      Thanks – sorry if I wasn’t giving enough detail.

      19:49 Jim: How you doing?
      19:53 Jane: I’ve finished tidying here
      19:53 Jim: Can either start another thing or meet you in a bit?
      20:20 Jane: We’re just sorting the bill now
      20:20 Jane: Shouldn’t be too long
      20:21 Jim: Cool beans
      20:41 Jane: Just walking to the station now

      They’re all formatted like this.

      The ideal result would be:

      19:49 Jim: How you doing?
      19:53 Jane: I’ve finished tidying here
      19:53 Jim: Can either start another thing or meet you in a bit?
      20:20 Jane: We’re just sorting the bill now
      [TAB] Shouldn’t be too long
      20:21 Jim: Cool beans
      20:41 Jane: Just walking to the station now

      The repeated instance has been removed and replaced with a tab character.

      Obviously the times vary, so I have been attempting to get around that with wildcards, but I might be overcomplicating it. I’d be really grateful if someone knows the best way to be expressing this.

    • #114658
      Peter Kahrel
      Participant

      Find what: ^(.+? ).+\r\K\1
      Replace with:

      P.

    • #114669
      Anonymous
      Inactive

      Hi Peter,

      That works partly – thanks so much. However, it gets rid of the second instance of the name without finding those where there are sometimes 3 or 4 instances in a row. I assume this is probably just a single small correction to the formula but I haven’t been able to work it out. Any ideas?

      EDIT: in fact – sometimes it finds the third instance without touching the second…!

    • #114671

      Craig, this needs more clarification!

      If I run Peters GREP on your lines, then 19:53 and 20:20 is selected. But at 19:53 these are different persons. In your final result this shouldn’t be removed? So what is the goal? Remove the date-time only, if it is the same person? Is it always the same time or only the same person?

    • #114718
      Anonymous
      Inactive

      Apologies all for not going into enough detail – really did intend on making this a relatively easy question to answer..!

      Here is a good long sample of text:

      13:29 Jim: Cos I was leaving you
      13:29 Jim: And wandering off
      13:29 Jim: To look at some random store
      13:35 Jane: Oh
      13:35 Jane: Thanks
      13:37 Jane: I can’t find a taxi
      13:37 Jane: I’m going to try and book a late gym class and then get a late train home
      13:37 Jim: ?
      13:38 Jane: Yeah
      13:38 Jane: It’s a pain
      13:39 Jane: Likely won’t get home till 10
      13:39 Jane: But c’est la vid
      13:39 Jim: Come to the office
      13:39 Jim: Grab the key
      13:39 Jane: *vid
      13:39 Jim: Go to mine
      13:39 Jane: Arrr
      13:39 Jane: *vie
      13:39 Jane: No no it’s okay
      13:39 Jim: No it’s really okay
      13:39 Jane: I should probably go home
      13:39 Jane: Don’t worry
      13:39 Jim: Ahhh come on you were clearly asking me if you could come
      13:39 Jim: See
      13:40 Jim: See
      13:40 Jane: I wasn’t
      13:40 Jane: I was venting

      ———-

      Here is the ideal result:

      13:29 Jim: Cos I was leaving you
      [TAB] And wandering off
      [TAB] To look at some random store
      13:35 Jane: Oh
      [TAB] Thanks
      [TAB] I can’t find a taxi
      [TAB] I’m going to try and book a late gym class and then get a late train home
      13:37 Jim: ?
      13:38 Jane: Yeah
      [TAB] It’s a pain
      [TAB] Likely won’t get home till 10
      [TAB] But c’est la vid
      13:39 Jim: Come to the office
      [TAB] Grab the key
      13:39 Jane: *vid
      13:39 Jim: Go to mine
      13:39 Jane: Arrr
      [TAB] *vie
      [TAB] No no it’s okay
      13:39 Jim: No it’s really okay
      13:39 Jane: I should probably go home
      [TAB] Don’t worry
      13:39 Jim: Ahhh come on you were clearly asking me if you could come
      [TAB] See
      [TAB] See
      13:40 Jane: I wasn’t
      [TAB] I was venting

      The aim is to have the conversation read more naturally on the page as a result of the removal, because each individual message after the first from a person will simply read as a new line rather than having to read their name again.

      So, instance 1 of “NAME1:” remains, then instances 2, 3, 4, 5, 6, and so on until “NAME2:” is written are deleted and replaced by a tab character. It is more complex than I explained up there; I understand if the answer is ‘this can’t be done’ – but I live in hope! I have a few thousand messages to apply this to otherwise. We shouldn’t worry too much about the time specifically as I know I can remove that with a different expression (and have done on some sections of text already) but the names are the things giving me issues.

    • #114721
      Peter Kahrel
      Participant

      This one:

      ^.+?(.+?:).+\r\K.+\1

      Looks at names, not times. But it matches only the second identical name. You could do an expression that matches all, but because InDesign doesn’t allow non-contiguous selections, you wouldn’t be able to replace anything.

      P.

    • #114722
      Anonymous
      Inactive

      Hi Peter,

      That makes sense re: non-contiguous selections. Thanks for that solution. This appears to be finding a bit too much sometimes, though. For the above selection it worked fine for all of them except this:

      13:39 Jim: Go to mine
      13:39 Jane: Arrr
      13:39 Jane: *vie
      13:39 Jane: No no it’s okay
      13:39 Jim: No it’s really okay

      In this one it found

      13:39 Jane: Arrr
      13:39 Jane:

      So would delete/replace one of the messages if used.

      Sorry, I know I am being demanding! Any idea why it’s doing that?

    • #114739
      Peter Kahrel
      Participant

      It’s caused by the identical times. If you change 13:39 to any other time in ’13:39 Jane: No no it’s okay’ the problem doesn’t occur. But since you can do only every other line, not every line, the grep query is of limited use anyway. A script would probably be more useful.

    • #114740
      Anonymous
      Inactive

      Thanks Peter – I’ll try to work out using a combination – maybe where I remove the times first, but either way, this has been really useful, so thanks so much for your help.

Viewing 12 reply threads
  • The forum ‘General InDesign Topics (CLOSED)’ is closed to new topics and replies.
Forum Ads