Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

Is there a script that can fix run-together words (e.g. "happybirthday")?

Return to Member Forum

  • Author
    Posts
    • #1179317
      Lala Lala
      Participant

      I know there’s a big potential problem with such a script –
      some words that might be misinterpreted as compound words, e.g. “theme” could get turned into “the me”.

      But I’m still hoping some ambitious person came up with some ‘smart’ script that could at least fix obvious
      ‘unspaced’ word pairs, like “wasprobably”.

      Does it exist? I got a document riddled with these.

    • #14324340

      It sounds possible but it would need a massive word list. I have experimented with some large datasets, and there is a limit that can’t be crossed.

      Have you considered doing a spell check?

    • #14324339

      Yes, run a spellcheck – that would get most of your ‘obvious’ ones.

      Also, can you go back to the source document? If it’s riddled with them, there might be two things –
      a/ human error at work somewhere, i.e. it’s a problem for someone else to fix – “this material is of unacceptable quality for me to process”;
      b/ a poor conversion of a PDF, where every new line came out with a hard return, and these have been replaced by nothing, running the words together.

      e.g.
      PDF has
      ‘I know there’s
      a big potential
      problem with
      such a script’

      which has been extracted from the PDF as
      ‘I know there’sa big potentialproblem withsuch a script’

      Just a theory!

    • #14324337
      Lala Lala
      Participant

      Appreciate the thoughts… yeah, spellcheck is how I’m doing it now, and it’s agonizing.
      A handful of errors per page x hundreds of pages. I can’t go back to the original source.

      Annoyingly, spellcheck’s suggestions are not consistent. One problem word might have the correct solution as the first choice, then another might have it as the 4th choice with some nonsensical suggestions ranked above it like “been-pulled” (why??).

      If the first choice was always consistent, like the [word][space][word], then I could at least rapidly spam double clicks on the first suggestion, and fix 99% of the problem words in a matter of minutes. But the slight mental effort and wasted time of finding the correct choice from the suggestion list, turns minutes into hours.

      A script that would run a spellcheck, and automatically replace the problem words strictly with [word1 word2] suggestions, would essentially do the trick. Or a script that uses a custom dictionary, which is just the normal dictionary, but maybe culling words less than 5 characters to reduce the amount of false positives (“the me” instead of “theme”).

    • #14324332

      Point taken about the spellcheck – quite surprising that it’s so bad at the first choice.
      I just did a trial between Word and InDesign using the same chunk of text with runtogether words, and Word was substantially better at suggesting the properly-spaced words as first choice, right maybe 80-90% of the time, while ID was less than 50%.

      Would a visual approach be less mentally taxing? If you set InDesign to ‘dynamic spelling’, all unrecognised words are squiggly underlined, and you just need to click the right spot and hit space. I say ‘just’ as if it’s an easy task, but worth a try?

      I suppose the lesson is to preflight incoming text before putting a lot of work into the layout.
      Not always possible I know. From experience, you’d at least expect language translators to understand the meaning of the words ‘final version’!

      Good luck

Viewing 4 reply threads
  • The forum ‘General InDesign Topics (CLOSED)’ is closed to new topics and replies.
Forum Ads