Back

If your email is not recognized and you believe it should be, please contact us.

  • You must be logged in to reply to this topic.Login

Dictionary for Transliterated Russian (and other languages)

Return to Member Forum

  • Author
    Posts
    • #102055
      Matthew Williams
      Participant

      Hi.

      I’m working on a book that is mostly English but that has a lot of Russian titles transliterated to Latin characters (that is, they’re not written using Cyrillic), and I’m having to guess about how to hyphenated the transliterated words. Does anyone know of a Hunspell dictionary that handles transliterated Russian (or Arabic or Hebrew, which I also have to deal with on a regular basis). If there’s no such dictionary, does anyone know if I can use a substitute dictionary—say Ukranian—that would use the Latin characters instead of the Cyrillic?

      Thanks.

    • #102063

      Interesting problem.
      If you can’t find a dictionary, I guess the biggest problem might be unwanted hyphenation separating any digraphs used to represent Cyrillic letters, like ZH for ? (does the forum software do Cyrillic?**). I suppose you could find-replace those digraphs with “no break”.

      Good luck,
      Chris

      **Edit: No it doesn’t do Cyrillic, it’s replaced my ZH with a question mark.

    • #102065
      Matthew Williams
      Participant

      Thanks, Chris. That’s definitely a step in the right direction, but because I don’t speak Russian and can’t “hear” it in my head. There are also lots and lots of places where I look at InDesign’s hyphenation choice and know that in English the break can’t be right. For example, InDesign wants to break “Imperatorskogo” after the “s.” In Russian, though, perhaps that’s correct. And there are places that are totally foreign to me—like the letter combination of “shch” in the word “nastoiashchem.” Is it okay to hyphenate between the first “h” and the “c”, or do the four letters constitute one sound? And what about that “oia” combination? It’s a problem I have with transliterated Hebrew and Arabic, too, but the character strings that occur in transliterations of those seem to be easier to parse.

      Nonetheless, I appreciate your help. At the very least, I can protect “zh” and “kh” and the like.

    • #102066

      Hi Matthew.
      SHCH is a single letter, U+0429 uppercase and U+0449 lower.
      I can more or less read the Cyrillic alphabet, but transliteration (of any language) is a bit of a minefield* unless you know the system that’s been used, or can ask the person who transliterated it.
      Also, it’s entirely possible that the transliteration using a digraph may also occur as two letters singly (cf English “weather / outhouse” where the TH is a digraph first time and not the second)
      Worst-case scenario: turn off hyphenation!

      *Wikipedia at https://en.wikipedia.org/wiki/Romanization_of_Russian
      says
      “There are a number of incompatible standards for the Romanization of Russian Cyrillic, with none of them having received much popularity and in reality transliteration is often carried out without any uniform standards”

      As I said before, Good Luck!
      C

    • #102100
      David Goodrich
      Participant

      I’ve been reading William Taubman’s <i>Gorbachev: His Life and Times</i> (Norton, 2017), a Christmas gift from my daughter. The author explains in the front-matter his use of multiple systems for romanizing Cyrillic, which I understand (in a past life I published a few scholarly articles based on Russian sources.) But whoever was in charge of production at Norton simply dropped the ball on hyphenation, and not just for romanized Russian: I’ve been startled by the number of recto pages ending in the middle of an English word, entirely avoidable with a setting in ID.

      Elaborating up on Chris’s worst case, I might consider creating a character style for romanized terms, and set its language to none, effectively turning off hyphenation. That might work most of the time, though inevitably it would cause some poor spacing. For those instances I could pull the char. style and insert my own discretionary hyphens. Someday AI may be able to help, but for now you may need someone who knows Russian to check.

      Good luck,
      David

    • #102101
      Matthew Williams
      Participant

      I had thought of the no-hyphen option, and since the text in this book is left-aligned, it would probably have been acceptable with a bit of futzing to prevent short lines. Fortunately, we have the option that seems to be the best one for now. The author will be reading the first pages, so I’ve asked her somewhat forcefully to pay particular attention to the way that the transliterated text is hyphenated because neither InDesign nor I know where the acceptable breaks are.

      I haven’t seen Taubman’s book, so this might not apply, but in (a rather weak) defense of Norton, InDesign does frequently fail to follow H&J rules (and Keep rules) when footnotes fall at the bottom of the page. That’s no excuse for not looking at every line and page break to make sure they’re right, of course, but I suspect that Norton paged the book using one of the automated packages out there. Someday, those packages will replace me, but they’re not there yet.

      Best wishes and thanks to you both, David and Chris, for your advice.

Viewing 5 reply threads
  • The forum ‘General InDesign Topics (CLOSED)’ is closed to new topics and replies.
Forum Ads