Drag and Drop AppleScripts for EPUB, IDML, etc.
For my final tip at last week’s PEPCon, I showed a little AppleScript for working with IDML that I found on an InDesign scripting forum. It was written by renowned AppleScripter Shane Stanley of AppleScript Pro Sessions and it allows you to drag and drop an IDML file on to the script (compiled as an application) to unzip the IDML into a folder structure. You can then poke around the files within, and edit them. You can also use apps with powerful search and replace features like Dreamweaver, Oxygen, BBEdit, and so on to find and modify content within the folder. When you’re done editing, just drop the folder back on top of the script to get a new IDML file with a “+” added to the filename to preserve your original file. Smooth as silk. If you’re a Mac user, it’s a very handy tool to have at your disposal, and big thanks to Shane for making it.
To take things one step further, I tinkered with the script to make versions for the other file formats that InDesign exports which are basically disguised ZIP archives: the new version of FLA (formerly known as XFL), and of course, EPUB. I also made a version for playing around inside Word’s DOCX files. If you’d like to give them a try, click the image below to download all four scripts for decompressing and recompressing IDML, EPUB, FLA, and DOCX.
Then with a drag and drop you can turn this:
But why would you want to crack open something like a DOCX?
David Creamer gave one answer to that question with his great tip at PEPCon, showing how to extract images from a Word doc. DOCX archives contain all graphics that have been placed into the Word file, in their full, high-res glory. You just have to know where to look. Hint: look in the Media folder ;)
Microsoft’s Office Open XML is a nasty format for an XML tourist like me to wade through. Reading it makes IDML and EPUB seem like a comic book by comparison. But what good’s having a computer if you can’t break things with it? So in the interest of weird science, I plowed ahead and edited the art and text inside the DOCX shown above. Word complained ominously about corruption when I re-opened the file.
But I called its bluff. The file did open, and it reflected my changes.
That’s what you call taking the bull by the horns.
Interesting Word tip… I’m on Windows so I can’t use that scripts but I just changed the .docx extension to a .zip and unzipped it. Worked in reverse too.
@Stix Hart: This is a hack of a trick. I really like that!
Thanks for sharing.
I’m in a kind of hurry. The AppleScripts have to wait…
Best regards from Austria
Cool tips Mike! Thanks for sharing.
We tend to get a lot of DOCX files with embedded graphics and photos from the two schools our newspaper covers. I’ve found that the easiest way to extract the original graphics is to “Save to Webpage” and grab the odd-numbered JPG files out of the folder.
@Jeremiah: I always did that, too, but Dave Creamer’s tip is far easier… just change .docx to .zip and you’re in like flynn.
One of the nice things I neglected to highlight about the scripts is that you can batch process a bunch of files by dropping them all on it as once.
Another nice thing is there’s no typing involved.
You know, if you place the Word file into InDesign, the pictures come along too (as long as you turned on that option in Import Options). They appear in the Links panel as embedded. You can shift-click them and unembed them if you want.
Sorry, I don’t know how to rewrite these scripts in Javscript, but maybe someone else in the InDesignSecrets ecosystem does.
I don’t really know how much work that would be, but the scripts are only 13 lines of code each, so it might not be too daunting a task.
The drag and drop epub script introduces validation errors on re-packing.
1. ERROR: foobar+.epub: length of first filename in archive must be 8, but was 9
I wondered since the repacking is done in the Finder, if some invisible files are being compressed as well?
I think that error is the result of the + sign being added to the EPUB name when it gets recompressed. Try deleting the + and validate again. If that works (and you have a back-up of the EPUB elsewhere), you could delete the part of the script that adds the +.
Nope, that isn’t it Mike.
1. test.epub on first run of epubcheck:
test.epub is valid!
Epubpreflight runs some of the same validation as epubcheck, but additionally checks for problems like very large file sizes that might affect some reading systems.
2. Same file: test.epub, opened with d&d epub.app.
Did not even open the folder, or modify a single file – simply decompressed and re-compressed using the D&D epub.app.
Deleted the + and changed name of file to test2.epub before validation – as noted in the result below…
test2.epub is NOT valid
1. ERROR: test2.epub: length of first filename in archive must be 8, but was 9
Mike, note that the error says “in archive”. Run the test and validation yourself and see what you get.
I have a hunch it is the DS_Store file myself, but I’m no programmer.
The DS_Store file is added to the folder by your friendly Mac OS, and as such, probably not an important part of the IDML specs. … It’s not there in Windows’ IDMLs, for one thing.
Since that’s about all I know about OSX, I’m going to lean back and listen intently to you people trying to prevent OSX from creating and adding this file … Should get interesting, non?
I was afraid of this… it seemed too good to be true that the script would put the epub back together the proper way. ePub has very strict rules for how it needs to be zipped up. That’s why Colin and Gabriel and folks were using Springy (on the Mac) to edit them without unzipping at all. Gabriel P. talks about other options for proper zipping on the Mac (using Terminal) in his InDesign Magazine article (which was reprinted on creativepro.com). Notably, the mime file has to go first!
FWIW, before I mentioned these scripts to anyone, I tried and was able to edit content in the epub, use the script to zip it back up and view it in Digital Editions without a problem. I don’t have time to do any more research right now, but perhaps the script could be enhanced to overcome the validation errors.
@Ira visit my site for more information to solve your problemLength of first filename in archive must be 8, but was 9?
hope my site post will help your problem out.
Mike, you are the man! I have been cursing editors for months for embedding their images in word files. This made my day :)
i’m desperately looking for some scripts like the ones you posted here. I thought I succeeded, but…
obviously the scripts work only on PowerPC… :(
You haven’t – by any chance – “updated” your scripts for Intel? Or have some other source for similar scripts?
I’d appreciate it very much. :)
@Eric: PowerPC?! Huh? I’ve used these scripts on a Mac Intel macbook pro with no problem. Maybe it’s an OS problem?
I used the Drag & Drop IDML app in 2011 and it was a lifesaver. I was certainly on OSX, probably 10.6.
I haven’t had to use it again until today. I’m now on OSX 10.7. When I dropped an IDML file on it I got the message that Power PC apps are not supported. That’s obviously wrong, but it looks like the app doesn’t work with OSX 10.7 (and probably higher). When I look at the icon for the script, it is overlaid with the circle with a line through it (the “not allowed” symbol, or whatever it’s called).
So, I’m stuck, since I’m on 10.7 at work and 10.10 at home. I clicked on Mike Rankin’s name, but it tries to find his website and I get a “Server can’t find” error.
Is there a working version of this for El Capitan/Yosemite?
It doesn’t work anymore in these newer versions of the OS.