Mail AppleScript-ing Project

I have a folder full of old mail in Apple’s Mail application. It’s gigantic. About 70,000 messages. Most of them are duplicates, because it’s the result of finding old folder of mail upon old folder of mail and merging them together into one great hoard. The actual number of real distinct messages is probably a smallish fraction of 70,000.

What’s worse, some of the 70,000 are blank. In an inept attempt at writing a Python script to clean up a similar uber-mail-folder in the past, I somehow took a lot of old mail and destroyed the body of the emails, leaving the headers intact. So my gigantic folder includes many duplicates, but some of the duplicates aren’t real duplicates because they have missing bodies.

I want to somehow eliminate all the duplicate messages, and there are scripts to do that in Apple Mail. The only one that I would have trusted not to accidentally kill a real message and keep the one without a body, chokes and fails on a folder that large. (It also choked and failed on a smaller folder. Maybe something changed in Leopard that breaks that script.)

So I wanted to go through and destroy all the messages which have blank bodies — they’re no use to me and they make it dangerous to get rid of duplicate messages. I tried exporting everything to a mbox-format file, and use some of Python’s nice mailbox-manipulation libraries, but the file was insanely large, and Python on my macbook staggered under its weight. (Besides, my use of Python caused this problem, a while back…)

So eventually I turned to AppleScript. (I first tried using rb-appscript, but it turns out I don’t need any special Rubyness for this, and it’s easier to learn from examples of AppleScript on the web if I don’t have to translate them into Ruby before I use them.)

I wrote a script in Apple’s Script Editor called “Winnower.” It takes messages in a folder called “doing” and sorts them into two folders, “blank” and “done,” depending on whether there’s any content in the body or attachments on the mail. I put a few thousand messages at a time into the “doing” folder and then run the script. (The full weight of the 70,000+ message folder was too much for this script too.)

It looks like this:

tell application “Mail”

set doingbox to mailbox “doing”

set blankbox to mailbox “blank”

set donebox to mailbox “done”

set doingmessages to messages of doingbox

repeat with thisMessage in doingmessages

ignoring white space

if mail attachments of thisMessage is {} and content of thisMessage is equal to “” then

move thisMessage to blankbox

else

move thisMessage to donebox

end if

end ignoring

end repeat

end tell

Stupid Ruby Serialization Tricks

This post is for the small fraction of readers of this blog that dig programming in Ruby… apologies to anyone else….

So, I was fooling around with Ruby when I came up with the following trick.

Let’s say you have an object that you want to be persistent across invocations of your application. A set of configuration settings, history, who knows what. You don’t want to go nuts and worry about a whole database though. There are a couple good super-light-weight transactional persistence libraries built in to Ruby, PStore and YAML::Store, which do the job. You could them like this:

o = MyCoolObject.new

# to store it first
YAML::Store.new(".storage_for_my_app").transaction do | store |
  store['MyCoolObject'] = o
end

# .. and to retrieve it from storage.
YAML::Store.new(".storage_for_my_app").transaction do | store |
  o = store['MyCoolObject']
end

You’ve got to remember to store it again when you’re done with it of course.

I came up with a variation on this:

class MyCoolObject
  def MyCoolObject.stored_in(filename, *newargs)  # class method
    YAML::Store.new(filename).transaction do | store |
      store['self'] ||= MyCoolObject.new(*newargs)
      yield store['self']
   end
  end
end

Now I can make sure everything I do with my cool object “o” is stored persistently. All I have to do is wrap my actions on “o” in a stored_in block:

MyCoolObject.stored_in(".storage_for_my_app", 'a', 'b') do | o |
  # o is either recreated from .storage_for_my_app, or
  # created anew with args 'a' and 'b' passed to its constructor

  # .. here we do stuff with o
end
# and here o is re-serialized, with any changes intact.

When the block opens, ‘o’ is either resurrected from the storage file, or created anew from *new_args if there’s nothing in the file.

When the block ends, any changes to ‘o’ are stored there.

As long as you keep all your interaction with the object inside stored_in blocks, you’re golden. You get persistence!

Note that ‘o’ can hold other objects in its instance variables, which can hold other objects, and so on — anything in there that’s serializable can be stored this way. So you can use this to persist a whole pile of objects in one file as long as they all live in a single object which has a stored_in class method.

I thought this would be cool functionality to include in a module, whereupon I learned that in Ruby, a module’s “class methods” are not added to a class when you include a module in it. (I guess because a module isn’t a class and so it doesn’t really technically have “class methods”…) but you can get the same effect using a tiny bit of trickery with the “included” method of the Module class.

Here’s a module you can use to give any class these kind of storage abilities:

require 'yaml/store'

module StoredInFile
  def self.included(base)
   def base.stored_in(path, *args)
     YAML::Store.new(path).transaction do | store |
       store['self'] ||= self.new(*args)
       yield store['self']
     end
    end
  end
end

You use it like this:

class MyCoolObject
include StoredInFile

# other class stuff goes here

end

And that’s it.

UPDATE:

Stupid WordPress makes including code in a page really hard…  keeps eating my formatting.  I think I’ve got it…

Haskell is kind of cool.

Back in about 2000-2001, I was doing first tech support and then configuration management work for a big company in Chicago, and, basically because I was lazy and curious I would spend more time than I should have reading, on the web, about programming, especially programming languages, especially unusual ones.

That was when I first started getting interested in Ruby, and read the online Pickaxe Book; that’s when I read beating the averages and wanted to be an Eager Young Lisp Cadet (much like the inimitable Bruce!), I downloaded Squeak and learned a little Smalltalk; and I got geeked about pure functional programming by reading John Hughes’ paper, Why Functional Programming Matters. Hey, anything but do the work I was being paid to do!

The Hughes paper led me to Haskell, and I read the Gentle Introduction to Haskell , at least up to the IO chapter, which linked forward to the Monads chapter, which was too much for my poor little brain.

The thing was, at the time, I wasn’t programming professionally or really much at all. I’d read about programming, done tiny little fun programs, done a lot of system scripting in Perl, and learned about the languages, but I’d never been a “real” programmer. This kept my mind open to wacky languages but it kept my understanding shallow.

A couple jobs later, I was doing actual programming for a living, but in Perl (the first language I’d actually used on the job, and so the one I was best at). While I wasn’t paying attention to it, Ruby suddenly became really popular thanks to this “web application framework” called Rails, maybe you’ve heard of it.

Now it seems like Haskell is starting to accumulate buzz. There’s almost as much jibberjabber on Reddit about Haskell (especially Monads) as there is about Ron “we can safely assume 95% of black males are criminals” Paul.

I’ve recently gone back to it, got a copy of the compiler working on my , and followed some of the good tutorials, and I finally realized that Haskell’s “monads” weren’t really as hard to understand or weird as I had thought.

I even wrote a little program that rolled dice. It compiled. It used the IO Monad. It used the Random Monad (indirectly — you can just pull random numbers into an IO Monad). It was maybe a dozen lines long, and verbose at that. I rewrote every part of it several times, so I wasn’t just cut and pasting code, butunderstood exactly how it was doing its thing, and I played around with the monad operators and “do-notation” and all that.

In the end it all turns out not to be a big deal.

OK, now what?

I’d love to go learn more about Haskell. But you know what? I don’t actually program in my spare time much. Just stupid little utility scripts from time to time. Convert videos from flv to mpg using mencoder. Generate clever passwords (I have a command line script that does what this does). Automate an rsync backup. I guess I could try writing those in Haskell instead of shell or Ruby, which is what I usually use. Maybe eventually it will lead to something interesting.

We’ll see. Haskell isn’t the only language that fascinates me but it’s the one I’ve had a long fascination with and done very with, mostly because of the silly “oh no I can’t grok monads” hurdle. I was prompted to write this up because I just started following the fascinating notes on haskell blog, whose author, Adam Turoff (a pointy-headed comp sci sounding name if there ever was one), wrote up a spiffy three-part intro to Haskell for ONLamp.com, beginning here.

A Glimmer of Haskell

I think I’m starting to actually grasp Haskell’s monads. I’ve been reading the Haskell wikibook and this article on Monads as Containers…

And I’ve been dinking around just a little and little things like this are making sense to me:

Prelude> return("won't you take me to") >>= (\line -> putStrLn (line ++ " funKAYTOWN"))
won't you take me to funKAYTOWN
Prelude> ["won't you take me to"] >>= (\str -> [str ++ " funKAYTOWN"])
["won't you take me to funKAYTOWN"]
Prelude>

Cool. I’ve wanted to learn me some Haskell for a long time (I think I first checked it out in 2001?), but somehow the abstraction in monads was a little more than I could focus on. Now it’s starting, starting to make sense.
The fact that it’s been this hard to get me this far doesn’t suggest that I’m going to be a master Haskellist any time soon, but at least there’s some hope.

RE: Rails, It Turns Out I’m Just an Idiot, Not A Moron

Or vice versa.

About a week ago I wrote and deleted a fairly whiny post about how I was trying to write a simple Rails application and just didn’t get it. I mean, I could generate scaffolding and stuff, like any chimp could, but every time I tried to do anything in the least bit off-the-beaten path, I’d end up in a morass.

Having taken a little time off I started messing around with a simple rails app again, and needed to look something up, and I couldn’t find it in the api documentation or googling around, so I grabbed my ancient (1st edition) Agile Web Development with Rails book, and checked out the index. Ah, there was what I needed, on page X Y and Z.

In the midst of reading those pages I realized I had never really taken advantage of that book at all.

See, the first umpteen chapters of the book are a tutorial, where you follow along, they say do this and do that, and you are supposed to go “wow, it sure looks easy, of course, I’m not learning anything except what to do if I happen to want to build exactly what they are building in the tutorial example.”

I’d only made it through a few chapters before tossing the book aside as useless, because that sort of thing doesn’t help me at all. I can’t follow along and not understand what’s going on. I want to know what’s going on, how things work, first, and then I may be able to get something useful out of an example or tutorial.

There is basically no useful way (for me at least) to learn Rails on the web. All you have are these whizz-bang follow-along tutorials, which don’t ever give you a complete picture of what’s going on, and the API documentation, which is useful as a reference but horribly painful to try to learn from. It’s hell or high water — either handwaving la-la on the one hand, or details so nitty-gritty that you’ve got to be a lot more of a propellerhead than I am to use them for learning.

Anyone who’s got the Rails book I mentioned can already see why I’m an idiot. It turns out that the latter half of the book, after all that whizz-bang la-la tutorial, is exactly what I needed. It sets out very clearly and comprehensibly what all the various parts of Rails are, how they fit together, what you can do with them, giving you enough details to clearly understand what you can do with each piece, but organizing those details into a comprehensible presentation.

And I’ve owned this book the whole time and I didn’t realize that it contained exactly what I needed to have to learn Rails.

So I’m not a moron who can’t learn what’s supposed to be the easiest web framework in the world in my favorite language in the world, I’m an idiot who was trying to learn it with all the wrong resources. Or vice versa.

I’m glad I got that sorted out.