Sick Creativity: tagrepo

Why is it when I lie down and try to get some rest I keep coming up with these great ideas or things I want to post to the blog? Grr. Came up with a comic idea in great detail and also this nifty programming idea invaded my brain fully formed while I was lying in a state of flu and fever induced lassitude and misery.
OK, this is something that’s within my programming abilities and would be cool but I will probably never get around to doing it so here’s the idea in case anybody wants to mess.

It’s something you could easily implement in Ruby or Python or Perl, but it’d only really work well on Linux, because it depends on FUSEfs. There are bindings to FuseFS for many languages.

It’d be called ‘tagrepo’. It is based on the ideas I discussed here, about filesystems being dubious from a usability standpoint.  It’s basically ‘flickrizing’ or ‘del.icio.usizing’ a Linux subdirectory.

Implementation

A tagrepo is implemented as a simple directory, which contains a database (say, a sqlite db) and a subdirectory which stores the files you put into it.

Storing a file in the tagrepo you give it any tags you want. It stores the tags in the db and renames the file to a uuid or content hash or something. It stores the file’s new name, and file extension, in the database along with other relevant goodies like the last modified date and perhaps the date of insertion, and most important of all — the tags.
You can query the tagrepo by giving it a few tags. It will return a list of all the files which have those tags, and a list of all the other tags which those files have. (So if you query on [‘music’,’Britney Spears’] you’ll get a list of all files tagged with ‘music’ and ‘Britney Spears’, and ALSO all the other tags which appear on the files returned.) This is just a database query.

FUSE Layer

That’s nice and all, but this is the gravy, this is what makes it worth while.

You set up a FuseFS interface like so.  Let’s say it’s mounted at /home/ed/tagrepo.  If I do an ‘ls’ on ‘/home/ed/tagrepo/’ I get the following:

A subdirectory named ‘untagged’ which contains files which have no tag, with simple number names in some useful order, with their original file extensions.

A subdirectory named ‘all’ which contains ALL THE FILES in the whole repository, with simple number names in some useful order, with their original file extensions.

A subdirectory for each tag that the system knows about.

If you go into a tagged subdirectory, say ‘music’, you get the ‘untagged’ and ‘all’ and tag subdirs, just like the top level directory, but now restricted to files which have the ‘music’ tag.

Thus, ‘/home/ed/tagrepo/music/Britney Spears’ should contain all of Britney Spears’ music in the ‘all’ subdir, that with no further tags besides ‘music’ and ‘Britney Spears’ in the ‘untagged’ subdir,  and a list of all relevant further tags.

A nice refinement would be that if you narrowed it down to a unique file by a series of tags, it would present that file as if it were named after the last tag: e.g. /home/ed/tagrepo/music/Britney Spears/Baby One More Time.mp3′ rather than just ‘/home/ed/tagrepo/music/Britney Spears/Baby One More Time/all/1.mp3’

You could combine this basic setup with all kinds of extra grooviness — automated tagging based on ID3 metadata, MusicBrainz lookups, all kinds of things.

The basic idea I’d think you could get in place with less than a day of coding though.

I’m pretty sure I’m never going to get around to it — anybody want to make this magic happen?

7 thoughts on “Sick Creativity: tagrepo”

  1. That would kick some serious ass. And I think you should make time for it Ed. It could be the killer app that makes Linux a must have OS for a lot of people. (Put that on your resume and smoke you some cigarettes rolled from Hamiltons.) You’d need UI integration, such that save dialog boxes prompted for tags rather than filenames.

  2. That’s the genius thing about FUSE, paul — setting it up that way would give you this shit for free. If you save a file as ~/tagrepo/foo/bar/baz/funky.mp3, then it would tag that file with the tags ‘foo’ ‘bar’ ‘baz’ and ‘funky’. MAGIC.

    The genius thing about FUSE is that it essentially gives you the UI integration for free by allowing your storage system to mimic a traditional filesystem.

  3. No kidding. So, slash delimited tag creation? What if I save a file as:

    ~/tagrepo/My Life with Master/RPG/Half Meme Press.txt

  4. …which matches your Britney Spears example, except del.icio.us has me conditioned to not expect to use spaces in my tags.

  5. bwahahah. exactly.

    There may be big problems I haven’t foreseen of course, but it seems pretty straightforward.

    The only downside to using conventional open/save dialogs is that this filesystem should allow you to save to directories which don’t exist yet, whereas conventional filesystems wouldn’t. That might badly confuse open/save dialogs. That may be a big problem or not, I don’t know.

  6. I wonder if for usability reasons it would be better to reduce all tags to lowercase single words, you know, just so you don’t have to wonder whether you tagged something as ‘my life with master’ or ‘mylifewithmaster’. Is that a *feature* of del.icio.us or a limitation??

  7. To me, it’s a limitation. So I have del.icio.us tags like:

    britney.spears

    You don’t want tags like britneyspears, because it complicates what the user has to do with boolean operators when searching across your tag set. When the user wants to return their whole collection of music by artists named Britney, they want to search on “Britney”.

    And then when you implement a tag cloud, you want a big, aesthetically pleasing “Britney Spears” tag rather than a big “britneyspears” tag.

Comments are closed.