Why is it when I lie down and try to get some rest I keep coming up with these great ideas or things I want to post to the blog? Grr. Came up with a comic idea in great detail and also this nifty programming idea invaded my brain fully formed while I was lying in a state of flu and fever induced lassitude and misery.
OK, this is something that’s within my programming abilities and would be cool but I will probably never get around to doing it so here’s the idea in case anybody wants to mess.
It’d be called ‘tagrepo’. It is based on the ideas I discussed here, about filesystems being dubious from a usability standpoint. It’s basically ‘flickrizing’ or ‘del.icio.usizing’ a Linux subdirectory.
A tagrepo is implemented as a simple directory, which contains a database (say, a sqlite db) and a subdirectory which stores the files you put into it.
Storing a file in the tagrepo you give it any tags you want. It stores the tags in the db and renames the file to a uuid or content hash or something. It stores the file’s new name, and file extension, in the database along with other relevant goodies like the last modified date and perhaps the date of insertion, and most important of all — the tags.
You can query the tagrepo by giving it a few tags. It will return a list of all the files which have those tags, and a list of all the other tags which those files have. (So if you query on [‘music’,’Britney Spears’] you’ll get a list of all files tagged with ‘music’ and ‘Britney Spears’, and ALSO all the other tags which appear on the files returned.) This is just a database query.
That’s nice and all, but this is the gravy, this is what makes it worth while.
You set up a FuseFS interface like so. Let’s say it’s mounted at /home/ed/tagrepo. If I do an ‘ls’ on ‘/home/ed/tagrepo/’ I get the following:
A subdirectory named ‘untagged’ which contains files which have no tag, with simple number names in some useful order, with their original file extensions.
A subdirectory named ‘all’ which contains ALL THE FILES in the whole repository, with simple number names in some useful order, with their original file extensions.
A subdirectory for each tag that the system knows about.
If you go into a tagged subdirectory, say ‘music’, you get the ‘untagged’ and ‘all’ and tag subdirs, just like the top level directory, but now restricted to files which have the ‘music’ tag.
Thus, ‘/home/ed/tagrepo/music/Britney Spears’ should contain all of Britney Spears’ music in the ‘all’ subdir, that with no further tags besides ‘music’ and ‘Britney Spears’ in the ‘untagged’ subdir, and a list of all relevant further tags.
A nice refinement would be that if you narrowed it down to a unique file by a series of tags, it would present that file as if it were named after the last tag: e.g. /home/ed/tagrepo/music/Britney Spears/Baby One More Time.mp3′ rather than just ‘/home/ed/tagrepo/music/Britney Spears/Baby One More Time/all/1.mp3’
You could combine this basic setup with all kinds of extra grooviness — automated tagging based on ID3 metadata, MusicBrainz lookups, all kinds of things.
The basic idea I’d think you could get in place with less than a day of coding though.
I’m pretty sure I’m never going to get around to it — anybody want to make this magic happen?