2009-01-01

Must... Organize... Music Collection...

Foreign characters... OMG, foreign characters...

I've never cared to blog about what my cat is doing that is so hilarious or some random celebrity and how awesome or full of suck and fail he/she is, so blogging has not been an interest of mine until, well... now. For now I have a problem and that problem is ID3 tags with foreign characters in a Ruby script that I eventually intend to manage my entire music library.

What got me started? I need order and structure in my file system. I need my music to be all neat and tidy and high quality and a few terabytes here and there and... Uh...

Anyway, you know how you acquire a song from the great and terrible internets and it's all like "track# - song title" or the same thing without the dash, or it's just named all wrong in some other fashion, so without looking at the tag info you have no idea who the artist is? If you have a bunch of these in one folder and you don't have the option of sorting by the artist listed in the tag, it's just an organizational nightmare.

Options... iTunes? Bleh. Sure, you could just load all the files into your favourite library based music player and and see order there, but that isn't enough for me. iTunes has the option of organizing your library for you, but there is no way (that I know of) to customize how it does this. I want a structure that looks like this...

.../Artist/[Year - ]Album/Artist - ## - Song Title.[mp3/ogg/m4a/flac]

Let's say I want to throw songs on some form of portable media and I don't need whole albums. I can copy the songs to the media and not have to worry about the sorting problems of files that start with the track number, such as...

01 - Artist A's song
01 - Artist B's song
02 - Artist A's other song
etc

Instead I'd have...

Artist A - 01 - Title
Artist A - 02 - Title
Artist B - 01 - Title
etc

I had many files with the track number first problem and I didn't want to spend the next three years organizing them into Artist/Album folders by hand, so I wrote a script! But it has serious bugs, so don't try it till it's fixed.


require "rubygems"
require "id3lib"
require "fileutils"

def mv_mp3
# This starts in the current folder and finds all the mp3 files in it and its subdirectories.
files = `dir /s /b *.mp3`.split("\n")
#The gsub here is a Windows only thing to replace the / with a \ as a folder separator.
album_base = File.expand_path(".\\").gsub(/[\/]/, "\\") + "\\"

files.each do |file| # Test and rename loop for all mp3s found.
file_name = File.split(file)[1] # This returns the filename without the directory

# This is A test to weed out files that are still being DLed.
unless file_name[0..10] == "INCOMPLETE~"

# This is the best ID3 reader/editor lib I've found so far,
# and it's where my foreign char problem comes from.
tag = ID3Lib::Tag.new(file)

# Have to check if the most fundamental tags are available,
# otherwise we leave the file alone.
if tag.artist and tag.title

# Sometimes MP3 tags end up with a bunch of trailing spaces
# after the actual text, and this is a problem for the script, so
# this solves that problem.
(tag.artist.chop! until tag.artist[-1].chr != " ")
(tag.album.chop! until tag.album[-1].chr != " ") if tag.album
(tag.title.chop! until tag.title[-1].chr != " ")

# If spaces were removed, update the files tags.
tagcomp = ID3Lib::Tag.new(file)
tag.update! if tag.artist != tagcomp.artist or tag.album != tagcomp.album or tag.title != tagcomp.title

track = "%02d" % tag.track.to_i # returns track number with leading zero for naming purposes.

# I just noticed some overkill here... still checking for artist...
# Anyway, this readies the Artist/Album folder names based on the tags.
# The gsubs remove illegal filename chars (more yet to do...)
audio_file_dir = ((tag.artist ? tag.artist : "Unknown Artist") + "\\" + (tag.album ? tag.album : "Singles") + "\\").gsub(/[:]/, " -").gsub(/[.?\"]/, "").gsub(/[\/]/, "-")

# This does the same as above but for the intended name of the file.
# There is a nasty bug in this area involving case sensitivity. Details at the end.
audio_file_name = (tag.artist + " -" + (track != "00" ? " " + track + " - " : " ") + tag.title + File.extname(file)).gsub(/[:]/, " -").gsub(/[?\"]/, "").gsub(/[\/]/, "-")

work_path = File.split(file)[0] +"\\" # Where is the file now?

# This is to show you what it intends to rename the file to.
# I should really have a yes/no prompt after it...
puts work_path + file_name + "\n>> " + album_base + audio_file_dir + audio_file_name +"\n"
puts "Artist/Album folders created.\n" if (work_path == (album_base + audio_file_dir) ? false :

# Before we can move and rename the file, the folder structure must be created
# provided it doesn't already exist.
# And again, this is part of the nasty case sensitivity bug.
FileUtils.mkpath(album_base + audio_file_dir))

# The final piece of the cs bug is here.
# This actually moves and renames the file.
# The bug... Let's say you have a folder /Alice in Chains/Album/Song
# and the tag info for artist is "Alice In Chains". Note the capitalization difference.
# This difference can occur in artist, album or title and it all yields the same result...
# The script stops with an error and the file it was moving vanishes.
# This wouldn't happen in a case sensitive file system, but in Winblows, it's a problem.
# I already have ideas to fix this, but as is, this is a danger.
puts (file == album_base + audio_file_dir + audio_file_name ? "Already ready already!\n\n" : (FileUtils.mv(file, album_base + audio_file_dir + audio_file_name) ? "Moved and renamed.\n\n" : "Tried to move\/rename... Fail!\n\n"))

else # No artist? No title? Leave it alone!
puts "Tag issues with " + file_name + ". File unchanged." + "\n\n"
end # if
end # unless
end # each

end

mv_mp3


The bug I have no idea how to fix is this. Non American letters, letters with some sort of accent mark, cause unreadable output in the ID3Lib::Tag object. The System of a Down song, "NĂ¼gun", produces "\000N\xxx\374\000g\000u\000n" where xxx is either 000 or a number I can't remember. Iconv seems to hate this. It gives me an Illegal Sequence error, but maybe I'm doing it wrong.

Iconv.iconv("LATIN1", "UTF-8", tag.title) # tag.title is the odd string above.

Someone please give me pointers as to how to use UTF-8, and/or a better Tag reading lib. I'd like to be able to use this script on Vorbis and m4a files as well.

I shall continue to google, but help is always appreciated.

No comments:

Post a Comment