pwhash in Ruby

March 29th, 2009

I spent some time this weekend re-implementing my pwhash functionality in ruby. I don't have much experience with ruby. I got some exposure to it when doing some work for johnlook a while back, but when writing this code it became apparent that I had some gaps in my knowledge.

Learning new programming languages is an interesting thing to do. I've done it a few times now and if the language is good it gives you a few new perspectives and new ideas on how to be a better programmer. I must say that ruby is a nice acquaintance. The learning curve is a bit steeper than with languages like python (or maybe I'm just getting old) but many things are elegant and I hope to get to work more with it in the future.

Anyway, without any further ado I give you pwhash.rb. Feel free to use it in any way that is compatible with GPL3. I'm fully aware that I have yet to master the style and details of ruby, so if you have any criticisms or ideas on how to improve upon it, feel free to drop me a line.

pwhash, password hashing in java

March 28th, 2009

As promised, here is the code to a Java implementation of the principles of password hashing that I outlined in my previous post. I'll put it on a proper project page later on, but for now the full distribution can be downladed as pwhash-0.9.zip, the binary jar can be found as pwhash-0.9.jar and the source code with documentation can be found at PasswordHasher.java.

Included in the distribution is also a Base64 implementation, Base64.java, that I wrote. The fact that Sun hasn't included it in Java from version from the very beginning is a mystery to me. My implementation might not be the fastest or the most robust one around but it is quite readable and preforms okay.

Sun’s Java MP3 plugin is no friend of ID3 tags

January 13th, 2009

The Java programming environment has a framework for reading audio files, that is extensible and has the ability to to handle new audio formats not originally supported by the standard platform. It turns out that Sun has released a plugin that adds playback support of the popular MP3 audio format. However, a few days back I learned that Sun's plugin doesn't seem to recognize the Bible MP3 files that we sell at Voxbiblia.

When looking into the problem I found out that the audio format identification functionality doesn't play well with the ID3v2 metadata format that we use at Voxbiblia. The ID3 tag enables users to organize our MP3 files into whole books or even whole Bibles on their computers or portable MP3 players, and it even enables them to read the actual Bible text of the passage recorded in a specific audio segment. As you might imagine the ID3 functionality is quite useful, and also close to universally used and accepted in popular playback products such as iTunes and Windows Media Player, so I think that it's kind of strange that Sun's MP3 plugin doesn't at least support it by skipping over the tag is kind of surprising.

However, it is not impossible to do just that yourself. If you open the MP3 file yourself and skip forward to after the end of the tag before you hand over the InputStream to the AudioSystem framework it can identify and decode the MP3 stream correctly.

So, I wrote some code that did just that. It's a little bit more tricky to than you might think, because the ID3 format encodes the tag length information in an unusual format, but other than that the strategy is quite straightforward.

The class can be downloaded here: MP3Identifier.java. Enjoy.

The accidental thief

December 29th, 2008

It turns out that the name of the software I released on saturday was a bit too good, so good in fact that someone had thought of it before. It turns out that there is a much more ambitious piece of software that does the same thing that my software does. To avoid confusion I have renamed, so Voxbiblia's jid3 is now id3j. I'm sorry about any confusion.

Sharing my work

December 27th, 2008

Today I have released a piece of software that I have written as free software. It's a small library that is used to generate the information stuck to an MP3 file that tells your MP3 player what you are listening to, something called an ID3 tag. The software is mostly simple, but it uses some of the advanced CRC32 reversal stuff that I blogged about a while back, so it has some neat features if you're in the business of creating dynamic metadata to audio files stored in dynamically created zip files. My prediction would be that very few people would actually use this, but the standard ID3 generation functionality is probably useful to some.

Anyway, I'm really happy to be able to give back to the free software community, small pieces of software that I have written. My dream is to some day write a piece of software that grows it's own community around it, with other people contributing new functionality and fixes to problems. I hope that some day that dream will come true. For now it's just me publishing small parts of the software I write. Not all that bad.

For interested parties the information about my library, named id3j, is found at Voxbiblia's Free software page.

Update:
Changed the name of the released package, because of a naming collision.

From Ruby on Rails to Java

December 20th, 2008

I have been tasked with making some extensions and changes to a system written i Ruby on Rails with some parts written in Java. For reasons I will not write about at this time we have decided that we want to move away from Ruby on Rails and instead develop using java and the excellent Spring framework.

Leaving one development framework for another is often a painful experience, where lots of code needs to be thrown away and rewritten in one step. For this particular situation doing that massive rewrite, technology change has been something we wish to avoid and here are some info on how we plan to do that.

The first step was to move both the java and the ruby environment into the same namespace from the point of view of the web browser. Since we already use the Apache httpd as a front end for the Ruby environment—executing it via mod_fcgid—this was a simple matter of configuring the same apache httpd instance to proxy all requests to the java environment. That way we can make relative links from pages created in he ruby environment to pages in the java environment.

The next problem to tackle was session management. More specifically, when a user logged in using Ruby on Rails we and then navigated to a page served by the java environment we needed to propagate the information about the logged in user to the java code, hopefully without doing any drastic changes to the ruby code.

It turns out that doing that was not that difficult. The way our Ruby on Rails environment is configured it uses Active Record to store session information associated with a specific cookie in the web brower in the database. Since both ruby and java lives on the same webserver from the point of view of the brower, any cookies set by rails is also sent by the browser as a part of the reqests that ends up in the java environment.

I have then written code that uses the cookie to look up the session data in the database from java. That data consists of a Base64-encoded binary data using the Marshal.dump() facility of Ruby. It turns out that data is not particularly difficult to parse relevant bits from. Looking for the ASCII string of the key of our user id value in the session object in Ruby, then look for the next double qoute char and parse all ASCII digits before the next double qoute occurs seems to be a fully workable solution.

The code to do this can be looked at in RailsSessionIntegrationHelper.java. If you want to get rid of the Spring Framework dependency, just replace the method getSessionData() with one that does raw JDBC instead, and remember to close your connection when you're done.

Image resizing in java

December 20th, 2008

Doing good image resizing in your favourite software development environment shouldn't be hard. After all, lots and lots of software that has a graphical user interface of some kind needs to do image resizing. However, in java it isn't as easy to do as it should be. I had some resizing needs, more specifically I needed code to resize a large black-and-white image into thumbnail size.

After googling a bit I came up with a recipe from the official Java 2D FAQ at sun.com and used that. After all, the creators of Java should know how to do it right. However, I was surprised to find out that the visual result of the resizing was terrible. Have a look for yourself:

A black and white letter a, 310 pixels wide

I started out with this letter a in black and white. When resizing that one to a version 40 pixels wide with the code that sun suggests, it looks like this when magnified:

resize_bilinear_sun_blowup

As you see, there is some grayscale smoothing going on but not much and the overall impression is quite jagged. Before anyone asks, yes I'm using the bilinear interpolation option. Compared to the result when resizing in GIMP, ImageMagick or just about any other tool the result is terrible. So i tried around, googled and looked at code here and there.

I was on my way to accept that java just didn't do this, and restort to calling a command line tool from my web application when I found a piece of code that compares the speed and results of different scaling methods. It turns out that if you use the getScaledInstance() method of the java.awt.Image base class with the Image.SCALE_SMOOTH as last parameter, the result looks much better.

Here is a blown up version of a 40 pixels wide rescaling using that method instead:

resize_scaledinstance_blowup

Ah, much better. Why is this? I don't know. If there is anyone out there that can give me details on why this is I'm more than happy to be educated.

So, if you want to copy my method, please have a look at ImageResizer.java. The version calling a verbatim copy of the suggested solution from Sun's FAQ is in the method sunResize() and the better looking version is in resize().

A CRC-32 reversal implementation

August 5th, 2008

(This post is mainly intended for fellow geeks. Non-programming readers of this blog may safely ignore this, I promise you won't be missing out on anything useful.)

The CRC-32 checksum algorithm is everywhere. It is used to detect random data corruption in ethernet packets, PNG graphics files and—as it turns out—zip files. The basic idea is that a 4 byte checksum of a sequence of bytes of arbitrary length is calculated using a specific algorithm. The sender of a message constructs a checksum of the message and send it along with the actual data, and then the receiver can compare the checksum with it's own checksum calculated from the payload data and use that information to detect random data corruption.

The algorithm is designed to be fast and easy to implement in hardware. You can view it as a machine that you feed bytes of data. At any time you can query the machine for a checksum value of all the data fed into the machine. Each byte fed alters it's internal state so that a different 4 byte checksum value is returned. Even a single bit change in the incoming data will alter the checksum to a completely different value.

The algorithm is explained in a very detailed way with impressive clarity in A painless Guide to CRC error detection algorithms so if you are interested, please read that document.

My conundrum with regards to CRC-32 was that I would like to make a short-cut, calculating the resulting checksum of a short dynamic sequence of bytes (an ID3 tag) combined with a longer, static sequence of bytes (an MP3 file) without needing to re-calculate the checksum of the whole every time the dynamic part changed.

So, what to do. Calculating a backwards CRC-32 value for the latter part of the stream to be combined with the checksum of the dynamic part seems to be too hard a problem to solve. However, I got a tip from a friend that with a 4 byte sequence of "compensation data" the checksum can be reset to whatever value one chooses. That way the dynamic tag plus four well chosen bytes can be reset to a known good state for the CRC-32 machine to chug along from and arrive at a known destination.

I got a link to a paper describing an algorithm to calculate the magic compensation bytes, but I think that the description of the algorithm was so complicated that I decided to figure out a way to do it on my own. So, I did. After a bit of experimenting looking at the table based algorithm used in the GNU Classpath CRC32.java I found out a way to calculate the magic reversal bytes as a function of some current checksum as well as a desired checksum.

I have released the resulting CRC32Compensator.java under the MIT Free Software license, in the hope that someone else will find it useful or instructive.