A java list randomizer

September 20th, 2009

This week I came across what seemed like a simple programming problem. From a large list of items I wanted to create a smaller list with random items copied from the larger list. The real world equivalent of would be something like bring me four random records from your CD collection.  Sounds simple? Well, there was an additional requirement on that I put on the solution, and that was that the smaller list should contain no duplicates.

There is a Random class in java that returns pseudorandom numbers. However, there is no guarantee that when calling Random.nextInt() a number of times the same number will not be returned twice.  In other words, the simple strategy filling a small list with elements from the large list picked at random with the help of Random.nextInt() will probably produce a result list when the same element is added twice. Not good enough.

One defense against that problem could be to keep track of which numbers the random generator has returned and simply discard any duplicates. For most cases that strategy is probably efficient, but sometimes it could lead to having to discard a whole bunch of random numbers. Moreover, execution times of such a solution would differ a lot depending on which numbers Random.nextInt() returned.

Another solution would be to copy all of the large list to a temporary list and remove each element taken to the smaller list from the temporary list. That way, no item would be added twice to the result list. However, if the source list is long, say a million entries, creating a temporary list to fetch holding all the million entries just to extract 100 random entries would be inefficient.

The elegant solution to this problem in my mind would instead be to create a list of offsets into the large list as long as the small list. The list of offsets would then be modified to emulate the effect of having a copy of the large list and removing entries from it. If the first entry of the offset list is offset 1, and the second offset also is 1, modify that to 2 instead, the element that would have been at position 1 if the original element at that position was removed.

I wrote a class implementing this solution, Randomizer.java. Feel free to use it or modify it anyway you wish.

pwhash, password hashing in java

March 28th, 2009

As promised, here is the code to a Java implementation of the principles of password hashing that I outlined in my previous post. I'll put it on a proper project page later on, but for now the full distribution can be downladed as pwhash-0.9.zip, the binary jar can be found as pwhash-0.9.jar and the source code with documentation can be found at PasswordHasher.java.

Included in the distribution is also a Base64 implementation, Base64.java, that I wrote. The fact that Sun hasn't included it in Java from version from the very beginning is a mystery to me. My implementation might not be the fastest or the most robust one around but it is quite readable and preforms okay.

Sun’s Java MP3 plugin is no friend of ID3 tags

January 13th, 2009

The Java programming environment has a framework for reading audio files, that is extensible and has the ability to to handle new audio formats not originally supported by the standard platform. It turns out that Sun has released a plugin that adds playback support of the popular MP3 audio format. However, a few days back I learned that Sun's plugin doesn't seem to recognize the Bible MP3 files that we sell at Voxbiblia.

When looking into the problem I found out that the audio format identification functionality doesn't play well with the ID3v2 metadata format that we use at Voxbiblia. The ID3 tag enables users to organize our MP3 files into whole books or even whole Bibles on their computers or portable MP3 players, and it even enables them to read the actual Bible text of the passage recorded in a specific audio segment. As you might imagine the ID3 functionality is quite useful, and also close to universally used and accepted in popular playback products such as iTunes and Windows Media Player, so I think that it's kind of strange that Sun's MP3 plugin doesn't at least support it by skipping over the tag is kind of surprising.

However, it is not impossible to do just that yourself. If you open the MP3 file yourself and skip forward to after the end of the tag before you hand over the InputStream to the AudioSystem framework it can identify and decode the MP3 stream correctly.

So, I wrote some code that did just that. It's a little bit more tricky to than you might think, because the ID3 format encodes the tag length information in an unusual format, but other than that the strategy is quite straightforward.

The class can be downloaded here: MP3Identifier.java. Enjoy.

Sharing my work

December 27th, 2008

Today I have released a piece of software that I have written as free software. It's a small library that is used to generate the information stuck to an MP3 file that tells your MP3 player what you are listening to, something called an ID3 tag. The software is mostly simple, but it uses some of the advanced CRC32 reversal stuff that I blogged about a while back, so it has some neat features if you're in the business of creating dynamic metadata to audio files stored in dynamically created zip files. My prediction would be that very few people would actually use this, but the standard ID3 generation functionality is probably useful to some.

Anyway, I'm really happy to be able to give back to the free software community, small pieces of software that I have written. My dream is to some day write a piece of software that grows it's own community around it, with other people contributing new functionality and fixes to problems. I hope that some day that dream will come true. For now it's just me publishing small parts of the software I write. Not all that bad.

For interested parties the information about my library, named id3j, is found at Voxbiblia's Free software page.

Update:
Changed the name of the released package, because of a naming collision.

From Ruby on Rails to Java

December 20th, 2008

I have been tasked with making some extensions and changes to a system written i Ruby on Rails with some parts written in Java. For reasons I will not write about at this time we have decided that we want to move away from Ruby on Rails and instead develop using java and the excellent Spring framework.

Leaving one development framework for another is often a painful experience, where lots of code needs to be thrown away and rewritten in one step. For this particular situation doing that massive rewrite, technology change has been something we wish to avoid and here are some info on how we plan to do that.

The first step was to move both the java and the ruby environment into the same namespace from the point of view of the web browser. Since we already use the Apache httpd as a front end for the Ruby environment—executing it via mod_fcgid—this was a simple matter of configuring the same apache httpd instance to proxy all requests to the java environment. That way we can make relative links from pages created in he ruby environment to pages in the java environment.

The next problem to tackle was session management. More specifically, when a user logged in using Ruby on Rails we and then navigated to a page served by the java environment we needed to propagate the information about the logged in user to the java code, hopefully without doing any drastic changes to the ruby code.

It turns out that doing that was not that difficult. The way our Ruby on Rails environment is configured it uses Active Record to store session information associated with a specific cookie in the web brower in the database. Since both ruby and java lives on the same webserver from the point of view of the brower, any cookies set by rails is also sent by the browser as a part of the reqests that ends up in the java environment.

I have then written code that uses the cookie to look up the session data in the database from java. That data consists of a Base64-encoded binary data using the Marshal.dump() facility of Ruby. It turns out that data is not particularly difficult to parse relevant bits from. Looking for the ASCII string of the key of our user id value in the session object in Ruby, then look for the next double qoute char and parse all ASCII digits before the next double qoute occurs seems to be a fully workable solution.

The code to do this can be looked at in RailsSessionIntegrationHelper.java. If you want to get rid of the Spring Framework dependency, just replace the method getSessionData() with one that does raw JDBC instead, and remember to close your connection when you're done.

Image resizing in java

December 20th, 2008

Doing good image resizing in your favourite software development environment shouldn't be hard. After all, lots and lots of software that has a graphical user interface of some kind needs to do image resizing. However, in java it isn't as easy to do as it should be. I had some resizing needs, more specifically I needed code to resize a large black-and-white image into thumbnail size.

After googling a bit I came up with a recipe from the official Java 2D FAQ at sun.com and used that. After all, the creators of Java should know how to do it right. However, I was surprised to find out that the visual result of the resizing was terrible. Have a look for yourself:

A black and white letter a, 310 pixels wide

I started out with this letter a in black and white. When resizing that one to a version 40 pixels wide with the code that sun suggests, it looks like this when magnified:

resize_bilinear_sun_blowup

As you see, there is some grayscale smoothing going on but not much and the overall impression is quite jagged. Before anyone asks, yes I'm using the bilinear interpolation option. Compared to the result when resizing in GIMP, ImageMagick or just about any other tool the result is terrible. So i tried around, googled and looked at code here and there.

I was on my way to accept that java just didn't do this, and restort to calling a command line tool from my web application when I found a piece of code that compares the speed and results of different scaling methods. It turns out that if you use the getScaledInstance() method of the java.awt.Image base class with the Image.SCALE_SMOOTH as last parameter, the result looks much better.

Here is a blown up version of a 40 pixels wide rescaling using that method instead:

resize_scaledinstance_blowup

Ah, much better. Why is this? I don't know. If there is anyone out there that can give me details on why this is I'm more than happy to be educated.

So, if you want to copy my method, please have a look at ImageResizer.java. The version calling a verbatim copy of the suggested solution from Sun's FAQ is in the method sunResize() and the better looking version is in resize().