Eclipse + Leopard = Crash?

I recently upgraded to Leopard on my work MacBook Pro. In Eclipse, one of the first things I ran into was anytime I tried to do Open Resource (Shift + APPLE + R) to open a file resource, Eclipse would crash with a nasty exception.

Exception Type: EXC_BAD_ACCESS (SIGBUS)

After doing a bit of googling, this seems to be a bug in SWT for Leopard (see here). I upgraded my Eclipse to the 3.4 Stream Stable Build found at http://download.eclipse.org/eclipse/downloads/ and my problems have seem to gone away.

It’s got a pretty cool new splash screen too 🙂


eclipse-3.4 splash screen

Eclipse + Leopard = Crash?

Javascript Password Revealer using Prototype

Inspired by the latest Coding Horror post, here’s an code snippet that allows you to implement a password revealer using the Prototype JavaScript library.

Check it out in action below.



I wouldn’t mind seeing more web forms adding this feature for password fields.

Javascript Password Revealer using Prototype

Using JMeter for load testing

I came across JMeter a while back but never got a chance to try it out. From the JMeter website:

Apache JMeter is a 100% pure Java desktop application designed to load test functional behavior and measure performance. It was originally designed for testing Web Applications but has since expanded to other test functions.

This weekend I was able to test load on my 256 slice which this blog is running on. Here’s what I did:

  1. Download the binary
    You can get the binary here.
  2. Unzip the tarball/zip file
    I extracted it file to /Users/theo/tools/jakarta-jmeter-2.3.1
  3. Start up JMeter
    Go to the bin directory. Run jmeter.sh (jmeter.bat if using Windows) from the command line.
  4. Create a Test Plan
    Just give a name and any description you want for your test plan.jmeter_create_test_plan
  5. Create a Thread Group
    A thread group allows you to specify the amount of load you want to simulate. Select your test plan from the left Tree view, right-click, and select Add -> Thread Group.

    Configurable fields include:

    • Number of threads – the number of connections or users you want to simulate
    • Ramp up period – the amount of time in seconds to take to reach the number of threads specified. If you choose 0, all of the threads will be created at the start of the test.
    • Loop count – you can specify to loop indefinitely or provide a number of times to run through the test.

    jmeter_create_thread_group

  6. Add a Sampler
    A sampler is a type of request you want to make. In this example, I used an HTTP request to test load to a web server. It’s good to note JMeter supports multiple types of samplers including web services, JMS, and JDBC. Add a sampler by selecting the Thread Group you just created, right-click, select Add -> Sampler -> HTTP Request.

    Configurable properties include:

    • Server Name – what the ip or url is to the server the request it to
    • Port – the port the server is listening to
    • Protocol – the protocol (http, https, etc)
    • Method – HTTP method (POST,GET, PUT, DELETE, etc)
    • Path – the URL path to request

    jmeter_create_http_request

  7. Add a Listener
    A listener allows you to collect data points and display them in some fashion like a graph or a table. I used the Graph Results listener by selecting the Thread Group, right-click, select Add -> Listener -> Graph Results.
  8. Run the test!
    Now we are ready to run the test. From the file menu bar, select Run ->Start. You will be prompted to save your test plan. You can save it or just hit “No”. You should see data points begin to be plotted on the Graph Result or whatever listener you selected.

    jmeter_graph_results
    I monitored the usage from my slice as well and this is what top showed me:

    slice_top_load_test

    You can see the 5 threads we specified in the Thread Group taking up 5 apache processes.

  9. Interpret the results
    Tests are worthless without interpreting the results. So, what the heck does this graph tell me? Pretty good documentation can be found here. Basically, given the load scenario we have setup, my slice can handle ~643 requests/minute or ~11 requests/second. I am not sure what kind of numbers I should be getting but these seem pretty good to me. One last thing to note is that JMeter is not a web browser so these metrics don’t include rendering time or execution of any JavaScript.

Overall, JMeter seems to be a great open source tool to test different kinds of load on servers. I look forward to trying it out at work. One last thing I came across is JMeter integration with your Ant build process. Check that out here. I would like to hear about other people’s experiences with JMeter, too.

Using JMeter for load testing

Prior work experience not needed?

Jeff Atwood just posted an article on the myth that the more years of experience a developer has, the better candidate they are for a position. In the article he references a previous post that spoke to the hypothesis that there is no correlation between skill in programming and experience. This is exactly what I was thinking when I wrote my reaction to disillusioned young IT workers.

I’m still early in my career and I’ve tried to stay on a career path that is driven by what I enjoy doing, which is hacking away at code. At the same time, I am not blind to the fact that companies do look for work experience in specific areas. This certainly helps you get your foot in the door. One thing I’ve struggled with is how do you make the transition into getting that sought-after experience?

While I have programmed in a number of different programming languages, these experiences are based on my own pet projects and curiosity of the languages. On the other hand, my professional experience can be summed up as a Java developer. I’ve been working with Java since I got out of school. While I think working with Java is great since I feel I can be productive in it and there are plenty of career opportunities, the IT industry evolves over time and we see other languages gain traction.

programming_languages_absolute

programming_languages_relative

I don’t have any hard statistics, but I suspect the number of core programming languages an average developer extensively works with throughout their career is probably around 10-12. If you count all the supplemental languages that come with working in certain languages, like HTML, JavaScript, or SQL, this number is probably closer to 20. That sounds like a reasonable guesstimate and if true, I’ve got a long way to go.

This got me thinking about how do I continue to learn if my day-to-day is limited to one programming language. Here’s some advice I have for other Java developers.

  • Follow open source – open source projects are a great way of getting exposure to a lot of the Java/JEE platform. The Java platform is a really big environment to be playing with. Open source has provided baselines for everything from database ORM projects like Hibernate/iBatis to MVC frameworks like Spring MVC/Struts to messaging infrastructure with ActiveMQ/Mule to web services with Xfire/Axis2. There is a lot to be learned and I never see a job description for Java developers without some mention of a Java open source project.
  • Change the focus – What I mean here is change what you are doing most of your work relative to the system. At my first job, I mainly worked on the front-end so it was all HTML/CSS/JSP and Controllers. After awhile, I was interested in doing more of the back-end and building out infrastructure with DAOs and web services. At my current employer, I started off again on the front-end. Luckily, it was with a different MVC framework so there was good exposure there. More recently, I’ve had the opportunity to work on infrastructure and I’ve been exposed to about five new technologies I never worked with before.

By doing this, I’ve gained a lot of experience in a short amount of time with respect to working in Java. Basically, it allowed me to get more in-depth experience with Java by using Java in different contexts. This is a great start and if I was planning on working with Java my entire life, I could always continue down this path but that’s not reasonable.

HR departments love to see previous experience and it reminds me of the catch-22 that recent college graduates face. They want a job but the employer wants prior experience. They can’t get experience if the employer doesn’t them a job.

I would love to hear from people who have worked with multiple programming languages and made the language transition between jobs. If a company had a .NET position or a Rails position and you never had experience in the language, what made the company hire you and allowed you to beat out other candidates that probably had more language experience than you? What inspired you to make the paradigm shift? What advice do you have for others developers with language-limited experience?

Prior work experience not needed?

Social Matchbox DC

Last week, I attended Social Matchbox DC with Brent and Brendan.

We all thought it would be a showcase of startups around the DC talking about the cool things they are doing. To our surprise it was more of a job fair than a social gathering. However, they had free pizza so I can’t complain too much. Still, it’s refreshing to see there is a startup community in the DC area, where it seems like everyone and their uncle work for the government. Clearspring and Freewebs were both there. I tried out making a widget on Clearspring’s platform and making a web site using Freewebs a while back. Both are very cool companies.

Although the DC area is more of a Government Valley, I wish there were more venues that allow startups and hackers get together around the area.

Social Matchbox DC

Puzzle – 100 Doors

I’m a big fan of logic puzzles. Normally, we only get asked these kind of questions during interviews but they are pretty fun to think about outside of an interview setting.

Here we go:

There are 100 doors in a long hallway. They are all closed. The first time you walk by each door, you open it. The second time around, you close every second door (since they are all opened). On the third pass you stop at every third door and open it if it’s closed, close it if it’s open. On the fourth pass, you take action on every fourth door. You repeat this pattern for 100 passes.

Question: At the end of 100 passes, what doors are opened and what doors are closed?

Snaps if you can figure it out. (No googling allowed!)

Update:
See answer below:


So, there are many ways to approach this problem.
Here’s how I went about solving this puzzle. There are two states a door to be in, opened or closed. On each pass of a door, the door’s state is toggled. Since all the doors starts off closed, we can say if we visit a door an even number of times, the door will be closed. If we visit the door an odd number of times, it will be opened in the door. Let’s see this in action with 5 doors (O = Opened, C = Closed)

Pass # Door 1 Door 2 Door 3 Door 4 Door 5
Initial C C C C C
Pass 1 O O O O O
Pass 2 O C O C O
Pass 3 O C C C O
Pass 4 O C C O O
Pass 5 O C C O C

One thing to note is once we pass door X X times, we never will have to toggle door X again. For example, once we pass door 10 on the 10th pass, on the 11th pass and subsequent passes, we will never have to revisit doors 10 and all the doors before 10.

Looking at the 5 doors shown in the table above, we see that only door 1 and door 4 are left open. They will remain open for the rest of the passes to 100. What’s special with 1 and 4? What makes us visit a door an odd number of times versus an even number of times? Let’s look what made us close door 2, 3, and 5 and open 1 and 4.

On the first pass, we opened all of the doors because it was the 1st pass.
Here we know door 1 is going to remain opened.
On the second pass, we closed all of the doors with numbers that were even or divisible by 2. (doors 2, 4, 6, 8, etc)
Here, we know door 2 is going to remain closed.
On the third pass, door 3 gets closed and remains closed.
On the fourth pass, door 4 gets reopened and remains opened.
On the fifth pass, door 5 is closed.

The factors of 1 are 1 and itself. It has one factor
The factors of 2, 3, and 5 are 1 and themselves. They have two factors.
The factors of 4 are 1 and itself, as well as 2. 4 has three factors.

Didn’t we say if we visit a door an odd number of times, it remains open?
It turns out the number of factors a number has is the same number of times we visit it.
If a number has an odd number of factors, we visit it an odd number of times.
If a number has an even number of factor, we visit it an even number of times.

So, the big question is what numbers have an odd number of factors?
Looking at 1 and 4 gives us some insight. Both numbers have factors that are performing double duty.
1 has 1×1 and 4 has 2×2. While factors come in pairs, we only count these factors that are performing double duty as one.

What do we call numbers that have factors that perform this double duty? Perfect squares.

The answer is doors 1, 4, 9, 16, 25, 36, 49, 64, 81, 100 are opened. The rest are closed.

Puzzle – 100 Doors

Top 8 Unix Commands for the Developer

As a developer, there are certain UNIX commands you find yourself typing repeatedly. Whether it’s to debug a production issue or just modifying some files, these commands have helped me do my job time and time again. Here’s my top 8:

  1. grep – Prints the lines that match the pattern provided in the files specified
    • Usage: grep <options> <pattern> <files>
    • Example: grep -n Exception production.log
      • Prints all the line (showing line numbers) in the file production.log that contain the string ‘Exception’
  2. tail – Only interested in a the last couple of lines in a file? tail allows you to quickly view the end of the file
    • Usage: tail <options> <file>
    • Example: tail -fn100 production.log
      • Shows the last 100 lines of the log and waits to display any new text appended to the file
  3. ssh – Log into remote servers
    • Usage: ssh -p<port> <username>@<hostname>
    • Example: ssh -p1234 theo@production
      • Logs into the server named production on port 1234
  4. scp – Copies files to/from remote servers
    • Usage: scp -P<port> <source> <target>
    • Example: scp -P1234 /home/theo/myfile.txt production@/home/jsmith
      • Copies myfile.txt from /home/theo to the server named production under /home/jsmith
  5. rm – Deletes stuff!
    • Usage: rm <options> <file>
    • Example: rm -rf mydir
      • Removes the entire directory and files with no prompt for confirmation (Use with caution!)
  6. ps – Shows process status
    • Usage: ps <options>
    • Example: ps aux
      • Displays the process status of processes for all users including those that are controlled by a terminal (system processes) sorted by CPU usage
  7. top – Similar to ps but it periodically updates the information such as CPU and memory usage
    • Usage: top
    • Example: top (duh!)
  8. kill – terminates a process
    • Usage: kill <option> <pid>
    • Example: kill -9 12345
      • Terminates the process with id of 12345 using a non-catchable, non-ignorable signal (that just means you REALLY mean to kill it)

I use lots of these commands in combination. For example, if tomcat seems to hang and won’t properly shut down I would do the following:

  >> ps aux | grep tomcat

I would then take the pid of tomcat and run:

  >> kill -9 <tomcat-pid>

Now you may be wondering why the “Top 8”, why not “Top 10”. Well, because 8 is the new 10 and those are all UNIX commands I know :).

What are some of the commands that you use to get through the day?

Top 8 Unix Commands for the Developer

Open Source and Caching Algorithms

I wanted to go through the exercise of contributing to open source with a project of my own. After thinking about it for probably 15 minutes, I decided I wanted to try to build my own caching system in Java. Too bad I knew next to nothing about caching. I went off and did some research.

There are certain known algorithms that have become popular when implementing caches. Given that caches have a finite size (either you run out of space or memory), the cache algorithms are used to manage the cache. These algorithms determine things like how long an item remains in the cache and what gets booted out of the cache when it reaches its maximum size. Wikipedia describes the most efficient caching algorithm “would be to always discard the information that will not be needed for the longest time in the future”. You need to take a look at the data you want to cache before deciding on a caching strategy. Do you need to support random access (the access to the data is uniformly distributed) or sequential access ( you’re interested in large chunks of data at a time)? Is certain data accessed more often that other pieces of data?

Here’s a couple common algorithms:

  • Least Recently Used (LRU) – the items that haven’t been accessed the longest get the boot first. This is implemented by keeping a timestamp for all items in the cache. Check out this simple LRU implementation.
  • Least Frequently Used (LFU) – the items that are sitting in the cache but have been accessed the least are booted out first. This is implemented by a counter to see how often an item is accessed.
  • First In First Out (FIFO) – the item that first entered the cache is the first to go when it gets full. This can be easily implemented by a queue.

Of course, there are projects like EHCache and OSCache out there that have addressed this issue.

OSCache provides a FIFO and a LRU implementation of a cache.

In addition to FIFO and LRU, EHCache provides a LFU implementation of a cache.

Thinking about how these algorithms work, it is easy to see that there are certain cases where using one over the other provides a great advantage. For example in the case of LRU, which seems to be the widely accepted and most used caching algorithm, this cache works great when the majority of the hits come to a very concentrated group of items. This way, most hits, if not all, are retrieved from the cache. However, if there is a large scan of all the data, once the cache reaches its max size LRU will just remove items out on every hit. If the cache can hold a max of 50 items and you have 100 records, as you iterate over the 100 records, the cache will empty out the first 50 records to put in the second half of the records, resulting in lots of add/removing to the cache and 0 cache hits. Algorithms that prevent this from happening, like LFU, are known as scan-resistant.

I was interested in finding if there was some middle ground that gave me the best of both worlds LRU and LFU. It turns out there is.

The algorithm is known as Adaptive Replacement Cache (ARC). It gives you the benefits of LRU as well does a balancing act to prevent data scans from polluting the cache. It does by keeping track of two lists, one for recently references items and another or frequently referenced items. If you read about it, it’s a pretty cool algorithm.

I was excited when I came across this algorithm because I thought it would make such a fine addition as an open source project. And then I discovered it was patented. Apparently, PostgreSQL already went through this exercise and deemed it safer to not use it.

So, now I’m thinking I need a new idea for a project.

Open Source and Caching Algorithms