Archive for the ‘code review’ Category

Code Review #6 - ‘Too smart’ aka scared of being dumb

Tuesday, May 6th, 2008

One of the biggest failing junior developers have is that they are too ’smart’. ‘Too smart’??? How can someone be ‘too smart’? Actually pretty easily.

‘Too smart’ is when the person spent hours looking at a problem. And the next day she/he

  • realizes the specs would require violating the speed-of-light;
  • or has the product manager come up to them and say, “Oh yeah, there was a typo in the spec and it should have said ‘… NOT …’”;
  • or the product manager/tech lead (when asked) says, “that’s not what I meant”;
  • or the product manager/tech lead (when asked) says, “I didn’t consider that case” (or don’t worry about that case now);
  • or is told by a fellow developer, “oh I ran into that problem last week and on the 3rd page of the documentation, a workaround solution to that problem is described.”;
  • or discovers they were working on an old version of the code and upgrading to the latest version of code or libraries, solves the problem;
  • or finally another developer looks at the problem and points out a simple condition that has been reversed;
  • or…

In all cases, the person (doesn’t have to be a developer) violated the 20-minute rule. Quite simply the 20-minute rule says

“If you are stuck for more than 20 minutes, and you are no closer to understanding the source of the problem - then ask for help”

Yes, it could be something ’stupid’ but it could also be something else. Sometimes you will be told to keep on looking because the person in question wants you to learn something. But sometimes there is a fundamental misunderstanding.

So to make sure that you are not asking for help too easily there is the corollary to the 20-minute rule, the 10-minute review rule.

Spend 10 minutes reviewing, how you got to this situation.

  • Describe why the previous code is written the way it was. Notice this is not a ‘what’ question. So do not answer ‘what the previous code was doing?’ Answer why did the previous developer did the code this way - do you know how the code is being used? Are you breaking those assumptions?
  • Describe why the previous code is now incorrect. List the changed assumptions or requirements.
  • Describe what your solution is going to accomplish.
  • Describe why your solution is the minimal solution.
  • Describe why your solution is the correct choice.
  • Describe the single problem that if solved would make your solution work correctly.

At this point, one of these choices is the path that needs to be taken:

  • Fundamental misunderstanding about existing code. You realized your initial assumptions were wrong and that your solution is the wrong path. Solution: STOP and look at reevaluate problem discarding old assumptions.
  • Missing information. The problem was not specified clearly enough. There are multiple possible problems that are being asked to be solved. Solution: get clarification. Do not proceed until the problem has been clarified.
  • Reinventing wheel. Has this problem already been solved somewhere else in the code? If so, refactor so the already created/tested solution can be used.
  • Rathole. A simple problem is becoming more and more complex. Are you fixing the problem of the problem of the problem that you started off solving? Are you making changes all over the place? If you are in a rathole, this is a sure sign of Doing-the-wrong-thing. Solution: STOP!
  • I don’t know If you don’t know, then you need to ask for help. Solution: Swallow pride and ask for help.

Code Review #5 - splurge on reporting configuration errors

Tuesday, October 30th, 2007

Configuration problems when deploying a new build are high on the “high anxiety” list. The pressure is high to hurry up and get the build deployed. Often times, the deploy happens late at night when the deployer really just wants to go to bed.

The number one way a developer can help out in this situation is very good error messages. If an exception occurs as a result of a misconfiguration: be verbose. Be overly verbose. The sanity you save may be your own.

This error message does not cut it:

Caused by: java.io.IOException: Couldn’t initialize working directory.
at com.amplafi.core.iomanagement.FtpManagerImpl.initializeWorkingDirectory(FtpManagerImpl.java:83)
at com.amplafi.core.iomanagement.FtpManagerImpl.initializeService(FtpManagerImpl.java:50)
… 16 more
Caused by: java.io.IOException: Couldn’t create directory: ftp-working
at com.amplafi.core.iomanagement.FtpManagerImpl.initializeDirectory(FtpManagerImpl.java:61)
at com.amplafi.core.iomanagement.FtpManagerImpl.initializeWorkingDirectory(FtpManagerImpl.java:81)

The error message should have at the minimum contained the following pieces of information:

  • The directory name as supplied in the configuration
  • The full absolute path to that directory (not the same thing as the configuration name if the name is a relative, not absolute path)
  • The reason for the problem (was the parent directory not writable, something already there, disk full, etc.)
  • A suggestion on how to correct the problem

Code Review #4: Always read the documentation/code - a.k.a. java.net.URL is evil

Sunday, September 9th, 2007

The Setup
Before I plunge into my rant, lets review a little-ole documentation. Under java.lang.Object, for equals() we have this:

It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.

Over in java.util.Collection land, there is this little bit of documentation, highlighting the importance of overriding the default implementations of equals() and hashCode() if one wants to play nice in a Collection:

Many methods in Collections Framework interfaces are defined in terms of the equals method. This specification should not be construed to imply that invoking Collection.contains with a non-null argument o will cause o.equals(e) to be invoked for any element e. Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two elements.

So for instances of a class to function well as a key in a java.util.Map or be placed into a java.util.Set, the class should override the standard equals() and hashCode() provided by Object. Because of this universal need, it is a reasonable expectation that implementors of equals() and hashCode() pay attention to performance. For example, the programmers who wrote java.lang.String did a fairly simple implementation of hashCode() and then cached the result so the calculation was not repeated for a given String object.

The resulting ubiquity of overridden equals() and hashCode() results in a certain set of expectations:

  1. if the object doesn’t change the results of equals()/hashCode() shouldn’t change (otherwise Set collections will break)
  2. equals()/hashCode() should be fast and in the case of hashCode - for potentially expensive hashCode() operations the results should be cached.
  3. Immutables (such as URL) should be good candidates to be keys in a java.util.Map

The Takedown
So I was completely blindside by the brain-dead, high school, freshman implementation of java.net.URL.

The only thing I will say positive about the java.net.URL implementation equals()/hashCode() is this: buried deep, deep in the documentation, the high school students who wrote the implementation do casually tell you that they are going to screw you over.

In java.net.URL.equals() (only if you go look at the detailed documentation) do you see this:

Compares this URL for equality with another object.

If the given object is not a URL then this method immediately returns false.

Two URL objects are equal if they have the same protocol, reference equivalent hosts, have the same port number on the host, and the same file and fragment of the file.

Two hosts are considered equivalent if both host names can be resolved into the same IP addresses; (emphasis mine) else if either host name can’t be resolved, the host names must be equal without regard to case; or both host names equal to null.

Since hosts comparison requires name resolution (emphasis mine), this operation is a blocking operation.

Note: The defined behavior for equals is known to be inconsistent with virtual hosting in HTTP. (emphasis mine)

Translation: We are going to screw you and you will enjoy it.

The Scream
How does this screw you?

  1. Two fundamental operations equals()/hashCode() are now ridiculously expensive since they involve a DNS lookup to see if they resolve to the same ip address. So now an operation that is optimize in other classes is outrageously expensive (milliseconds long) for a very fundamental internet class.
  2. URL looks to be an immutable. Immutable objects should mean that the identity: x.equals(x) is true. But this isn’t the case for URL. If you serialize a URL, the resolved hostaddress (it’s transient) is lost. So serialize/deserialize a URL. Wait a little bit. Add a little bit of the Dynamic DNS magic. Presto “http://google.com” will not equal the deserialized version of itself
  3. It is simple wrong. Take the case of a hosting service that hosts two different sites on the same server. The DNS lookup resolves the 2 domains to the same physical ip address. URL (with an “assist” from URLStreamHandler, InetAddress, and our very expensive DNS lookup) compares URLs based on the ip address not the entered hostname. As a result, http://my-porn-site.com could ‘equal’ http://jesus-has-saved-my-soul.org. I think someone might disagree with this assessment, don’t you?

The Takeaway
So after these very expensive operations URL.equals()/hashCode() is broken. The implementers of URL clearly decided that they were going to be ‘clever’ and for all their ‘cleverness’ the only thing they managed to do is screw users of this class. Sun should just admit the error of its ways and just reimplement URL.equals() (with hashCode() being the equivalent).

public boolean equals(Object o) {
    if( !(o instanceof URL)) {
        return false;
    } else if ( o == this ) {
        return true;
    } else {
        return ((URL)o).toString().equalsIgnoreCase(this.toString());
    }
}

The old functionality should be quietly pushed to a uselessEquals() method some place.

The Net
Performance problems and bugs can be introduced from the wildest and least expected locations. Always recheck your assumptions.

“When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”
-Sherlock Holmes.

Code Review #3: Use the debugger to check “working” code

Friday, September 7th, 2007

There is a bug (or two) in this code:

    protected void compareStreams(InputStream is, InputStream isFromGet)
            throws IOException {
        byte[] newBuf = new byte[4096];
        byte[] oldBuf = new byte[4096];

        int len = 0;
        while ((len = is.read(newBuf)) != -1) {
            boolean flag = false;
            if (isFromGet.read(oldBuf) == len) {
                flag = true;
                newBuf.equals(oldBuf);
            }
            if (!flag) {
                throw new IOException(”New file differs from old one.”);
            }
        }
        isFromGet.read(newBuf);
        if (isFromGet.read(newBuf) != -1) {
            throw new IOException(”New file differs from old one.”);
        }
    }

Did you spot any problems? Well maybe you didn’t spot any right away. But use a debugger, and the bugs would quickly become obvious.

  1. Look at :
                if (isFromGet.read(oldBuf) == len) {
                    flag = true;
                    newBuf.equals(oldBuf);
                }
    

    The results check newBuf.equals(oldBuf) is completely ignored. So what happens if 2 streams of the same length but with different contents are compared? This method says they are the same!

  2. But even if the check was flag = newBuf.equals(oldBuf), it would still be wrong! Using an IDE (or a debugger) quickly discovers that newBuf.equals():

        public boolean equals(Object obj) {
    	return (this == obj);
        }
    

    This is not the same as comparing the contents of the arrays. The correct call is Arrays.equals(newBuf, oldBuf). And of course after each read the arrays in question should be 0-filled to avoid issues with partially buffers.

  3. Next, what happens if the streams are read at different rates? newBuf and oldBuf will miscompared simply because one stream didn’t have data immediately available to it.
  4. Finally,

            isFromGet.read(newBuf);
            if (isFromGet.read(newBuf) != -1) {
                throw new IOException("New file differs from old one.");
            }
    

    What is that first line doing in there!? This allows ‘isFromGet‘ stream to be up to 4096 bytes larger than the ‘is‘ stream!

Yecks! So few lines so many major bugs.

Do a line-by-line code review with a debugger, and discover hidden bugs such as these in “perfect” code.

But finally, haven gotten this far… use the (debugged) work of others…

org.apache.commons.io.IOUtils.contentEquals(InputStream, InputStream) is good and correct. Use that.

Code Review #2: “It ain’t stored until the database says it is”

Monday, September 3rd, 2007

This was a recent comment on a bug report:

EventImpl stores start and end dates as two strings (milliseconds presentation of date). Not clearly understand how information can be lost.

What is wrong with this statement?

Well nothing is “stored” until there is a persistent representation of the data that will outlast the program being stopped. Period.

Putting some value in an instance’s field does not qualify. Unless and until some storage mechanism (database, serialization to a file, etc.) puts it some place permanent - it is not “stored”.

The developer can only say the value is being stored if:

  1. there is an active transaction;
  2. the object being modified is mapped to the database tables.
  3. the field being modified is mapped to some column within that database
  4. the specific object instance being modified is actually participating in the active transaction. There could be an active transaction that this object is not part of!
  5. the transaction commits without errors

For the particular bug report, neither the field nor the object was persisted by the database. Therefore nothing was stored.

Code Review #1: Masking of Throwables

Monday, September 3rd, 2007

When it comes to finally blocks, extra care is required. The code within a finally block can make very few assumptions about the state of variables that it is dealing with. Usually where developers slip-up is with null handling. This error is compounded by finally block lacking the try block exception management protection.

An otherwise well-written method with regards to correct try/catch blocks can still fail badly when it comes to the finally block code. A method that handles all exceptions can still throw an exception from the finally block itself. Usually the finally block throws an exception when try or catch block code threw an exception.

This code has this problem:

   private void copyFile(String fileName) throws IOException {
        File destFile = new File(archiveDirectory, fileName);
        FileChannel srcChannel = null;
        FileChannel dstChannel = null;
        try {
            srcChannel = new FileInputStream(new File(workingDirectory,
                    fileName)).getChannel();
            dstChannel = new FileOutputStream(destFile.getAbsolutePath())
                    .getChannel();
            dstChannel.transferFrom(srcChannel, 0, srcChannel.size());
            destFile = new File(archiveDirectory, fileName+".sha");
            srcChannel = new FileInputStream(new File(workingDirectory,
                    fileName+".sha")).getChannel();
            dstChannel = new FileOutputStream(destFile.getAbsolutePath())
                    .getChannel();
            dstChannel.transferFrom(srcChannel, 0, srcChannel.size());
        } finally {
            srcChannel.close();
            dstChannel.close();
        }
    }

Did you spot the problem?

What happens if either of the first two lines throws an exception?

  • srcChannel or dstChannel will be null.
  • A NullPointerException(NPE) will be thrown from the finally block.
  • The NPE will *mask* the original exception.

This last point bears repeating. The original cause of the problem will be lost! Now you as a developer may assume that the lost exception will be an IOException and 99% of the time especially during development this will be correct.

But many developers forget that there are a few externally generated events that can cause throwables to be thrown. Thats right! The root for “things that can be thrown” is java.lang.Throwable - not java.lang.Exception!

Some non-Exception examples include: StackOverflowError and OutOfMemoryError.

So the masking NPE could easily hide an OutOfMemoryError, which results in a lot of hair pulling when:

  • the exception is only occurring occasionally…
  • only on production boxes…
  • at a customer site…
  • they will not let you update the code until you have a fix…
  • the customer is your company’s largest account…
  • and their CEO is calling your CEO…

It may not play out like this in all its glory. But take the words to heart and make sure you never mask Throwables!

Thoughts on how to evaluate code

Tuesday, May 29th, 2007

I met someone at Startup Camp2 who was non-techincal but had an idea that required technical expertise.

She faced the typical problem of judging and evaluate software code in order to make sure the people she hired were:

  • competent
  • making progress

This is of course hard especially in the beginnings of a project when so much is really building infrastructure code and configurations, none of which involves ‘visible’ progress.

I whipped off a quick email in response that even after sleep I still rather like:

  • Maintainability
  • Performance

These are your 2 key metrics. Poorly written code fails in these two areas.

Maintainability


Commented Code

Look through the code. Do all large methods have well-written comments that *you* can understand. You may not understand all the details but if there are no comments or the English is poor, this is bad. Any other developer coming later is going to have a hard time understanding what the original developer was trying to do and will probably create bugs when adding new features.

Key point: The comment should talk about *why* not just *what* is being done. The developer must describe all convoluted (to you) code in a *written* comment. This comment should be understandable to you, the layperson, to a reasonable degree. Chances are very good that, if he/she cannot that:

  • He is not quite certain himself what it does
  • He probably has not thought through completely all the issues around this code.
  • It has lots of bugs.

Be careful not to get lost in the weeds here. Have the developer take you through the high-level code, not the low-level stuff. Low-level stuff will distract you from seeing the bigger picture. You may want a friend developer who knows the language in question but be prepared to be able to fly solo on this after a few reviews.

Sample comment:
/**
 * Application State Object that tracks the current state of a flow. Holds any
 * state information related to a specific flow
 *
 * Each flow state has all the information to run the flow and re-enter it if
 * needed.
 *
 * defines an actively executing flow. Each FlowState has an attached
 * Flow which is the instantiated definition. This copy is made to avoid
 * problems with flow definitions changing while an instance of a flow is
 * active.
 */

Integration Tests/Unit Tests:

These are automated tests that anyone can run from the command line ( i.e. should not require bringing up a development environment.) You should be able to run a command line tool that reports number of tests run and the code coverage of those tests. These tests should include running something like selenium that will bring up a browser and run through your site.

Packaging

You should have a set of clear step by step instructions to get from a brand new machine to running the tests to bringing up the service. You need to be able to verify this yourself. Using only the directions only as written i.e. *no help from anyone* can you get the machine set up, source code downloaded from a source code repository**, compiled, and running? You should be able to type ‘http://localhost/’ and see your website.

[**Run away from any developer that doesn't understand source code repositories. They are your insurance that 3 months into development the developer's machine crashes and everything is gone.]

Can you bring up the development environment and start the product following the written directions by yourself? This avoids the possibility/probability that the developer’s machine is magically configured and only his machine is set up just so to build the product. Believe it or not, I have worked at large companies that are hair-pulling experiences because everything has to be magically configured to build the product.

Performance


How many people are going to hit your web server? What is the peak load going to be? What kind of response is the developer giving about issues like scalability?

Big issue here. Have the developer create jmeter tests that show how the server behaves under load. When running a jmeter test look at memory usage and CPU % on the server and that database. Ratchet up the number of jmeter users until the service just dies. Is that number acceptable? Look into making the service scalable using Amazon’s ec2 service. Ask questions about how much memory each logged in user takes. If each logged in user takes 1 megabyte of memory, you will only be able to have 300 or so users at a time/machine!

Any developer worth anything knows to use a database and well but they are not experts. Spend the money for a day or two of a database expert’s time. Have them take a look at the queries your service runs against the database. Have he/she do at least a little bit of tuning (this will be on-going process) but could easily allow the service to run 10x - 100x better.