Tuesday, September 13, 2011

Communicate code intention I: Nulls

It's all about communication... Really!

As programmer one thing you do is to communicate with the computer so it does whatever you want.
However the computer isn't the only one you interact with. Your code is is something you tell to the computer and the team. Is communication of actions and intentions what makes the computer do things.
And this is the key part of the communication: intentions.

While the language of your choice may be able to communicate some intentions very explicitly, some others don't. This is where you should put your effort. Communicating your intentions is essential for your team and for your future self. Yes, you think you will remember but you won't.
The more you code, the lesser the probabilities are you will remember what was the intention of each line.

So, how do you manage to communicate all the implicit decisions and knowledge that is used on the code? You don't! Because you cannot reflect all the implicit decisions. But at least, you try to express as much as is useful for you and your team.

Use annotations to communicate expected behavior with nulls

I use two extra annotations just for the purpose of communication. They don't change the behavior of the program nor are compiled. They are discarded on compilation but they help me and the team know what we should expect when using a method or passing parameters.

@Documented
@Retention(RetentionPolicy.SOURCE)
@Target({ ElementType.FIELD, ElementType.LOCAL_VARIABLE, ElementType.PARAMETER, ElementType.METHOD })
public @interface CantBeNull {

}

@Documented
@Retention(RetentionPolicy.SOURCE)
@Target({ ElementType.FIELD, ElementType.LOCAL_VARIABLE, ElementType.PARAMETER, ElementType.METHOD })
public @interface MayBeNull {

}

As you can see these annotations are only retained on the source level and can be used on parameters, methods, variables and class fields.

Of course, you can live without these, or make your own. The idea here is to communicate what your code expects as input and how it will behave as output. Communicating this information on the code has proved to be valuable for others and myself after months of coding.

Don't overuse
Remember the goal: communication. If you use them everywhere and without a second thought they will become noise, and people ignore noise. So all your extra effort will be in vain. Just use it when the code isn't clear or can be prone to errors.
If you use Hibernate or other frameworks that have similar annotations just use those. Don't add these mechanically as they are redundant in presence of similar annotations.

Uses
These are the main use cases

On methods: Indicate whether is safe to use the return value as null-impossible, or you should do extra checks
On method arguments: Indicate whether the method can handle null values or you may provoke a NullPointerException if careless.
On class fields: Indicate whether the class is designed to handle null on that field, or some parts of the class rely on the assumption of not null.
On local variable: This case is rare and I don't use it very much because I try to make short methods. But just in case, you can indicate whether or not a local variable can be made null during a loop or similar.

Tuesday, August 23, 2011

Don't use String literals for Reflection

Do you know what this line of code is for?



@OneToMany(mappedBy = "assignedInvoice")

private List<PurchaseOrderAssignment> purchaseOrders;

And this one?



Class.forName("java.lang.String")

Or this one?



Field nameField = Person.class.getField("name");

Ok, not so difficult.
First one is a Hibernate annotation for mapping a one-to-many relationship indicating wich field is the owner. The second one is how you obtain a reference to a class Object for String class. And the third one is how you obtain the "name" field of the class "Person" using reflection.

If you have been around Java long enough you should be familiar with of these forms of reflection.

Now. Can you tell what's wrong with them? (hint: the tittle of this article can help you)

Yes! There are using string literals to refer to class structure elements.

And why is that wrong?
Because doing so you loose the relationship with the referred element. Your code looses the information about structure dependencies.

Using string literals you cannot know where all the reflections references to some class field are, prior to a refactor.You cannot change the name of the field in one place and expect to automatically update all the references (remember DRY?).

In other words you are making implicit an existing dependency relationship between referred and referral code and you start to loose control over changes.

What should I do then?

Until the RFE(http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5043025) is implemented in the JDK, we will have to content ourselves with String constants.

We cannot honor the DRY principle, but at least we can repeat ourselves just once.

I propose you to declare a constant with the name of the field, or the class, or whatever you may need to refer using reflection, very near to the place where it is declared.

As an example to the one-to-many relationship counterpart:



  @ManyToOne

  private Invoice assignedInvoice;

  public static final String assignedInvoice_FIELD = "assignedInvoice";

Here I declare the field and below the constant to refer to it with reflection.
I then, use that constant (assignedInvoice_FIELD) to refer to the field using any form of reflection (annotations too).



@OneToMany(mappedBy = PurchaseOrderAssignment.assignedInvoice_FIELD)

private List<PurchaseOrderAssignment> purchaseOrders;

The sad part is that you have to update the constant every time you rename the field (or class), but luckily you won't be changing that too much and in case you do, you will have to change just one place.

The nice part is that now, you can see where all your reflections references are with most IDEs and if you happen to delete or rename the field carelessly, having that constant near is a reminder of the existing dependencies.

Of course this is not an ideal solution, but is good enough to keep you in control of your code.

Extra note

I prefer to use "_FIELD" as a suffix when the constant refers a field so that the IDE auto-completion capabilities offers me the constant and I know that is not a getter or setter method. I use "_GETTER" suffix when the property is not available as a field on the class.

Other interesting articles

Sunday, August 7, 2011

How to query by proximity in Android

The problem

So you need to query objects near some location?
You say ‘hey, this is Android, it should be easy, everybody makes a “search-near-me app”’, right?

Well, the truth is that is not that easy. Making a query for items near a geographic point is not as easy as one might expect. One of the reasons is that SQLite (SQL engine that comes with Android) doesn’t support trigonometric operations, or custom functions, or spatial extensions.

The short answer

In a nutshell, the solution is to use an alternative distance function to order objects based on this distance to the reference point and filter a few to show them on a map. Specifically, use the Manhattan distance that is based on simple operations easily written in Sqlite. You will get a small ordering error but will be good enough for many cases.

Where

d₁is the distance function

are vectors and

p_i and q_i are scalar values.

The long answer

Let’s see step by step what should you do.

Limit the result set

The first thing we need to accept and to do is to reduce the number of elements we are handling. There are two good reasons for this. The first one is usability; users don’t feel comfortable if you throw a hundreds of items over the map. The second is performance, there’s no point on processing elements that won’t be visible.

So the first thing you should do is to decide the number of elements you will be showing the user. That might depend on the type of application, but you should be careful not to show less or more than expected. Make some tests on real phones to discover hardware limitations to this number.

Let’s say 40 elements is ok for us. We know that we don’t care about all the results; we just want the first forty elements near to the reference location (i.e.: the phone current location). Because we are using an approximate distance for our calculation, we will make the limit a little higher. So let’s make it 65 elements. I will explain why later.

At the end of the query you should have a limit clause, something like ‘LIMIT 65’ as we want the first 65 elements.

Order results by proximity using Manhattan distance

This is a proximity based result display, so we want to show the user the first nearest elements. For that purpose we need to order the results, otherwise they will be in any random order and you will get any 65 elements.

Due to our simple SQL support for mathematical operations we will use a simple form of distance called “Manhattan distance” or taxicab distance (http://en.wikipedia.org/wiki/Taxicab_geometry).

This kind of distance is very easy to calculate for Sqlite and has a small error compared to using a real geometric distance for ordering.

Add an order by clause: ‘ORDER BY abs(latitude - (?)) + abs( longitude - (?))’
Here I’m assuming the query arguments are the user current coordinates (latitude and longitude) and ‘latitude’ and ‘longitude’ are the columns with the values for each element.

The query should look something like this:
"SELECT _id, latitude, longitude FROM EXAMPLE_DOTS ORDER BY abs(latitude - (?)) + abs( longitude - (?)) LIMIT 65"
And we will add as arguments the latitude and longitude of the reference point.

Error in Manhattan distance

Ordered in this way, our query will be enough to make the trick. If you only show the elements in the map, and they are scattered enough, nobody will notice the difference.
But, if you must show a list of elements, or they are not so sparsed you will still have to make a few corrections.

The Manhattan distance has a little problem (compared to real distance); it favors elements close to the axes. That means that elements near the vertical or horizontal axis will get lower values for their distance. That means better positions on the order, and will likely to be first that other elements. In other words elements near 45º, 135º, 225º and 315º will receive a worst order than they should.

If elements are regularly scattered you will get a
diamond-like shape with manhattan distance.

When using this alternative distance to order elements and limiting the results you could include an element farther that is close to an axis and exclude an object that is nearer because it’s not close to the axis.

This is why we add additional elements on the limit clause; we don’t want axis elements to be favored and diagonal elements to be discarded erroneously. To avoid this situation we will use a real geo-distance (at least a far better approximation to real distance).

Use Android Location-distance calculator to reorder and filter

The Android platform comes with an excellent implementation of an approximate geo distance calculation formula based on WGS84 ellipsoid (http://en.wikipedia.org/wiki/World_Geodetic_System#Longitudes_on_WGS_84).
This formula cannot be used inside the query, but as we limited the number of results (only 65) we can use it to re-order them in memory very fast so we have a real distance ordering.

You should call Location.distanceBetween(startlatitude, startLongitude, endLatitude, endLongitude, results) to calculate a better distance and re-order the elements obtained from the database. You can then, discard the exceeding elements.

I do this by using a few helper classes that I attached to this article as an Eclipse project :

/**

* Order the results based on real distance instead of manhattan. This method changes the order

* using memory

* @param referencePoint

* @param nearDots

* Dots to order

* @param resultLimit

* Amount of dots to be returned

* @return The list of re-ordered and limited elements

private List<ExampleDot> orderAndFilterUsingRealDistanceTo(GeoPoint referencePoint, List<ExampleDot> nearDots,

int resultLimit) {

final double referenceLatitude = referencePoint.getLatitudeE6() / 1e6;

final double referenceLongitude = referencePoint.getLongitudeE6() / 1e6;

DistanceCalculator<ExampleDot> distanceCalculator = new DistanceCalculator<ExampleDot>() {

public Double calculateDistanceTo(ExampleDot point) {

double pointLatitude = point.latitude / 1e6;

double pointLongitude = point.longitude / 1e6;

float[] results = new float[1];

Location.distanceBetween(referenceLatitude, referenceLongitude, pointLatitude, pointLongitude, results);

return (double) results[0];

}

};

Comparator<? super ExampleDot> distanceComparator = DistanceComparator.create(distanceCalculator, true);

List<ExampleDot> filtered = ResultsHelper.filterFirstMin(resultLimit, nearDots, distanceComparator);

return filtered;

}

As you can see, the above method is used to re-order the elements using a geo-distance and filtering out the exceeding elements. So, now we have the forty nearest elements we wanted.

With real geo-distance and regularly scattered elements
you should get a circle-like shape (or approximate).

Conclusion

The method I showed above is not for every case but can be used for most of the applications that need to show results near some point.
In fact, just by using Manhattan distance on the query can be sufficient and you don’t need to make extra corrections.

You should be sensible as when to use it due to the error in the order that Manhattan distance makes. I recommend you to test with real data points so you can judge if the results are good enough for you.

Links and downloads: Full eclipse example project
More articles from 10pines.com: http://www.10pines.com/articles

Monday, June 27, 2011

Presentation

This blog is dedicated to collect practices and tools that enhances every day coding.
While this is not a best practices recommendation blog (some of the ideas are to be tested), some others are old enough to be used as if they were ;-)