Skip to main content

Moving a Collection Task to Java 8 Lambdas and Streams

On a recent project, I encountered a function that had been copy-pasted to a dozen places in the code base. That in itself is a classic Code Smell, and I determined to extract it to a common, reusable function.

The block of lines also repeated an action over several elements in a larger collection. Since this team had recently moved to Java 8, I decided to rewrite this code using Lambdas and the Stream API.

The project's code base used a class called DBRecord as a very flexible extended Collection, representing the data in a single row from a relational database. It was designed to contain one Collection of the various field values for the row of data, and another Collection of meta-data defining the traits of the fields themselves.

Another wrinkle was the state of denormalization of their underlying data model. The team has traditionally not been interested in following Third-Normal-Form design patterns, resulting in as much duplicated and repeated data in their table structures as there is in their code base.

As a result, one database table had multiple columns generically named "SortField1" through "SortField5". It was sometimes necessary to act when any one or more of the SortFields had data. But they were not guaranteed to be consecutive; any one of these fields could have data or not, independent of the others. So the code base had variations of the following code sprinkled throughout:
if ( record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty() )
{
// no data so do something appropriate
}
else
{
// at least 1 piece of data, so do something else
}


This code gets the index of SortField1 with getField(), and uses that index to look up the associated value, checks if it is empty or not, and repeats for SortFields 2 through 5.
Since this logic was needed multiple times, I decided to extract it into a reusable function, one that will take the DBRecord structure, determine if the fields are empty, and return the result. Voila:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
boolean allEmpty = record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty();
return !allEmpty;
}

At this point, I wanted to convert it to the Lambdas and Streams of Java 8. But first, the function needed some tests. Minimally, I needed to write unit tests to verify the behavior when no field has data; when one field has data; when many fields have data; when the fields have only white space; and other edge cases. Here are some of the tests I wrote in jUnit (there were others that I will omit):

@Test
public void atLeastOneSortFieldHasData_Multiple() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(record.getField("sortfield1"), "MySortField1");
rec.setField(record.getField("sortfield5"), "MySortField5");
assertTrue("Many sortfields should have been found.", MyClass.atLeastOneSortFieldHasData(rec));
}

@Test
public void atLeastOneSortFieldHasData_SingleNotFirst() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField4, "MySortField4");
assertTrue("Sortfield 4 has data.", MyClass.atLeastOneSortFieldHasData(rec));
}

@Test
public void atLeastOneSortFieldHasData_WhiteSpaceCountsAsEmpty() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField1, "     ");
assertFalse("No sortfield has data, not even SortField 1 with white space.", MyClass.atLeastOneSortFieldHasData(rec));
}

Now, with the newly extracted function tested, I replaced the code with Lambdas and Streams. I knew that my change was ok once all my tests passed. Here is what the production code looked like, rewritten to use the Stream API of Java 8:

public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
return record.stream().filter(field -> field.getName().toLowerCase().startsWith("sortfield"))
.anyMatch(field -> !record.getString(record.getField(field)).isEmpty());
}

When we moved to Java 8, a stream() method got added to the DBRecord class to allow us to stream its collection of database fields. So the first step is to call stream().

Next, since the algorithm only cares about the entries named SortField1 through SortField5, the stream is passed through a filter() call, which will select only those fields with the common prefix "sortfield".

I want the function to return true when 1 or more of these filter-selected fields is non-blank. The anyMatch() method is perfect for this. We call the getString() method and check if the String object isEmpty().

When I reran the tests, they all passed. I could now replace the copy-pasted code by calls to this new function.

Popular posts from this blog

Git Reset in Eclipse

Using Git and the Eclipse IDE, you have a series of commits in your branch history, but need to back up to an earlier version. The Git Reset feature is a powerful tool with just a whiff of danger, and is accessible with just a couple clicks in Eclipse. In Eclipse, switch to the History view. In my example it shows a series of 3 changes, 3 separate committed versions of the Person file. After commit 6d5ef3e, the HEAD (shown), Index, and Working Directory all have the same version, Person 3.0.

Scala Collections: A Group of groupBy() Examples

Scala provides a rich Collections API. Let's look at the useful groupBy() function. What does groupBy() do? It takes a collection, assesses each item in that collection against a discriminator function, and returns a Map data structure. Each key in the returned map is a distinct result of the discriminator function, and the key's corresponding value is another collection which contains all elements of the original one that evaluate the same way against the discriminator function. So, for example, here is a collection of Strings: val sports = Seq ("baseball", "ice hockey", "football", "basketball", "110m hurdles", "field hockey") Running it through the Scala interpreter produces this output showing our value's definition: sports: Seq[String] = List(baseball, ice hockey, football, basketball, 110m hurdles, field hockey) We can group those sports names by, say, their first letter. To do so, we need a disc

Java 8: Rewrite For-loops using Stream API

Java 8 Tip: Anytime you write a Java For-loop, ask yourself if you can rewrite it with the Streams API. Now that I have moved to Java 8 in my work and home development, whenever I want to use a For-loop, I write it and then see if I can rewrite it using the Stream API. For example: I have an object called myThing, some Collection-like data structure which contains an arbitrary number of Fields. Something has happened, and I want to set all of the fields to some common state, in my case "Hidden"

How to do Git Rebase in Eclipse

This is an abbreviated version of a fuller post about Git Rebase in Eclipse. See the longer one here : One side-effect of merging Git branches is that it leaves a Merge commit. This can create a history view something like: The clutter of parallel lines shows the life spans of those local branches, and extra commits (nine in the above screen-shot, marked by the green arrows icon). Check out this extreme-case history:  http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg Merge Commits show all the gory details of how the code base evolved. For some teams, that’s what they want or need, all the time. Others may find it unnecessarily long and cluttered. They prefer the history to tell the bigger story, and not dwell on tiny details like every trivial Merge-commit. Git Rebase offers us 2 benefits over Git Merge: First, Rebase allows us to clean up a set of local commits before pushing them to the shared, central repository. For this

Code Coverage in C#.NET Unit Tests - Setting up OpenCover

The purpose of this post is to be a brain-dump for how we set up and used OpenCover and ReportGenerator command-line tools for code coverage analysis and reporting in our projects. The documentation made some assumptions that took some digging to fully understand, so to save my (and maybe others') time and effort in the future, here are my notes. Our project, which I will call CEP for short, includes a handful of sub-projects within the same solution. They are a mix of Web APIs, ASP MVC applications and Class libraries. For Unit Tests, we chose to write them using the MSTest framework, along with the Moq mocking framework. As the various sub-projects evolved, we needed to know more about the coverage of our automated tests. What classes, methods and instructions had tests exercising them, and what ones did not? Code Coverage tools are conveniently built-in for Visual Studio 2017 Enterprise Edition, but not for our Professional Edition installations. Much less for any Commun