Skip to main content

Moving a Collection Task to Java 8 Lambdas and Streams

On a recent project, I encountered a function that had been copy-pasted to a dozen places in the code base. That in itself is a classic Code Smell, and I determined to extract it to a common, reusable function.

The block of lines also repeated an action over several elements in a larger collection. Since this team had recently moved to Java 8, I decided to rewrite this code using Lambdas and the Stream API.

The project's code base used a class called DBRecord as a very flexible extended Collection, representing the data in a single row from a relational database. It was designed to contain one Collection of the various field values for the row of data, and another Collection of meta-data defining the traits of the fields themselves.

Another wrinkle was the state of denormalization of their underlying data model. The team has traditionally not been interested in following Third-Normal-Form design patterns, resulting in as much duplicated and repeated data in their table structures as there is in their code base.

As a result, one database table had multiple columns generically named "SortField1" through "SortField5". It was sometimes necessary to act when any one or more of the SortFields had data. But they were not guaranteed to be consecutive; any one of these fields could have data or not, independent of the others. So the code base had variations of the following code sprinkled throughout:
if ( record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty() )
{
// no data so do something appropriate
}
else
{
// at least 1 piece of data, so do something else
}


This code gets the index of SortField1 with getField(), and uses that index to look up the associated value, checks if it is empty or not, and repeats for SortFields 2 through 5.
Since this logic was needed multiple times, I decided to extract it into a reusable function, one that will take the DBRecord structure, determine if the fields are empty, and return the result. Voila:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
boolean allEmpty = record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty();
return !allEmpty;
}

At this point, I wanted to convert it to the Lambdas and Streams of Java 8. But first, the function needed some tests. Minimally, I needed to write unit tests to verify the behavior when no field has data; when one field has data; when many fields have data; when the fields have only white space; and other edge cases. Here are some of the tests I wrote in jUnit (there were others that I will omit):

@Test
public void atLeastOneSortFieldHasData_Multiple() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(record.getField("sortfield1"), "MySortField1");
rec.setField(record.getField("sortfield5"), "MySortField5");
assertTrue("Many sortfields should have been found.", MyClass.atLeastOneSortFieldHasData(rec));
}

@Test
public void atLeastOneSortFieldHasData_SingleNotFirst() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField4, "MySortField4");
assertTrue("Sortfield 4 has data.", MyClass.atLeastOneSortFieldHasData(rec));
}

@Test
public void atLeastOneSortFieldHasData_WhiteSpaceCountsAsEmpty() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField1, "     ");
assertFalse("No sortfield has data, not even SortField 1 with white space.", MyClass.atLeastOneSortFieldHasData(rec));
}

Now, with the newly extracted function tested, I replaced the code with Lambdas and Streams. I knew that my change was ok once all my tests passed. Here is what the production code looked like, rewritten to use the Stream API of Java 8:

public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
return record.stream().filter(field -> field.getName().toLowerCase().startsWith("sortfield"))
.anyMatch(field -> !record.getString(record.getField(field)).isEmpty());
}

When we moved to Java 8, a stream() method got added to the DBRecord class to allow us to stream its collection of database fields. So the first step is to call stream().

Next, since the algorithm only cares about the entries named SortField1 through SortField5, the stream is passed through a filter() call, which will select only those fields with the common prefix "sortfield".

I want the function to return true when 1 or more of these filter-selected fields is non-blank. The anyMatch() method is perfect for this. We call the getString() method and check if the String object isEmpty().

When I reran the tests, they all passed. I could now replace the copy-pasted code by calls to this new function.

Popular posts from this blog

How to do Git Rebase in Eclipse

This is an abbreviated version of a fuller post about Git Rebase in Eclipse. See the longer one here : One side-effect of merging Git branches is that it leaves a Merge commit. This can create a history view something like: The clutter of parallel lines shows the life spans of those local branches, and extra commits (nine in the above screen-shot, marked by the green arrows icon). Check out this extreme-case history:  http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg Merge Commits show all the gory details of how the code base evolved. For some teams, that’s what they want or need, all the time. Others may find it unnecessarily long and cluttered. They prefer the history to tell the bigger story, and not dwell on tiny details like every trivial Merge-commit. Git Rebase offers us 2 benefits over Git Merge: First, Rebase allows us to clean up a set of local commits before pushing them to the shared, central repository. For ...

Git Reset in Eclipse

Using Git and the Eclipse IDE, you have a series of commits in your branch history, but need to back up to an earlier version. The Git Reset feature is a powerful tool with just a whiff of danger, and is accessible with just a couple clicks in Eclipse. In Eclipse, switch to the History view. In my example it shows a series of 3 changes, 3 separate committed versions of the Person file. After commit 6d5ef3e, the HEAD (shown), Index, and Working Directory all have the same version, Person 3.0.

Trigger Windows Scheduled Task from Remote Computer via Jenkins

One thing I love about working in Information Technology is the opportunity - the NEED - to constantly learn new things. If a week goes by in which I have not looked up something on StackOverflow or other message boards, I start lobbying my team for more challenges. This week, I learned the power of running " SCHTASKS.exe " from a command-line script for a remote server in a Microsoft Windows environment. If you don't know Schtasks, you can read up on it here: https://msdn.microsoft.com/en-us/library/windows/desktop/bb736357(v=vs.85).aspx In a nutshell, it is the command-line interface for the Windows Task Scheduler, and allows you (or a system administrator) to create, change, run, query, terminate, and delete scheduled tasks on a work-station, either the local one or a remote one. Not all of the features are available in older versions. In my scenario below, this was relevant as the local computer will be a Windows 8 machine, and the remote server is, shall we ...

Updating Oracle javapath symlinks on Windows

A Java-based application on my Windows 10 machine recently started prompting me to upgrade my version of Java. Since I wanted to control it myself, I declined the app's offer to upgrade for me, and downloaded and installed the latest Java 8 from Oracle. In my case, Java 1.8.0_171, 64-bit version. The upgrade went fine. But when I launched the app, it again said I needed to upgrade. Why was it still looking at the old location? I made the change using Settings, to change the JAVA_HOME environment variable to point to the location of the new upgrade. But no change, the app still insisted that I needed to upgrade. A little research into the app's execution path showed that it was using c:\ProgramData\Oracle\Java\javapath to find Java. When I looked in that folder, I found symbolic links to my old Java installation. Normally, this hidden bit of information gets updated automatically in the upgrade or installation process. I have read of cases where, when downg...

Abort a Git Merge or Cherry-Pick

Recently a colleague of mine used the Git Cherry-pick feature to bring one of their commits from one branch of our repository to another. They hit a somewhat complex merge conflict and, in trying to resolve all of the conflicts in the file, they got confused about what needed to be done. They came to see me with the question: can they cancel their cherry-pick and partial merge, and start over? The answer is Yes: it is possible to abort a merge or a cherry-pick Git operation. Most of the time it is not needed; with a little work and human intelligence, the merge conflicts can be resolved without too much trouble. But sometimes, in cases like my colleague faced, a more complex merge winds up confusing the developer, and they want to go back and start over. If you use git from the command-line, it’s as simple as: git cherry-pick --abort or git reset --merge On my team, about a quarter of us use Git from the command line, but most - like my colleague in this story -...