On a recent project, I encountered a function that had been copy-pasted to a dozen places in the code base. That in itself is a classic Code Smell, and I determined to extract it to a common, reusable function.
The block of lines also repeated an action over several elements in a larger collection. Since this team had recently moved to Java 8, I decided to rewrite this code using Lambdas and the Stream API.
The project's code base used a class called DBRecord as a very flexible extended Collection, representing the data in a single row from a relational database. It was designed to contain one Collection of the various field values for the row of data, and another Collection of meta-data defining the traits of the fields themselves.
Another wrinkle was the state of denormalization of their underlying data model. The team has traditionally not been interested in following Third-Normal-Form design patterns, resulting in as much duplicated and repeated data in their table structures as there is in their code base.
As a result, one database table had multiple columns generically named "SortField1" through "SortField5". It was sometimes necessary to act when any one or more of the SortFields had data. But they were not guaranteed to be consecutive; any one of these fields could have data or not, independent of the others. So the code base had variations of the following code sprinkled throughout:
if ( record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty() )
{
// no data so do something appropriate
}
else
{
// at least 1 piece of data, so do something else
}
This code gets the index of SortField1 with getField(), and uses that index to look up the associated value, checks if it is empty or not, and repeats for SortFields 2 through 5.
Since this logic was needed multiple times, I decided to extract it into a reusable function, one that will take the DBRecord structure, determine if the fields are empty, and return the result. Voila:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
boolean allEmpty = record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty();
return !allEmpty;
}
At this point, I wanted to convert it to the Lambdas and Streams of Java 8. But first, the function needed some tests. Minimally, I needed to write unit tests to verify the behavior when no field has data; when one field has data; when many fields have data; when the fields have only white space; and other edge cases. Here are some of the tests I wrote in jUnit (there were others that I will omit):
@Test
public void atLeastOneSortFieldHasData_Multiple() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(record.getField("sortfield1"), "MySortField1");
rec.setField(record.getField("sortfield5"), "MySortField5");
assertTrue("Many sortfields should have been found.", MyClass.atLeastOneSortFieldHasData(rec));
}
@Test
public void atLeastOneSortFieldHasData_SingleNotFirst() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField4, "MySortField4");
assertTrue("Sortfield 4 has data.", MyClass.atLeastOneSortFieldHasData(rec));
}
@Test
public void atLeastOneSortFieldHasData_WhiteSpaceCountsAsEmpty() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField1, " ");
assertFalse("No sortfield has data, not even SortField 1 with white space.", MyClass.atLeastOneSortFieldHasData(rec));
}
Now, with the newly extracted function tested, I replaced the code with Lambdas and Streams. I knew that my change was ok once all my tests passed. Here is what the production code looked like, rewritten to use the Stream API of Java 8:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
return record.stream().filter(field -> field.getName().toLowerCase().startsWith("sortfield"))
.anyMatch(field -> !record.getString(record.getField(field)).isEmpty());
}
When we moved to Java 8, a stream() method got added to the DBRecord class to allow us to stream its collection of database fields. So the first step is to call stream().
Next, since the algorithm only cares about the entries named SortField1 through SortField5, the stream is passed through a filter() call, which will select only those fields with the common prefix "sortfield".
I want the function to return true when 1 or more of these filter-selected fields is non-blank. The anyMatch() method is perfect for this. We call the getString() method and check if the String object isEmpty().
When I reran the tests, they all passed. I could now replace the copy-pasted code by calls to this new function.
The block of lines also repeated an action over several elements in a larger collection. Since this team had recently moved to Java 8, I decided to rewrite this code using Lambdas and the Stream API.
The project's code base used a class called DBRecord as a very flexible extended Collection, representing the data in a single row from a relational database. It was designed to contain one Collection of the various field values for the row of data, and another Collection of meta-data defining the traits of the fields themselves.
Another wrinkle was the state of denormalization of their underlying data model. The team has traditionally not been interested in following Third-Normal-Form design patterns, resulting in as much duplicated and repeated data in their table structures as there is in their code base.
As a result, one database table had multiple columns generically named "SortField1" through "SortField5". It was sometimes necessary to act when any one or more of the SortFields had data. But they were not guaranteed to be consecutive; any one of these fields could have data or not, independent of the others. So the code base had variations of the following code sprinkled throughout:
if ( record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty() )
{
// no data so do something appropriate
}
else
{
// at least 1 piece of data, so do something else
}
This code gets the index of SortField1 with getField(), and uses that index to look up the associated value, checks if it is empty or not, and repeats for SortFields 2 through 5.
Since this logic was needed multiple times, I decided to extract it into a reusable function, one that will take the DBRecord structure, determine if the fields are empty, and return the result. Voila:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
boolean allEmpty = record.getString(record.getField("sortfield1")).isEmpty()
&& record.getString(record.getField("sortfield2")).isEmpty()
&& record.getString(record.getField("sortfield3")).isEmpty()
&& record.getString(record.getField("sortfield4")).isEmpty()
&& record.getString(record.getField("sortfield5")).isEmpty();
return !allEmpty;
}
At this point, I wanted to convert it to the Lambdas and Streams of Java 8. But first, the function needed some tests. Minimally, I needed to write unit tests to verify the behavior when no field has data; when one field has data; when many fields have data; when the fields have only white space; and other edge cases. Here are some of the tests I wrote in jUnit (there were others that I will omit):
@Test
public void atLeastOneSortFieldHasData_Multiple() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(record.getField("sortfield1"), "MySortField1");
rec.setField(record.getField("sortfield5"), "MySortField5");
assertTrue("Many sortfields should have been found.", MyClass.atLeastOneSortFieldHasData(rec));
}
@Test
public void atLeastOneSortFieldHasData_SingleNotFirst() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField4, "MySortField4");
assertTrue("Sortfield 4 has data.", MyClass.atLeastOneSortFieldHasData(rec));
}
@Test
public void atLeastOneSortFieldHasData_WhiteSpaceCountsAsEmpty() throws Exception
{
DBRecord rec = makeEmptyRecord();
rec.setField(TranLink.sortField1, " ");
assertFalse("No sortfield has data, not even SortField 1 with white space.", MyClass.atLeastOneSortFieldHasData(rec));
}
Now, with the newly extracted function tested, I replaced the code with Lambdas and Streams. I knew that my change was ok once all my tests passed. Here is what the production code looked like, rewritten to use the Stream API of Java 8:
public static boolean atLeastOneSortFieldHasData(DBRecord record)
{
return record.stream().filter(field -> field.getName().toLowerCase().startsWith("sortfield"))
.anyMatch(field -> !record.getString(record.getField(field)).isEmpty());
}
When we moved to Java 8, a stream() method got added to the DBRecord class to allow us to stream its collection of database fields. So the first step is to call stream().
Next, since the algorithm only cares about the entries named SortField1 through SortField5, the stream is passed through a filter() call, which will select only those fields with the common prefix "sortfield".
I want the function to return true when 1 or more of these filter-selected fields is non-blank. The anyMatch() method is perfect for this. We call the getString() method and check if the String object isEmpty().
When I reran the tests, they all passed. I could now replace the copy-pasted code by calls to this new function.