0

Implementing Persistence With Clean Architecture

#Implementing #Persistence #Clean #Architecture

Over a decade has passed since Robert C. Martin posted an article about Clean Architecture on the Clean Coder Blog (in 2012). Later (in 2017) it was followed up by an entire book about the same topic. This tutorial is not aimed to argue pro or contra the concept but is written to present some practical aspects on how to implement the persistence layer followed by the idea of Clean Architecture.

One aspect discussed in this topic is that the business core (which is normally referred to as being “inner”) must not depend on any technical or environmental details (referred to as being “outer”). The motivation behind this rule is to express importance: in other words, less important parts of the code shall depend on more important, but not the other way around. In this regard, business logic is important (as that is the point why the given software exists); how it is deployed and how data is stored is less important. This has several implications. For example, one implication is that none of the business classes shall be annotated by framework-related annotations; in practice, no Spring annotations on classes with business logic, no JPA annotations on entities that these classes are working on, etc. 

This is fairly easy to comply with in some aspects. For example, it is completely natural to allow a REST controller to depend on a business class and disallow the other way (and this is not even specific to Clean Architecture, most of the layered architectures also contain this rule).

The situation is heavily different in other aspects, such as persistence: unlike layered architecture models, which would allow business classes to depend on “lower” layers, Clean Architecture clearly does not allow business classes to depend on persistence classes. A quick description of how it still works is that instead of persistence offering the business its capabilities, the business shall define what it expects from its persistence layer. Practically speaking, business defines a Java interface with all the business-relevant persistence methods, and regardless of what kind of persistence engine will actually handle data, an adapter is going to bridge business needs and technical capabilities.

Setting Up Our Domain and Expectations

In order to provide practical hints, let’s define a simple business domain. In this tutorial we are going to implement a small library: it can handle books (identified by ISBN, holding basic attributes like title and number of pages), readers (identified by name, holding address and telephone number), and relations between them (in order to see which person read which book).

Before implementing the actual logic, we can already define what is the expected behavior of the application. The logic shall be able to:

  • Persist new business entities (book, person)
  • Provide existing entities
  • Register new borrowing events (add the book to the person’s book list and add the person to the book reader’s list)
  • Provide a list of all the book titles accompanied by the lengths of the books

We can define it not only in human languages but in the form of unit tests, too – which I actually did, on one hand in general to advocate for TDD, and on the other hand I find it particularly useful to ensure, that different variations have the same capabilities. You can find the tests – as well as the complete code examples – on GitHub.

The interface that our application must fulfill is then:

public interface LibraryService {

    void createNewBook(final String isbn,
                       final String title,
                       final int numberOfPages);

    Object findBookByIsbn(String isbn);

    void createNewReader(String name,
                         String address,
                         String phoneNumber);

    Object findReaderByName(String name);

    void borrowBook(String nameOfPerson,
                    String isbnOfBook);

    List<String> listAllBooksWithLengths();
}

Variation 1: Persistence Manages Its Own, Business-Independent Structures

This might seem the most straightforward way: business core defines its data model and in its persistence interface (which belongs to the business core!) describes commands and queries on them, such as:

public class Book {
    private String title;
    private String isbn;
    private int numberOfPages;
    private List<Person> readers = new ArrayList<>();
}

public class Person {
    private String name;
    private String address;
    private String phoneNumber;
    private List<Book> booksRead = new ArrayList<>();
}

public interface LibraryPersistenceProvider {
    void saveBook(Book book);
    void savePerson(Person person);
    Person findPersonByName(String name);
    Book findBookByIsbn(String isbn);
    List<Book> getBooks();
}

In this variation, the persistence code uses business classes only to read attributes while persisting and set attributes while loading. This also allows persistence to use any kind of data representation, for example in this tutorial everything is persisted into Maps. 

This is quite convenient for several purposes but also might bring several trade-offs: even though it is such a minimal domain, it is easy to see that if books and people are building a fully connected graph, loading any of the entities causes the entire data model to be loaded. This leads to the motivation of the next variation.

Variation 1B: Persistence Uses the Business Classes and Defines Its Own Subclasses

The most straightforward solution for the problem mentioned above is to let persistence fill the list of books read and the list of readers of a book with proxy objects, which would only load actual contents when they are touched – in other words, to implement proxy-based lazy loading. This requires persistence to extend the business classes.

Persistence Uses the Business Classes and Defines Its Own Subclasses

Note, that the Java class hierarchy of Variation 1 contains only business-relevant classes, but this variation contains business-relevant classes and persistence-relevant classes, too. Also, while in Variation 1, business always works with its own classes, in this variation business classes might work with instances of persistence classes.

This might lead to some unexpected problems. Imagine that at some point, no lazy-loading is implemented, but a new business rule that allows only users with the role “admin” to list who read a given book is being implemented. Let’s assume that to ensure that this always applies, a guardian is added to the getter method; for example:

List<Person> getReaders() {
  if (!currentUserHasAdminRights()) {
    throw new AdminRightsNeededException();
  }
  return this.readers;
}

If later on, lazy-loading is added without examining the original getter of the business class, the persistence extension of the class might simply implement the getter as follows:

List<Person> getReaders() {
  if (!readersAlreadyLoaded()) {
    loadReadersFromDB();
  }
  return this.readers;
}

This means that the business guardian is erased by the persistence code (which changes the business behavior of the system), but in the meantime, it is likely that all the unit tests will pass, as they are written against the original business class!

Also, note, that implementing lazy-loading this way is sort of a guess by persistence that business might not need some fields (or at least might not need them immediately). The motivation arises to let business tell persistence which fields are needed for a given operation. This leads to our next variant.

Variation 2: Define Only Interfaces

Let’s consider one of the fundamental ideas behind Clean Architecture again: business shall define what is expected from persistence and this set of expectations might contain attribute-level details, too. 

In our example above, to list all book titles with their lengths, the object that persistence provides to business does not even have to contain an ISBN or a list of readers. This can be grasped by the usage of interfaces, such as:

public interface LibraryPersistenceProvider {
    <T extends HasTitle & HasNumberOfPages> List<T> getBooks();
}

In this construct, the business entities might be actual classes that simply implement all the HasXXX interfaces, but it is also possible for the business to define its business model purely by using interfaces. For example:

public interface HasTitle { String getTitle(); }
public interface HasIsbn { String getIsbn(); }
public interface HasReaders { List<HasName> getReaders(); }
public interface HasNumberOfPages { int getNumberOfPages(); }

public interface Book extends HasTitle, 
				HasIsbn, 
				HasReaders, 
				HasNumberOfPages { }

At first sight, this feels more than odd: the business clearly does not know what actual instances it uses in its internal flows. On the other hand (unless every class of the domain model is final), this statement is also true for the previous variations, too. It was just hidden behind the generic fact that classes can be extended.

Let’s see what are the main benefits of this way:

  • It is clear to the business that it uses instances of foreign parts of the code, and it shall not assume anything that is not explicitly described in an interface (for example, do not assume that a getter has a security guard, too).
  • The business can define exactly what it needs for a given flow
  • Business methods can not hide intentions of further methods they call

To understand the last point, take a look at the following example:

private void doSomeBusinessFlow() {
	final var instance = fetchFromDB();
	doSomeAction1(instance);
}

private <T extends HasTitle & HasIsbn & HasReaders> void doSomeAction1(T onThis) {
  	... some business logic with onThis.getReaders(); ...
 	doSomeAction2(onThis);
}

private <K extends HasTitle & HasIsbn> void doSomeAction2(K onThis) {
	...
}

private <L extends HasTitle & HasIsbn & HasReaders> L fetchFromDB() {
  	...
}

Here, assuming that doSomeAction2 wants to access Title and ISBN, doSomeAction1 has no way to hide this fact, even if it only wants to access the Readers. Using traditional business objects (like having one Book class) would not bring this kind of transparency. 

Note that this is not only true while writing the business code for the first time, but this must be maintained all the time. If doSomeAction2 is being reworked and it needs access to NumberOfPages too, this change must be propagated to all the methods until the point when the instance is being fetched from DB. This can help to identify overgrown business methods: the more attributes it expects, probably the more it is doing, which might be an indicator that the given method should be the target of refactoring.

It is also worth noting that by using inline classes and lambdas, creating instances for the given requirements is really straightforward.

Conclusion

As shown above, there are multiple ways to implement data access while sticking to the thoughts of Clean Architecture. It is also clear that there is no golden hammer that can be used in all possible situations. This is exactly the reason why considering non-usual solutions is important: there are examples when they are a better fitting than their usual counterparts.

As hinted above already, full code examples for all three presented variations with the example domain of a library can be found in the GitHub repository linked earlier.

Additional Remarks

The first and most important remark is that Clean Architecture covers way more aspects than summarized above. It is suggested for every IT professional (even for non-developers) to get familiar with it.

As a further remark, note that all variations are tested by the same set of unit tests, which set is independent of the variation details. On the trade-off side, this is the reason why LibraryService returns Objects and not variation-specific classes.