Improving Perceived Performance, Hexagonally
In Hexagonal Architecture, who's responsible for running on a separate thread: the Outbound Adapter implementation, or the Application Services Layer that coordinates that work?
Why Ensembler is Slow
I’ve been trying to figure out how to make Ensembler, my scheduling application for group programming sessions (aka MobReg), more responsive. It’s an MPA (Multi-Page App, aka old-school web app), so when you click a button, the server does some work, then redirects you to a new page to see the result. I want that work to not take any time (or at least not hold up the redirect), unless it must be completed in order to display the results on the next page.
For Members (users of Ensembler), the main buttons they can click are Accept and Decline (to accept or decline attending a scheduled Ensemble session). The time spent on the back-end here is mostly sending a single confirmation email (after an Accept), which is around 100-300ms (depending on the response time of SendGrid). The time to persist their RSVP takes almost no time.
However, for me, the admin, it can take several seconds to do actions, such as “create a new Ensemble” due to blocking delays in two places:
-
Sending emails to members: When a new Ensemble is scheduled, Ensembler doesn’t send an email in bulk to every member. Instead, it sends a customized email to each member so it contains the date/time of the Ensemble in their time zone[1]. This means it’s linear to:
number of members * ~200ms
There are currently ~15 members, which will only grow, so this is the biggest time sink.
-
Creating the Zoom meeting: only done once. Takes about 500ms.
Neither of those actions needs to be completed to return a refreshed administration page, so they can both be async.
That is, both the email sending and Zoom meeting creation can run on a background thread, while the server returns from the HTTP request.
The persistence (via ensembleRepository.save()
) of the newly scheduled Ensemble
needs to be synchronous (otherwise I might not see what I just scheduled!), but it takes less than 50ms, so it’s fine.
Completable Futures to the Rescue
To make the two actions happen “in the background”, I figured I could wrap those actions in CompletableFuture
[2] objects, which makes it easy to run a fragment of code on a separate thread[3], without having to write complex exception-handling code.
To do this, I’d change the application service (EnsembleService
) so that the calls to the Notifier
[4] and the VideoConferenceScheduler
[5] are wrapped in CompletableFuture
s using the runAsync()
method.
Like this:
CompletableFuture.runAsync(
() -> notifier.ensembleScheduled(
ensemble,
URI.create("https://mobreg.herokuapp.com/"))
);
Easy to do, but I worried that tests checking for a method call (using a pre-programmed Mock) might not see it, as the whole point is to return from the method before the action was complete.
In other words, the test might finish before the async action had a chance to complete.
I was right.
It turns out the ensembleScheduledWithMeetingLinkThenScheduledNotificationIsSent
test fails:
org.opentest4j.AssertionFailedError:
[ensembleScheduled() should have been called 1 time,
but was called 0 times.]
at com.jitterted.mobreg.application
.EnsembleServiceEnsembleScheduledNotificationTest$MockEnsembleScheduledNotifier
.verify(EnsembleServiceEnsembleScheduledNotificationTest.java:85)
at com.jitterted.mobreg.application
.EnsembleServiceEnsembleScheduledNotificationTest
.ensembleScheduledWithMeetingLinkThenScheduledNotificationIsSent
(EnsembleServiceEnsembleScheduledNotificationTest.java:59)
Adding a Thread.sleep(10)
before the test verifies the method call is enough to get it to pass (yes, only 10ms!), but that is the Wrong Way to verify async behavior in a test (any Thread.sleep()
is a test smell).
There are various Right Ways to do this: inject the executor, get access to the CompletableFuture
(so you can .join()
it), etc.
Violating Hexagonal Architecture?
In a way, though, it feels like I’ve violated Hexagonal Architecture.
Why does the EnsembleService
take so long to do its job?
Because of infrastructure (calls to remote services).
If we make the Outbound Adapters return immediately (and do their work on a separate thread, perhaps leveraging Spring’s @Async
, or Future
s), then the problem goes away.
Or does it? 🤔
What if I need the Zoom meeting information returned by the API call in order to include it in the emails I send?
Then I can’t let the ZoomScheduler
do its work later, on a separate thread.
In that case, I need to wait on the answer in order to send the emails with the link.
Obviously, I also need to save the Zoom link in the database, so I have to wait for it anyway, but that doesn’t need to hold up the redirect (so doesn’t affect perceived performance).
It seems like this is a decision that the Application Services Layer might need to make, and possibly even the Domain Layer. In order to make it possible for the App or Domain layers to use the information, whether they run in the background would be left up to them.
If we define the Port interface so that the methods wrap the return value in a CompletableFuture
, e.g., instead of:
public interface Notifier {
void ensembleScheduled(Ensemble ensemble, URI registrationLink);
}
We do:
public interface Notifier {
CompletableFuture<Void> ensembleScheduled(Ensemble ensemble, URI registrationLink);
}
And we can ignore the return value (it’s just void
), letting it run and finish in the background.
Or, we can do a .join()
to block until the notifier completes.
For the Zoom meeting creation, we can change:
public interface VideoConferenceScheduler {
ConferenceDetails createMeeting(Ensemble ensemble);
}
to be:
public interface VideoConferenceScheduler {
CompletableFuture<ConferenceDetails> createMeeting(Ensemble ensemble);
}
And then use supplyAsync()
and .thenAccept()
to update the meeting link in the Ensemble
object[6].
CompletableFuture.supplyAsync(
() -> videoConferenceScheduler.createMeeting(ensemble))
.thenAccept(conferenceDetails -> {
// update ensemble with info from details
})
.join();
Application Services Coordinate & Orchestrate
Doing this means the service only has to coordinate those futures, but doesn’t need to execute them—they’re already running on their own thread. Coordination is hard enough, so letting the Outbound Adapter implementation decide how it’s going to run seems like the right split. I think that’s more in the spirit of Hexagonal Architecture.
This will be one of the things I implement on a future Live Coding stream on Twitch.
What do you think? Are there other ways to handle time-consuming operations in your infrastructure without blocking? Let me know on Twitter or my Discord.
Yes, there are ways to do this in a single API call, but let’s assume there isn’t, or the time for the API call to return is linear to the number of emails. ↩︎
See the API docs: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/CompletableFuture.html ↩︎
It will either create and manage a new
Thread
for you, or use theForkJoinPool.commonPool()
. ↩︎An Outbound Port interface implemented by
SendGridNotifier
to send emails via SendGrid’s Java SDK. ↩︎A different Outbound Port interface implemented by
ZoomScheduler
to create meetings via Zoom’s API. ↩︎Of course, you would have to use a separate transaction to update the Ensemble’s information. ↩︎