Using Sequence diagrams to find missing classes

24 Aug 2023 - Jorge Garcia

Last updated: 2023-10-06


This is part of lessons learned from Practical Object Oriented Ruby Design by Sandi Metz.

On the 4th chapter, she suggests the usage of sequence diagrams to find missing classes. There are a couple hints that help us to discover classes, these are:

This is a suggested action to do before coding, as it’s cheaper than writing code and having to rollback changes if needed.

Context

We wanted to create a Data Pipeline (here it’s called DataFeed) from resources of an external REST-API provider. Alongside the next things:

The main problem of this DataFeed, is to keep track of the oldest ongoing trip, as the next enqueued job will start from that date.

Assuming the data feed runs for the first time, we have 2 posibilities that can occur, and the time range will be:

starts_at = Time.now - 1.day
ends_at = Time.now
  1. No ongoing / in progress trips; we will save the ends_at and the next time the data feed runs, we will start from here.
  2. At least 1 ongoing trip. We will search the oldest starts_at of the ongoing trips and the next time the data feed run, we will start from here.

And repeat ^

We already had something in place to keep track of this cursor:

The trips belongs to a record called GpsDevice. And we used a jsonb column on these records to keep track of the different cursors, not only trips but also other data feeds.

However, this wasn’t that practical, each time we need to update the cursor, we also need to lock! the device to prevent another data feed overriding the content.

Code ex:

device = GpsDevice.find(id)

device.api_metadata
{
  trips_next_pointer: "8-09-2023 13:30",
  sensor_data_next_pointer: "8-09-2023 12:55"
}

# Both data feeds can run at the same time so on each job we need to lock! the record
# to prevent overriding the jsonb column

device.with_lock! do |device|
  # find next pointer...
  device.api_metadata["trips_next_pointer"] = time

  device.save
end

Based on the above, that solution wasn’t that practical. So I knew something wass odd.

The first sequence diagram was born following the existing implementation.

First Diagram

diagram

Additional context:

This isn’t that bad, but it has the next problem:

Let’s try to imagine the private methods on the DataFeed we need to execute:

In order to determine if a method doesn’t violate the Single Responsibility Principle, I like to use the interrogation method.

OK DataFeeds::ProviderTrip answer me:

It seems, the methods to resolve the cursor may not belong here?

And dont get me started about the collection of trips having to be an instance and each one an instance as well

Ok, let’s do a second try:

Second Diagram

diagram

This is much better!

However, the problem of locking the record persists.. Each GpsDevice is expected to be updated multiple times per day + other DataFeeds using it. So if we can skip the locking would be ideal. Also, this can be debatible, but I do think that storing the cursor on the GpsDevice can be a violation of the SRP. A counter-argument can be that if we use the same object, is 1 less object we need to query and I see that as a valid point.

And so the third diagram was born.

Third and Final Diagram

diagram

With this approach, the cursor logic is consolidated in a single place. The DataFeedCursor model. We’re not violating the SRP anymore and the resolution of the cursor can be extended as needed without impacting other classes.

Summary

Take advantage of the sequence diagrams and use it as needed as they’re a powerful yet cheap way to validate our solutions, and found new classes.