Refactoring Ruby: DRY up your domain models using a struct table

When you go past a managed forest, you see a mass of tree trunks. Then at a certain point you look again, and you realize they're all in perfect rows. Clarity… what you've been looking at from the wrong angle and not seeing at all.

— The Edge

One of the most powerful problem-solving tools available to us as programmers is the ability to look at a problem from a new angle. The hard part is knowing what other angles we might be missing… and knowing the tools that can make such a shift in perspective possible. 

That's why it can be helpful to watch other programmers demonstrate the perspective changes that they've used to solve big problems. In the screencast and blog post below, you'll watch over the shoulder of RubyTapas guest chef Sam Livingston-Gray as he uses Ruby's class, the Vim editor, and some clever formatting to rearrange his view of a refactoring problem. Each individual step will appear simple by itself… but the end result seems almost miraculous.

Enjoy!

— Avdi

 

A model full of lists

Once upon a time, I joined a team working on a specialized customer relationship manager. As users interacted with our public web app, it would track certain actions and make them available for analysis in our CRM. We also kept track of the contacts that our salespeople had with those end users. For historical reasons, both of these kinds of data were represented by the same SchmactiveRecord model, called UserActivity. Different kinds of activities were distinguished by an “action” field, which had over a dozen possible values.

Looking at the UserActivity model, I noticed a few constants that consisted entirely of lists of values for the “action” field.

Now, in Ruby, all constants are public unless you go out of your way to hide them. So as soon as I see a constant being defined, I immediately wonder how and where it's getting used. I did a quick grep for each one of these lists, and I found that they were used in two different ways.

One way that these lists were being used was to populate select boxes in the web app. These lists also showed up in conditional checks inside various methods in the UserActivity class. As soon as I saw that, I knew this app was missing a concept. Implicit in both the lists and the conditionals is the idea of some behavior that's common among all the UserActivity records with the same value in their “action” field.

Now, because these lists are referenced elsewhere in the app, I can't just get rid of them. But I *can* change the way they get built.

Revealing missing concepts with a new class

In keeping with the first “Once” in “Once and Only Once”, I'm going to make that missing concept explicit, and to do that, I need to give it a name. After a minute or two of thinking about it, I remember one of my rules about naming things: if I can't think of a good name, I should pick a name that's easy to grep for and change later. Because this is about the action field, I'll just call it “action type”.

I need an object that's going to keep track of attributes that are associated with each possible value in the action field. My tool of choice for this is the Struct class. I'll start it out with two attributes: :action and :user_initiated.

I'm going to create a new empty list to hold instances of the new ActionType class, and then populate it with a copy of the USER_INITIATED list.

I'll use the entries in that list to populate the :action field of the ActionType structs, and I'll set the :user_initiated field to true for all of them.

And now that I've got that list of action types, I'll use it to generate the list of user-initiated actions. For now, this is really simple: I'll just map the :action method across the entire list.

That cleaned up the definition of that USER_INITIATED constant, but the ActionType definitions look a little funky. This is already looking like a table, so I'll use an excellent Vim plugin called Tabularize to wrangle the whitespace for me.

I also know that I'm going to add more rows to this table, so I'm going to give myself a commented-out row at the bottom that I can use as a template later.

Next up, let's look at that list of SALESPERSON_ACTIVITIES. I'll start by adding a :salesperson attribute to the ActionType Struct, and set it to false for all of the existing action types.

Normally, when I'm refactoring, I try not to pay attention to what the code I'm looking at *means*; I just treat it as a bunch of symbols to be moved around according to a set of rules. But I happened to notice that the list of salesperson activities is completely disjoint from the list of user-initiated ones. Which means I can copy the next list into this one, add some boilerplate before and after the action, and make sure that the :salesperson value is set to true for these new rows.

As soon as I do this, though, I break a test. Because the definition of USER_INITIATED is mapping :action across the entire list, it now has some extra elements in it. The fix is to filter on the :user_initiated attribute before doing the map.

With the tests back to green, I can change the definition of the SALESPERSON_ACTIVITIES list. Again, I'll filter, then map.

This process gets a little repetitive, so I'm going to handwave past the part where I add the third and fourth columns for EXEMPT and ACTIVITIES_WITH_RECEIVERS.

I notice that these four constants all follow the same pattern, so I'll DRY up some of the boilerplate, to make it easier to see what's interesting about each list.

Clarifying the table

This is already starting to communicate in new ways, but there's still room for improvement. For one thing, the syntax highlighting doesn't make it especially easy to distinguish ‘true' from ‘false'. For our purposes, ‘false' isn't a very interesting value, so I'd like to deemphasize it a bit. I'll swipe a trick from the Ruby Koans project, and define “double underscore” as a stand-in for “false”.

Once I've done that, I can do a quick find-and-replace in the ACTION_TYPES list…

While I'm paying attention to formatting, it wouldn't hurt matters any to have some headers on those columns…

…and now it's a lot easier to notice that “exempt” is highly correlated with “user_initiated”. In fact, except for that “made_note” entry, they're the same list.

When I asked what “exempt” meant, I learned that our salespeople are especially interested in *new* users. When someone signs up, management wants to make sure one of our salespeople talks to them right away. So the user model has a “new to the site” flag, which should remain set until one of our salespeople tries to contact that user. “Exempt” refers to an action that *doesn't* clear this flag.

Like a lot of humans, I have a hard time with double negation, so I try to get rid of it whenever possible. I'd rather say that some actions clear the new flag, and I can express that by adding another column called “clears_new_flag” and making it the inverse of the “exempt” column.

… then, since Struct lets us define methods on the class it generates, I'll override its default implementation of its :exempt method…

… and now I can remove the :exempt column. I still have a list of exempt actions, but now the table is expressed in terms of what an action *does*, rather than what it *doesn't* do.

The benefits of a new perspective

This might seem like an awful lot of work just to change the way four arrays get built… and, yeah, it sort of is. I don't put this level of effort into every source file I run across. But this was one of the most important models in its application, and as a developer new to the codebase, I'd been in this file several times and had found it confusing to work with.

The moment when I've just stopped being confused about a piece of code is a valuable window of opportunity for me to make that code clearer: I get to take my new understanding, as well as my very fresh memories of what information I was missing, and then take a few moments to make that information more obvious.

By changing the representation of these lists, I've taken my own questions about what this all means, and I've used them to organize important information about the program into a more compact and accessible form. As developers, we've already got a lot of experience reading tables; formatting this source file as a table makes it easier to spot certain kinds of patterns.

For example, now that I'm looking at this, I can see that there are two activities related to text messages: “sent_text_message” is a salesperson action, and “received_text_message” is not. But “received_text_message” also isn't user_initiated, which seems like maybe that's a bug. That mismatch was there all along, but it would have taken a lot more effort to find it before.

In a future episode, I'll start using this ActionType class to do some more traditional refactoring.

Until then, happy hacking!


Sam Livingston-Gray is a professional software developer, amateur podcaster, tea-swilling leftist, proud papa, and occasional punster. Despite his lack of appreciation for coffee or beer, he happily resides in a very Hobbit-friendly neighborhood of Portland, Oregon.

Sam podcasts at greaterthancode.com, and tweets at twitter.com/geeksam.

(Cover image: “Rows”, copyright (c) by Gary, made available under the Creative Commons Attribution 2.0 license.)