Practical Recursion?
So, in the process of answering questions on the FreeCodeCamp forums, and on The Odin Project's Discord group, and occasionally on Twitter, I've occasionally seen the question "Is there a practical use for recursion? I mean, it's a thing, but is it a thing we're ever going to actually use?"
And that is a valid question. When the lessons at this or that online curriculum have you use recursion to create an inverse array... is there a practical use for that, as opposed to simply array.reverse()
?
But this is not me saying that there's no use for recursion, quite the opposite. I was tinkering around with an idea, and the upshot is... this post!
A little background
I am working on building this blog up using Next.js and a few other bits. I plan to connect it to Google Firebase, in order to allow commenting (yes, from all two of you). In researching the best way of building comments, there ain't no right way. So I'm rolling my own, simply because I can, and because I enjoy my nerd-moments.
The question of how to structure those comments in the database was my first question. Originally, I was thinking that, being a document-based database system, they would structure naturally as a deeply nested object:
article
|_ comment
|_ comment
|_ nested comment
|_ reply to nested comment
|_ another one
|_ and back a level
|_ a third comment in this second one
|_ another root level comment
Seems eminently sensible, as it's mirroring the object as I might structure it in my front-end application:
const article = {
title: "an article",
content: "...",
children: [
{
id: 001,
author: <authorId>,
comment: "Such a crock!",
children: [
{
id: 011,
author: <anotherId>,
comment: "You just...!",
children: [
{
id: 013,
author: <fooId>,
comment: "I did not!",
children: []
}
]
},{
id: 012,
author: <bobsId>,
comment: "Interesting opinion...",
children: []
}
]
} // end of the first comment's tree...
]
} // and end of the article itself
And that's great... but everything you might read about Firestore (or other NoSQL databases) suggests that that approach might not be the best Kung Fu. The shallower the tree, the more responsive, would be my guess. So another approach is called for. Rather than nesting all the comments under their parent document, I tried simply having a /comments
collection in my Firestore, with each comment having a consistent structure:
const comment = {
id: <uuid>,
article: <uuid>,
author: <uuid>,
comment: <string>,
parent: <uuid>,
};
From this, we simply get a flat array of objects, each containing its own data, but also containing a reference to its parent. Comments at the root node are given a parent
value of null
.
The comment contains its own unique ID, the ID of the article to which it refers, the author of the comment's ID, and the ID of its parent comment. As we go forward, we may be adding a few more, but for our uses, this will let us start.
So now, when we query the database for all comments for a given article ID, we get an array of those comments. No heirarchy, no enforced order... simply a (potentially HUGE) array.
Time for an illustration, maybe?
So the structure of a comments block like I'm envisioning might be something like a Twitter post's. An article can contain any number of first-level comments, and each of those might contain any number below them.
The above is an illustration of a pretty common design pattern, known as the Composite Pattern. Each node in that pattern can be considered a branch, or a leaf. In either case, some sort of common functionality is applied to each of these.
How does that reflect what we're doing? Suppose the root
node in that diagram reflects our article, and the three nodes below it reflect the first-level comments. They contain exactly the same structure, whether they have comments descending from them or not.
So, armed with that information (a bunch of nodes forming a tree structure that function in exactly the same way) and what we're getting from my database (an array of elements that represent the nodes... as an array), how can we convert from the second to the first?
Start from pseudocode
We have this array of comments. We want to run some function on all of them, starting from the first-level comments. Those are the ones with a parent
property equal to null
.
The difference between the array we get from our database and the nested object we want is that each object in the tree contains a children
property, which is an array of comments in reply to this particular parent. So, we need to create a children
property, and either set it to []
(an empty array), or to an array of those children elements.
Let's try out some pseudocode! Note that this is not, by any stretch, formal pseudocode. This is my own internal dialog outlining notes.
buildTree is a function that takes an array of *all* the elements, and a parent id.
First, create a new array of those elements with that parent id as their parent property.
Then, look at the length of that array. Is it zero?
If it is, then there are no children - return an empty array.
If it is *not*, then we have children. A little more processing.
For each child node, we want to add a children property, which will contain the result of calling buildTree with the same array of all the elements, and this current element's ID.
Once we've added that children property, return the updated element to our array.
Finally, having processed all the child nodes within the given parent, return the completed array.
That's the entire function. And note, when we have an array of children elements that we're processing, we call that same buildTree
function again, only we pass this particular element's ID as the parentId
parameter.
Hey, that's RECURSION!
Told you we'd get there. Yeah, this is a practical use for recursion. It doesn't make sense to use a for loop here, or most other looping mechanisms - we don't know how deeply nested they might be, or how we might handle the nesting ( a for loop inside a for loop inside a... EEESH!)
But the point of the composite pattern is that all the elements are handled the same way, they are composeable. So each node will have buildTree()
called on it, and will either end up with an empty array, or an array of children.
To begin, we have this:
const buildTree = (fullArray, parentId=null) => {...}
In the parameters, note that I've included a default value for parentId
- if we don't provide one, assume null
. The next thing we want to do is to get all the children of the given parentId
:
const buildTree = (fullArray, parentId=null) => {
const children = fullArray.filter(element => element.parent===parentId)
}
Now, if you've read the article on Recursion, All The Way Down, you may remember the definition of recursion:
A function or set of rules applied to an element or collection of elements, calling itself repeatedly until an exit condition is met.
Complete paraphrase, but the point remains:
- collection of elements
- function or set of rules
- exit condition
In our case, the exit condition will be when there are no children for the given parentId
. When that happens, we simply want to return an empty array.
const buildTree = (fullArray, parentId=null) => {
const children = fullArray.filter(element => element.parent === parentId)
if (children.length ===0) return []
}
There we go, our exit condition is handled. Anything we do after that return line will only happen if children.length
is greater than zero.
So, what do we want to do next? We want to iterate over each child, and add that children
property to it.
const buildTree = (fullArray, parentId=null) => {
const children = fullArray.filter(element => element.parent === parentId)
if (children.length ===0) return []
return children.map( element => {
return {
...element,
children: /* and something happens here... */
}
}
}
So we use object destructuring to create a new object from our comment object, and we are adding a property to it called children
. But what value do we want it to have?
Here's where the recursive magic happens. We want the children
property to be either an empty array (if this comment has no children), or a tree of children.
Huh. We wrote a function like that somewhere...
const buildTree = (fullArray, parentId=null) => {
const children = fullArray.filter(element => element.parent === parentId)
if (children.length ===0) return []
return children.map( element => {
return {
...element,
children: buildTree(fullArray, element.id)
}
})
}
And in that line, we now call this same function within itself. In the event that the current element contains nothing, we don't go any deeper, we simply exit the recursion and return a []
. If, however, it does contain children, then we recursively call buildTree()
with each child's id.
Wait, that's it?
That's it, yup. It won't take anything more than a simple call, when I get the firebase array of comments, than just calling:
const comments = commentsRef.where('articleId', '===',articleId)
.get()
.then(querySnapshot => {
return buildTree(querySnapshot.docs)
})
The opening line is the firebase code, the then(...)
happens when the results have come back, and we simply call the buildTree(...)
function within that. By not passing it an id, we start the tree from the root of the comments tree, where the parent
is null
, and grow from there.
This is a lot, and it's a confusing topic. Please feel free to email or message me any questions, and be watching - the next post will be on a similar topic. I'd like to build a way to get elements based on their depth in the tree, as well as to drill through both parents and children from a given node.