Did you know that Khan Academy not only has videos, exercises, and computer programs but also articles as well? These articles form a major part of the art history curriculum as of today (6/1/2015), and as of this date I have yet to find someone outside the company who knew that they existed. If all goes well hopefully that’ll change real soon! I’ll plan to write about my project in detail soon; this sentence will probably show up in a lot more posts in the meantime.
There’s a lot of things that need to happen to ready the article writing system before the school year starts again. Here’s what my mentor Alex and I tackled first.
The Slug Problem
When an article is written on the platform, it receives a specific URL ending called a slug. For example, when you visit the article I linked above you can see the slug is
/a/ right before it in the URL is the content type: it says that this link points to an article.) Now I can copy and paste that URL all over the internet to show all my friends how cool art history is! That slug is inferred from the title of the article, so if I wrote an article titled “Hello world”, the slug would automatically get set to something like
hello-world, and if I change the title to “Hello world v2”, the slug will be updated to look like
And there’s the problem: when the slug changes, all previous links to that article break.
So, we have to decouple the article’s slug from its title so that changing one does not change the other.
(For KA folks: here’s my diff, although it’s a bit noisy since I did another closely related feature along with this one.)
Tackling the Feature
We started out with the interface itself. Alex walked me through the relevant parts in the UI that would change while I asked questions.
We came up with a high level breakdown of the final result as follows:
- Add a “Slug” input field to the article editor.
- When that field is edited, grab the result and send it to the server.
- Save the new value of the field into the datastore for an article.
As it turns out, our exploration revealed that videos already have this functionality so there was code we could reference. This was especially helpful since this was my first time ever looking into the codebase and there was a whole lot I was unfamiliar with.
We started with the interface. In a large codebase, how I even know which file to edit so that my changes show up in the right places? Alex taught me perhaps the simplest way to do so using the find-all /
grep functionality in Sublime. Essentially, we look for text on the page that looks hardcoded in and search for all occurrences, hoping that one of the search results will be what we want it to be.
Thank goodness for Cmd+Shift+F
article-edit-view.jsx was the file that rendered in that view. If the view uses React, another simple way to find stuff is to use Inspect Element and the React Devtools extension to find the React component and then finding all occurrences of that.
Now I can grep for
Find-all makes hide-and-seek a little one-sided.
This came in handy since I not only found where the component is defined but also where it’s being used. The nice thing about starting from the view first is that I could make some small changes and refresh to verify that these indeed were the files I was looking for.
From there, it was relatively simple to add the input fields that I wanted since we were already using some components for the input elements themselves. I just copy-pasted and made some small edits to wind up with:
placeholder="Identifier shown in the URL"
title="Slug / readable ID"
Code reuse to the rescue!
We said that the field should be named
readableID to be consistent with the way it was done in for videos. Now to add the capability to store that data.
It ain’t pretty, but it’s there.
(I said something along the lines of “Now we can add a
readable_id column to the Article model” to my mentor, who chuckled. Apparently in Google App Engine speak this is translated to “Now we can add a
readable_id property to the Article entity.”)
Here, Alex helped me out and said that I was looking for the
article_models.py file since there wasn’t anything on the webpage I could easily
grep for. There are a bunch of article-related classes in this file, including
FrozenArticle, and a few others. Quick breakdown:
BaseArticle defines common properties and methods for the other article classes. It was fairly easy to find this out because the docstring said exactly that:
"""Base for Article and ArticleRevision w/ common properties & methods."""Go docstrings.
ArticleRevision is a version of an article. Every time an article is edited, we make another one of these since they’re cheap. We use a similar type of versioning system for all our content since we have versions of videos, exercises, etc. which are live on the site but also versions that are being drafted. This system also has the nice benefit of being able to
diff entities very easily (just compare the newest revision with the one that’s published).
FrozenArticle is essentially the version of an article that is published (visible to the public). It’s designed to be very fast, since it will never change but will be read often. Every time we publish content, we regenerate these so that they have the most up-to-date info.
So, making the change that I wanted (adding a new property to all articles) is done easily with
# Human readable, unique id that can be used in urls.
readable_id = db.StringProperty(indexed=True)
It turns out that on the editing view side, the form autosaves the changes you make for all the input elements (into a new
ArticleRevision!). This is done through Backbone and we didn’t explore that in depth at the time but I know I will sometime.
I quickly noticed that when I edited the field, the editor would save the results and tell me it was successful. However, on a page refresh, the slug I inputted would disappear, meaning that it wasn’t getting saved properly. Furthermore, changing any other field saved just fine.
Alex came to the rescue again and noted that we restrict the parameters that can get submitted through the form (for security reasons, like Rails’ strong parameters for my fellow Rails people). We added this line to
_f('readableId', source='readable_id'): sig.nullable(sig.string),
If you were paying attention, this is how we can specify an input for
readableId in the form and map it to
readable_id in the entity to follow the convention of using camelCase for JS and snake_case for Python.
After this, the shiny new slug field saves properly!
Now, the final step: getting that new slug field into the URL. It turns out that looking at how it was done for videos made life really easy. In
BaseVideo there was a method called
"""See content.models.LearnableContent.slug for details."""
As the docstring suggested, I took a look in
LearnableContent, the class that
BaseArticle, and the other content classes inherit from. The
slug method was defined there as:
return util.slugify(self.title), so by default the slug of a particular piece of content is derived from its title. Since
BaseArticle didn’t define a
slug method it fell back to this default behavior, so whenever an article’s title changes its slug will change too. We came full circle to our original problem.
BaseArticle we wrote:
return getattr(self, 'readable_id', None) or util.slugify(self.title)
to override the
slug method and get the whole thing working.
That was a lot of stuff. Still, I definitely struggled to keep the details to a reasonable level. This is my first “technical” post, so I’ll keep experimenting until I find a style I like.
There was a lot of stuff I left out and may actually never get around to. I’ll end with a couple tips I’d leave for my past self if I could.
- There are many moving parts in a large app and when diving in for the first time your mind simply cannot understand all of them at once. That’s okay. Ask for a high-level description of things you’re interested in and focus instead on what’s most relevant to you right now.
- Related to the above point: during my pairing session with Alex I noticed myself thinking very hard about the implementation – the code itself. When Alex pointed out higher-level things (eg. “This might cause problems elsewhere.”) I could immediately why what he said was true but I felt like I wouldn’t have been able to come up with it myself. That’s okay too. Learn the details of the implementation well now so you can think at a high level soon.
- Ask lots of clarifying questions, because lots of times you’re wrong about how things work and without correction you may end up prey to the Law of the Broken Futon. You shouldn’t expect your mentor to attend to you 24/7 so be liberal with your questions while he/she’s around. Constantly summarize your thoughts as if you’re explaining it to a fifth-grader and be glad when you can’t; now you know which parts of your work you don’t understand well.
Thankfully, it’s not about impressing anyone with your code. So, choose instead to do whatever it takes to get greater learning and understanding.