How to Improve your Flashcard Knowledge Base

Even the best flashcard developers among us create bad cards on a regular basis (e.g. too long, ambiguous, useless information).

Given the reality that we all are highly imperfect at developing our flashcards, what should we do to improve these crummy cards as they age so you can spend less time reviewing and remember the concepts better?

I call this process flashcard “refactoring” (a term borrowed from software development).

Why refactor flashcards?

Reviewing old flashcards requires time and effort. Here are a few reasons why it’s worth the price:

  • It improves your understanding of the material. The process of breaking learning material down into the smallest “chunks” possible that fit onto flashcards is an extremely valuable exercise. Reviewing troublesome cards clarifies what you don’t understand and forces you to restructure your knowledge in a way that makes sense.
  • Your worst flashcards take up a disproportionate amount of time and effort, while yielding the worst results in terms of retention and usefulness. Following the 80-20 rule, 20% of your cards leads to 80% of the effort in review. So it’s a high value activity to hunt down this subset of your cards.
  • It provides knowledge construction training. Creating good flashcards is a nontrivial skill built over time. You can read Poitr Wozniak 20 rules for formulating knowledge, but actually observing your own performance on your cards and troubleshooting improvements takes your skills to the next level.

The process I use has two broad steps: selection and revision.

Selecting Problem Cards

I use two main methods to find cards needing review.

The first and most important method is finding cards I keep failing (“lapse” is the term in Anki). In the Anki browser, I can use the command prop:lapses>n to find the cards that have lapsed over n times. For me, cards are never lapse more than 8 times because Anki then marks it as a leech and automatically suspends it. Cards that have lapsed 5 or more times are great candidates for refactoring.

The other method is “marking” cards during review when I notice a card is poorly formed. I also try to make notes on marked cards describing what’s causing the problems (coming back to the cards at a later time, you can easily forget the specific issue that tripped you up).

Reviewing and Revising Problem Cards

The first step examining a difficult card is to ask whether I need this knowledge at all. If not, that’s the end of the process – I just delete the card and I’m done with it. I may also revisit the source material or do some Googling on the topic, which will sometimes reveal that the card is pointless or inaccurate.

If I decide that it’s important and relevant knowledge I want to keep, then I’ll examine the card for issues, using the principles from Poitr Wozniak Twenty Rules of Formulating Knowledge.

Example

Consider this data engineering card from my deck which was recently giving me problems:

  • Side 1: Tail latency amplification
  • Side 2: Even if small % backend calls slow, chance of getting a slow call increases if user request requires multiple backend calls, and so a higher proportion of end-user requests end up being slow.

First off, is this card relevant and worthwhile? For me, the answer is definitely yes: it’s both relevant to my job as a data scientist and my software engineering side projects.

Next, diagnose the problem. On closer examination, there are a few things wrong with the card:

In cases where I add material I don’t fully understand, I find the best approach is to go back to the source (which in this case is the book Designing Data-Intensive Applications by Martin Kleppmann). I then refactored the card like this:

  • Side 1: Tail latency amplification (Kleppmann)
  • Side 2: Multiple back-end calls for a single user request increases chance of encountering a tail latency. (Kleppmann)

As you can see, I added a source to clarify where the information came from (Rule 18: Provide source).

I was curious what other cards I have about tail latency, and it turns out there are none! Seems ridiculous to have a card about tail latency amplification, but not have a single one about tail latency which is a more common term. Not having this in my deck probably contributed to interference since I never tested myself on the distinction between the two concepts. So I added:

  • Side 1: Tail latency (Kleppmann)
  • Side 2: High percentile response time. (Kleppmann)

Note that the tail latency amplification card uses tail latency in its response. I’m hoping this will limit confusion between the two and emphasize the distinction (Rule 13: Refer to other memories). I also italicized amplification, to hopefully further avoid interference.

Since I’ve made these changes, I haven’t had any problems with these cards and feel like I have a better grasp on the material. Consider doing the same for the important knowledge in your decks causing you trouble.

Tips From Anki Flashcard Refactoring: Add Enough Knowledge to your Deck and Review your Sources

My flashcard refactoring for today is a reminder of the classic knowledge construction advice: do not add what you do not understand. It is also a reminder of the importance of providing enough related cards in your deck for a piece of knowledge.

Here’s the card I came across that was giving me trouble, related to SQL programming (double-sided):

  • Side 1: Oracle SQL syntax for creating object table
  • Side 2: CREATE TABLE (table name) OF (object type)

When revisiting this card, I realized that I didn’t have a good concept of what “object tables” are, so this is definitely a case of not understanding the material before committing it to spaced repetition.

But the thing is, I wouldn’t have added it if I didn’t have a good understanding of object tables, at the time of adding knowledge to my spaced repetition system. The problem is I forgot the concept of “object tables”, and seeing the answer to this card was not enough to bring it back. I didn’t have any other cards in my deck about “object tables” and how they differ from other related concepts in Oracle SQL such as nested tables.

In a situation like this, it helps to go back to the source, clarify any misunderstanding, and add new cards that solidify your knowledge.

So, in this case, I looked up Oracle documentation and found a great article almost immediately that clarified the meaning. It also provided a bunch of useful nomenclature for closely related concepts, providing further scaffolding for the knowledge. This lead me to add a bunch of cards:

  • Card 1 (Cloze): Objects can be stored in two types of tables: [object tables] and [relational tables].
  • Card 2 (Basic 1-sided Q&A):
    • Q: What’s the difference between object tables and relational tables? (Oracle SQL)
    • A: Object tables store only objects Relational tables store objects with other table data
  • Card 3 (Basic 1-sided Q&A):
    • Q: What does each row represent in an object table? (Oracle SQL)
    • A: An Object

So to recap, here the main lessons from this refactoring:

  1. Don’t add stuff to spaced repetition that you don’t understand
  2. Make sure you add enough knowledge about the concept in your deck, so there is sufficient context for you to understand again when you forget
  3. When dealing with 1 or 2, the solution is to go back to the original source to understand the knowledge and add more relevant material.

For access to my shared Anki deck and Roam Research notes knowledge base as well as regular updates on tips and ideas about spaced repetition and improving your learning productivity, join “Download Mark’s Brain”.

Tips from Flashcard Refactoring

Include your Sources, Have a Single Answer, and Break-Down Your Cards

Here’s a flashcard related to Oracle SQL that was giving me trouble (lapsed 8 times and was automatically marked as a leech):

  • Side 1: Collection (Oracle SQL)
  • Side 2: Data types in Oracle SQL that lets you internalize parent-child relationships between tables in the parent table.

This was a double-sided card, so both Side 1 and 2 serve as the question. Let’s see if we can improve this one.

First things first: do I need this card at all? Yes: SQL is highly relevant to my career in Data Science, and the organization I work for relies heavily on Oracle database. It’s important knowledge for me that I didn’t want to remove.

Next, figure out the issue with the card. Looking at the card statistics, it turns out I was always getting Side 2 wrong. After some consideration, I realized that this is actually a poor definition of a “Collection”. In fact, it’s not really the “definition” of a Collection, but a characteristic of a Collection. In other words, the flashcard doesn’t have a unique answer: it’s true that a Collection internalizes parent-child relationships, but it does a lot of other things too.

I consulted the original source of the material and there isn’t a clear definition of a Collection there. I did some Googling for other sources and apparently there isn’t really a great definition of an Oracle Collection. It turns out that Collection refers to a generic programming idea not specific to Oracle.

So, rather than trying to define Collection, I’ve opted to break the existing card down into multiple cards, following Rule Number 4 of Knowledge Construction: stick to the minimum information principle, which means if you can break a card into multiple simpler, easier-to-answer cards, do it.

Card 1 (one-sided):

  • Side 1: What Oracle SQL data type lets you internalize parent-child relationships in the parent table?
  • Side 2: Collection

Card 2 (one-sided):

  • Side 1: What kind of relationship does an Oracle SQL Collection help you represent?
  • Side 2: Parent-child (aka “one to many”)

Card 3 (one-sided):

  • Side 1: Does the Oracle SQL Collection data type internalize parent-child relationships in the parent table or child table?
  • Side 2: Parent table

I also tracked down a good definition of the generic “Collection” concept in Computer Science, and added it:

Card 4-5 (double-sided):

  • Side 1: Collection (Computer Science)
  • Side 2: Object that groups multiple items together as a single unit (Computer Science)

I feel confident these cards will be easier to remember, cost less time and frustration, and help me remember the concept much better.

Lessons learned:

  • Flashcards should have a single answer. Multiple correct answers for a card is a recipe for confusion and frustration. Interestingly, this isn’t included in Poitr Wozniak’s Twenty Rules for Formulating Knowledge, although you could interpret this as a form of interference (Rule #11)
  • Keep track of your source material when making cards. It makes it easy to look up more details when needed. 
  • Browse related sources through Google search if you’re unsure about what to do to an item. This will give you more context around the card to see whether the knowledge is even required at all. You may also come across a clarification or better formulation. In the example above, I discovered the generic concept of “Collection” in programming and realized that it was futile to try to include a definition specific to Oracle SQL.
  • Break cards down into a larger number of simpler cards. This is classic knowledge construction advice that is often not heeded, because it feels like more cards means more work. Counterintuitively, it is really a free lunch: you remember the concept better, you spend less time reviewing than you would have with the single complicated card, and reviews become much more enjoyable. 

Anki / Spaced Repetition Tip: Review your Weak Flashcards

I’ve been a long-time user of spaced repetition tools. I’ll never forget first hearing about SuperMemo from a close friend as I started my undergraduate degree in 2005. I was immediately sold on the value of spaced repetition, and I particularly liked the idea of computers automatically taking care of review scheduling for you. I started using SuperMemo as a central tool for studying, and saw my academic performance skyrocket.

Over the years, I’ve slowly improved my skill in designing flashcards. It is by no means a trivial skill: it took me years to get pretty good at it, and to this day I still often make flashcards that are complete failures.

I believe there will eventually be an open collaborative platform for flashcard development and sharing, where experts can contribute and refine perfectly crafted cards. Users contribute their deck statistics, revealing poorly formed cards and contributing to our understanding of optimal flashcards.

But until that day, it pays to develop your flashcard creation skills.

Flashcard quality is top of mind for me since I’ve revisited the classic article by Peter Wozniak (of SuperMemo fame), “Effective Learning: Twenty Rules of Formulating Knowledge)”. It is a must-read for anyone that creates flashcards for learning (i.e. almost everyone at some point in their life). I’ve published my summary notes on this article (aside: my notetaking tool of choice is Roam my notes are easy to copy-paste into your own Roam database if you happen to use it as well).

One great way to improve your flashcard development skills, while simultaneously improving the quality of your deck, is to review your old cards regularly. Review your top 10-20 most problematic cards weekly, and for each one you encounter, do one of the following things:

  • Revise: With the Twenty Rules of Formulating Knowledge by your side, refine your card or break it down into a larger number of small, easy to digest cards.
  • Suspend: If you don’t think you need to have a card in spaced repetition anymore, but don’t want to delete it entirely, suspending is a good option.
  • Delete: If you know the knowledge is completely useless to you, trash the card entirely.

But what cards should you review? If you’re like me, you have a pretty big collection, and it’s just not feasible to review all your cards every week to find the weak ones.

Anki makes it quite easy to find these problematic cards. Two main search commands in the Anki Browser are useful here:

  • tag:leech – this finds all of the “leeches” in your Anki deck, which are cards that you keep forgetting. By default, Anki tags your card as “leech” when you fail a card 8 times.
  • prop:lapses>n – this reveals all of the cards you have failed (“lapsed“) over n times. You can set n to whatever number you like. Start with high-n cards and work your way down.

In addition to using these search techniques, I try to make a habit of “marking” cards that are problematic or poorly formed in some way, during review. If it’s an easy correction (e.g. obvious suspension, or small text changes), I’ll make the change right away in the mobile app. Otherwise, I will simply mark the card and filter it out during weekly review to make improvements.

When you do revise your cards, I recommend “resetting” the card so it’s like a “do-over” – the card should be reviewed again as if you just created it. This serves two purposes: it ensures that the card will no longer show up in your “problem cards” lists when you do the above queries. It also provides you with more opportunities to review your new formulation of the knowledge.

Unfortunately, it seems the only way to do this in Anki is do create new card(s) with the information you want and delete the old one. There is an option for “rescheduling” the card, but this only restarts the review process and doesn’t delete your review history. As a result, the card will still appear as one of your problem cards if you do a query like prop:lapses>n. Luckily, it’s not much extra effort to do this.

I have to admit that I do not entirely practice what I preach here. Weekly review of my cards is something I haven’t fully incorporated yet, but I’m resolving to start doing it today. In the next weeks, I’m going to experiment with a Flashcard Refactoring series to illustrate the card refinement process. Stay tuned!

Roam Notes on Poitr Wozniak (Supermemo) Twenty Rules of Formulating Knowledge

  • "Author::" [[Poitr Wozniak]]
  • "Source::" https://www.supermemo.com/en/archives1990-2015/articles/20rules
  • "Recommended By::"
  • "Tags:: " #Flashcards #[[Spaced Repetition]] #[[flashcard design]] #Learning
  • Summary

  • The rules are listed in order from most important / common to least.
  • Rule 1: Do not learn if you do not understand. Trying to memorize things you don’t understand increase the time to learn and more importantly, reduces the value of the knowledge to nothing (e.g. memorizing a German history book when you don’t know German – you won’t know any of its history). #[[Flashcard Tip: Don’t add Things you Don’t Understand]]
  • Rule 2: Learn before you memorize. He recommends building an overall picture of the learned knowledge before memorizing. You’ll reduce learning time when the individual pieces fit a single coherent structure. So, read the chapter first, then add the cards. #[[Flashcard Tip: Learn Before you Memorize]]
    • Notes: Why can’t you learn with [[Flashcards]] alone? Perhaps this is efficient if presented in the proper order. Also, perhaps the cards need to change when first learning when compared to committing to long-term memory. If so, how do they change? In other words, how are "questions for learning" different than "questions for retention"? #[[Personal Ideas]]
  • Rule 3: Build upon the basics. Start simple, and build from there. Don’t hesitate to memorize basic, obvious things. The cost of memorizing them is small, because they’re easy to answer. "usually you spend 50% of your time repeating just 3-5% of the learned material" source #[[Flashcard Tip: Build Upon the Basics]]
    • Notes: The basics provide [[scaffolding]] that you can build upon. This reminds me of the [[80-20 rule]], where a big chunk of your time is spent on a small number of [[flashcards]]. #[[Flashcard Tip: Track Down and Eliminate Your Problem Cards]].
  • Rule 4: Stick to the minimum information principle. Formulate knowledge as simply as possible. Simple is easy to remember, and having a complex answer means there is more to remember – a larger number of simpler cards covering the same knowledge lets you review each sub-component at its own appropriate pace. #[[Minimum Information Principle]] #[[Flashcard Tip: Follow the Minimum Information Principle]]
  • Rule 5: Cloze deletion is easy and effective. #[[Flashcard Tip: Use Cloze Deletion]]
  • Rule 6: Use imagery. Our brains are wired for them. They usually take more time to create though compared to a basic verbal card, so weigh the benefits. #[[Flashcard Tip: Use Images]]
  • Rule 7: Use mnemonic techniques. He makes an interesting point that these do not solve the problem of forgetting, since the bottleneck is long-lasting and useful memory, not quickly memorizing knowledge. For that, you need #[[Spaced Repetition]]. "Experience shows that with a dose of training you will need to consciously apply mnemonic techniques in only 1-5% of your items". #[[Flashcard Tip: Save Mnemonics for Difficult Cards]] #mnemonics
  • Rule 8: Graphic deletion is as good as cloze deletion. #[[Flashcard Tip: Use Image Occlusion]]
  • Rule 9: Avoid sets. Sets are unordered collections of objects. Very difficult to memorize. If you must, use [[enumerations]] instead, which are ordered in some way. #sets #[[Flashcard Tip: Avoid sets]]
  • Rule 10: Avoid enumerations #enumerations #[[Flashcard Tip: Avoid Enumerations]]
    • He includes a nice method for [[memorizing text]] such as [[poems]] or [[prayers]], without using [[cloze deletion]]
  • Rule 11: Combat interference: #[[memory interference]] #[[Flashcard Tip: Combat Interference]]
    • Learning similar things tends to make you confuse them. [[memory interference]] – "knowledge of one item tends to make it harder to remember another item".
    • "**Interference is probably the single greatest cause of forgetting in collections of an experienced user of **[[SuperMemo]]."
    • The only strategy to work against this is detect and eliminate. It’s hard to know you’ll face interference at card creation time.
  • Rule 12: Optimize wording #[[Flashcard Tip: Optimize Wording]]
    • Shave down the number of words you use. Make your cards as clear and concise as possible. Focus on the piece of information that is important.
  • Rule 13: Refer to other memories #[[Flashcard Tip: Refer to Other Memories]]
    • When you add a new card, try incorporating things you’ve learned from other cards.
  • Rule 14: Personalize and provide examples: #[[Flashcard Tip: Personalize and Provide Examples]]
    • Link your cards to your personal life.
  • Rule 15: Rely on emotional states: #[[Flashcard Tip: Rely on Emotional States]]
    • We remember things better that are vivid or shocking.
  • Rule 16: Context cues simplify wording: #[[Flashcard Tip: Use Context Cues]]
    • They often reduce the number of words you need
  • Rule 17: Redundancy does not contradict minimum information principle #[[Flashcard Tip: Use Redundancy]]
    • Redundancy – more information than needed or duplicate information.
    • It can be good, and minimum information principle does not mean minimum number of characters in your deck.
  • Rule 18: Provide source: #[[Flashcard Tip: Provide Sources]]
  • Rule 19: Provide date stamping: #[[Flashcard Tip: Use Date Stamps]]
    • Particularly for knowledge that changes over time and can become obsolete.
  • Rule 20: Prioritize: #[[Flashcard Tip: Prioritize]]
    • There is way more knowledge in the world than you’ll be able to absorb and remember long-term.
    • Focus on adding knowledge that is most relevant and important to you.

Excerpts from “The Use of Flashcards in an Introduction to Psychology Class”

Excerpts

  • Abstract: Four hundred fifteen undergraduate students in an Introduction to Psychology course voluntarily reported their use of [[Flashcards]] on three exams as well as answered other questions dealing with flashcard use (e.g., when did a student first use flashcards). Almost 70% of the class used flashcards to study for one or more exams. Students who used flashcards for all three exams had significantly higher exam scores overall than those students who did not use flashcards at all or only used flashcards on one or two exams. These results are discussed in terms of [[retrieval]] practice, a specific component of using flashcards.
  • Despite their apparent prevalence and impressive claims regarding their effectiveness, there appear to be no published studies examining whether flashcard use increases students’ exam performance in a naturalistic context.
    • Researchers have investigated flashcard effectiveness in laboratory settings.
  • A [[crib sheet]] (or cheat sheet) is an index card that contains ‘‘brief written notes’’ for a class and that a student can use during an exam (Dickson & Miller, 2005).
    • some research on crib sheets may pertain to how [[Flashcards]] influence exam performance. Studies have shown that merely creating crib sheets does not aid in student learning because students depend on being able to use the crib sheets during an exam and may not actually learn the exam material (Dickson & Bauer, 2008; Funk & Dickson, 2011). Yet, Funk and Dickson (2011) found that when students created crib sheets but did not expect to use them during an exam, they performed better on that exam than on another exam for which they expected to use their crib sheets. The former condition may be similar to creating flashcards in that students generate and use flashcards with the clear understanding that these cards will not be used during the exam. #[[How Much Does Flashcard Creation Aid Learning?]] #[[Blog Post: How Much Does Flashcard Creation Aid Learning?]]
  • [[Descriptive Statistics About Flashcard Use]]
    • Overall, 69.9% of the class used flashcards for at least one of the three exams; 65.5% used written flashcards, 3.9% used computer flashcards, and 0.5% used both self-generated and [[computer flashcards]]. Also, 55.2% of the class used flashcards (either written or computer) to study for two of the three exams and 34.9% used flashcards to study for all exams.
    • The results showed that flashcards were also used in other classes: 48% used only written flashcards in other classes, 2% used only [[computer flashcards]] in other classes, and 6.5% used written and computer flashcards in other classes. About half of students (49%) who used flashcards in the present Psychology course used them in other courses. Only about a quarter of students (23%) did not use flashcards in any class. Finally, only a small percentage of students (7%) did not use flashcards in Introduction to Psychology, but used flashcards in other courses
  • In our study, students primarily used self-generated [[Flashcards]]. In fact, so few students used [[computer flashcards]] that analyses could not be conducted comparing the two types of flashcards.
  • it is likely that the proliferation of smaller computers and electronic devices (e.g., iPads) will lead to an increase in [[computer flashcard]] use in the years ahead.
  • Flashcard use should be examined in greater detail by investigating the composition of the flashcards that are generated (i.e., what is on each card), how students actually use the cards (e.g., how often do the students test themselves, how long do students spend generating and using flashcards), whether other study techniques are used in conjunction with flashcards, and how the nature of the materials to be studied impacts flashcard use. #[[Gaps in Flashcard Research]]
  • three important [[methodological limitations]] that should be noted
    • there is the possibility that students may have exaggerated or misremembered information about flashcard use
    • the survey was only conducted with a single Psychology class
    • the present study did not include information that might differentiate flashcard users and nonusers #[[selection bias]]