These are some ideas I have for creating a usable flashcard program. They are somewhat organized by topic, but not placed in any particular order. This is sort of a brain-dump for me. I wanted to get these idea out of my head and written down. Some of these ideas may be great, some may be not-so-great.

Algorithms

Variable Algorithms

Allow user to write his own algorithms in a python include file or something. Settings for each card would allow you to select the algorithm / progression through algorithms.

All algorithms would have to provide maybe two functions: Return interval for desired retention rate, and Return retention rate for given interval.

Database would hold identity of algorithm along with a certain number of columns for parameters. All non-null parameters would be passed along in order (0-5)

Fluency Algorithm

  • passCount = number of times card has been marked ‘correct’
  • param1 = initial interval
  • param2 = frame size
  • param3 = increase per frame
  • next interval = param1 + (param3 * int(streak/param2))
  • retention rate = return False

On failing a card, card would simply be presented again after a few minutes delay. (perhaps implement a failed cards queue such that any due cards in this queue take highest priority).

Speed Drill Algorithm

This algorithm would schedule speed drill exercises. See the ‘Card’ as external process section below.

This is a very simple algorithm. Whether the user passes the speed drill (achieves a specified competency/speed rate), or not, this algorithm would simply schedule the next task for n days later. The value of n would be set as one of the card/item properties.

Essentially, we would be scheduling daily reviews for speed-building exercises. Failing to meet a certain speed criteria would not mean that the user has to try again that day; he would just have another chance the following day. The user may not be able to achieve the speed required yet, so asking him to redo the card until he achieves this speed would be unrealistic.

Retrieval Probability Algorithm

This algorithm would attempt to schedule items to hit a certain retrieval probability on the next review. This is what super-memo claims to attempt. It’s a very difficult task because retrieval probability cannot be measured by an individual item’s review; all that tells us is whether we retrieved the item, not what our chances were.

And while we can plot retrieval rates after given intervals and fit a forgetting curve model to the data, I don’t think there’s much known about how memory is strengthened following review at any given retrieval rate. It might be something like an inverted U shape; as retrieval probability falls, the benefit of review increases until it hits a peak and then begins to fall… but who knows?

One we choose a model of the forgetting curve, we can make some assumptions about how reviews will affect the parameters. We can then attempt to estimate the interval before retrieval probability hits any given value for an item (target retrieval rate could be stored for each item). Our naive assumptions may give us wildly inaccurate estimates, but they might still be useful for determining which items are most in need of review (i.e. most over-due).

Tuning the Retrieval Probability Algo

As individual items don’t yield much data, it’s probably best to look at groups of similar items when trying to fit equation parameters to the data that we’ve collected. For this, of course, we’d need to somehow group items by similarity.

One simple option might be to group items based on their passCount and failCount. Forgetting curves for items with the same pass and fail Counts could be grouped together to create the data for fitting a forgetting curve. Subsequent items could then adopt those parameters rather than the default ones created using naive assumptions.

For example, we might look at the review history for all items that had previously had 2 fails, and 5 passes. We could round all intervals to the nearest hour, then calculate the average retrieval rate for each nearest-hour group. Then, using a curve-fitting algorithm (I believe scipy has one, or maybe numpy), we would fit our forgetting curve model to the data points. We could figure out some reasonable trigger after which we decide to use our empirical curve instead of the theoretical one we started out with. (perhaps when a goodness-of-fit measure of our curve reaches a certain level)

Max-Efficiency Algorithm

This algorithm would attempt to schedule reviews so that memory stability increase is maximized, while overall workload is minimized. Essentially the ratio of (memory boost over time) / (work load over time) is maximized.

Data-Mining Algorithm

In order to gather data for curve-fitting purposes, we might create an algorithm designed to create experimental data.

For example, this algorithm could at first randomly assign cards new intervals of one of [15min, 30min, 1hr, 2hr, 6hr, 12hr, 24hr, 48hr] or some sequence like that. After the first review, the cards would be repeated until passed and then randomly assigned to the same list of intervals, but 2.5 times longer i.e. [2.5 x 15min, 2.5 x 30min, etc. ]. The general formula for the list would essentially be [ 15min x passCount, 30min x passCount, etc. ]

Users could assign their own cards to this algorithm, or we could use use some sort of systematic association pairs for this. e.g. nonsense syllables and meanings, random face images and names, obsure words and meanings, obscure trivia questions and answers.

Switching Algorithms

Implement ways for a user to manually switch algorithms, and ways for him to automate the switch (through a cron script perhaps after streak hits a certain number).

Figure out how to handle streak and fail settings when changing algorithms… possibly reset both back to zero… or make that a default and give user control over how to handle this.

For automated switches, will need user to provide initial parameters for each algorithm, or accept defaults.

Syncing

Forget about custom syncing. Leave it to the user to use existing technologies (rsync, syncthing, dropbox, etc.)

Cron Job

Set a cron job to perform database functions such as:

  • fitting forgetting curves to existing data
  • switching cards to another algorithm when conditions are triggered
  • exporting anonymized data

Collecting anonymized data

  • Create a github project for data collection.
  • Optional Cron job exports anonymized data
  • Export file is named with users UUID (generated when he starts the program)
  • Setting to auto-create pull request, or leave this as a manual task for user
    • perhaps program prompts user to make pull request when started

Cards / Items to be scheduled

Traditional Flashcards

Randomization

Include feature to randomize elements of the card. This is useful for learning underlying patters that show up in many particular situations. e.g. Verb conjugations follow patterns, but the verbs, and direct objects can vary.

Include some sort of tag that is replaced by the output of a custom-written user function. This allows user to set values from a list, or from a database query (perhaps all nouns with streak > 3). string might be something like: { % randfunc:username.knownWords('Noun', field1, field2) % }

Such randomized cards also create the possibility of scheduling card elements instead of cards themselves (See Scheduling Elements below).

Make Cloze Deletion a priority

Cloze-deletion cards are much faster to make and just as good as traditional question/answer cards. Although you have to not be an idiot who makes huge cards with tons of contextual clues (a common ‘criticism’ of cloze cards by idiots on /r/anki).

  • cloze-deletion with hint feature similar to Anki
  • allow for nested clozes
  • allow for overlapping clozes

‘Card’ as external process/modules

Instead of restricting use to scheduling cards, allow scheduling of virtually any activity. This means we schedule items instead of cards. These items can be traditional flashcards, or they can be some sort of exercise done through another module of an external program.

Example Use Scenario 1

Imagine you don’t already touch-type, but you want to learn how. Using an efficient spaced algorithm for this is not going to work well because you want to build speed for this; not just be able to remember where a key is after a few seconds of thinking about it.

So you create a module that runs gtypist, a terminal typing tutor, in an x-terminal and automatically loads one of its exercises (I think this is possible through the command-line, but let’s just assume it is for now).

You create an item that uses this gtypist module. The item is set to use a home-row exercise, and to use an algorithm that simply schedules review for the next day whether the user passes or fails the exercise. You also set the pass criteria to “Accuracy >= 50% and Speed >= 10wpm”.

So this item is scheduled for daily review. You start your reviews, and after a few card reviews, this item is selected. Gtypist pops up with the home-row exercise, you practice typing and then quit after the exercise is over. Now the flashcard program asks if you scored >= 50% accuracy and >= 1wpm. Your answer is recorded in the passCount and failCount for the item, but the scheduling algorithm ignores this and just schedules this item for the following day regardless of success.

Through a separate cron-script, you check on the progress of this item each night at midnight. Once the passCount hits 10, the item is modified to be scheduled every 3 days and a similar gtypist item involving a top-row exercise is set to active so that you will begin reviewing it daily beginning with your next review session.

Example Use Scenario 2

You’re an airforce pilot training yourself to quickly recognize common aircraft of your enemies and your allies. (This was actually done with pilots during WWII, after pilots had mistakenly fired upon unrecognized allied aircraft.)

You create a program that flashes random images of aircraft on the screen for 200 milliseconds. After the flashed image, the user tries to identify the aircraft he’s just seen, then hits a button to see the name of the aircraft. Then the user marks the slide as correct or incorrect. When a specified number of images have been displayed, the program/module returns ‘pass’ if the user identified at least 75% correct, and ‘fail’ otherwise.

A cron job checks the database each night at midnight. Once the passCount for this item reaches 10, the settings are changed so that images are flashed for only 150 milliseconds and the number of images shown by the module is increased. (These would be parameter setting stored in the database record for the item that the program passes to the image-flash program).

Training like this was done with actual pilots during WWII and was very successful. I think the device they created to flash images for fractions of a second was called a tachistoscope.

Scheduling Elements of Cards instead of Cards

Using modules to dynamically generate cards lets us do some interesting things that might increase efficiency.

For example, let’s say we’re learning Spanish, and we create a function to generate random (but grammatically correct) sentences using words from a vocabulary database. Let’s say that the sentences always have a noun as a subject, a verb, a noun as a direct object, and a noun as an indirect object (e.g. The bear gave the rose to the tiger). These elements are selected from the database, so cards generated by this function contain different possible sentences.

We use this module to practice reading comprehension. The nouns vary, the verb varies, and the tense of the verb varies, so these cards give us some practice with vocabulary (nouns and verbs) and tense conjugation.

The standard SM2/Anki way of handling cards is to make them static and separate, and to have them meet the minimum information principle. This has advantages. For example, if we fail a card, we know what the problem is because he card (in theory) tests only one thing. This allows for efficient scheduling.

But maximizing efficiency isn’t the only approach you might want to take. I believe that approaching everything from this granular, efficiency-seeking perspective causes problems for some tasks. A good example is verb conjugation. This is a skill that needs to become fluent for understanding Spanish. The approach of waiting to review conjugations until you can still remember them after a brief moment probably isn’t going to build the kind of fluency you need.

I think the lack of fluency leads to users feeling like they don’t, “get the big picture”, and other such discomforting feelings.

The implications of this are that users should add more big picture cards for conceptual material, and use techniques outside of SM2/Anki to build fluency. Either that, or create a flashcard program that allows for fluency-building exercises in addition to efficient ones.

An unfortunate implication of the SM2/Anki tactic of adding large numbers of cards to meet the minimum information principle, and provide the big picture feeling is that it can quickly lead to large work-loads and backlogs when users inevitably fall behind. This is only worsened by SM2’s attempt to maintain high retention rates for all cards.

If we also have a module that presents simple vocabulary cards that require recall of a single word, then those are going to be reinforcing the same words that our random sentence cards are reinforcing. For some words, I think targeting efficiency is a very appropriate goal. There are so many common nouns, verbs, and adjectives that are used very infrequently, that efficiency is probably the best option. So let’s say that for common nouns and verbs, we want to study their meaning efficiently.

Well, in that case, we can have our random-sentence module keep track of each element used in the sentence and then reschedule each element according to whatever algorithm we’ve chosen. Our simple vocabulary card then, could search for the ripest vocabulary terms to present. Any word recently seen in a random sentence that we understood would then be very un-ripe and would not be chosen.