Being able to use overlapping and nested cloze deletions is super useful in a flashcard application. It’s something that I felt was sorely missing from the flashcard applications that I’ve used.

  • [These are overlapping] cloze deletions
  • These are [overlapping cloze deletions]
  • And these are [nested] cloze deletions
  • And these [are nested cloze] deletions

So for my own personal project, I thought I’d take a crack at coding this feature myself.

cloze.py

UPDATE: This code has been updated. See my newer post: Score Nested Cloze Deletions When You Remember The Parent.

Here’s the most important function; the one that takes a string and spits out question and answer sides for each cloze deletion.

# python3
# Cloze-Deletion Parsing Function
import re

# function to remove the cloze codes from other cloze deletions
def removeClozeCodes(text):
    text = re.sub(r'(?<!\\){[0-9]+::','',text,0)
    text = re.sub(r'(?<!\\):[^:]*?:[0-9]+}','',text,0)
    text = re.sub(r'\\:',':',text,0)
    text = re.sub(r'\\{','{',text,0)
    return text


# mkCLozes
## Returns a numbered dictionary of cards. 
## cards[n] = card with cloze number n deleted
## cards[n][0] = list of tuples (string, type) for question side
## cards[n][1] = list of tuples (string, type) for answer side
## type codes:
##    0 = normal text
##    1 = cloze deletion text
##    2 = hint text
def mkClozes(s):
    startMatch = r'(?<!\\){([0-9]+)::'
    endMatch = r'(?<!\\):([^:]*?):([0-9]+)}'
    i = re.finditer(startMatch, s)
    starts = {}
    ends = {}
    clozeHint = {}
    cards = {}
    for m in i:
        n = int(m.group(1))
        if not n in starts:
            starts[n] = []
        starts[n].append(m.span())
    
    i = re.finditer(endMatch, s)
    for m in i:
        n = int(m.group(2))
        if n not in clozeHint:
            clozeHint[n] = []
        hint = m.group(1)
        clozeHint[n].append(hint)
        if not n in ends:
            ends[n] = []
        ends[n].append(m.span())
    
    if len(starts) != len(ends):
        print("Mismatching starts and ends to clozes: %s" % s )
        return None
    
    for n in starts:
        # the next two lists will be lists of tuples
        # (string, type) 0=normal  1=cloze 2=hint
        clozeQuestion = []
        clozeAnswer = []
        d = 0
        for i in range(len(starts[n])):
            a = starts[n][i][0]
            b = starts[n][i][1]
            clozeQ = s[d:a]
            clozeA = s[d:a]
            # all cloze tags left in new string should be removed
            # they are from other clozes
            clozeQ = removeClozeCodes(clozeQ)
            clozeA = removeClozeCodes(clozeA)
            # add strings to list
            clozeQuestion.append((clozeQ,0))
            clozeAnswer.append((clozeA,0))
            c = ends[n][i][0]
            d = ends[n][i][1]
            clz = s[b:c]
            # remove any cloze tags left by other cloze deletions
            clz = removeClozeCodes(clz)
            clozeAnswer.append((clz,1))
            if clozeHint[n][i]:
                clozeQuestion.append(("[",1))
                clozeQuestion.append((clozeHint[n][i],2))
                clozeQuestion.append(("]",1))
            else:
                clozeQuestion.append(("[...]",1))
        clozeQ = s[d:]
        clozeA = s[d:]
        # all cloze tags left in new string should be removed
        # they are from other clozes
        clozeQ = removeClozeCodes(clozeQ)
        clozeA = removeClozeCodes(clozeA)
        # add strings to list
        clozeQuestion.append((clozeQ,0))
        clozeAnswer.append((clozeA,0))
        # add strings to card dictionary
        cards[n] = (clozeQuestion, clozeAnswer)
    return cards

Usage

Cloze Start and End Codes

The codes used for starting and ending the cloze deletions are similar to Anki’s.

Cloze deletions are prefixed by {1:: where 1 can be any arbitrary number. End a cloze deletion with ::1} (where 1 is the same number as in the start code.

I figured that these codes would be easy enough to type, but also rare enough that you’d never need to use this sort of sequence for the content of your card. If for some reason you do want to use a code like this in your card (e.g. making flashcards about how to use this function), you can escape any of these codes by prefixing them with a backslash. (see example code below)

Hints

Add any hint text you want between the two colons of the end code. e.g. {1::This is the cloze text:and this is the hint:1}

The limitation is that you can’t use a colon in your hint.

Calling the Function

So you feed the mkClozes a string where the start and end of each cloze is marked with a certain sequence of characters. Then the function returns a numbered dictionary (cards[1] is cloze deletion 1, etc.).

Each entry in the dictionary consists of two lists; a list for the question side, and a list for the answer side. And each item in the list is a tuple containing two items: first a string of text, and then a numeric code describing the text. (0=normal text, 1=cloze text, 2=hint text)

So you would grab the list from the cloze deletion you want and then concatenate the strings one-by-one using whatever formatting you want.

Here’s how I do that for the curses interface I’m using:

test.py
import curses
import cloze

# set up ncurses
stdscr = curses.initscr()
curses.noecho()
curses.cbreak()
# colorFunctions
curses.start_color()
curses.use_default_colors()
for i in range(0, curses.COLORS):
    curses.init_pair(i + 1, i, -1)


# curses color pairs for demonstration
clozeColor = 3
hintColor = 4

# demonstate the mkClozes function
# test string
s = "Start a cloze deletion with {8::'\\{1::'::8} (that was an escaped cloze start) and say {1::\"hello world\":what programmers always say first:1} as your first cloze. End a cloze deletion with {9::'\\::1}'::9} (or whatever number you're using). {4::You can {3::nest:birds do it:3} one cloze in another:what was that about nesting?:4} and {5::clozes can {6::even overlap:... over-:5}, which can be very useful:... over- ...:6}. And several {7::cloze deletions::7} can share {7::the same number::7}. {100::And yes, any arbitrary number will work:Does any arbitrary number work?:100} in a cloze deletion. {7::I'm part of cloze deletion #7 too!::7}"
cards = cloze.mkClozes(s)
for c in sorted(cards.keys()):
    for i in [0,1]:
        stdscr.addstr("Card %s:\n\n" % c)
        for text, code in cards[c][i]:
            if code == 1:
                stdscr.addstr(text, curses.color_pair(clozeColor))
            elif code == 2:
                stdscr.addstr(text, curses.color_pair(hintColor))
            else:
                stdscr.addstr(text)
        stdscr.getkey()
        stdscr.clear()

curses.nocbreak()
stdscr.keypad(False)
curses.echo()
curses.endwin()

Background

I used Anki for many years. In early versions, handling cloze deletions was pretty rudimentary; each cloze deletion had to be a separate card. In later versions, it became much more sophisticated; one cloze deletion note could generate several cards. But simple overlapping cloze deletions were still missing.

Overlapping Clozes are Great for Taking Baby Steps

Overlapping clozes are very useful for when you want to work your way up to recalling more and more information. For example, let’s say you’re having trouble remembering when Columbus sailed across the Atlantic. You find a nice phrase that should be pretty easy to memorize (In 1492, Columbus sailed the ocean blue), but for some reason you still have trouble remembering the date.

So you decide to break it up into smaller steps and then tackle the whole thing once you’ve learned the parts. You create the following series of cloze deletions:

  1. In […]92, Columbus Sailed the Ocean Blue.
  2. In 14[…], Columbus Sailed the Ocean Blue.
  3. In […], Columbus Sailed the Ocean Blue.

This way, you concentrate on the first part of the year, then the second, and finally the whole thing.

Using the function above, you could create all three cards with the following:

In {3::{1::14::1}{2::92::2}::3}, Columbus Sailed the Ocean Blue.

You can create a single note that will generate these cards in Anki, but the process is much more convoluted. I’ve describe how to create overlapping clozes in Anki in this post.