Note: This page describes the instructional purpose of the N-Dolphin tool; the user's manual is a available as a single large document, or in structured form in which you can click on just the section you want.
Deductive reasoning is important for practical programming activities, not just formal proofs in theoretical CS classes, as careful programmers must think about universal properties of code, i.e., things that are true for any input, not just "what happened during the test cases I thought of". For example, we may want to know that a given loop can't run forever in any use of our software, not just that it didn't happen to run forever in a given set of tests.
As educators, we are concerned that the reliance on mathematical proof to teach deduction may amplify inequities that arise from disparate pre-college mathematics experiences. Thus, we wish to explore deduction in an environment that centers code and code-centric activities such as refactoring, rather than traditional mathematical notation and structures. Note that classic refactoring steps, such as in-lining a function or variable definition, can be combined with steps so simple they're not usually even called "refactoring" (e.g., turning 1+1 into 2, or reducing a statement that starts "if False" to just the code in the "else" clause), to express the execution of code (i.e., serve as a notional machine). This fact has been known for some time, but it is not usually brought to bear on the question of introducing deduction in a course on practical programming.
The N-Dolphin tool, and it's cousin ORC2A (when configured for the same logic as N-Dolphin), are visualizers that combine several important features not found together in other tools:
N-Dolphin currently renders code in Python, though other languages could be added, as long as they are sufficiently expressive of pure-functional code.
Specific examples of the pedagogy are given, along with full defintions of rules, in Section 2 of the "Bridge" paper (using ORC2A); and, with less verbiage and detail, below.
Other things are either simple/familiar ("Python follows the usual rules for arithmetic, so 22+5*4 is 42, not 27*4") or illustrated clearly in a standard debugger ("the arguments given in a function call are used to initialize the parameters"). But the process of returning a result seems to be neither, for beginning students. N-Dolphin can illustrate both simple substitutions (3+4 becoming 7) and subtle ones (the result of a function call appearing where the call expression had been) via repeated clicks of the "Step in" button.
An N-Dolphin session starts with standard Python code, typically some defintions followed by an expression, as shown on the left below. The command-line interpreter would respond to this code by printing the value of the expression; N-Dolphin shows us how the value can be produced by refactoring/substitution rules. Clicking on "Step in" performs only the selected addition step (i.e., the red 3+4), inserting the result in the appropriate place, as shown in the top-right image.
![]() |
|
The replacement of (3+4) with 7 is unually completely unsurprising for students; The second image on the right-hand side (above) shows the result of several more clicks, with square(7) being the next step needed in the calculation.
Additional clicks of "Step in" bring us to the next image, in which square(7) has been replaced with 49 (the result of the function), illustrating a process that we've found to be less intuitive for students: the result returned by the function takes the place of the call expression. This process can also be illustrated more concisely via the "Step over" button, as we will discuss below.
The final result of repeated clicks of "Step in" is shown at the bottom of the figure: the result that would have been printed by the python command-line interpreter,
Since N-Dolphin only allows code in which variables and objects don't change (i.e., pure-functional code), there's no need to worry about how substitution might interact with a variable changing value at some point.
Since N-Dolphin only works for code in which variables and objects don't change after initialization/creation, we can get the same (correct) answer regardless of the order in which we run separate elements of an expression. Note that this freedom does not extend to changing the structure to combine elements that should have been separate; i.e., in the expression ((3+4)*20)-(2*square_0(3+4)), we could execute the (3+4)*20 before, or after, the 2*square_0(3+4), but we can't re-combine it to do the 20-2 first.
N-Dolphin lets the user select valid components, but only as long as they are valid, so we could choose either of the 3+4's, but not 4+20 or 20-2, in the full expression above. Choosing the left or right side of that substraction is not very interesting; somewhat more interesting is the use of this flexibility to delay the execution of a function parameter, i.e., for square_0(3+4), we could put the 3+4 into the return expression before adding to produce 7, getting (3+4)*(3+4) as an intermediate step ... of course, this still produces 49, as we would hope. But, so far, we've only used N-Dolphin as a rather ecclectic visualizer of program execution. This is about to change: Flexibility of execution order opens the door to direct proof techniques.
For our first proof, we'll use the direct-proof technique, simply relying on the "substitution of equals for equals" principle, with specific rules about what things must be equal. We'll do an extremely simple proof, demonstrating via substition, that power(any_b, 2) is just any_b*any_b, which should come as no surprise. A standard debugger can be used to confirm this fact for any specific value of any_b, but can't show us what must be true for all executions.
In N-Dolphin, we simply guide the sequence of substitutions to express the same things we'd express in a traditional mathematical proof. Depending on the desired pedagogy, we can describe this to the students as refactoring, symbolic execution, or as a proof in which we decide what axiom to apply where, and the computer handles the mechanics.
We begin with what's known: the definition of our power function, expressed recursively to stay within the pure-functional approach, and the starting point of our proof, "power(any_b, 2)", as shown below on the left. Clicking "step in" has no effect, as the first thing to be evaluated is any_b, and we've not defined it. But, if we select the full "power(any_b, 2)", we gain access to the "function-name to body" rule (visible at the bottom of the image), which can be applied even without specific values for parameters, producing the top image on the right. Note that N-Dolphin has expressed the if-else as a Python if-expression, rather than a statement, to ensure the code is legal Python.
Clicking the "function name to body" button replaces the
|
|
Many standard debuggers provide a "Step over" feature; we have found this to be a valuable tool in teaching abstraction and recursion, as it emphasises the purpose and overall effect of the function.
N-Dolphin provides a similar feature; in the previous example, we could have clicked "Step over" instead of repeatedly clicking "Step in", to turn square_0(7) directly into 49.
Additionally, (details coming soon-ish ... email davew if you want them now, or see the "Bridge" paper).
Note that we still consider inductive proof to be a form of deductive reasoning, in the sense that it "[starts] from a set of general premises and then [draws] a specific conclusion that contains no more information than the premises themselves", rather than generalizing from premises based on experience/experiment, i.e., testing (an important software-development technique in its own right, but different from refactoring in terms of reasoning structure).
In Fall 2024, we plan to upload resources and archive data sets from our studies of the teaching and testing of abstraction-centric recursion, including a link to a git repo. If you need something promptly, please email davew@cs.haverford.edu.
There are some known bugs with the logic in the Bridge paper, having to do with making incorrect code seem to execute correctly. We are in the process of correcting those in N-Dolphin, a list of known problems is here.