I am an Associate Professor in Logical Foundations and Formal Methods at the University of Cambridge. I study programming languages and semantics using type theory, category theory, domain theory, and topos theory as a guide. My other interests include Near Eastern, Classical, and Germanic philology. Find my contact information here.
Central to both the design of programming languages and the practice of software engineering is the tension between abstraction and composition. I employ semantic methods from category theory and type theory to design, verify, and implement languages that enable both programmers and mathematicians to negotiate the different levels of abstraction that arise in their work. I apply my research to global safety and security properties of programming languages as well as the design and implementation of interactive theorem provers for higher-dimensional mathematics. I develop foundational and practical mathematical tools to deftly weave together verifications that cut across multiple levels of abstraction.
See my page for student project ideas and contact me if you are interested in one of them.
This website is a “forest” created using the Forester tool. I organize my thoughts here on a variety of topics at a granular level; sometimes these thoughts are self-contained, and at times I may organize them into larger notebooks or lecture notes. My nascent ideas about the design of tools for scientific thought are here. I welcome collaboration on any of the topics represented in my forest. To navigate my forest, press Ctrl-K
.
I maintain several public bibliographies in topics of my interest. You can also access my personal bibliography or my curriculum vitæ.
This is my blog, in which I write about a variety of topics including computer science, mathematics, and the design of tools for scientists.
Apart from my day-job at University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
I am a proponent of using plain text (or Markdown-formatted) emails rather than HTML-formatted emails. Although there are some legitimate uses for rich formatting in email, I find the well is poisoned: HTML-formatted email manipulates, surveils, and controls.
Although plain text email has gone the way of the dinosaur in most parts of society, it remains used among many technical communities (such as the Linux kernel developers, etc.). Recently, SourceHut has emerged as a free and open source code forge whose workflow is entirely based on plain text email as described on the useplaintext.email website. Naturally, this has led to both curiosity and friction as users attempt to adapt to working technology of the 1990s that Capital has made increasingly difficult to interact with using present-day software. As a maintainer of several projects on SourceHut, I have seen many of my valued collaborators completely fall on the ground when it comes to following the etiquette recommendations for plaintext emails (especially text wrapping) and this is certainly not for lack of trying on their part. The available technological options are just that difficult to use.
The main viable tools for working with plain text email are terminal-based: for example, aerc and mutt and its derivatives. These are, however, a non-starter for most users (even technical users): for example, they are difficult to configure, and integrating with address books from standard email providers is nearly impossible or requires a Nobel prize; a bigger problem is that for most of these tools, one must store passwords in plain text (or use some very complicated authentication scheme that a typical user will be unable to set up).
The only viable modern GUI client recommended by useplaintext.email for macOS is MailMate. Unfortunately, MailMate’s visible settings do not have any option for wrapping text, which would contradict its viability for plaintext email users. Furthermore, the versions of MailMate available for download from its website (both the released version and the prerelease) seem to be incompatible with macOS Sonoma. By doing a bit of spelunking, I have found that both these problems can be solved (at least for now).
First, download the latest prerelease binary for MailMate from this directory: http://updates.mailmate-app.com/archives/?C=M;O=D. I found that the this program can take a very long time to open for the first time, but after this it is speedy.
format=flowed
Second, enable format=flowed
using the following completely undocumented command:
defaults write com.freron.MailMate MmFormatFlowedEnabled -bool true
Note that this option is so undocumented that it does not even appear in the list of hidden preferences. There is some discussion of this on the mailing list.
Although the format=flowed
standard standard is very rarely implemented by clients, email sent using this flag will be formatted correctly by SourceHut Lists.
I have been thinking about monoidal closed structures induced by slicing over a monoid, which has been considered by Combette and Munch-Maccagnoni as a potential denotational semantics of destructors à la C++. It occurred to me that this construction is, in fact, almost a degenerate case of Day convolution on an internal monoidal category — and this made me realize that there might be a nice way to understand Day convolution in the language of fibered categories. In fact, many of these results (in particular the relativization to internal monoidal categories) are probably a special case of Theorem 11.22 of Shulman’s paper on enriched indexed categories.
Let
Note that
To understand the construction of the Day tensor, we will go through it step-by-step. Let
Under appropriate assumptions, we may also compute a “Day hom” by adjointness.
Let
First of all, we note that the pullback functor
Let
If
I believe, but did not check carefully, that when
There remain some interesting directions to explore. First of all, the claims above would obviously lead to a new construction of the Day convolution monoidal structure on the 1-category of discrete fibrations on
Let
is highly non-trivial, as monoidal bicategories are extremely difficult to construct explicitly. I am hoping that Mike Shulman’s ideas involving monoidal double categories could potentially help.
I have been thinking again about the relationship between quantitative type theory and synthetic Tait computability and other approaches to type refinements. One of the defining characteristics of QTT that I thought distinguished it from STC was the treatment of types: in QTT, types only depend on the “computational” / unrefined aspect of their context, whereas types in STC are allowed to depend on everything. In the past, I mistakenly believed that this was due to the realizability-style interpretation of QTT, in contrast with STC’s gluing interpretation. It is now clear to me that (1) QTT is actually glued (in the sense of q-realizability, no pun intended), and (2) the nonstandard interpretation of types in QTT corresponds to adding an additional axiom to STC, namely the tininess of the generic proposition.
It has been suggested to me by Neel Krishnaswami that this property of QTT may not be desirable in all cases (sometimes you want the types to depend on quantitative information), and that for this reason, graded type theories might be a better way forward in some applications. My results today show that STC is, in essence, what you get when you relax the QTT’s assumption that types do not depend on quantitative information. This suggests that we should explore the idea of multiplicities within the context of STC — as any monoidal product on the subuniverse spanned by closed-modal types induces quite directly a form of variable multiplicity in STC, I expect this direction to be fruitful.
My thoughts on the precise relationship between the QTT models and Artin gluing will be elucidated at a different time. Today, I will restrict myself to sketching an interpretation of a QTT-style language in STC assuming the generic proposition is internally tiny.
Let
We now consider the interpretation of a language of (potentially quantitative) refinements into
So far we have not needed anything beyond the base structure of STC in order to give an interpretation of types in QTT’s style. But to extend this interpretation to a universe, we must additionally assume that
As
Although
We will now see how to use the adjunction
Thus we see that if
Finally, we comment that the tininess of
For any small category
We will assume here that
For any small category
The universal property of
Given a morphism of topoi
In case
In particular, let
To give an explicit computation of the tensor product, we first compute the inverse image of any
We are now prepared to compute the tensor product of any
Finally, we can relate the computation above to that of Fiore in terms of coends.
Above, we have written
Thanks to Marcelo Fiore and Daniel Gratzer for helpful discussions.
I have long been an enthusiast for outliners, a genre of computer software that deserves more than almost any other to be called an “elegant weapon for a more civilized age”. Recently I have been enjoying experimenting with Jesse Grosjean’s highly innovative outliner for macOS called Bike, which builds on a lot of his previous (highly impressive) work in the area with a level of fit and finish that is rare even in the world of macOS software. Bike costs $29.99 and is well-worth it; watch the introductory video or try the demo to see for yourself.
The purpose of outliners is to provide room to actively think; Grothendieck is said to have been unable to think at all without a pen in his hand, and I think of outliners as one way to elevate the tactile aspect of active thinking using the unique capabilities of software. Tools for thinking must combat stress and mental weight, and the most immediate way that outliners achieve this is through the ability to focus on a subtree — narrowing into a portion of the hierarchy and treating it as if it were the entire document, putting its context aside. This feature, which some of my readers may recognize from Emacs org-mode
, is well-represented in Bike — without, of course, suffering the noticeable quirks that come from the Emacs environment, nor the ill-advised absolute/top-down model of hierarchy sadly adopted by org-mode
.
As a scientist in academia, one of the most frequent things I am doing when I am not writing my own papers or working with students is refereeing other scientists’ papers. For those who are unfamiliar, this means carefully studying a paper and then producing a detailed and well-structured report that includes a summary of the paper, my personal assessment of its scientific validity and value, and a long list of corrections, questions, comments, and suggestions. Referee reports of this kind are then used by journal editors and conference program committees to decide which papers deserve to be published.
In this post, I will give an overview of my refereeing workflow with Bike and how I overcame the challenges transferring finished referee reports from Bike into the text-based formats used by conference refereeing platforms like HotCRP and EasyChair using a combination of XSLT 2.0 and Pandoc. This tutorial applies to Bike 1.14; I hope the format will not change too much, but I cannot make promises about what I do not control.
Most scientific conferences solicit and organize reviews for papers using a web platform such as HotCRP and EasyChair; although these are not the same, the idea is similar. Once you have been assigned to referee a paper, you will receive a web form with several sections and large text areas in which to put the components of your review; not all conferences ask for the same components, but usually one is expected to include the following in addition to your (numerical) assessment of the paper’s merit and your expertise:
Usually you will be asked to enter your comments under each section in a plain text format like Markdown. The first thing a new referee learns is not to type answers directly into the web interface, because this is an extremely reliable way to lose hours of your time when a browser or server glitch deletes all your work. Most of us instead write out our answers in a separate text editor, and paste them into the web interface when we are satisfied with them. In the past, I have done this with text files on my computer, but today I want to discuss how to draft referee reports as outlines in Bike; then I will show you how to convert them to the correct plain text format that can be pasted into your conference’s preferred web-based refereeing platform.
To start off, have a look at the figure below, which shows my refereeing template outline in Bike.
As you can see, there is a healthy combination of hierarchy and formatting in a Bike outline.
Bike is a rich text editor, but one that (much like GNU TeXmacs) avoids the classic pitfalls of nearly all rich text editors, such as the ubiquitous and dreaded “Is the space italic?!” user-experience failure; I will not outline Bike innovative approach to rich text editing here, but I suggest you check it out for yourself.
One of the most useful features of Bike’s approach to formatting is the concept of a row type, which is a semantic property of a row that has consequences for its visual presentation. Bike currently supports the following row types:
My refereeing template uses several of these row types (headings, notes, tasks) as well as some of the rich text formatting (highlighting). When I fill out the refereeing outline, I will use other row types as well — including quotes, ordered, and unordered rows. I will create a subheading under Detailed comments for the author to contain my questions and comments, which I enter in as ordered rows; then I make a separate subheading at the same level for Typographical errors, which I populate with unordered rows. Unordered rows are best for typos, because they are always accompanied already by line numbers. If I need to quote an extended portion of the paper, I will use a quote row.
When working on the report outline, I will constantly be focusing on individual sections to avoid not only distractions but also the intense mental weight of unfinished sections. Focusing means that the entire outline is narrowed to a subtree that can be edited away from its context; this is achieved in Bike by pressing the gray south-easterly arrows to the right of each heading, as seen in the figure.
Although we have seen how pleasant it is to use an outliner like Bike to draft a referee report, but we obviously cannot submit a .bike
file to a conference reviewing website or a journal editor. Most conference reviewing systems accept plain text or Markdown responses, and so our goal will be to convert a Bike outline into reasonably formatted Markdown.
It happens that Bike’s underlying format is HTML, so one idea would be to use Pandoc to process this HTML into Markdown. This would work, except that Bike’s model is sufficiently structured that it must make deeply idiosyncratic use of HTML tags, as can be seen from the listing below.
<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta charset="utf-8"/> </head> <body> <ul id="2sbcmmms"> <li id="3C" data-type="heading"> <p>Tasks</p> <ul> <li id="cs" data-type="task"> <p>read through paper on iPad and highlight</p> </li> </ul> </li> <li id="XZ" data-type="heading"> <p>Syntax and semantics of foo bar baz</p> <ul> <li id="Kp"> <p><mark>Overall merit:</mark><span> </span></p> </li> <li id="d9"> <p><mark>Reviewer Expertise:</mark></p> </li> <li id="uM" data-type="heading"> <p>Summary of the paper</p> <ul> <li id="oB" data-type="note"> <p>Please give a brief summary of the paper</p> </li> <li id="TWZ"> <p>This paper describes the syntax and semantics of foo, bar, and baz.</p> </li> </ul> </li> <li id="V8" data-type="heading"> <p>Assessment of the paper</p> <ul> <li id="vD" data-type="note"> <p>Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.</p> </li> </ul> </li> <li id="zo" data-type="heading"> <p>Detailed comments for authors</p> <ul> <li id="o0" data-type="note"> <p>Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.</p> </li> <li id="bgy" data-type="heading"> <p>Minor comments</p> <ul> <li id="tMq" data-type="unordered"> <p>line 23: "teh" => "the"</p> </li> <li id="EmX" data-type="unordered"> <p>line 99: "fou" => "foo"</p> </li> </ul> </li> </ul> </li> <li id="aN" data-type="heading"> <p>Questions to be addressed by author response</p> <ul> <li id="7s" data-type="note"> <p>Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.</p> </li> </ul> </li> <li id="4S" data-type="heading"> <p>Comments for PC and other reviewers</p> <ul> <li id="bN" data-type="note"> <p>Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.</p> </li> <li id="i2b"> <p>In case any one is wondering, I am an expert in foo, but not in bar nor baz.</p> </li> </ul> </li> </ul> </li> </ul> </body> </html>
The Markdown that would result from postprocessing a Bike outline directly with Pandoc would be deeply unsuitable for submission. We will, however, use a version of this idea: first we will preprocess the Bike format into more conventional (unstructured) HTML using XSLT 2.0, and then we will use Pandoc to convert this into Markdown.
XSLT 2.0 is unfortunately only implemented by proprietary tools like Saxon, developed by Saxonica. Nonetheless, it is possible to freely install Saxon on macOS using Homebrew:
brew install saxon
You must also install Pandoc, which is also conveniently available as a binary on Homebrew:
brew install pandoc
With the system requirements out of the way, we can proceed to prepare an XSLT stylesheet that will convert Bike’s idiosyncratic use of HTML tags to more conventional HTML that can be processed into Markdown by Pandoc. The stylesheet bike-to-html.xsl
is described and explained in the listing below.
We can write convert Bike outlines to reasonable HTML using an XSLT 2.0 stylesheet, bike-to-html.xsl
detailed below.
<?xml version="1.0"?> <xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xhtml="http://www.w3.org/1999/xhtml" exclude-result-prefixes="xhtml"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" /> <xsl:strip-space elements="*" />
We will allow several tags to be copied verbatim into the output, as Bike uses these in the same way that idiomatic HTML does.
<xsl:template match="xhtml:html | xhtml:body | xhtml:code | xhtml:strong | xhtml:em | xhtml:mark"> <xsl:copy> <xsl:apply-templates select="node()|@*" /> </xsl:copy> </xsl:template>
Bike leaves behind a lot of empty span
elements; we drop these.
<xsl:template match="xhtml:span"> <xsl:apply-templates /> </xsl:template>
Bike uses ul
for all lists; the list type is determined not at this level, but rather by each individual item’s @data-type
attribute. To get this data into the HTML list model, we must group items that have the same @data-type
and wrap them in an appropriate list-forming element.
To do this, we use XSLT 2.0’s xsl:for-each-group
instruction to group adjacent li
elements by their @data-type
attribute. (It is extremely difficult and error-prone to write equivalent code in the more widely available XSLT 1.0.) We must convert @data-type
to a string: otherwise, the transformer will crash when it encounters an item without a @data-type
attribute.
<xsl:template match="xhtml:ul"> <xsl:for-each-group select="xhtml:li" group-adjacent="string(@data-type)"> <xsl:choose> <xsl:when test="@data-type='ordered' or @data-type='task'"> <ol> <xsl:apply-templates select="current-group()" /> </ol> </xsl:when> <xsl:when test="@data-type='unordered'"> <ul> <xsl:apply-templates select="current-group()" /> </ul> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="current-group()" /> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:template>
Next, we match each individual li
element; the content of a list item is stored in a p
element directly under li
, so we let the transformer fall thorugh the parent and then format the content underneath according to the @data-type
of the item.
<xsl:template match="xhtml:li"> <xsl:apply-templates /> </xsl:template> <xsl:template match="xhtml:li[@data-type='ordered' or @data-type='unordered' or @data-type='task']/xhtml:p"> <li> <xsl:apply-templates /> </li> </xsl:template>
Bike has correctly adopted the optimal explicit and relative model of hierarchy, in contrast to HTML; this means that the depth of a heading is not reflected in the element that introduces it, but is instead inferred from its actual position in the outline hierarchy. To convert Bike outlines to idiomatic HTML, we must flatten the hierarchy and introduce explicit heading levels; luckily, this is easy to accomplish in XSLT by counting the ancestors of heading
type.
<xsl:template match="xhtml:li[@data-type='heading']/xhtml:p"> <xsl:element name="h{count(ancestor::xhtml:li[@data-type='heading'])}"> <xsl:apply-templates /> </xsl:element> </xsl:template>
The remainder of the row types are not difficult to render; you may prefer alternative formatting depending on your goals.
<xsl:template match="xhtml:li[@data-type='quote']/xhtml:p"> <blockquote> <xsl:apply-templates /> </blockquote> </xsl:template> <xsl:template match="xhtml:li[@data-type='note']/xhtml:p"> <p> <em> <xsl:apply-templates /> </em> </p> </xsl:template> <xsl:template match="xhtml:li[not(@data-type)]/xhtml:p"> <p> <xsl:apply-templates /> </p> </xsl:template> </xsl:stylesheet>
Next, we can use Saxon to convert a Bike outline to idiomatic HTML using the stylesheet above.
cat review.bike | saxon -xsl:bike-to-html.xsl - > review.html
Go ahead and open the resulting HTML file in a text editor and a browser to see the results.
The following is the result of transforming an example Bike outline to idiomatic HTML using an XSLT stylesheet.
<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <body> <h1>Tasks</h1> <ol> <li>read through paper on iPad and highlight</li> </ol> <h1>Syntax and semantics of foo bar baz</h1> <p> <mark>Overall merit:</mark> </p> <p> <mark>Reviewer Expertise:</mark> </p> <h2>Summary of the paper</h2> <p> <em>Please give a brief summary of the paper</em> </p> <p>This paper describes the syntax and semantics of foo, bar, and baz.</p> <h2>Assessment of the paper</h2> <p> <em>Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.</em> </p> <h2>Detailed comments for authors</h2> <p> <em>Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.</em> </p> <h3>Minor comments</h3> <ul> <li>line 23: "teh" => "the"</li> <li>line 99: "fou" => "foo"</li> </ul> <h2>Questions to be addressed by author response</h2> <p> <em>Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.</em> </p> <h2>Comments for PC and other reviewers</h2> <p> <em>Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.</em> </p> <p>In case any one is wondering, I am an expert in foo, but not in bar nor baz.</p> </body> </html>
Next, we will process this HTML file using Pandoc; unfortunately, Pandoc leaves behind a lot of garbage character escapes that are not suitable for submission anywhere, so we must filter those out using sed
.
cat review.html | pandoc -f html -t markdown-raw_html-native_divs-native_spans-fenced_divs-bracketed_spans-smart | sed 's/\\//g'
The following is the result of converting the idiomatic HTML representation of a Bike outline to Markdown using Pandoc, with some light postprocessing by sed
.
# Tasks 1. read through paper on iPad and highlight # Syntax and semantics of foo bar baz Overall merit: Reviewer Expertise: ## Summary of the paper *Please give a brief summary of the paper* This paper describes the syntax and semantics of foo, bar, and baz. ## Assessment of the paper *Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.* ## Detailed comments for authors *Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.* ### Minor comments - line 23: "teh" => "the" - line 99: "fou" => "foo" ## Questions to be addressed by author response *Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.* ## Comments for PC and other reviewers *Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.* In case any one is wondering, I am an expert in foo, but not in bar nor baz.
We can group these tasks into a one-liner as follows:
cat review.bike | saxon -xsl:bike-to-html.xsl - | pandoc -f html -t markdown-raw_html-native_divs-native_spans-fenced_divs-bracketed_spans-smart | sed 's/\\//g'
I have gathered the scripts to convert Bike outlines into Markdown via idiomatic HTML in a Git repository where they can be easily downloaded. If you have any improvements to these scripts, please submit them as a patch to my public inbox! I am also interested in whether it is possible to write the XSLT 2.0 stylesheet as equivalent XSLT 1.0, to avoid requiring the proprietary Saxon tool. Feel free also to send comments on this post to my public inbox, or discuss with me on Mastodon.
Apart from my day-job at University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
Lurie states the following result as Proposition 7.2.1.14 of Higher Topos Theory:
Let\mathcal {X} be an ∞-topos and let{ \mathcal {X} } \xrightarrow {{ \tau _{ \leq 0} }}{ \tau _{ \leq 0} \mathcal {X} } be the left adjoint to the inclusion. A morphism{ U } \xrightarrow {{ \phi }}{ X } in\mathcal {X} is an effective epimorphism if and only if in\tau _{ \leq 0} { \mathopen {} \left ( \phi \right ) \mathclose {}} is an effective epimorphism in the ordinary topos\mathrm {h} { \mathopen {} \left ( \tau _{ \leq0 } \mathcal {X} \right ) \mathclose {}} .
Several users of MathOverflow have noticed that the proof of this result given by Lurie is circular. The result is true and can be recovered in a variety of ways, as pointed out in the MathOverflow discussion. For expository purposes, I would like to show how to prove this result directly in univalent foundations using standard results about
First we observe a trivial lemma about truncations.
Any proposition is orthogonal to the map
This follows from the dependent induction principle of set-truncation, as any proposition is a set.
It follows from that
A function
We deduce that
by definition | |
by induction | |
by Lemma | |
by HoTT Book, Theorem 7.3.12 | |
by HoTT Book, Theorem 7.3.9 | |
by definition |
Incidentally, the appeal to Theorem 7.3.12 of the HoTT Book can be replaced by the more general Proposition 2.26 of Christensen et al., which applies to an arbitrary reflective subuniverse and the corresponding subuniverse of separated types.
The Stacks project is the most successful scientific hypertext project in history. Its goal is to lay the foundations for the theory of algebraic stacks; to facilitate its scalable and sustainable development, several important innovations have been introduced, with the tags system being the most striking.
Each tag refers to a unique item (section, lemma, theorem, etc.) in order for this project to be referenceable. These tags don't change even if the item moves within the text. (Tags explained, The Stacks Project).
Many working scientists, students, and hobbyists have wished to create their own tag-based hypertext knowledge base, but the combination of tools historically required to make this happen are extremely daunting. Both the Stacks project and Kerodon use a cluster of software called Gerby, but bitrot has set in and it is no longer possible to build its dependencies on a modern environment without significant difficulty, raising questions of longevity.
Moreover, Gerby’s deployment involves running a database on a server (in spite of the fact that almost the entire functionality is static HTML), an architecture that is incompatible with the constraints of the everyday working scientist or student who knows at most how to upload static files to their university-provided public storage. The recent experience of the nLab’s pandemic-era hiatus and near death experience has demonstrated with some urgency the pracarity faced by any project relying heavily on volunteer system administrators.
After spending two years exploring the design of tools for scientific thought that meet the unique needs of real, scalable scientific writing in hypertext, I have created a tool called Forester which has the following benefits:
Forester combines associative and hierarchical networks of evergreen notes (called “trees”) into hypertext sites called “forests”.
A forest of evergreen notes (or a forest for short) is loosely defined to be a collection of evergreen notes in which multiple hierarchical structures are allowed to emerge and evolve over time. Concretely, one note may contextualize several other notes via transclusion within its textual structure; in the context of a forest, we refer to an individual note as a tree. Of course, a tree can be viewed as a forest that has a root node.
Trees correspond roughly to what are referred to as “tags” in the Stacks Project.
In this article, I will show you how to set up your own forest using the Forester software. These instructions pertain to the Forester 2.3 version.
In this section, we will walk through the installation of the Forester software.
Forester requires a unix-based system to run; it has been tested on both macOS and Linux. Windows support is desirable, but there are no concrete plans to implement it at this time.
Forester is a tool written in the OCaml programming language, and makes use of the latest features of OCaml 5. Most users should install Forester through OCaml's opam package manager; instructions to install opam and OCaml simultaneously can be found here.
If you intend to embed
It is best practice to maintain your forest inside of distributed version control. This serves not only as a way to prevent data loss (because you will be pushing frequently to a remote repository); it also allows you to easily roll back to an earlier version of your forest, or to create “branches” in which you prepare trees that are not yet ready to be integrated into the forest.
The recommended distributed version control system is git, which comes preinstalled on many unix-based systems and is easy to install otherwise. Git is not the most user-friendly piece of software, unfortunately, but it is ubiquitous. It is possible (but not recommended) to use Forester without version control, but note that the simplest way to initialize your own forest involves cloning a git repository.
Once you have met the system requirements, installing Forester requires only a single shell command:
opam install forester
To verify that Forester is installed, please run forester --version
in your shell.
Now that you have installed the Forester software, it is time to set up your forest. Currently, the simplest way to set up a new forest is to clone your own copy of the forest-template
repository which has been prepared for this purpose. In the future, I plan to have a more user-friendly way to set up new forests; please contact the mailing list if this is an urgent need, and it will be prioritized accordingly.
We will initialize a new forest in a folder called forest
. Open your shell to a suitable parent directory (e.g. your documents folder), and type the following commands:
git init forest cd forest
The first command above creates a new folder called forest
and initializes a git repository inside that folder. The second command instructs your shell to enter the forest
directory. Next, we will initialize your new forest with the contents of the default template by pulling from the template repository:
git pull https://git.sr.ht/~jonsterling/forest-template
A tree in Forester is associated to an address of the form xxx-NNNN
where xxx
is your chosen “namespace” (most likely your initials) and NNNN
is a four-digit base-36 number. The purpose of the namespace and the base-36 code is to uniquely identify a tree, not only within your forest but across all forests. A tree with address xxx-NNNN
is always stored in a file named xxx-NNNN.tree
.
Note that the format of tree addresses is purely a matter of convention, and is not forced by the Forester tool. Users are free to use their own format for tree addresses, and in some cases alternative (human-readable) formats may be desirable: this includes trees representing bibliographic references, as well as biographical trees.
The template forest assumes that your chosen namespace prefix is xxx
, and comes equipped with a “root” tree located at trees/xxx-0001.tree
. You should choose your own namespace prefix, likely your personal initials, and then rename trees/xxx-0001.tree
accordingly.
To build your forest, you can run the following command of Forester's executable in your shell, where xxx-0001
is the name of your root tree:
forester build --dev --root=xxx-0001 trees/
The --dev
flag is optional, and when activated supplies metadata to the generated website to support an “edit button” on each tree; this flag is meant to be used when developing your forest locally, and should not be used when building the forest to be uploaded to your public web host. The --root=xxx-0001
option has the effect of rendering the tree xxx-0001
to a file named output/index.xml
rather than to output/xxx-0001.xml
.
Forester renders your forest to some XML files in the output/
directory; XML is, like HTML, a format for structured documents that can be displayed by web browsers. The forest template comes equipped with a built-in XSLT stylesheet (assets/forest.xsl
) which is used to instruct web browsers how to render your forest into a pleasing and readable format.
The recommended and most secure way to view your forest while editing it is to serve it from a local web server. To do this, first ensure that you have Python 3 correctly installed. Then run the following command from the root directory of your forest:
python3 -m http.server 1313 -d output
While this command is running, you will be able to access your forest by navigating to localhost:1313/index.xml
in your preferred web browser.
In the future, Forester may be able to run its own local server to avoid the dependency on external tools like Python.
It is also possible to open the generated file output/index.xml
directly in your web browser. Unfortunately, modern web browsers by default prevent the use of XSLT stylesheets on the local file system for security reasons. Because Forester's output format is XML, the output cannot be viewed directly in your web browser without disabling this security feature (at your own risk). Users who do not understand the risks involved should turn back and use a local web server instead, which is more secure; if you understand and are willing to accept the risks, you may proceed as follows depending on your browser.
To configure Firefox for viewing your local forest, navigate to about:config
in your address bar.
security.fileuri.strict_origin_policy
.security.fileuri.strict_origin_policy
will appear set to true
. Double click on the word true
to toggle it to false
.To configure Safari for viewing your local forest, you must activate the Develop menu and then toggle one setting.
The first tree that you should create is a biographical tree to represent your own identity; ultimately you will link to this tree when you set the authors of other trees that you create later on. Although most trees will be addressed by identifiers of the form xxx-NNNN
, it is convenient to simply use a person’s full name to address a biographical tree. My own biographical tree is located at trees/people/jonmsterling.tree
and contains the following source code:
\title{Jon Sterling} \taxon{person} \meta{external}{https://www.jonmsterling.com/} \meta{institution}{[[ucam]]} \meta{orcid}{0000-0002-0585-5564} \meta{position}{Associate Professor} \p{Associate Professor in Logical Foundations and Formal Methods at University of Cambridge. Formerly a [Marie Skłodowska-Curie Postdoctoral Fellow](jms-0061) hosted at Aarhus University by [Lars Birkedal](larsbirkedal), and before this a PhD student of [Robert Harper](robertharper).}
Let’s break this code down to understand what it does.
\title{Jon Sterling}
sets the title of the tree to my name.\taxon{person}
declaration informs Forester that the tree is biographical. Not ever tree needs to have a taxon; common taxa include person
, theorem
, definition
, lemma
, blog
, etc. You are free to use whatever you want, but some taxa are treated specially by Forester.\meta
declarations attach additional information to the tree that can be used during rendering. These declarations are optional, and you are free to put whatever metadata you want.\p{...}
.Do not hard-wrap your text, as this can have visible impact on how trees are rendered; it is recommended that you use a text editor with good support for soft-wrapping, like Visual Studio Code.
You can see that the concrete syntax of Forester's trees looks superficially like a combination of
forester new
Creating a new tree in your forest is as simple as adding a .tree
file to the trees
folder. Because it is hard to manually choose the next incremental tree address, Forester provides a command to do this automatically. If your chosen namespace prefix is xxx
, then you should use the following command in your shell to create a new tree:
forester new --dir=trees --prefix=xxx
In return, Forester should output a message like Created tree xxx-007J
; this means that there is a new file located in trees/xxx-007J.tree
which you can populate. If we look at the contents of this new file, we will see that it is empty except for metadata assigning a date to the tree:
\date{2023-08-15}
Most trees should have a \date
annotation; this date is meant to be the date of the tree's creation. You should proceed by adding further metadata: the title and the author; for the latter, you will use the address of your personal biographical tree.
\title{my first tree} \author{jonmsterling}
Tree titles should be given in lower case (except for proper names, etc.); these titles will be rendered by Forester in sentence case. A tree can have as many \author
declarations as it has authors; these will be rendered in their order of appearance.
Now you can begin to populate the tree with its content, written in the Forester markup language. Think carefully about keeping each tree relatively independent and atomic.
You may be used to writing
Forester’s bottom-up approach to section hierarchy works via something called transclusion. The idea is that at any time, you can include (“transclude”) the full contents of another tree into the current tree as a subsection by adding the following code:
\transclude{xxx-NNNN}
This is kind of like \input
command, but much better behaved: for instance, section numbers are computed on the fly. This entire tutorial is cobbled together by transcluding many smaller trees, each with their own independent existence. For example, the following two sections are transcluded from an entirely different part of my forest:
It is easy to make the mistake of prematurely imposing a complex hierarchical structure on a network of notes, which leads to excessive refactoring. Hierarchy should be used sparingly, and its strength is for the large-scale organization of ideas. The best structure to impose on a network of many small related ideas is a relatively flat one. I believe that this is one of the mistakes made in the writing of the foundations of relative category theory, whose hierarchical nesting was too complex and quite beholden to my experience with pre-hypertext media.
One of the immediate impacts and strengths of Forester’s transclusion model is that a given tree has no canonical “geographic” location in the forest. One tree can appear as a child of many other trees, which allows the same content to be incorporated into different textual and intellectual narratives.
Multiple hierarchical structures can be imposed on the same associative network of nodes; a hierarchical structure amounts to a “narrative” that contextualizes a given subgraph of the network. One example could be the construction of lecture notes; another example could be a homework sheet; a further example could be a book chapter or scientific article. Although these may draw from the same body of definitions, theorems, examples, and exercises, these objects are contextualized within a different narrative, often toward fundamentally different ends.
As a result, any interface for navigating the neighbor-relation in hierarchically organized notes would need to take account of the multiplicity of parent nodes. Most hypertext tools assume that the position of a node in the hierarchy is unique, and therefore have a single “next/previous” navigation interface; we must investigate the design of interfaces that surface all parent/neighbor relations.
A tree in Forester is a single file written in a markup language designed specifically for scientific writing with bottom-up hierarchy via transclusion. A tree has two components: the frontmatter and the mainmatter.
The frontmatter of a Forester tree is a sequence of declarations that we summarize below.
Declaration | Meaning |
---|---|
\title{...}
|
sets the title of the tree; can contain mainmatter markup |
\author{name}
|
sets the author of the tree to be the biographical tree at address name
|
\date{YYYY-MM-DD}
|
sets the creation date of the tree |
\taxon{taxon}
|
sets the taxon of the tree; example taxa include lemma , theorem , person , reference ; the latter two taxa are treated specially by Forester for tracking biographical and bibliographical trees respectively |
\def\ident[x][y]{body}
|
defines and exports from the current tree a function named \ident with two arguments; subsequently, the expression \ident{u}{v} would expand to body with the values of u,v substituted for \x,\y
|
\import{xxx-NNNN}
|
brings the functions exported by the tree xxx-NNNN into scope
|
\export{xxx-NNNN}
|
brings the functions exported by the tree xxx-NNNN into scope, and exports them from the current tree
|
Below we summarize the concrete syntax of the mainmatter in a Forester tree.
Function | Meaning |
---|---|
\p{...}
|
creates a paragraph containing ... ; unlike Markdown, it is mandatory to annotate paragraphs explicitly |
\em{...}
|
typesets the content in italics |
\strong{...}
|
typesets the content in boldface |
#{...}
|
typesets the content in (inline) math mode using |
##{...}
|
typesets the content in (display) math mode using |
\transclude{xxx-NNNN}
|
transcludes the tree at address xxx-NNNN as a subsection |
[title](address)
|
formats the text title as a hyperlink to address address ; if address is the address of a tree, the link will point to that tree, and otherwise it is treated as a URL
|
\let\ident[x][y]{body}
|
defines a local function named \ident with two arguments; subsequently, the expression \ident{u}{v} would expand to body with the values of u,v substituted for \x,\y .
|
\code{...}
|
typesets the content in monospace |
\tex{...}
|
typesets the value of the body externally using \texpackage{pkgname} declarations in the frontmatter |
An example of a complete tree in the Forester markup language can be seen below.
\title{creation of (co)limits} \date{2023-02-11} \taxon{definition} \author{jonmsterling} \def\CCat{#{\mathcal{C}}} \def\DCat{#{\mathcal{D}}} \def\ICat{#{\mathcal{I}}} \def\Mor[arg1][arg2][arg3]{#{{\arg2}\xrightarrow{\arg1}{\arg3}}} \p{Let \Mor{U}{\CCat}{\DCat} be a functor and let \ICat be a category. The functor #{U} is said to \em{create (co)limits of #{\ICat}-figures} when for any diagram \Mor{C_\bullet}{\ICat}{\CCat} such that #{\ICat\xrightarrow{C_\bullet}\CCat\xrightarrow{F}\DCat} has a (co)limit, then #{C_\bullet} has a (co)limit that is both preserved and reflected by #{F}.}
The code above results in the following tree:
Let
Now that you have created your forest and added a few trees of your own, it is time to upload it to your web host. Many users of Forester will have university-supplied static web hosting, and others may prefer to use GitHub pages; deploying a forest works the same way in either case.
output
directory and upload them to your preferred web host.I am eager to see the new forests that people create using Forester. I am happy to offer personal assistance via the mailing list.
Many aspects of Forester are in flux and not fully documented; it will often be instructive to consult the source of existings forests, such as my own. Please be aware that my own forest tends to be using the latest (often unreleased) version of Forester, so there may be slight incompatibilities.
Have fun, and be sure to send me links to your forests when you have made them!
Apart from my day-job at University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
We develop a denotational semantics for general reference types in an impredicative version of guarded homotopy type theory, an adaptation of synthetic guarded domain theory to Voevodsky’s univalent foundations. We observe for the first time the profound impact of univalence on the denotational semantics of mutable state. Univalence automatically ensures that all computations are invariant under symmetries of the heap—a bountiful source of program equivalences. In particular, even the most simplistic univalent model enjoys many new program equivalences that do not hold when the same constructions are carried out in the universes of traditional set-level (extensional) type theory.
We present decalf, a directed, effectful cost-aware logical framework for studying quantitative aspects of functional programs with effects. Like calf, the language is based on a formal phase distinction between the extension and the intension of a program, its pure behavior as distinct from its cost measured by an effectful step-counting primitive. The type theory ensures that the behavior is unaffected by the cost accounting. Unlike calf, the present language takes account of effects, such as probabilistic choice and mutable state; this extension requires a reformulation of calf’s approach to cost accounting: rather than rely on a “separable” notion of cost, here a cost bound is simply another program. To make this formal, we equip every type with an intrinsic preorder, relaxing the precise cost accounting intrinsic to a program to a looser but nevertheless informative estimate. For example, the cost bound of a probabilistic program is itself a probabilistic program that specifies the distribution of costs. This approach serves as a streamlined alternative to the standard method of isolating a recurrence that bounds the cost in a manner that readily extends to higher-order, effectful programs.
The development proceeds by first introducing the decalf type system, which is based on an intrinsic ordering among terms that restricts in the extensional phase to extensional equality, but in the intensional phase reflects an approximation of the cost of a program of interest. This formulation is then applied to a number of illustrative examples, including pure and effectful sorting algorithms, simple probabilistic programs, and higher-order functions. Finally, we justify decalf via a model in the topos of augmented simplicial sets.
This paper considers direct encodings of generalized algebraic data types (GADTs) in a minimal suitable lambda-calculus. To this end, we develop an extension of System Fω with recursive types and internalized type equalities with injective constant type constructors. We show how GADTs and associated pattern-matching constructs can be directly expressed in the calculus, thus showing that it may be treated as a highly idealized modern functional programming language. We prove that the internalized type equalities in conjunction with injectivity rules increase the expressive power of the calculus by establishing a non-macro-expressibility result in Fω, and prove the system type-sound via a syntactic argument. Finally, we build two relational models of our calculus: a simple, unary model that illustrates a novel, two-stage interpretation technique, necessary to account for the equational constraints; and a more sophisticated, binary model that relaxes the construction to allow, for the first time, formal reasoning about data-abstraction in a calculus equipped with GADTs.
Separation logic is used to reason locally about stateful programs. State of the art program logics for higher-order store are usually built on top of untyped operational semantics, in part because traditional denotational methods have struggled to simultaneously account for general references and parametric polymorphism. The recent discovery of simple denotational semantics for general references and polymorphism in synthetic guarded domain theory has enabled us to develop Tulip, a higher-order separation logic over the typed equational theory of higher-order store for a monadic version of System
It often happens that free algebras for a given theory satisfy useful reasoning principles that are not preserved under homomorphisms of algebras, and hence need not hold in an arbitrary algebra. For instance, if
As type theories have become increasingly more sophisticated, it has become more and more difficult to establish the useful properties of their free models that facilitate effective implementation. These obstructions have facilitated a fruitful return to foundational work in type theory, which has taken on a more geometrical flavor than ever before. Here we expose a modern way to prove a highly non-trivial injectivity law for free models of Martin-Löf type theory, paying special attention to the ways that contemporary methods in type theory have been influenced by three important ideas of the Grothendieck school: the relative point of view, the language of universes, and the recollement of generalized spaces.
Jacobs has proposed definitions for (weak, strong, split) generic objects for a fibered category; building on his definition of (split) generic objects, Jacobs develops a menagerie of important fibrational structures with applications to categorical logic and computer science, including higher order fibrations, polymorphic fibrations, 𝜆2-fibrations, triposes, and others. We observe that a split generic object need not in particular be a generic object under the given definitions, and that the definitions of polymorphic fibrations, triposes, etc. are strict enough to rule out some fundamental examples: for instance, the fibered preorder induced by a partial combinatory algebra in realizability is not a tripos in this sense. We propose a new alignment of terminology that emphasizes the forms of generic object appearing most commonly in nature, i.e. in the study of internal categories, triposes, and the denotational semantics of polymorphism. In addition, we propose a new class of acyclic generic objects inspired by recent developments in higher category theory and the semantics of homotopy type theory, generalizing the realignment property of universes to the setting of an arbitrary fibration.
Several different topoi have played an important role in the development and applications of synthetic guarded domain theory (SGDT), a new kind of synthetic domain theory that abstracts the concept of guarded recursion frequently employed in the semantics of programming languages. In order to unify the accounts of guarded recursion and coinduction, several authors have enriched SGDT with multiple “clocks” parameterizing different time-streams, leading to more complex and difficult to understand topos models. Until now these topoi have been understood very concretely qua categories of presheaves, and the logico-geometrical question of what theories these topoi classify has remained open. We show that several important topos models of SGDT classify very simple geometric theories, and that the passage to various forms of multi-clock guarded recursion can be rephrased more compositionally in terms of the lower bagtopos construction of Vickers and variations thereon due to Johnstone. We contribute to the consolidation of SGDT by isolating the universal property of multi-clock guarded recursion as a modular construction that applies to any topos model of single-clock guarded recursion.
We present XTT, a version of Cartesian cubical type theory specialized for Bishop sets à la Coquand, in which every type enjoys a definitional version of the uniqueness of identity proofs. Using cubical notions, XTT reconstructs many of the ideas underlying Observational Type Theory, a version of intensional type theory that supports function extensionality. We prove the canonicity property of XTT (that every closed boolean is definitionally equal to a constant) by Artin gluing.
We present calf, a cost-aware logical framework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account of phase distinctions, we argue that the cost structure of programs motivates a phase distinction between intension and extension. Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which cost-aware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a full-spectrum dependent type theory, calf presents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants.
We evaluate calf as a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: the method of recurrence relations and physicist’s method for amortized analysis. We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential and parallel complexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning in calf by means of a model construction.
We propose a new sheaf semantics for secure information flow over a space of abstract behaviors, based on synthetic domain theory: security classes are open/closed partitions, types are sheaves, and redaction of sensitive information corresponds to restricting a sheaf to a closed subspace. Our security-aware computational model satisfies termination-insensitive noninterference automatically, and therefore constitutes an intrinsic alternative to state of the art extrinsic/relational models of noninterference. Our semantics is the latest application of Sterling and Harper’s recent re-interpretation of phase distinctions and noninterference in programming languages in terms of Artin gluing and topos-theoretic open/closed modalities. Prior applications include parametricity for ML modules, the proof of normalization for cubical type theory by Sterling and Angiuli, and the cost-aware logical framework of Niu et al. In this paper we employ the phase distinction perspective twice: first to reconstruct the syntax and semantics of secure information flow as a lattice of phase distinctions between “higher” and “lower” security, and second to verify the computational adequacy of our sheaf semantics with respect to a version of Abadi et al.’s dependency core calculus to which we have added a construct for declassifying termination channels.
In the published version of this paper, there were a few mistakes that have been corrected in the local copy hosted here.
A serious (and as-yet unfixed) problem was discovered in July of 2023 by Yue Niu, which undermines the proof of adequacy given; in particular, the proof that the logical relation on free algebras is admissible is not correct. I believe there is a different proof of adequacy for the calculus described, but it will have a different structure from what currently appears in the paper. We thank Yue Niu for his attention to detail and careful reading of this paper.
The theory of program modules is of interest to language designers not only for its practical importance to programming, but also because it lies at the nexus of three fundamental concerns in language design: the phase distinction, computational effects, and type abstraction. We contribute a fresh “synthetic” take on program modules that treats modules as the fundamental constructs, in which the usual suspects of prior module calculi (kinds, constructors, dynamic programs) are rendered as derived notions in terms of a modal type-theoretic account of the phase distinction. We simplify the account of type abstraction (embodied in the generativity of module functors) through a lax modality that encapsulates computational effects, placing projectibility of module expressions on a type-theoretic basis.
Our main result is a (significant) proof-relevant and phase-sensitive generalization of the Reynolds abstraction theorem for a calculus of program modules, based on a new kind of logical relation called a parametricity structure. Parametricity structures generalize the proof-irrelevant relations of classical parametricity to proof-relevant families, where there may be non-trivial evidence witnessing the relatedness of two programs—simplifying the metatheory of strong sums over the collection of types, for although there can be no “relation classifying relations,” one easily accommodates a “family classifying small families.”
Using the insight that logical relations/parametricity is itself a form of phase distinction between the syntactic and the semantic, we contribute a new synthetic approach to phase separated parametricity based on the slogan logical relations as types, by iterating our modal account of the phase distinction. We axiomatize a dependent type theory of parametricity structures using two pairs of complementary modalities (syntactic, semantic) and (static, dynamic), substantiated using the topos theoretic Artin gluing construction. Then, to construct a simulation between two implementations of an abstract type, one simply programs a third implementation whose type component carries the representation invariant.
After going to press, we have fixed the following mistakes:
The local copy hosted here has the corrections implemented
We prove normalization for (univalent, Cartesian) cubical type theory, closing the last major open problem in the syntactic metatheory of cubical type theory. Our normalization result is reduction-free, in the sense of yielding a bijection between equivalence classes of terms in context and a tractable language of
Extending Martín Hötzel Escardó’s effectful forcing technique, we give a new proof of a well-known result: Brouwer’s monotone bar theorem holds for any bar that can be realized by a functional of type
We contribute XTT, a cubical reconstruction of Observational Type Theory [Altenkirch et al., 2007] which extends Martin-Löf's intensional type theory with a dependent equality type that enjoys function extensionality and a judgmental version of the unicity of identity proofs principle (UIP): any two elements of the same equality type are judgmentally equal. Moreover, we conjecture that the typing relation can be decided in a practical way. In this paper, we establish an algebraic canonicity theorem using a novel extension of the logical families or categorical gluing argument inspired by Coquand and Shulman: every closed element of boolean type is derivably equal to either true or false.
Modalities are everywhere in programming and mathematics! Despite this, however, there are still significant technical challenges in formulating a core dependent type theory with modalities. We present a dependent type theory MLTT🔒 supporting the connectives of standard Martin-Löf Type Theory as well as an S4-style necessity operator. MLTT🔒 supports a smooth interaction between modal and dependent types and provides a common basis for the use of modalities in programming and in synthetic mathematics. We design and prove the soundness and completeness of a type checking algorithm for MLTT🔒, using a novel extension of normalization by evaluation. We have also implemented our algorithm in a prototype proof assistant for MLTT🔒, demonstrating the ease of applying our techniques.
Nakano’s later modality can be used to specify and define recursive functions which are causal or synchronous; in concert with a notion of clock variable, it is possible to also capture the broader class of productive (co)programs. Until now, it has been difficult to combine these constructs with dependent types in a way that preserves the operational meaning of type theory and admits a hierarchy of universes. We present an operational account of guarded dependent type theory with clocks called Guarded Computational Type Theory, featuring a novel clock intersection connective that enjoys the clock irrelevance principle, as well as a predicative hierarchy of universes which does not require any indexing in clock contexts. Guarded Computational Type Theory is simultaneously a programming language with a rich specification logic, as well as a computational metalanguage that can be used to develop semantics of other languages and logics.
In this paper, we present an extension to Martin-Löf’s Intuitionistic Type Theory which gives natural solutions to problems in pragmatics, such as pronominal reference and presupposition. Our approach also gives a simple account of donkey anaphora without resorting to exotic scope extension of the sort used in Discourse Representation Theory and Dynamic Semantics, thanks to the proof-relevant nature of type theory.
We present a survey of the two-dimensional and tensorial structure of the lifting doctrine in constructive domain theory. We establish the universal property of lifting of directed-complete partial orders (dcpos) as the Sierpiński cone, from which we deduce (1) that lifting forms a Kock–Zöberlein doctrine, (2) that lifting algebras, pointed dcpos, and inductive partial orders form canonically equivalent locally posetal 2-categories, and (3) that the category of lifting algebras is cocomplete, with connected colimits created by the forgetful functor to dcpos. Finally we deduce the symmetric monoidal closure of the Eilenberg–Moore resolution of the lifting 2-monad by means of smash products; these are shown to classify both bilinear maps and strict maps, which we prove to coincide in the constructive setting. We provide several concrete computations of the smash product as dcpo coequalisers and lifting algebra coequalisers, and compare these with the more abstract results of Seal. Although all these results are well-known classically, the existing proofs do not apply in a constructive setting; indeed, the classical analysis of the Eilenberg–Moore category of the lifting monad relies on the fact that all lifting algebras are free, a condition that is not known to hold constructively.
We present a novel mechanism for controlling the unfolding of definitions in dependent type theory. Traditionally, proof assistants let users specify whether each definition can or cannot be unfolded in the remainder of a development; unfolding definitions is often necessary in order to reason about them, but an excess of unfolding can result in brittle proofs and intractably large proof goals. In our system, definitions are by default not unfolded, but users can selectively unfold them in a local manner. We justify our mechanism by means of elaboration to a core type theory with extension types, a connective first introduced in the context of homotopy type theory. We prove a normalization theorem for our core calculus and have implemented our system in the cooltt proof assistant, providing both theoretical and practical evidence for it.
We contribute the first denotational semantics of polymorphic dependent type theory extended by an equational theory for general (higher-order) reference types and recursive types, based on a combination of guarded recursion and impredicative polymorphism; because our model is based on recursively defined semantic worlds, it is compatible with polymorphism and relational reasoning about stateful abstract datatypes. We then extend our language with modal constructs for proof-relevant relational reasoning based on the logical relations as types principle, in which equivalences between imperative abstract datatypes can be established synthetically. Finally we develop a decomposition of the store model as a general construction that extends an arbitrary polymorphic call-by-push-value adjunction with higher-order store, improving on Levy's possible worlds model construction; what is new in relation to prior typed denotational models of higher-order store is that our Kripke worlds need not be syntactically definable, and are thus compatible with relational reasoning in the heap. Our work combines recent advances in the operational semantics of state with the purely denotational viewpoint of synthetic guarded domain theory.
Existential types are reconstructed in terms of small reflective subuniverses and dependent sums. The folklore decomposition detailed here gives rise to a particularly simple account of first class modules as a mode of use of traditional second class modules in connection with the modal operator induced by a reflective subuniverse, leading to a semantic justification for the rules of first-class modules in languages like OCaml and MoscowML. Additionally, we expose several constructions that give rise to semantic models of ML-style programming languages with both first-class modules and realistic computational effects, culminating in a model that accommodates higher-order first class recursive modules and higher-order store.
Logical relations are the main tool for proving positive properties of logics, type theories, and programming languages: canonicity, normalization, decidability, conservativity, computational adequacy, and more. Logical relations combine pure syntax with non-syntactic objects that are parameterized in syntax in a somewhat complex way; the sophistication of possible parameterizations makes logical relations a tool that is primarily accessible to specialists. In the spirit of Halmos' book Naïve Set Theory, I advocate for a new viewpoint on logical relations based on synthetic Tait computability, the internal language of categories of logical relations. In synthetic Tait computability, logical relations are manipulated as if they were sets, making the essence of many complex logical relations arguments accessible to non-specialists.
Hofmann and Streicher famously showed how to lift Grothendieck universes into presheaf topoi, and Streicher has extended their result to the case of sheaf topoi by sheafification. In parallel, van den Berg and Moerdijk have shown in the context of algebraic set theory that similar constructions continue to apply even in weaker metatheories. Unfortunately, sheafification seems not to preserve an important realignment property enjoyed by the presheaf universes that plays a critical role in models of univalent type theory as well as synthetic Tait computability, a recent technique to establish syntactic properties of type theories and programming languages. In the context of multiple universes, the realignment property also implies a coherent choice of codes for connectives at each universe level, thereby interpreting the cumulativity laws present in popular formulations of Martin-Löf type theory.
We observe that a slight adjustment to an argument of Shulman constructs a cumulative universe hierarchy satisfying the realignment property at every level in any Grothendieck topos. Hence one has direct-style interpretations of Martin-Löf type theory with cumulative universes into all Grothendieck topoi. A further implication is to extend the reach of recent synthetic methods in the semantics of cubical type theory and the syntactic metatheory of type theory and programming languages to all Grothendieck topoi.
My research interests lie mainly in type theory, proof assistants, homotopy theory and constructive mathematics.
PhD student of Robert Harper.
The course aims to introduce the mathematics of discrete structures, showing it as an essential tool for computer science that can be clever and beautiful.
Michaelmas term lectured by Marcelo Fiore; Lent term lectured by Jon Sterling.
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A (binary) relation
Let
We recall that relation from
In the simplest terms, a specification of a program is a relation that describes the possible input/output pairs that can occur. For example, the specification that a given program compute the square root is captured by the relation
Let
Let
Let
A network is given by a set of nodes
We now come to an example of a relation between multiple sets: we could define a relation
For any two sets
For any two sets
For any set
We have already seen the square root relation from positive reals to reals, which corresponds to a total but many-valued function. We can define an analogous relationship in below from positive integers (naturals) to integers, which will correspond to a partial and many-valued function.
The square root operation corresponds to a relation
A useful way to visualise a relation between two sets is by means of internal diagrams: each set is depicted as a blob containing its elements, and lines are drawn from the elements of one blob to the elements of the second blob when they are related.
In particular, let
We can depict
Draw the internal diagram corresponding to the following relation:
Given relations
Recall the square root relation
By relational extensionality, it suffices to check that for any
Relational composition is associative and has the identity relation as a neutral element.
To prove associativity, we fix relations
Using the above, we can compute the full composites:
For the right and left neutrality, we must prove that
Relations between finite sets can be desribed in a more computationally friendly way by their tabulation as matrices. In particular, we shall see in that an
For natural numbers
Then
For any
The identity matrix is sometimes called the diagonal matrix, for reasons that become apparent when visualising it according to :
Let
We have defined a function
Let
We have defined a function
The associated matrix function
We must check that
The other composite
The idempotent
We fix a matrix
Because
Thus we conclude that an
Although we do not explore it in this course, the viewpoint of matrix multiplication as relational composition generalises to a correct explanation of the role of matrices in geometry as presentations of linear maps between vector spaces in terms of an (uncanonical) choice of basis.
Let
The product of matrices is associative and has the identity matrix as neutral element.
For associativity, fix matrices
For one unit law, we fix
Unfolding the definition of
The other unit law follows in an analogous way.
Under the correspondence between boolean matrices and finite relations (, ), matrix products of boolean matrices correspond to relational composites. In particular, given
We use the fact that the additive operation of the boolean semiring is disjunction and the multiplicative operation is conjunction:
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
Let
A zero matrix is one that whose cells all contain
Matrix addition forms an associative and commutative operation on
This follows immediately from the associativity, commutativity, and unit laws of the additive submonoid
The pointwise sum
Moreover, the zero matrix corresponds to the empty relation:
This follows because addition and zero in the boolean semiring are given by disjunction and falsehood respectively.
A directed graph
For every set
We have seen in that relational composition is associative and has identity relations
For
More precisely, we define
Let
Let
Let
We shall prove
For
In the case of an endo-relation
There are only
Let
This is an immediate consequence of : by we have
We have seen in Lecture 13 () how to turn any relation
Let
Thus we have a recursive algorithm that can deicde whether there is a path between two elemetns in a finite directed graph: indeed, there is a path from
We first unravel the first few
By induction, we can extend the pattern above to a closed form for each
Our goal is to show that
By (reflexive-transitive closure of finite endo-relations), we have
By (matrix addition and union of relations), we have:
A preorder
In other words, a preorder is a directed graph such that the edge relation is both reflexive and transitive.
Preorders are everywhere in both general mathematics and computer science.
The first example above happen to be total orders (meaning that the underlying directed graph is connected): so
All the examples we have seen so far are, furthermore, partial orders: a preorder is a partial order when we have
The reflexive-transitive closure
To be more precise, let
Then we have
To show that
We first note that
It follows from
Our second goal is to prove
Fixing such a preorder
Therefore,
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A relation
We refer to a functional relation as a partial function; we often use letters like
Partial functions are closed under identity and composition in the following sense:
First, we prove that each identity relation
Next, we fix partial functions
Let
The domain of definition of a partial function
Let
One direction is trivial. In the other direction, we assume
Fixing
The following are examples of partial functions:
For all finite sets
A partial function
A partial function
Just as we write
Any relation
Symbolically, our second criterion is
For all finite sets
A function
Functions are closed under identity and composition.
We have already seen in that partial functions are closed under identity and composition. Therefore, it suffices to argue that the identity partial function is total, and that the composition of two total partial functions is total.
We have already seen some inductive definitions:
In this section, we will study this concept in more detail.
An inductive definition on the naturals defines a structure for each number
Likewise, the function
In order to make precise the description of , , and as “inductive definitions”, we must give an actual formal definition of what an inductive definition is.
Let
Note that our specification of inductively defined functions in asserts, without proof, that there does in fact exist a unique function
We have seen two informal examples of inductive definitions in ; we will now update these examples in light of :
Given an element
In light of , we see that is defining
Given an element
We first notice that
We must show that
For the base case, we fix
We choose
In the inductive step, we assume
We know that for any
We choose
Therefore, we have some
To show that
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A section-retraction pair is defined to be a pair of functions
A function
Every section-retraction pair determines an idempotent. In particular, when
We proceed by calculation:
It happens that every idempotent arises from a section-retraction pair in the manner of .
Let
We must check that
A function
We will often write
To see that
For all finite sets
A bijection between two sets is a way of associating precisely one element of the first set with precisely one element of the second set. These associations are embodied in the functions
Assuming two finite sets
The identity function is a bijection, and the composition of bijections yields a bijection.
The inverse to the identity function
Fix bijections
Two sets
A bijection between finite sets is just a “relabeling” of its elements: for example, we have
Although we are speaking of “being isomorphic” as the property of there being a bijection between them, in practice, it is almost never useful to know that there exists some undetermined bijection between given sets; it is always important to know which bijection. Indeed, we have seen in that there can be many distinct bijections between two sets.
A relation
We will write
Let
You may hear some people say “an equivalence” instead of “an equivalence relation”. This usage is wrong and leads to extremely confused thinking: you must not repeat it!
A partition of a set
We will write
For every set
We must first argue that
The inverse
To check that
As
Conversely, we must show that
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
The concept of isomorphism between two sets satisfies laws like those of an equivalence relation. In particular:
For all sets
Fix bijections
Given
Next, we check that
A predicate on a set
The indicator function or characteristic function of
Let
The mappings
Fixing
Conversely, we fix
For any set
We first note that
Now that we have learned about bijections, we are ready to replace our intuitive/informal understanding of finite sets and finite cardinality with a formal one.
A set
There are many identities that relate set-theoretic operations on finite sets to arithmetic operations on their cardinalities. We will illustrate a few of them here.
For all
We can prove this from a combinatoric point of view.
To choose a pair
We can also prove this isomorphism more formally by means of a specific bijection.
We can think of the cartesian product
We do need to check that
The inverse function
For all
We can think of
The function
A set
The basic axioms of set theory that we have learned in this course so far do not in fact guarantee the existence of any infinite sets. For this reason, one usually adds the axiom of infinity below to set theory.
The natural numbers form a set.
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
For a function
We first prove that
Next, we show that we have
A function
Any bijection
The unique function
Suppose that
Conversely, if
Let
We must show that for every equivalence class
The projection function
Suppose that
Conversely, suppose that either
The identity function is a surjection and the composition of surjections yields a surjection.
For the identity function on a set
Let
A set
In an enumeration
A countable set is one that is either empty or enumerable.
A set
Suppose that
Conversely, suppose that
We can define a bijective enumeration of the integers by “zigzagging” between the positive and the negative following the pattern
Any non-empty subset of an enumerable set is enumerable.
We will use the same method as in our proof of .
Let
The cartesian product of countable sets is countable.
Let
Every surjection has a section.
A function
Every section is injective
Let
The function
Immediate: the inclusion maps a given element to itself!
The identity function is an injection and the composition of injections yields a injection.
The identity function is clearly injective, as
Let
Now that we have enough definitions in hand to see it, the import of is then to state that a function is bijective if and only if it is both injective and surjective.
For finite sets
We will argue from a combinatoric perspective.
An injection
Therefore, choosing an injection amounts to making two independent choices: first we choose a subset of
Of course, if there is no subset of
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
Let
Let
A family of sets indexed in a set
Thus
Let
This is almost saying that the direct image of
may seem a bit obscure (and it is!). But an example of a set that we need replacement to define is the set of iterated powersets
Let
By assumption, we have an enumeration
Let
The function
Given any set
Assume that we do have a surjection
Because
Thus for each
A fixed point of a function
Given sets
Let
Because
Finally, we conclude that
To deduce from , we suppose that there is a surjection
The set
Of course, as
Let
The “strictly less-than” relation
Almost immediate: give a non-empty set of naturals, pick the smallest number.
The strictly less-than relation
There is no smallest integer, so the set
A relation
Let
In other words, to show that every element of
The “strong” or “complete” induction principle for natural numbers states that for any subset
Let
We must show that any string
If
Suppose on the other hand that
By definition of
Let
We must show that any string
If
Suppose that
We inductively define a sequence of states
Define
We proceed by induction on
It follows from the above that we have transitions
Why can’t we define
Because if we did, then would fail already in the case of the empty string. We would have
I offer a variety of paid consulting services, ranging from language design to proof+software engineering to private tutoring; please contact me for a consultation.
This page accumulates project ideas that could be taken on by Part III/MPhil students at University of Cambridge, or very advanced Part II students in some cases.
Please note that an MPhil/Part III project report is different from a Part II dissertation, and so not all projects here will be suitable for all students. Please contact me at js2878@cl.cam.ac.uk if you are interested in one of these projects, and we can discuss what would be suitable for you.
Certain foundational systems for mathematics employ global choice or indefinite description operators; operators of this kind are assumed by Bourbaki, Isabelle/HOL, and Lean. Indefinite description is similar to the axiom of choice, but quite a bit stronger:
In the presence of subset types, indefinite description in the sense above is equivalent to the presence of a single global choice operator:
Experience with formalising classical mathematics in both Isabelle/HOL and Lean suggests that indefinite description is a very convenient feature to have in a language, not least because it greatly simplifies the treatment of partial functions as Larry Paulson has pointed out. For example, you can define division in Isabelle as a total function by sending
Unfortunately, global choice and indefinite description are inconsistent with homotopy type theory / univalent foundations, as they imply that every type is a set. In fact, as can be seen from the proof of Theorem 3.2.2 of the HoTT Book, it is even inconsistent with univalent foundations to assume global choice only for sets!
To express classical mathematics in univalent foundations, it is therefore necessary to use the more cumbersome axiom of choice — which is strictly weaker than global choice. When
I believe that it is nonetheless possible to have the “best of both worlds” and elaborate a high-level language with indefinite descriptions to homotopy type theory with the axiom of choice. The idea has two main parts:
Thus we have a language for univalent foundations in which indefinite descriptions can be used. There is a minor trade-off in the proposed treatment, however: two occurrences of
One of the most common tasks when working in univalent foundations is to characterize the path space of a given type, i.e. establish a structure identity principle. Formalized SIP proofs often follow a baroque style involving identity systems and the even more complex displayed identity systems, etc. that sometimes obscure the very simple informal reasoning behind a given structure identity principle; the proposal is to develop a formal proof language to better reflect this informal reasoning, backed by a suite of lemmas and tactics.
To understand what I mean, let us consider how we might characterize the path spaces of morphisms in a co-slice
Let
Let
We expand the dependent sum
First we re-order the telescope:
By homotopy induction (function extensionality), the first two fields contract to a point with substitution
Zooming into the
By dependent homotopy induction (also function extensionality), the remaining data collapse to a point with substitution
The steps performed in the proof of the SIP lemma for co-slice morphisms above are most naturally expressed at the level of a flat telescope (a context). The goal is always to contract a telescope to a point, and each step is either a rearrangement of the telescope, a “zooming in” step that adjusts a component of the telescope, or an application of some form of identity induction to kill off part of the telescope — path induction, homotopy induction, equivalence induction, etc. In general, we want to be able to easily point out the fragment of a telescope that contracts to a point, and then use the resulting substitution to modify the telescope and proceed further until the entire telescope is gone.
I believe that there is a reasonable and practical formal logic for doing such things, and that this logic can be implemented as a series of lemmas in homotopy type theory that can then be animated via tactics for interactive backward-inference proofs of contractibility.
These projects have already been taken up by a student or are otherwise unavailable.
There is a common thread in modern approaches to domain theory, theory of computation, and topology: one has an object
In addition to the path order described above, the Sierpiński space has its own intrinsic order: for
The map
A concrete model of domain theory (like cpos) is always a poset-enriched category, which can also be thought of as a (locally posetal) 2-category. I have discovered that it is possible to re-formulate Phoa’s principle in terms of 2-categorical universal properties of comma and cocomma squares, and it can be shown that in a cartesian closed locally posetal 2-category, an appropriate version of Phoa’s principle always holds.
In fact, an appropriate version of Phoa’s principle should hold in any cartesian closed 2-category (including non-posetal ones), and the proposal is to prove that this is the case using string diagrams. This result will have implications for the understanding of higher-dimensional domain theory as applied to concurrency.
Homotopy type theory / univalent foundations is an exciting and new foundation of mathematics in which the classical notion of equality is extended to include not only the strict identifications between elemetns of sets but also symmetries on higher dimensional spaces. The simplest example of a higher-dimensional space is the space of all sets — in which “equality” corresponds to bijection. When a mathematical result is formalised in homotopy type theory, it can have unexpectedly strong consequences: the result holds in all (∞,1)-toposes.
Synthetic domain theory is an approach to studying the semantics of computer programs in which a few small axioms are added to intuitionistic logic or intuitionistic type theory, centering on an object that shall play the role of the Sierpiński space, which can be thought of as the space of programs that produce no output (but might diverge). In the past, synthetic domain theory has mainly been studied in the presence of very strong classical and infinitary axioms: in one tradition, Markov’s principle is always assumed, and in another tradition one makes use of infinitary axioms such as the assumption that the initial lifting algebra (a recursively defined gluing of the Sierpiński space onto itself ad infinitum) can be computed as a sequential colimit.
Overemphasis of classical and infinitary assumptions has certainly dampened efforts to develop a unified synthetic domain theory that can serve as base-line truth on which to build further and more refined notions of computational space. Furthermore, many difficult aspects of synthetic domain theory were developed prior to the current understanding of type theory (with its universes, etc.) — and are thus ripe for revisiting and simplification under the aegis of modern dependent type theory. I propose a project to formalise the basics of synthetic domain theory in homotopy type theory, most likely as an extension to Martín Hötzel Escardó’s TypeTopology library.
Central to both the design of programming languages and the practice of software engineering is the tension between abstraction and composition. I employ semantic methods from category theory and type theory to design, verify, and implement languages that enable both programmers and mathematicians to negotiate the different levels of abstraction that arise in their work. I apply my research to global safety and security properties of programming languages as well as the design and implementation of interactive theorem provers for higher-dimensional mathematics. I develop foundational and practical mathematical tools to deftly weave together verifications that cut across multiple levels of abstraction.
From September 2023, I am an Associate Professor in Logical Foundations and Formal Methods at University of Cambridge.
From 2022, I was a Marie Skłodowska-Curie Postdoctoral Fellow hosted at Aarhus University working with Professor Lars Birkedal.
From 2016 to 2021, I was a PhD student of Professor Robert Harper at Carnegie Mellon University, where I wrote my doctoral thesis on synthetic Tait computability and its application to normalization for cubical type theory.
The annual Symposium on Principles of Programming Languages is a forum for the discussion of all aspects of programming languages and programming systems. Both theoretical and experimental papers are welcome on topics ranging from formal frameworks to experience reports. We seek submissions that make principled, enduring contributions to the theory, design, understanding, implementation or application of programming languages.
Applied category theory is a topic of interest for a growing community of researchers, interested in studying many different kinds of systems using category-theoretic tools. These systems are found across computer science, mathematics, and physics, as well as in social science, linguistics, cognition, and neuroscience. The background and experience of our members is as varied as the systems being studied. The goal of Applied Category Theory is to bring researchers in the field together, disseminate the latest results, and facilitate further development of the field.
Homotopy Type Theory is a young area of logic, combining ideas from several established fields: the use of dependent type theory as a foundation for mathematics, inspired by ideas and tools from abstract homotopy theory. Univalent Foundations are foundations of mathematics based on the homotopical interpretation of type theory.
The goal of this workshop is to bring together researchers interested in all aspects of Homotopy Type Theory/Univalent Foundations: from the study of syntax and semantics of type theory to practical formalization in proof assistants based on univalent type theory.
The Second Workshop on the Implementation of Type Systems (WITS 2023) will be held on August 28, 2023, in Braga, Portugal, co-located with IFL 2023. The goal of this workshop is to bring together the implementors of a variety of languages with advanced type systems. The main focus is on the practical issues that come up in the implementation of these systems, rather than the theoretical frameworks that underlie them. In particular, we want to encourage exchanging ideas between the communities around specific systems that would otherwise be accessible to only a very select group. The workshop will have a mix of invited and contributed talks, organized discussion times, and informal collaboration time. We invite participants to share their experiences, study differences among the implementations, and generalize lessons from those. We also want to promote the creation of a shared vocabulary and set of best practices for implementing type systems.
Homotopy Type Theory is a young area of logic, combining ideas from several established fields: the use of dependent type theory as a foundation for mathematics, inspired by ideas and tools from abstract homotopy theory. Univalent Foundations are foundations of mathematics based on the homotopical interpretation of type theory.
The goal of this workshop is to bring together researchers interested in all aspects of Homotopy Type Theory/Univalent Foundations: from the study of syntax and semantics of type theory to practical formalization in proof assistants based on univalent type theory.
The Workshop on Type-Driven Development (TyDe) aims to show how static type information may be used effectively in the development of computer programs. Co-located with ICFP, this workshop brings together leading researchers and practitioners who are using or exploring types as a means of program development.
PI: | Jonathan Sterling |
institution: | University of Cambridge |
Funding Agency: | United States Air Force Office of Scientific Research |
Program Officer: | Dr. Tristan Nguyen |
Award No.: | FA9550-23-1-0728 |
Years: | 2023–2028 |
Amount: | 1,221,099 USD |
Status: | awarded 2023-09-27 |
See project bibliography.
Abstract. What does it mean for a piece of software to be correct? There are many possible degrees and dimensions of correctness, e.g. safety, functional correctness, computational complexity, and security. To grapple with this diversity of verification requirements, semanticists develop mathematical models of program behavior that put into relief different aspects of the physical reality of program execution on hardware, just as physicists create many idealized mathematical models to study different aspects of the material reality of the universe.
Mathematical models enable us to reason about program behavior by viewing highly complex objects as being glued together from smaller, simpler objects that are easier to study in isolation. For instance, operational models aim to reduce the behavior of a process to that of individual steps of discrete computation that take place on an idealized computer; in contrast, denotational models reduce the complex global behavior of a process to the simpler local behavior of its constituent subroutines. One advantage of operational methods is that they are applicable even in situations that challenge the modularity of denotational semantics, e.g. where it is not yet understood how to reduce the global behavior of a program to that of its components. On the other hand, denotational methods provide vastly stronger and simpler reasoning principles for program verification when available.
The central thesis of denotational semantics is that programs arrange themselves into geometrical spaces called computational domains, and that a computational process can be thought of as the limit of a sequence of continuous transformations on these domains. Although this thesis has been amply born out for simple kinds of program, today’s most urgent verification requirements pertain to program constructs like concurrency and side effects whose treatment requires the introduction of new kinds of space: for instance, the correct treatment of branching behavior for concurrent processes requires the introduction of higher-dimensional computational domains in which programs can “remember” the specific way that they were glued together.
This project will extend the reach of denotational semantics and its attendant advantages for program verification into terrains where scientists have historically struggled to enact the reduction of global behavior to local behavior, making essential use of new advances in the homotopical and geometrical understanding of computation via higher dimensional category theory and topos theory. I will investigate two areas that are ripe for reaping the benefits of a modern denotational semantics: the semantics of side-effects which govern the interaction of a program with the computer’s memory, and the semantics of concurrent processes.
Beneficiary: | Jonathan Sterling |
Award: | Marie Skłodowska-Curie Actions Postdoctoral Fellowship |
Funder: | European Commission, Horizon Europe Framework Programme (HORIZON) |
Host: | Aarhus University, Center for Basic Research in Program Verification |
Years: | 2022–2024 (terminated 2023) |
Amount: | 214,934.4 EUR |
See the Final Report and Bibliography.
Abstract. Software systems mediate a growing proportion of human activity, e.g. communication, transport, medicine, industrial and agricultural production, etc. As a result, it is urgent to understand and better control both the correctness and security properties of these increasingly complex software systems. The diversity of verification requirements speaks to a need for models of program execution that smoothly interpolate between many different levels of abstraction.
Models of program execution vary in expressiveness along the spectrum of possible programming languages and specification logics. At one extreme, dependent type theory is a language for mathematically-inspired functional programming that is sufficiently expressive to serve as its own specification logic. Dependent type theory has struggled, however, to incorporate several computational effects that are common in every-day programming languages, such as state and concurrency. Languages that support these features require very sophisticated specification logics due to the myriad details that must be surfaced in their semantic models.
In the context of dependent type theory, I have recently developed a new technique called Synthetic Tait Computability or STC that smoothly combines multiple levels of abstraction into a single language. Inspired by sophisticated mathematical techniques invented in topos theory and category theory for entirely different purposes, STC enables low-level details (even down to execution steps) to be manipulated in a simpler and more abstract way than ever before, making them easier to control mathematically. Perhaps more importantly, the STC method makes it possible to import ideas and techniques from other mathematical fields that are comparatively more developed than programming languages.
The goal of the TypeSynth project is to extend the successful STC approach to a wider class of programming models, in particular programming languages with effects.
PIs: | Stephanie Balzer and Robert Harper |
Unfunded: | Jonathan Sterling |
Funding Agency: | United States Air Force Office of Scientific Research |
Program Officer: | Dr. Tristan Nguyen |
Award No.: | FA9550-21-1-0385 |
Years: | 2021-2022 |
The purpose of this research is to investigate the development of programming language techniques to express and enforce constraints on the flow of information in a program.
Type systems are the most widely applicable tool for enforcing such restrictions within and among programs. The aim of the project is to investigate the development of suitable type systems for information flow security in two settings, and to understand their inter-relationship. In each case the goal is to state and prove non-interference properties of programs that ensure the independence of non-sensitive outputs on sensitive inputs to a system. Both methods draw on the method of logical relations to establish these properties.
The first setting is that of session types for communicating programs. In their most basic form session types express the protocols for interaction among programs that interact along data-carrying communication channels. A key characteristic of session types is that they are able to track changes of state in a computation using methods drawn from substructural logic. The intent of the investigation is to extend session types to track information flow in a composite program using refinement types that encode security levels of data.
The second setting is that of program modules which govern the construction of programs from separable, reusable components. Type systems for modularity are primarily concerned with the interfaces between components, which ensure that the effects of changes to a module implementation on other modules in a system can be tightly controlled. The project will investigate the extension of module type systems to express information flow dependencies among components using a generalization of the phase distinction between static and dynamic aspects of a program to account for a richer hierarchy of security levels.
My research interests lie mainly in type theory, proof assistants, homotopy theory and constructive mathematics.
PhD student of Robert Harper.
The course aims to introduce the mathematics of discrete structures, showing it as an essential tool for computer science that can be clever and beautiful.
Michaelmas term lectured by Marcelo Fiore; Lent term lectured by Jon Sterling.
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A (binary) relation
Let
We recall that relation from
In the simplest terms, a specification of a program is a relation that describes the possible input/output pairs that can occur. For example, the specification that a given program compute the square root is captured by the relation
Let
Let
Let
A network is given by a set of nodes
We now come to an example of a relation between multiple sets: we could define a relation
For any two sets
For any two sets
For any set
We have already seen the square root relation from positive reals to reals, which corresponds to a total but many-valued function. We can define an analogous relationship in below from positive integers (naturals) to integers, which will correspond to a partial and many-valued function.
The square root operation corresponds to a relation
A useful way to visualise a relation between two sets is by means of internal diagrams: each set is depicted as a blob containing its elements, and lines are drawn from the elements of one blob to the elements of the second blob when they are related.
In particular, let
We can depict
Draw the internal diagram corresponding to the following relation:
Given relations
Recall the square root relation
By relational extensionality, it suffices to check that for any
Relational composition is associative and has the identity relation as a neutral element.
To prove associativity, we fix relations
Using the above, we can compute the full composites:
For the right and left neutrality, we must prove that
Relations between finite sets can be desribed in a more computationally friendly way by their tabulation as matrices. In particular, we shall see in that an
For natural numbers
Then
For any
The identity matrix is sometimes called the diagonal matrix, for reasons that become apparent when visualising it according to :
Let
We have defined a function
Let
We have defined a function
The associated matrix function
We must check that
The other composite
The idempotent
We fix a matrix
Because
Thus we conclude that an
Although we do not explore it in this course, the viewpoint of matrix multiplication as relational composition generalises to a correct explanation of the role of matrices in geometry as presentations of linear maps between vector spaces in terms of an (uncanonical) choice of basis.
Let
The product of matrices is associative and has the identity matrix as neutral element.
For associativity, fix matrices
For one unit law, we fix
Unfolding the definition of
The other unit law follows in an analogous way.
Under the correspondence between boolean matrices and finite relations (, ), matrix products of boolean matrices correspond to relational composites. In particular, given
We use the fact that the additive operation of the boolean semiring is disjunction and the multiplicative operation is conjunction:
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
Let
A zero matrix is one that whose cells all contain
Matrix addition forms an associative and commutative operation on
This follows immediately from the associativity, commutativity, and unit laws of the additive submonoid
The pointwise sum
Moreover, the zero matrix corresponds to the empty relation:
This follows because addition and zero in the boolean semiring are given by disjunction and falsehood respectively.
A directed graph
For every set
We have seen in that relational composition is associative and has identity relations
For
More precisely, we define
Let
Let
Let
We shall prove
For
In the case of an endo-relation
There are only
Let
This is an immediate consequence of : by we have
We have seen in Lecture 13 () how to turn any relation
Let
Thus we have a recursive algorithm that can deicde whether there is a path between two elemetns in a finite directed graph: indeed, there is a path from
We first unravel the first few
By induction, we can extend the pattern above to a closed form for each
Our goal is to show that
By (reflexive-transitive closure of finite endo-relations), we have
By (matrix addition and union of relations), we have:
A preorder
In other words, a preorder is a directed graph such that the edge relation is both reflexive and transitive.
Preorders are everywhere in both general mathematics and computer science.
The first example above happen to be total orders (meaning that the underlying directed graph is connected): so
All the examples we have seen so far are, furthermore, partial orders: a preorder is a partial order when we have
The reflexive-transitive closure
To be more precise, let
Then we have
To show that
We first note that
It follows from
Our second goal is to prove
Fixing such a preorder
Therefore,
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A relation
We refer to a functional relation as a partial function; we often use letters like
Partial functions are closed under identity and composition in the following sense:
First, we prove that each identity relation
Next, we fix partial functions
Let
The domain of definition of a partial function
Let
One direction is trivial. In the other direction, we assume
Fixing
The following are examples of partial functions:
For all finite sets
A partial function
A partial function
Just as we write
Any relation
Symbolically, our second criterion is
For all finite sets
A function
Functions are closed under identity and composition.
We have already seen in that partial functions are closed under identity and composition. Therefore, it suffices to argue that the identity partial function is total, and that the composition of two total partial functions is total.
We have already seen some inductive definitions:
In this section, we will study this concept in more detail.
An inductive definition on the naturals defines a structure for each number
Likewise, the function
In order to make precise the description of , , and as “inductive definitions”, we must give an actual formal definition of what an inductive definition is.
Let
Note that our specification of inductively defined functions in asserts, without proof, that there does in fact exist a unique function
We have seen two informal examples of inductive definitions in ; we will now update these examples in light of :
Given an element
In light of , we see that is defining
Given an element
We first notice that
We must show that
For the base case, we fix
We choose
In the inductive step, we assume
We know that for any
We choose
Therefore, we have some
To show that
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
A section-retraction pair is defined to be a pair of functions
A function
Every section-retraction pair determines an idempotent. In particular, when
We proceed by calculation:
It happens that every idempotent arises from a section-retraction pair in the manner of .
Let
We must check that
A function
We will often write
To see that
For all finite sets
A bijection between two sets is a way of associating precisely one element of the first set with precisely one element of the second set. These associations are embodied in the functions
Assuming two finite sets
The identity function is a bijection, and the composition of bijections yields a bijection.
The inverse to the identity function
Fix bijections
Two sets
A bijection between finite sets is just a “relabeling” of its elements: for example, we have
Although we are speaking of “being isomorphic” as the property of there being a bijection between them, in practice, it is almost never useful to know that there exists some undetermined bijection between given sets; it is always important to know which bijection. Indeed, we have seen in that there can be many distinct bijections between two sets.
A relation
We will write
Let
You may hear some people say “an equivalence” instead of “an equivalence relation”. This usage is wrong and leads to extremely confused thinking: you must not repeat it!
A partition of a set
We will write
For every set
We must first argue that
The inverse
To check that
As
Conversely, we must show that
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
The concept of isomorphism between two sets satisfies laws like those of an equivalence relation. In particular:
For all sets
Fix bijections
Given
Next, we check that
A predicate on a set
The indicator function or characteristic function of
Let
The mappings
Fixing
Conversely, we fix
For any set
We first note that
Now that we have learned about bijections, we are ready to replace our intuitive/informal understanding of finite sets and finite cardinality with a formal one.
A set
There are many identities that relate set-theoretic operations on finite sets to arithmetic operations on their cardinalities. We will illustrate a few of them here.
For all
We can prove this from a combinatoric point of view.
To choose a pair
We can also prove this isomorphism more formally by means of a specific bijection.
We can think of the cartesian product
We do need to check that
The inverse function
For all
We can think of
The function
A set
The basic axioms of set theory that we have learned in this course so far do not in fact guarantee the existence of any infinite sets. For this reason, one usually adds the axiom of infinity below to set theory.
The natural numbers form a set.
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
For a function
We first prove that
Next, we show that we have
A function
Any bijection
The unique function
Suppose that
Conversely, if
Let
We must show that for every equivalence class
The projection function
Suppose that
Conversely, suppose that either
The identity function is a surjection and the composition of surjections yields a surjection.
For the identity function on a set
Let
A set
In an enumeration
A countable set is one that is either empty or enumerable.
A set
Suppose that
Conversely, suppose that
We can define a bijective enumeration of the integers by “zigzagging” between the positive and the negative following the pattern
Any non-empty subset of an enumerable set is enumerable.
We will use the same method as in our proof of .
Let
The cartesian product of countable sets is countable.
Let
Every surjection has a section.
A function
Every section is injective
Let
The function
Immediate: the inclusion maps a given element to itself!
The identity function is an injection and the composition of injections yields a injection.
The identity function is clearly injective, as
Let
Now that we have enough definitions in hand to see it, the import of is then to state that a function is bijective if and only if it is both injective and surjective.
For finite sets
We will argue from a combinatoric perspective.
An injection
Therefore, choosing an injection amounts to making two independent choices: first we choose a subset of
Of course, if there is no subset of
These lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
Let
Let
A family of sets indexed in a set
Thus
Let
This is almost saying that the direct image of
may seem a bit obscure (and it is!). But an example of a set that we need replacement to define is the set of iterated powersets
Let
By assumption, we have an enumeration
Let
The function
Given any set
Assume that we do have a surjection
Because
Thus for each
A fixed point of a function
Given sets
Let
Because
Finally, we conclude that
To deduce from , we suppose that there is a surjection
The set
Of course, as
Let
The “strictly less-than” relation
Almost immediate: give a non-empty set of naturals, pick the smallest number.
The strictly less-than relation
There is no smallest integer, so the set
A relation
Let
In other words, to show that every element of
The “strong” or “complete” induction principle for natural numbers states that for any subset
Let
We must show that any string
If
Suppose on the other hand that
By definition of
Let
We must show that any string
If
Suppose that
We inductively define a sequence of states
Define
We proceed by induction on
It follows from the above that we have transitions
Why can’t we define
Because if we did, then would fail already in the case of the empty string. We would have
This award, established in 2018, is endowed with a generous contribution from Emeritus Professor Ed Clarke and his wife Martha. It is presented annually to an outstanding dissertation in the Computer Science Department.
Dear friends,
I am delighted to announce that Jonathan Sterling has been selected as the winner of the 2022 Edmund M. Clarke Dissertation Award in CSD for his thesis:
First Steps in Synthetic Tait Computability, October 2021
Jonathan’s PhD research was advised by Robert Harper.
This award, endowed by a generous contribution from Professor Ed Clarke and his wife Martha, is given annually to an outstanding dissertation in the Computer Science Department.
Congratulations Jonathan!
The implementation and semantics of dependent type theories can be studied in a syntax-independent way: the objective metatheory of dependent type theories exploits the universal properties of their syntactic categories to endow them with computational content, mathematical meaning, and practical implementation (normalization, type checking, elaboration). The semantic methods of the objective metatheory inform the design and implementation of correct-by-construction elaboration algorithms, promising a principled interface between real proof assistants and ideal mathematics.
In this dissertation, I add synthetic Tait computability to the arsenal of the objective metatheorist. Synthetic Tait computability is a mathematical machine to reduce difficult problems of type theory and programming languages to trivial theorems of topos theory. First employed by Sterling and Harper to reconstruct the theory of program modules and their phase separated parametricity, synthetic Tait computability is deployed here to resolve the last major open question in the syntactic metatheory of cubical type theory: normalization of open terms.
Modalities are everywhere in programming and mathematics! Despite this, however, there are still significant technical challenges in formulating a core dependent type theory with modalities. We present a dependent type theory MLTT🔒 supporting the connectives of standard Martin-Löf Type Theory as well as an S4-style necessity operator. MLTT🔒 supports a smooth interaction between modal and dependent types and provides a common basis for the use of modalities in programming and in synthetic mathematics. We design and prove the soundness and completeness of a type checking algorithm for MLTT🔒, using a novel extension of normalization by evaluation. We have also implemented our algorithm in a prototype proof assistant for MLTT🔒, demonstrating the ease of applying our techniques.
We contribute XTT, a cubical reconstruction of Observational Type Theory [Altenkirch et al., 2007] which extends Martin-Löf's intensional type theory with a dependent equality type that enjoys function extensionality and a judgmental version of the unicity of identity proofs principle (UIP): any two elements of the same equality type are judgmentally equal. Moreover, we conjecture that the typing relation can be decided in a practical way. In this paper, we establish an algebraic canonicity theorem using a novel extension of the logical families or categorical gluing argument inspired by Coquand and Shulman: every closed element of boolean type is derivably equal to either true or false.
Named in honor of Professor W. Kendrick Pritchett, a distinguished scholar and a devoted teacher of Greek, upon his retirement, a monetary prize to the enrolled student who performs best in the first year of Greek (sometimes awarded to co-winners). Awarded annually since 1977 and administered by the Department. Each May, the instructors of Greek 2 and (when offered) 10 meet after final exams to discuss the best students in each class; in consultation with the Department Chair or a delegate of the Chair, one or two winners are selected.
I maintain several public bibliographies.
This page collects papers and dissertations about Cubical Type Theory; if you have written a paper or dissertation on this topic, please write to me to have it added to this list.
This page collects papers and dissertations about Guarded Domain Theory; if you have written a paper or dissertation on this topic, please write to me to have it added to this list.
This page collects papers and dissertations about Synthetic Tait Computability, also known as the logical relations as types / LRAT principle; if you have written a paper or dissertation on this topic, please write to me to have it added to this list.
Use Bike to think, write, organize.
Make lists, take notes, create documents.
Bike’s an unusually fast outliner designed for your Mac.
Kerodon is an online textbook on categorical homotopy theory and related mathematics. It currently consists of a handful of chapters, but should grow (slowly) over time. It is modeled on the Stacks project, and is maintained by Jacob Lurie.
Its infrastructure uses Gerby, which is maintained by Pieter Belmans. The design was created by Opus Design.
These are notes about the theory of Fibred Categories as I have learned it from Jean Bénabou. I also have used results from the Thesis of Jean-Luc Moens from 1982 in those sections where I discuss the fibered view of geometric morphisms. Thus, almost all of the contents is not due to me but most of it cannot be found in the literature since Bénabou has given many talks on it but most of his work on fibered categories is unpublished. But I am solely responsible for the mistakes and for misrepresentations of his views. And certainly these notes do not cover all the work he has done on fibered categories. I just try to explain the most important notions he has come up with in a way trying to be as close as possible to his intentions and intuitions. I started these notes in 1999 when I gave a course on some of the material at a workshop in Munich. They have developed quite a lot over the years and I have tried to include most of the things I want to remember.
If you need to convert files from one markup format into another, pandoc is your swiss-army knife.
The implementation and semantics of dependent type theories can be studied in a syntax-independent way: the objective metatheory of dependent type theories exploits the universal properties of their syntactic categories to endow them with computational content, mathematical meaning, and practical implementation (normalization, type checking, elaboration). The semantic methods of the objective metatheory inform the design and implementation of correct-by-construction elaboration algorithms, promising a principled interface between real proof assistants and ideal mathematics.
In this dissertation, I add synthetic Tait computability to the arsenal of the objective metatheorist. Synthetic Tait computability is a mathematical machine to reduce difficult problems of type theory and programming languages to trivial theorems of topos theory. First employed by Sterling and Harper to reconstruct the theory of program modules and their phase separated parametricity, synthetic Tait computability is deployed here to resolve the last major open question in the syntactic metatheory of cubical type theory: normalization of open terms.
Do you wish to write beautiful scientific documents with GNU TeXmacs? Then be guided by The Jolly Writer on your way to mastering the secrets of elegant layout and mathematical typography.
The book covers the main features of the latest version of TeXmacs: its wysiwyg user interface, mathematics, tables, professional typography, technical pictures, laptop presentations, structured editing, interfaces to systems for symbolic and numeric computations, customizable style sheets, the versioning tool, and much more.
The primary aim of The Jolly Writer is to guide you through this world of possibilities. While hacking your way through the jungle of style tags, keyboard shortcuts, mathematical homoglyphs, hanging punctuation, structured insertions, morphing animations, among others, you will slowly but surely unravel their mysteries and deeper why. At the end of the road, maybe you will become a jolly writer yourself.
We study localization at a prime in homotopy type theory, using self maps of the circle. Our main result is that for a pointed, simply connected type
We model exceptions in a linear, effectful setting by relaxing the notion of monadic strength to contexts that are discardable, in the spirit of C++ destructors. This gives rise to a resource modality reminiscent of unique_ptr and move semantics in C++11. We explore consequences in language design for resource management in functional programming languages.
The Stacks project is an ever growing open source textbook and reference work on algebraic stacks and the algebraic geometry needed to define them. Here are some quick facts:
For a longer discussion, please read the blog post What is the stacks project?.
Homotopy type theory is a new branch of mathematics that combines aspects of several different fields in a surprising way. It is based on a recently discovered connection between homotopy theory and type theory. It touches on topics as seemingly distant as the homotopy groups of spheres, the algorithms for type checking, and the definition of weak
We develop a theory of categories which are simultaneously (1) indexed over a base category
We present a method of constructing symmetric monoidal bicategories from symmetric monoidal double categories that satisfy a lifting condition. Such symmetric monoidal double categories frequently occur in nature, so the method is widely applicable, though not universally so.
The paper develops a mathematical theory in the spirit of categorical algebra that provides a model theory for second-order and dependently-sorted syntax. The theory embodies notions such as alpha-equivalence, variable binding, capture-avoiding simultaneous substitution, term metavariable, meta-substitution, mono and multi sorting, and sort dependency. As a matter of illustration, a model is used to extract a second-order syntactic theory, which is thus guaranteed to be correct by construction.
If you have a LaTeX document which
you will run into the problem that
Gerby addresses these problems by providing an online tag-based view, instead of just having a big PDF. Gerby is tailored towards making large online textbooks and reference works more accessible.
In case you were wondering, a gerbe is a kind of stack (in the mathematical sense), and the software was originally meant for the Stacks project.
Higher category theory is generally regarded as technical and forbidding, but part of it is considerably more tractable: the theory of infinity-categories, higher categories in which all higher morphisms are assumed to be invertible. In Higher Topos Theory, Jacob Lurie presents the foundations of this theory, using the language of weak Kan complexes introduced by Boardman and Vogt, and shows how existing theorems in algebraic topology can be reformulated and generalized in the theory’s new language. The result is a powerful theory with applications in many areas of mathematics.
The book’s first five chapters give an exposition of the theory of infinity-categories that emphasizes their role as a generalization of ordinary categories. Many of the fundamental ideas from classical category theory are generalized to the infinity-categorical setting, such as limits and colimits, adjoint functors, ind-objects and pro-objects, locally accessible and presentable categories, Grothendieck fibrations, presheaves, and Yoneda’s lemma. A sixth chapter presents an infinity-categorical version of the theory of Grothendieck topoi, introducing the notion of an infinity-topos, an infinity-category that resembles the infinity-category of topological spaces in the sense that it satisfies certain axioms that codify some of the basic principles of algebraic topology. A seventh and final chapter presents applications that illustrate connections between the theory of higher topoi and ideas from classical topology.
We present Quantitative Type Theory, a Type Theory that records usage information for each variable in a judgement, based on a previous system by McBride. The usage information is used to give a realizability semantics using a variant of Linear Combinatory Algebras, refining the usual realizability semantics of Type Theory by accurately tracking resource behaviour. We define the semantics in terms of Quantitative Categories with Families, a novel extension of Categories with Families for modelling resource sensitive type theories.