21371jms-0001index.xmlJon SterlingJon SterlingAlejandro AguirreAndrew PittsFrank StajanoLars BirkedalMarcelo FiorefalseI am an Associate Professor in Logical Foundations and Formal Methods at the University of Cambridge. I study programming languages and semantics using type theory, category theory, domain theory, and topos theory as a guide. My other interests include Near Eastern, Classical, and Germanic philology. Find my contact information here.
4584jms-008Njms-008N.xmlResearch themes202398Jon Sterlingjms-0001Central to both the design of programming languages and the practice of software engineering is the tension between abstraction and composition. I employ semantic methods from category theory and type theory to design, verify, and implement languages that enable both programmers and mathematicians to negotiate the different levels of abstraction that arise in their work. I apply my research to global safety and security properties of programming languages as well as the design and implementation of interactive theorem provers for higher-dimensional mathematics. I develop foundational and practical mathematical tools to deftly weave together verifications that cut across multiple levels of abstraction.
4586jms-005Ujms-005U.xmlAbout this website2023717Jon SterlingThis website is a “forest” created using the Forester tool. I organize my thoughts here on a variety of topics at a granular level; sometimes these thoughts are self-contained, and at times I may organize them into larger notebooks or lecture notes. My nascent ideas about the design of tools for scientific thought are here. I welcome collaboration on any of the topics represented in my forest. To navigate my forest, press Ctrl-K.
I maintain several public bibliographies in topics of my interest. You can also access my personal bibliography or my curriculum vitæ.
4588jms-007Wjms-007W.xmlBlog postsJon Sterlingjms-0001This is my blog, in which I write about a variety of topics including computer science, mathematics, and the design of tools for scientists.
934☕jms-0088jms-0088.xmlCrowdfunding and sponsorshipJon SterlingApart from my day-job at the University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
945jms-00QBjms-00QB.xmlTips for using plain text email on macOS2024310Jon SterlingI am a proponent of using plain text (or Markdown-formatted) emails rather than HTML-formatted emails. Although there are some legitimate uses for rich formatting in email, I find the well is poisoned: HTML-formatted email manipulates, surveils, and controls.937#250unstable-250.xmlA renaissance for plain text email?2024310Jon Sterlingjms-00QBAlthough plain text email has gone the way of the dinosaur in most parts of society, it remains used among many technical communities (such as the Linux kernel developers, etc.). Recently, SourceHut has emerged as a free and open source code forge whose workflow is entirely based on plain text email as described on the useplaintext.email website. Naturally, this has led to both curiosity and friction as users attempt to adapt to working technology of the 1990s that Capital has made increasingly difficult to interact with using present-day software. As a maintainer of several projects on SourceHut, I have seen many of my valued collaborators completely fall on the ground when it comes to following the etiquette recommendations for plaintext emails (especially text wrapping) and this is certainly not for lack of trying on their part. The available technological options are just that difficult to use.The main viable tools for working with plain text email are terminal-based: for example, aerc and mutt and its derivatives. These are, however, a non-starter for most users (even technical users): for example, they are difficult to configure, and integrating with address books from standard email providers is nearly impossible or requires a Nobel prize; a bigger problem is that for most of these tools, one must store passwords in plain text (or use some very complicated authentication scheme that a typical user will be unable to set up).The only viable modern GUI client recommended by useplaintext.email for macOS is MailMate. Unfortunately, MailMate’s visible settings do not have any option for wrapping text, which would contradict its viability for plaintext email users. Furthermore, the versions of MailMate available for download from its website (both the released version and the prerelease) seem to be incompatible with macOS Sonoma. By doing a bit of spelunking, I have found that both these problems can be solved (at least for now).943jms-00QDjms-00QD.xmlUsing MailMate for plain text email2024310Jon Sterling939Step#252unstable-252.xmlDownload the prerelease of MailMate2024310Jon Sterlingjms-00QDFirst, download the latest prerelease binary for MailMate from this directory: http://updates.mailmate-app.com/archives/?C=M;O=D. I found that the this program can take a very long time to open for the first time, but after this it is speedy.941Step#253unstable-253.xmlEnable format=flowed2024310Jon Sterlingjms-00QDSecond, enable format=flowed using the following completely undocumented command:defaults write com.freron.MailMate MmFormatFlowedEnabled -bool trueNote that this option is so undocumented that it does not even appear in the list of hidden preferences. There is some discussion of this on the mailing list.Although the format=flowed standard standard is very rarely implemented by clients, email sent using this flag will be formatted correctly by SourceHut Lists.961jms-009Fjms-009F.xmlDay tensors of fibered categories202392020231217Jon SterlingI have been thinking about monoidal closed structures induced by slicing over a monoid, which has been considered by Combette and Munch-Maccagnoni as a potential denotational semantics of destructors à la C++. It occurred to me that this construction is, in fact, almost a degenerate case of Day convolution on an internal monoidal category — and this made me realize that there might be a nice way to understand Day convolution in the language of fibered categories. In fact, many of these results (in particular the relativization to internal monoidal categories) are probably a special case of Theorem 11.22 of Shulman’s paper on enriched indexed categories.947Definitionjms-009Ajms-009A.xmlThe “Day tensor” of fibered categories2023920Jon SterlingLet E and F be two fibered categories over a semimonoidal category { \mathopen {} \left ( B , \otimes , \alpha \right ) \mathclose {}}. We may define the “Day tensor” product E \otimes _ B F of E and F to be the following fibered category over B: E \otimes _ B F : \equiv \otimes _! { \mathopen {} \left ( \pi _1^* E \times _{ B \times B } \pi _2^* F \right ) \mathclose {}} Note that f_! refers to the left adjoint of base change f^* for fibered categories, which is not postcomposition.949Exegesisjms-009Cjms-009C.xmlStepping through the Day tensor of fibered categories2023920Jon SterlingTo understand the construction of the Day tensor, we will go through it step-by-step. Let E and F be two fibered categories over a semimonoidal category { \mathopen {} \left ( B , \otimes , \alpha \right ) \mathclose {}}.
Given that E and F are displayed over B, we may restrict them to lie the two disjoint regions of B \times B by base change along the two projections:
\RequirePackage {tikz}
\RequirePackage {amsmath}
\usetikzlibrary {backgrounds, intersections, calc, spath3, fit}
\definecolor {catccolor}{RGB}{255,244,138}
\tikzstyle {dot}=[circle, draw=black, fill=black, minimum size=1mm, inner sep=0mm]
\tikzstyle {catc}=[catccolor!60]
\tikzstyle {catd}=[orange!40]
\tikzstyle {cate}=[red!40]
\tikzstyle {catf}=[blue!10]
\tikzstyle {catg}=[green!25]
\tikzstyle {blue halo}=[fill=blue!10, opacity=0.7, rounded corners]
\tikzstyle {white halo}=[fill=white, opacity=0.7, rounded corners]
\NewDocumentCommand \CreateRect {D<>{} m m}{
\path
coordinate (#1nw)
++(#2,-#3) coordinate (#1se)
coordinate (#1sw) at (#1se -| #1nw)
coordinate (#1ne) at (#1nw -| #1se)
;
\path [spath/save = #1north] (#1nw) to (#1ne);
\path [spath/save = #1west] (#1nw) to (#1sw);
\path [spath/save = #1east] (#1ne) to (#1se);
\path [spath/save = #1south] (#1sw) to (#1se);
}
\usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd}
\usetikzlibrary{matrix,arrows}
\usetikzlibrary{backgrounds,fit,positioning,calc,shapes}
\usetikzlibrary{decorations.pathreplacing}
\usetikzlibrary{decorations.pathmorphing}
\usetikzlibrary{decorations.markings}
\tikzset{
desc/.style={sloped, fill=white,inner sep=2pt},
upright desc/.style={fill=white,inner sep=2pt},
pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pullback 45/.style = {
append after command={
\pgfextra{
\draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
ne pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm);
}
}
},
sw pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm);
}
}
},
dotted pullback/.style = {
append after command={
\pgfextra{
\draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
muted pullback/.style = {
append after command={
\pgfextra{
\draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pushout/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm);
}
}
},
between/.style args={#1 and #2}{
at = ($(#1)!0.5!(#2)$)
},
diagram/.style = {
on grid,
node distance=2cm,
commutative diagrams/every diagram,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
}
},
fibration/.style = {
-{Triangle[open]}
},
etale/.style = {
-{Triangle[open]}
},
etale cover/.style= {
>={Triangle[open]},->.>
},
opfibration/.style = {
-{Triangle}
},
lies over/.style = {
|-{Triangle[open]}
},
op lies over/.style = {
|-{Triangle}
},
embedding/.style = {
{right hook}->
},
open immersion/.style = {
{right hook}-{Triangle[open]}
},
closed immersion/.style = {
{right hook}-{Triangle}
},
closed immersion*/.style = {
{left hook}-{Triangle}
},
embedding*/.style = {
{left hook}->
},
open immersion*/.style = {
{left hook}-{Triangle[open]}
},
exists/.style = {
densely dashed
},
}
\newlength{\dontworryaboutit}
\tikzset{
inline diagram/.style = {
commutative diagrams/every diagram,
commutative diagrams/cramped,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
anchor = base,
inner sep = 0pt
},
every path/.append style = {
outer xsep = 2pt
}
}
}
\tikzset{
square/nw/.style = {},
square/ne/.style = {},
square/se/.style = {},
square/sw/.style = {},
square/north/.style = {->},
square/south/.style = {->},
square/west/.style = {->},
square/east/.style = {->},
square/north/node/.style = {above},
square/south/node/.style = {below},
square/west/node/.style = {left},
square/east/node/.style = {right},
}
\ExplSyntaxOn
\bool_new:N \l_jon_glue_west
\keys_define:nn { jon-tikz/diagram } {
nw .tl_set:N = \l_jon_tikz_diagram_nw,
sw .tl_set:N = \l_jon_tikz_diagram_sw,
ne .tl_set:N = \l_jon_tikz_diagram_ne,
se .tl_set:N = \l_jon_tikz_diagram_se,
width .tl_set:N = \l_jon_tikz_diagram_width,
height .tl_set:N = \l_jon_tikz_diagram_height,
north .tl_set:N = \l_jon_tikz_diagram_north,
south .tl_set:N = \l_jon_tikz_diagram_south,
west .tl_set:N = \l_jon_tikz_diagram_west,
east .tl_set:N = \l_jon_tikz_diagram_east,
nw/style .code:n = {\tikzset{square/nw/.style = {#1}}},
sw/style .code:n = {\tikzset{square/sw/.style = {#1}}},
ne/style .code:n = {\tikzset{square/ne/.style = {#1}}},
se/style .code:n = {\tikzset{square/se/.style = {#1}}},
glue .choice:,
glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool},
glue~target .tl_set:N = \l_jon_tikz_glue_target,
north/style .code:n = {\tikzset{square/north/.style = {#1}}},
north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}},
south/style .code:n = {\tikzset{square/south/.style = {#1}}},
south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}},
west/style .code:n = {\tikzset{square/west/.style = {#1}}},
west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}},
east/style .code:n = {\tikzset{square/east/.style = {#1}}},
east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}},
draft .meta:n = {
nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}},
sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}},
se = {\__jon_tikz_diagram_fmt_placeholder:n {se}},
ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}},
north = {\__jon_tikz_diagram_fmt_placeholder:n {north}},
south = {\__jon_tikz_diagram_fmt_placeholder:n {south}},
west = {\__jon_tikz_diagram_fmt_placeholder:n {west}},
east = {\__jon_tikz_diagram_fmt_placeholder:n {east}},
}
}
\tl_set:Nn \l_jon_tikz_diagram_width { 2cm }
\tl_set:Nn \l_jon_tikz_diagram_height { 2cm }
\cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 {
\texttt{\textcolor{red}{#1}}
}
\keys_set:nn { jon-tikz/diagram } {
glue~target = {},
}
\cs_new:Nn \__jon_tikz_render_square:nn {
\group_begin:
\keys_set:nn {jon-tikz/diagram} {#2}
\bool_if:nTF \l_jon_glue_west {
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
} {
\node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$};
\node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$};
\draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw);
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
}
\group_end:
}
\NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{
\__jon_tikz_render_square:nn {#1} {#2}
}
\NewDocumentCommand\DiagramSquare{D<>{}O{}m}{
\begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)]
\__jon_tikz_render_square:nn {#1} {#3}
\end{tikzpicture}
}
\ExplSyntaxOff
\DiagramSquare {
nw = \pi _1^* E ,
sw = B \times B ,
se = B ,
ne = E ,
east/style = {|->},
west/style = {|->,exists},
north/style = {->,exists},
south = \pi _1,
height = 1.5cm,
}
\qquad
\DiagramSquare {
nw = \pi _2^* F ,
sw = B \times B ,
se = B ,
ne = F ,
east/style = {|->},
west/style = {|->,exists},
north/style = {->,exists},
south = \pi _2,
height = 1.5cm,
}
The above can be seen as cartesian lift in the 2-bifibration of fibered categories over \mathbf {Cat}.
Next, we took the fiber product of fibered categories, giving a vertical span over B:
\pi _1^* E \leftarrow \pi _1^* E \times _{ B \times B } \pi _2^* F \rightarrow \pi _2^* F
Of course, this corresponds to pullback in \mathbf {Cat} or cartesian product in { \mathbf {Cat} } _{ / B }.
Finally, we take a cocartesian lift along the tensor functor { B \times B } \xrightarrow {{ \otimes }}{ B } to obtain the Day tensor:
\RequirePackage {tikz}
\RequirePackage {amsmath}
\usetikzlibrary {backgrounds, intersections, calc, spath3, fit}
\definecolor {catccolor}{RGB}{255,244,138}
\tikzstyle {dot}=[circle, draw=black, fill=black, minimum size=1mm, inner sep=0mm]
\tikzstyle {catc}=[catccolor!60]
\tikzstyle {catd}=[orange!40]
\tikzstyle {cate}=[red!40]
\tikzstyle {catf}=[blue!10]
\tikzstyle {catg}=[green!25]
\tikzstyle {blue halo}=[fill=blue!10, opacity=0.7, rounded corners]
\tikzstyle {white halo}=[fill=white, opacity=0.7, rounded corners]
\NewDocumentCommand \CreateRect {D<>{} m m}{
\path
coordinate (#1nw)
++(#2,-#3) coordinate (#1se)
coordinate (#1sw) at (#1se -| #1nw)
coordinate (#1ne) at (#1nw -| #1se)
;
\path [spath/save = #1north] (#1nw) to (#1ne);
\path [spath/save = #1west] (#1nw) to (#1sw);
\path [spath/save = #1east] (#1ne) to (#1se);
\path [spath/save = #1south] (#1sw) to (#1se);
}
\usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd}
\usetikzlibrary{matrix,arrows}
\usetikzlibrary{backgrounds,fit,positioning,calc,shapes}
\usetikzlibrary{decorations.pathreplacing}
\usetikzlibrary{decorations.pathmorphing}
\usetikzlibrary{decorations.markings}
\tikzset{
desc/.style={sloped, fill=white,inner sep=2pt},
upright desc/.style={fill=white,inner sep=2pt},
pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pullback 45/.style = {
append after command={
\pgfextra{
\draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
ne pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm);
}
}
},
sw pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm);
}
}
},
dotted pullback/.style = {
append after command={
\pgfextra{
\draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
muted pullback/.style = {
append after command={
\pgfextra{
\draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pushout/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm);
}
}
},
between/.style args={#1 and #2}{
at = ($(#1)!0.5!(#2)$)
},
diagram/.style = {
on grid,
node distance=2cm,
commutative diagrams/every diagram,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
}
},
fibration/.style = {
-{Triangle[open]}
},
etale/.style = {
-{Triangle[open]}
},
etale cover/.style= {
>={Triangle[open]},->.>
},
opfibration/.style = {
-{Triangle}
},
lies over/.style = {
|-{Triangle[open]}
},
op lies over/.style = {
|-{Triangle}
},
embedding/.style = {
{right hook}->
},
open immersion/.style = {
{right hook}-{Triangle[open]}
},
closed immersion/.style = {
{right hook}-{Triangle}
},
closed immersion*/.style = {
{left hook}-{Triangle}
},
embedding*/.style = {
{left hook}->
},
open immersion*/.style = {
{left hook}-{Triangle[open]}
},
exists/.style = {
densely dashed
},
}
\newlength{\dontworryaboutit}
\tikzset{
inline diagram/.style = {
commutative diagrams/every diagram,
commutative diagrams/cramped,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
anchor = base,
inner sep = 0pt
},
every path/.append style = {
outer xsep = 2pt
}
}
}
\tikzset{
square/nw/.style = {},
square/ne/.style = {},
square/se/.style = {},
square/sw/.style = {},
square/north/.style = {->},
square/south/.style = {->},
square/west/.style = {->},
square/east/.style = {->},
square/north/node/.style = {above},
square/south/node/.style = {below},
square/west/node/.style = {left},
square/east/node/.style = {right},
}
\ExplSyntaxOn
\bool_new:N \l_jon_glue_west
\keys_define:nn { jon-tikz/diagram } {
nw .tl_set:N = \l_jon_tikz_diagram_nw,
sw .tl_set:N = \l_jon_tikz_diagram_sw,
ne .tl_set:N = \l_jon_tikz_diagram_ne,
se .tl_set:N = \l_jon_tikz_diagram_se,
width .tl_set:N = \l_jon_tikz_diagram_width,
height .tl_set:N = \l_jon_tikz_diagram_height,
north .tl_set:N = \l_jon_tikz_diagram_north,
south .tl_set:N = \l_jon_tikz_diagram_south,
west .tl_set:N = \l_jon_tikz_diagram_west,
east .tl_set:N = \l_jon_tikz_diagram_east,
nw/style .code:n = {\tikzset{square/nw/.style = {#1}}},
sw/style .code:n = {\tikzset{square/sw/.style = {#1}}},
ne/style .code:n = {\tikzset{square/ne/.style = {#1}}},
se/style .code:n = {\tikzset{square/se/.style = {#1}}},
glue .choice:,
glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool},
glue~target .tl_set:N = \l_jon_tikz_glue_target,
north/style .code:n = {\tikzset{square/north/.style = {#1}}},
north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}},
south/style .code:n = {\tikzset{square/south/.style = {#1}}},
south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}},
west/style .code:n = {\tikzset{square/west/.style = {#1}}},
west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}},
east/style .code:n = {\tikzset{square/east/.style = {#1}}},
east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}},
draft .meta:n = {
nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}},
sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}},
se = {\__jon_tikz_diagram_fmt_placeholder:n {se}},
ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}},
north = {\__jon_tikz_diagram_fmt_placeholder:n {north}},
south = {\__jon_tikz_diagram_fmt_placeholder:n {south}},
west = {\__jon_tikz_diagram_fmt_placeholder:n {west}},
east = {\__jon_tikz_diagram_fmt_placeholder:n {east}},
}
}
\tl_set:Nn \l_jon_tikz_diagram_width { 2cm }
\tl_set:Nn \l_jon_tikz_diagram_height { 2cm }
\cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 {
\texttt{\textcolor{red}{#1}}
}
\keys_set:nn { jon-tikz/diagram } {
glue~target = {},
}
\cs_new:Nn \__jon_tikz_render_square:nn {
\group_begin:
\keys_set:nn {jon-tikz/diagram} {#2}
\bool_if:nTF \l_jon_glue_west {
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
} {
\node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$};
\node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$};
\draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw);
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
}
\group_end:
}
\NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{
\__jon_tikz_render_square:nn {#1} {#2}
}
\NewDocumentCommand\DiagramSquare{D<>{}O{}m}{
\begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)]
\__jon_tikz_render_square:nn {#1} {#3}
\end{tikzpicture}
}
\ExplSyntaxOff
\DiagramSquare {
nw = \pi _1^* E \times _{ B \times B } \pi _2^* F ,
sw = B \times B ,
south = \otimes ,
ne = E \otimes _{ B } F ,
se = B,
north/style = {->,exists},
east/style = {|->,exists},
west/style = {|->},
height = 1.5cm,
width = 3cm,
}
Note: the cocartesian lift above does not correspond to composition in \mathbf {Cat} except in the discrete case.
Under appropriate assumptions, we may also compute a “Day hom” by adjointness.953Definitionjms-009Gjms-009G.xmlThe “Day hom” of fibered categories2023920Jon SterlingLet E and F be two fibered categories over a semimonoidal category { \mathopen {} \left ( B , \otimes , \alpha \right ) \mathclose {}}. We may define the “Day hom” E \multimap _ B F of E and F to be the following fibered category over B: E \multimap _ B F : \equiv { \mathopen {} \left ( \pi _1 \right ) \mathclose {}} _* { \mathopen {} \left ( \pi _2^* E \Rightarrow _{ B \times B } \otimes ^* F \right ) \mathclose {}}
951Proof#546unstable-546.xml2023920Jon Sterlingjms-009G
First of all, we note that the pullback functor { { \mathbf {Cat} } _{ / B } } \xrightarrow {{ \pi _1^* }}{ { \mathbf {Cat} } _{ / B \times B } } has a right adjoint \pi _1^* \dashv { \mathopen {} \left ( \pi _1 \right ) \mathclose {}} _*, as { B \times B } \xrightarrow {{ \pi _1 }}{ B } is (as any cartesian fibration) a Conduché functor, i.e. an exponentiable arrow in \mathbf {Cat}. We have already assumed that E is a Cartesian fibration, and thus so is its restriction \pi _1^*E; it therefore follows that \pi _1^*E is exponentiable. With that out of the way, we may compute the hom by adjoint calisthenics:
\begin {aligned} & \mathbf {hom} _{ { \mathbf {Cat} } _{ / B } } { \mathopen {} \left ( X \otimes _B E , F \right ) \mathclose {}} \\ & \quad \equiv \mathbf {hom} _{ { \mathbf {Cat} } _{ / B } } { \mathopen {} \left ( \otimes _! { \mathopen {} \left ( \pi _1^*X \times _{B \times B} \pi _2^* E \right ) \mathclose {}} , F \right ) \mathclose {}} \\ & \quad \simeq \mathbf {hom} _{ { \mathbf {Cat} } _{ / B \times B } } { \mathopen {} \left ( \pi _1^*X \times _{B \times B} \pi _2^* E , \otimes ^* F \right ) \mathclose {}} \\ & \quad \simeq \mathbf {hom} _{ { \mathbf {Cat} } _{ / B \times B } } { \mathopen {} \left ( \pi _1^*X , \pi _2^* E \Rightarrow \otimes ^* F \right ) \mathclose {}} \\ & \quad \simeq \mathbf {hom} _{ { \mathbf {Cat} } _{ / B } } { \mathopen {} \left ( X , { \mathopen {} \left ( \pi _1 \right ) \mathclose {}} _* { \mathopen {} \left ( \pi _2^* E \Rightarrow \otimes ^* F \right ) \mathclose {}} \right ) \mathclose {}} \end {aligned}
955Definitionjms-009Bjms-009B.xmlThe “Day unit” of fibered categories2023920Jon SterlingLet { \mathopen {} \left ( B , \otimes ,I, \alpha , \lambda , \rho \right ) \mathclose {}} be a monoidal category. We may define a “Day unit” for fibered categories over B to be given by the discrete fibration { B } _{ / I } \to B, which corresponds under the Grothendieck construction to the presheaf represented by I.957Conjecturejms-009Djms-009D.xmlThe Day tensor preserves cartesian fibrations2023920Jon SterlingIf E and F are cartesian fibrations over a semimonoidal category { \mathopen {} \left ( B , \otimes , \alpha \right ) \mathclose {}}, then the Day tensor E \otimes _ B F is also a cartesian fibration.I believe, but did not check carefully, that when E and F are discrete fibrations over a semimonoidal category { \mathopen {} \left ( B , \otimes , \alpha \right ) \mathclose {}} then the Day tensor is precisely the discrete fibration corresponding to the (contravariant) Day convolution of the presheaves corresponding to E and F. Likewise when { \mathopen {} \left ( B , \otimes ,I, \alpha , \lambda , \rho \right ) \mathclose {}} is monoidal, it appears that the Day unit corresponds precisely to the traditional one.There remain some interesting directions to explore. First of all, the claims above would obviously lead to a new construction of the Day convolution monoidal structure on the 1-category of discrete fibrations on B that coincides with the traditional one up to the Grothendieck construction. But in general, we should expect to exhibit both { \mathbf {Cat} } _{ / B } and \mathbf {Fib}_{ B } as monoidal bicategories, a result that I have not seen before.960Conjecturejms-009Ejms-009E.xmlA monoidal bicategory of fibered categories2023920Let { \mathopen {} \left ( B , \otimes ,I, \alpha , \lambda , \rho \right ) \mathclose {}} be a fibered category. Then the Day tensor and unit extend to a monoidal structure on the bicategory of fibered categories over B. is highly non-trivial, as monoidal bicategories are extremely difficult to construct explicitly. I am hoping that Mike Shulman’s ideas involving monoidal double categories could potentially help.966jms-0094jms-0094.xmlOn the relationship between QTT and STC2023917Jon SterlingI have been thinking again about the relationship between quantitative type theory and synthetic Tait computability and other approaches to type refinements. One of the defining characteristics of QTT that I thought distinguished it from STC was the treatment of types: in QTT, types only depend on the “computational” / unrefined aspect of their context, whereas types in STC are allowed to depend on everything. In the past, I mistakenly believed that this was due to the realizability-style interpretation of QTT, in contrast with STC’s gluing interpretation. It is now clear to me that (1) QTT is actually glued (in the sense of q-realizability, no pun intended), and (2) the nonstandard interpretation of types in QTT corresponds to adding an additional axiom to STC, namely the tininess of the generic proposition.It has been suggested to me by Neel Krishnaswami that this property of QTT may not be desirable in all cases (sometimes you want the types to depend on quantitative information), and that for this reason, graded type theories might be a better way forward in some applications. My results today show that STC is, in essence, what you get when you relax the QTT’s assumption that types do not depend on quantitative information. This suggests that we should explore the idea of multiplicities within the context of STC — as any monoidal product on the subuniverse spanned by closed-modal types induces quite directly a form of variable multiplicity in STC, I expect this direction to be fruitful.My thoughts on the precise relationship between the QTT models and Artin gluing will be elucidated at a different time. Today, I will restrict myself to sketching an interpretation of a QTT-style language in STC assuming the generic proposition is internally tiny.Let \mathscr {Q} be an elementary topos equipped with a subterminal object \P \hookrightarrow \mathbf {1} inducing an open subtopos \mathscr {E} \simeq { \mathscr {Q} } _{ / \P } \hookrightarrow \mathscr {Q} and its complementary closed subtopos \mathscr {F} \hookrightarrow \mathscr {Q}. This structure is the basis of the interpretation of STC; if you think of STC in terms of refinements, then stuff from \mathscr {E} is “computational” and stuff from \mathscr {F} is “logical”.We now consider the interpretation of a language of (potentially quantitative) refinements into \mathscr {Q}. A context \Gamma is interpreted by an object of \mathscr {Q}; a type \Gamma \vdash A is interpreted by a family A \to \bigcirc { \Gamma }; a term \Gamma \vdash a : A is interpreted as a map \Gamma \to A such that \Gamma \to A \to \bigcirc \Gamma is the unit of the monad.So far we have not needed anything beyond the base structure of STC in order to give an interpretation of types in QTT’s style. But to extend this interpretation to a universe, we must additionally assume that \P is internally tiny, in the sense that the exponential functor { \mathopen {} \left ( - \right ) \mathclose {}} ^ \P is a left adjoint. Under these circumstances, the idempotent monad \bigcirc \equiv j_*j^* : \mathscr {Q} \to \mathscr {Q} corresponding to the open immersion j : \mathscr {E} \hookrightarrow \mathscr {Q} has a right adjoint \square : \mathscr {Q} \to \mathscr {Q}, an idempotent comonad.
964Proof#472unstable-472.xml2023917Jon Sterlingjms-0094
As \bigcirc \equiv j_*j^* is the exponential functor { \mathopen {} \left ( - \right ) \mathclose {}} ^ \P, its right adjoint \square is therefore the “root functor” { \mathopen {} \left ( - \right ) \mathclose {}} _ \P that exhibits \P as an internally tiny object.
Although \square lifts to each slice of \mathscr {Q}, these liftings do not commute with base change; this will, however, not be an obstacle for us.We will now see how to use the adjunction \bigcirc \dashv \square to interpret a universe, either for the purpose of interpreting universes of refinement types, or for the purpose of strictifying the model that we have sketched. Let \mathcal {V} be a (standard) universe in \mathscr {Q}, e.g. a Hofmann–Streicher universe; we shall then interpret the corresponding universe of refinements as \mathcal {U} : \equiv \square \mathcal {V}. To see that \mathcal {U} classifies \mathcal {V}-small families of refinements, we compute as follows:A code \Gamma \vdash \hat {A} : \mathcal {U} amounts to nothing more than a morphism \hat {A}: \Gamma \to \square { \mathcal {V} }.
By adjoint transpose, this is the same as a morphism \hat {A}^ \sharp : \bigcirc { \Gamma } \to \mathcal {V}.Thus we see that if \mathcal {V} is generic for \mathcal {V}-small families of (arbitrary) types in \mathscr {Q}, then \mathcal {U} \equiv \square \mathcal {V} is generic for \mathcal {V}-small families of type refinements, i.e. types whose context is \bigcirc-modal.Finally, we comment that the tininess of \P is satisfied in many standard examples, the simplest of which is the Sierpiński topos \mathbf {Set} ^{ \to }.969jms-008Jjms-008J.xmlClassifying topoi and generalised abstract syntax202395Jon SterlingFor any small category \mathscr {C}, Fiore treats \mathscr {C}-sorted abstract syntax in the functor category { \mathopen {} \left [ \mathscr {C} , \operatorname {Pr} { \mathopen {} \left ( \operatorname { \mathbb {L}} { \mathscr {C} } \right ) \mathclose {}} \right ] \mathclose {}} where \operatorname { \mathbb {L}} is some 2-monad on \mathbf {Cat}; any such functor P denotes a set that is indexed in sorts and contexts (where the 2-monad \operatorname { \mathbb {L}} takes a category of sorts to the corresponding category of contexts). When \mathscr {C} is a set and \operatorname { \mathbb {L}} is either finite limit or finite product completion, we recover the standard notions of many-sorted abstract syntax; in general, we get a variety of forms of dependently sorted or generalised abstract syntax.We will assume here that \operatorname { \mathbb {L}} is the free finite limit completion 2-monad; our goal is to study Fiore’s general substitution monoidal structure from the point of view of classifying topoi, building on Johnstone’s analogous observations (Elephant, D3.2) on the non-symmetric monoidal structure of the object classifier. The topos theoretic viewpoint that we will explore is nothing more than a rephrasing of Fiore’s account in terms of the Kleisli composition in a 2-monad; nonetheless the perspective of classifying topoi is enlightening, as it provides an explanation for precisely what internal geometrical structure one expects in a given topos for abstract syntax, potentially leading to improved internal languages.For any small category \mathscr {C}, the category of presheaves \operatorname {Pr} { \mathopen {} \left ( \operatorname { \mathbb {L}} \mathscr {C} \right ) \mathclose {}} corresponds to the classifying topos of diagrams of shape \mathscr {C}. Following Anel and Joyal, we shall write \mathbb {A} ^{ \mathscr {C} } for this “affine” classifying topos; under the conventions of op. cit., we may then identify the category of sheaves \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } with the presheaf category \operatorname {Pr} { \mathopen {} \left ( \operatorname { \mathbb {L}} { \mathscr {C} } \right ) \mathclose {}}.The universal property of \mathbb {A} ^{ \mathscr {C} } as the classifying topos of \mathscr {C}-diagrams means that for any topos \mathcal {X}, a diagram { \mathscr {C} } \xrightarrow {{ P }}{ \operatorname {Sh} { \mathcal {X} } } corresponds essentially uniquely (by left Kan extension) to a morphism of topoi { \mathcal {X} } \xrightarrow {{ \bar {P} }}{ \mathbb {A} ^{ \mathscr {C} } }. We have a generic \mathscr {C}-shaped diagram { \mathscr {C} } \xrightarrow {{ \mathrm {G}_{ \mathscr {C} } }}{ \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } } corresponding under this identification to the identity map on \mathbb {A} ^{ \mathscr {C} }. More explicitly, the diagram \mathrm {G}_{ \mathscr {C} } is the following composite: \mathrm {G}_{ \mathscr {C} } : \equiv \mathscr {C} \xrightarrow { \eta _ \mathscr {C} } \operatorname { \mathbb {L}} \mathscr {C} \xrightarrow { よ_{ \mathscr {C} } } \operatorname {Pr} { \mathopen {} \left ( \operatorname { \mathbb {L}} \mathscr {C} \right ) \mathclose {}} = \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } Given a morphism of topoi { \mathcal {X} } \xrightarrow {{ f }}{ \mathbb {A} ^{ \mathscr {C} } }, we may recover the diagram \mathscr {C} \to { \operatorname {Sh} { \mathcal {X} }} that it classifies as the composite \mathscr {C} \xrightarrow { \mathrm {G}_{ \mathscr {C} } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow { f ^{ * } } \operatorname {Sh} { \mathcal {X} }.In case \mathcal {X} \equiv \mathbb {A} ^{ \mathscr {C} }, then, we have a correspondence between \mathscr {C}-shaped diagrams of sheaves on \mathbb {A} ^{ \mathscr {C} } and endomorphisms of \mathbb {A} ^{ \mathscr {C} }; we are interested in representing the compositions of such endomorphisms as a tensor product on the functor category { \mathopen {} \left [ \mathscr {C} , \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \right ] \mathclose {}}.In particular, let { \mathscr {C} } \xrightarrow {{ P,Q }}{ \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } } be two diagrams; taking characteristic maps, we have endomorphisms of affine topoi { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow {{ \bar {P}, \bar {Q} }}{ \mathbb {A} ^{ \mathscr {C} } }, which we may compose to obtain { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow {{ \bar {Q} \circ \bar {P} }}{ \mathbb {A} ^{ \mathscr {C} } }; then, we will define the tensor P \bullet Q to be the diagram whose characteristic morphism of affine topoi is \bar {P} \circ \bar {Q}. In other words: \begin {aligned} P \bullet Q &: \equiv \mathscr {C} \xrightarrow { \mathrm {G}_{ \mathscr {C} } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow { { \mathopen {} \left ( \bar {P} \circ \bar {Q} \right ) \mathclose {}} ^{ * } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \\ &= \mathscr {C} \xrightarrow { \mathrm {G}_{ \mathscr {C} } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow { \bar {P} ^{ * } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow { \bar {Q} ^{ * } } \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } \end {aligned} To give an explicit computation of the tensor product, we first compute the inverse image of any { \mathbb {A} ^{ \mathscr {C} } } \xrightarrow {{ f }}{ \mathbb {A} ^{ \mathscr {C} } } on representables よ_{ \mathscr {C} } \Gamma for \Gamma \in \operatorname { \mathbb {L}} \mathscr {C}. As any left exact functor { \operatorname { \mathbb {L}} \mathscr {C} } \xrightarrow {{ H }}{ \mathscr {E} } is the right Kan extension of { \mathscr {C} } \xrightarrow {{ H \circ \eta _ \mathscr {C} }}{ \mathscr {E} } along { C } \xrightarrow {{ \eta _ \mathscr {C} }}{ \operatorname { \mathbb {L}} \mathscr {C} }, we can conclude that H \Gamma \cong \operatorname {lim}_{ \Gamma \to \eta _ \mathscr {C} {d} } H { \mathopen {} \left ( \eta _ \mathscr {C} {d} \right ) \mathclose {}}. We will use this in our calculation below, setting H: \equiv f ^{ * } \circ よ_{ \mathscr {C} }. \begin {aligned} f ^{ * } { よ_{ \mathscr {C} } \Gamma } & \cong \operatorname {lim}_{ \Gamma \to \eta _ \mathscr {C} {d} } f ^{ * } { よ_{ \mathscr {C} } { \eta _ \mathscr {C} {d}} } \\ & \cong \operatorname {lim}_{ \Gamma \to \eta _ \mathscr {C} {d} } f ^{ * } { \mathrm {G}_{ \mathscr {C} } {d}} \end {aligned} We are now prepared to compute the tensor product of any { \mathscr {C} } \xrightarrow {{ P,Q }}{ \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } }. \begin {aligned} { \mathopen {} \left ( P \bullet Q \right ) \mathclose {}} c &= \bar {Q} ^{ * } \bar {P} ^{ * } \mathrm {G}_{ \mathscr {C} } {c} \\ & \cong \bar {Q} ^{ * } { \mathopen {} \left ( Pc \right ) \mathclose {}} \\ & \cong \bar {Q} ^{ * } \operatorname {colim}_{ よ_{ \mathscr {C} } \Delta \to Pc } よ_{ \mathscr {C} } \Delta \\ & \cong \operatorname {colim}_{ よ_{ \mathscr {C} } \Delta \to Pc } \bar {Q} ^{ * } よ_{ \mathscr {C} } \Delta \\ & \cong \operatorname {colim}_{ よ_{ \mathscr {C} } \Delta \to Pc } \operatorname {lim}_{ \Delta \to \eta _ \mathscr {C} {d} } \bar {Q} ^{ * } \mathrm {G}_{ \mathscr {C} } d \\ & \cong \operatorname {colim}_{ よ_{ \mathscr {C} } \Delta \to Pc } \operatorname {lim}_{ \Delta \to \eta _ \mathscr {C} {d} } Qd \end {aligned} Finally, we can relate the computation above to that of Fiore in terms of coends. \begin {aligned} { \mathopen {} \left ( P \bullet Q \right ) \mathclose {}} \, c & \cong \operatorname {colim}_{ よ_{ \mathscr {C} } \Delta \to Pc } \operatorname {lim}_{ \Delta \to \eta _ \mathscr {C} {d} } Qd \\ & \cong \int ^{ \Delta \in \operatorname { \mathbb {L}} \mathscr {C} } { \mathopen {} \left [ よ_{ \mathscr {C} } \Delta ,Pc \right ] \mathclose {}} \cdot \operatorname {lim}_{ \Delta \to \eta _ \mathscr {C} {d} } Qd \\ & \cong \int ^{ \Delta \in \operatorname { \mathbb {L}} \mathscr {C} } P \, c \, \Delta \cdot \operatorname {lim}_{ \Delta \to \eta _ \mathscr {C} {d} } Qd \\ & \cong \int ^{ \Delta \in \operatorname { \mathbb {L}} \mathscr {C} } P \, c \, \Delta \cdot \int _{c \in \mathscr {C} } { \mathopen {} \left [ \Delta , \eta _ \mathscr {C} {d} \right ] \mathclose {}} \pitchfork Qd \end {aligned} Above, we have written { \mathopen {} \left ( \cdot \right ) \mathclose {}} and { \mathopen {} \left ( \pitchfork \right ) \mathclose {}} for the tensoring and cotensoring of \operatorname {Sh} { \mathbb {A} ^{ \mathscr {C} } } over \mathbf {Set} respectively. Thus, the fully pointwise computation is as follows: { \mathopen {} \left ( P \bullet Q \right ) \mathclose {}} \, c \, \Gamma \cong \int ^{ \Delta \in \operatorname { \mathbb {L}} \mathscr {C} } P \, c \, \Delta \times \int _{c \in \mathscr {C} } { \mathopen {} \left [ \Delta , \eta _ \mathscr {C} {d} \right ] \mathclose {}} \Rightarrow Q \, d \, \Gamma Thanks to Marcelo Fiore and Daniel Gratzer for helpful discussions.1001jms-0084jms-0084.xmlScientific refereeing using Bike Outliner2023831Jon SterlingI have long been an enthusiast for outliners, a genre of computer software that deserves more than almost any other to be called an “elegant weapon for a more civilized age”. Recently I have been enjoying experimenting with Jesse Grosjean’s highly innovative outliner for macOS called Bike, which builds on a lot of his previous (highly impressive) work in the area with a level of fit and finish that is rare even in the world of macOS software. Bike costs $29.99 and is well-worth it; watch the introductory video or try the demo to see for yourself.The purpose of outliners is to provide room to actively think; Grothendieck is said to have been unable to think at all without a pen in his hand, and I think of outliners as one way to elevate the tactile aspect of active thinking using the unique capabilities of software. Tools for thinking must combat stress and mental weight, and the most immediate way that outliners achieve this is through the ability to focus on a subtree — narrowing into a portion of the hierarchy and treating it as if it were the entire document, putting its context aside. This feature, which some of my readers may recognize from Emacs org-mode, is well-represented in Bike — without, of course, suffering the noticeable quirks that come from the Emacs environment, nor the ill-advised absolute/top-down model of hierarchy sadly adopted by org-mode.As a scientist in academia, one of the most frequent things I am doing when I am not writing my own papers or working with students is refereeing other scientists’ papers. For those who are unfamiliar, this means carefully studying a paper and then producing a detailed and well-structured report that includes a summary of the paper, my personal assessment of its scientific validity and value, and a long list of corrections, questions, comments, and suggestions. Referee reports of this kind are then used by journal editors and conference program committees to decide which papers deserve to be published.In this post, I will give an overview of my refereeing workflow with Bike and how I overcame the challenges transferring finished referee reports from Bike into the text-based formats used by conference refereeing platforms like HotCRP and EasyChair using a combination of XSLT 2.0 and Pandoc. This tutorial applies to Bike 1.14; I hope the format will not change too much, but I cannot make promises about what I do not control.982jms-0089jms-0089.xmlRefereeing in an outliner2023829Jon SterlingMost scientific conferences solicit and organize reviews for papers using a web platform such as HotCRP and EasyChair; although these are not the same, the idea is similar. Once you have been assigned to referee a paper, you will receive a web form with several sections and large text areas in which to put the components of your review; not all conferences ask for the same components, but usually one is expected to include the following in addition to your (numerical) assessment of the paper’s merit and your expertise:A summmary of the paper
Your assessment of the paper
Detailed comments for the authors
Questions to be addressed by author response
Comments for the PC (program committee) and other reviewersUsually you will be asked to enter your comments under each section in a plain text format like Markdown. The first thing a new referee learns is not to type answers directly into the web interface, because this is an extremely reliable way to lose hours of your time when a browser or server glitch deletes all your work. Most of us instead write out our answers in a separate text editor, and paste them into the web interface when we are satisfied with them. In the past, I have done this with text files on my computer, but today I want to discuss how to draft referee reports as outlines in Bike; then I will show you how to convert them to the correct plain text format that can be pasted into your conference’s preferred web-based refereeing platform.To start off, have a look at the figure below, which shows my refereeing template outline in Bike.976Figurejms-0086jms-0086.xmlA Bike outline for conference refereeing2023828Jon Sterlinghttps://git.sr.ht/~jonsterling/bike-convertors/tree/main/item/example.bikehttp://www.w3.org/1999/xhtml{
This Bike outline contains the skeleton of a referee report for a major computer science conference. Bike’s row types are used extensively: for example, sections are formatted as headings, and prompts are formatted as notes
}As you can see, there is a healthy combination of hierarchy and formatting in a Bike outline.978jms-008Bjms-008B.xmlRich text editing in Bike2023829Jon SterlingBike is a rich text editor, but one that (much like GNU TeXmacs) avoids the classic pitfalls of nearly all rich text editors, such as the ubiquitous and dreaded “Is the space italic?!” user-experience failure; I will not outline Bike innovative approach to rich text editing here, but I suggest you check it out for yourself.980jms-008Ajms-008A.xmlRow types in Bike2023829Jon SterlingOne of the most useful features of Bike’s approach to formatting is the concept of a row type, which is a semantic property of a row that has consequences for its visual presentation. Bike currently supports the following row types:Plain rows
Heading rows, formatted in boldface
Note rows, formatted in gray italics
Quote rows, formatted with a vertical bar to their left
Ordered rows, formatted with an (automatically chosen) numeral to their left
Unordered rows, formatted with a bullet to their left
Task rows, formatted with a checkbox to their leftMy refereeing template uses several of these row types (headings, notes, tasks) as well as some of the rich text formatting (highlighting). When I fill out the refereeing outline, I will use other row types as well — including quotes, ordered, and unordered rows. I will create a subheading under Detailed comments for the author to contain my questions and comments, which I enter in as ordered rows; then I make a separate subheading at the same level for Typographical errors, which I populate with unordered rows. Unordered rows are best for typos, because they are always accompanied already by line numbers. If I need to quote an extended portion of the paper, I will use a quote row.When working on the report outline, I will constantly be focusing on individual sections to avoid not only distractions but also the intense mental weight of unfinished sections. Focusing means that the entire outline is narrowed to a subtree that can be edited away from its context; this is achieved in Bike by pressing the gray south-easterly arrows to the right of each heading, as seen in the figure. 994jms-008Cjms-008C.xmlFrom an outline to a plain text referee report2023831Jon SterlingAlthough we have seen how pleasant it is to use an outliner like Bike to draft a referee report, but we obviously cannot submit a .bike file to a conference reviewing website or a journal editor. Most conference reviewing systems accept plain text or Markdown responses, and so our goal will be to convert a Bike outline into reasonably formatted Markdown.It happens that Bike’s underlying format is HTML, so one idea would be to use Pandoc to process this HTML into Markdown. This would work, except that Bike’s model is sufficiently structured that it must make deeply idiosyncratic use of HTML tags, as can be seen from the listing below.
984Listingjms-008Gjms-008G.xmlThe source code to a typical Bike outline2023831Jon Sterling<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8"/>
</head>
<body>
<ul id="2sbcmmms">
<li id="3C" data-type="heading">
<p>Tasks</p>
<ul>
<li id="cs" data-type="task">
<p>read through paper on iPad and highlight</p>
</li>
</ul>
</li>
<li id="XZ" data-type="heading">
<p>Syntax and semantics of foo bar baz</p>
<ul>
<li id="Kp">
<p><mark>Overall merit:</mark><span> </span></p>
</li>
<li id="d9">
<p><mark>Reviewer Expertise:</mark></p>
</li>
<li id="uM" data-type="heading">
<p>Summary of the paper</p>
<ul>
<li id="oB" data-type="note">
<p>Please give a brief summary of the paper</p>
</li>
<li id="TWZ">
<p>This paper describes the syntax and semantics of foo, bar, and baz.</p>
</li>
</ul>
</li>
<li id="V8" data-type="heading">
<p>Assessment of the paper</p>
<ul>
<li id="vD" data-type="note">
<p>Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.</p>
</li>
</ul>
</li>
<li id="zo" data-type="heading">
<p>Detailed comments for authors</p>
<ul>
<li id="o0" data-type="note">
<p>Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.</p>
</li>
<li id="bgy" data-type="heading">
<p>Minor comments</p>
<ul>
<li id="tMq" data-type="unordered">
<p>line 23: "teh" => "the"</p>
</li>
<li id="EmX" data-type="unordered">
<p>line 99: "fou" => "foo"</p>
</li>
</ul>
</li>
</ul>
</li>
<li id="aN" data-type="heading">
<p>Questions to be addressed by author response</p>
<ul>
<li id="7s" data-type="note">
<p>Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.</p>
</li>
</ul>
</li>
<li id="4S" data-type="heading">
<p>Comments for PC and other reviewers</p>
<ul>
<li id="bN" data-type="note">
<p>Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.</p>
</li>
<li id="i2b">
<p>In case any one is wondering, I am an expert in foo, but not in bar nor baz.</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</body>
</html>
The Markdown that would result from postprocessing a Bike outline directly with Pandoc would be deeply unsuitable for submission. We will, however, use a version of this idea: first we will preprocess the Bike format into more conventional (unstructured) HTML using XSLT 2.0, and then we will use Pandoc to convert this into Markdown.
986jms-008Ejms-008E.xmlSystem requirements2023831Jon SterlingXSLT 2.0 is unfortunately only implemented by proprietary tools like Saxon, developed by Saxonica. Nonetheless, it is possible to freely install Saxon on macOS using Homebrew:brew install saxonYou must also install Pandoc, which is also conveniently available as a binary on Homebrew:brew install pandoc
With the system requirements out of the way, we can proceed to prepare an XSLT stylesheet that will convert Bike’s idiosyncratic use of HTML tags to more conventional HTML that can be processed into Markdown by Pandoc. The stylesheet bike-to-html.xsl is described and explained in the listing below.
988Listingjms-0087jms-0087.xmlAn XSLT 2.0 transformer to convert Bike outlines to HTML2023829Jon Sterlinghttps://git.sr.ht/~jonsterling/bike-convertors/tree/main/item/bike-to-html.xslWe can write convert Bike outlines to reasonable HTML using an XSLT 2.0 stylesheet, bike-to-html.xsl detailed below.<?xml version="1.0"?>
<xsl:stylesheet version="2.0"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:html="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xhtml">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />We will allow several tags to be copied verbatim into the output, as Bike uses these in the same way that idiomatic HTML does. <xsl:template match="html:html | html:body | html:code | html:strong | html:em | html:mark">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>Bike leaves behind a lot of empty span elements; we drop these. <xsl:template match="html:span">
<xsl:apply-templates />
</xsl:template>Bike uses ul for all lists; the list type is determined not at this level, but rather by each individual item’s @data-type attribute. To get this data into the HTML list model, we must group items that have the same @data-type and wrap them in an appropriate list-forming element.To do this, we use XSLT 2.0’s xsl:for-each-group instruction to group adjacent li elements by their @data-type attribute. (It is extremely difficult and error-prone to write equivalent code in the more widely available XSLT 1.0.) We must convert @data-type to a string: otherwise, the transformer will crash when it encounters an item without a @data-type attribute. <xsl:template match="html:ul">
<xsl:for-each-group select="html:li" group-adjacent="string(@data-type)">
<xsl:choose>
<xsl:when test="@data-type='ordered' or @data-type='task'">
<ol>
<xsl:apply-templates select="current-group()" />
</ol>
</xsl:when>
<xsl:when test="@data-type='unordered'">
<ul>
<xsl:apply-templates select="current-group()" />
</ul>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()" />
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:template>Next, we match each individual li element; the content of a list item is stored in a p element directly under li, so we let the transformer fall thorugh the parent and then format the content underneath according to the @data-type of the item. <xsl:template match="html:li">
<xsl:apply-templates />
</xsl:template>
<xsl:template
match="html:li[@data-type='ordered' or @data-type='unordered' or @data-type='task']/html:p">
<li>
<xsl:apply-templates />
</li>
</xsl:template>Bike has correctly adopted the optimal explicit and relative model of hierarchy, in contrast to HTML; this means that the depth of a heading is not reflected in the element that introduces it, but is instead inferred from its actual position in the outline hierarchy. To convert Bike outlines to idiomatic HTML, we must flatten the hierarchy and introduce explicit heading levels; luckily, this is easy to accomplish in XSLT by counting the ancestors of heading type. <xsl:template match="html:li[@data-type='heading']/html:p">
<xsl:element
name="h{count(ancestor::html:li[@data-type='heading'])}">
<xsl:apply-templates />
</xsl:element>
</xsl:template>The remainder of the row types are not difficult to render; you may prefer alternative formatting depending on your goals. <xsl:template match="html:li[@data-type='quote']/html:p">
<blockquote>
<xsl:apply-templates />
</blockquote>
</xsl:template>
<xsl:template match="html:li[@data-type='note']/html:p">
<p>
<em>
<xsl:apply-templates />
</em>
</p>
</xsl:template>
<xsl:template match="html:li[not(@data-type)]/html:p">
<p>
<xsl:apply-templates />
</p>
</xsl:template>
</xsl:stylesheet>
Next, we can use Saxon to convert a Bike outline to idiomatic HTML using the stylesheet above.cat review.bike | saxon -xsl:bike-to-html.xsl - > review.htmlGo ahead and open the resulting HTML file in a text editor and a browser to see the results.
990Listingjms-008Fjms-008F.xmlA Bike outline transformed to idiomatic HTML2023831Jon SterlingThe following is the result of transforming an example Bike outline to idiomatic HTML using an XSLT stylesheet.<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<h1>Tasks</h1>
<ol>
<li>read through paper on iPad and highlight</li>
</ol>
<h1>Syntax and semantics of foo bar baz</h1>
<p>
<mark>Overall merit:</mark>
</p>
<p>
<mark>Reviewer Expertise:</mark>
</p>
<h2>Summary of the paper</h2>
<p>
<em>Please give a brief summary of the paper</em>
</p>
<p>This paper describes the syntax and semantics of foo, bar, and baz.</p>
<h2>Assessment of the paper</h2>
<p>
<em>Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.</em>
</p>
<h2>Detailed comments for authors</h2>
<p>
<em>Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.</em>
</p>
<h3>Minor comments</h3>
<ul>
<li>line 23: "teh" => "the"</li>
<li>line 99: "fou" => "foo"</li>
</ul>
<h2>Questions to be addressed by author response</h2>
<p>
<em>Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.</em>
</p>
<h2>Comments for PC and other reviewers</h2>
<p>
<em>Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.</em>
</p>
<p>In case any one is wondering, I am an expert in foo, but not in bar nor baz.</p>
</body>
</html>
Next, we will process this HTML file using Pandoc; unfortunately, Pandoc leaves behind a lot of garbage character escapes that are not suitable for submission anywhere, so we must filter those out using sed.cat review.html | pandoc -f html -t markdown-raw_html-native_divs-native_spans-fenced_divs-bracketed_spans-smart | sed 's/\\//g'
992Listingjms-008Hjms-008H.xmlA Bike outline transformed to Markdown2023831Jon SterlingThe following is the result of converting the idiomatic HTML representation of a Bike outline to Markdown using Pandoc, with some light postprocessing by sed.# Tasks
1. read through paper on iPad and highlight
# Syntax and semantics of foo bar baz
Overall merit:
Reviewer Expertise:
## Summary of the paper
*Please give a brief summary of the paper*
This paper describes the syntax and semantics of foo, bar, and baz.
## Assessment of the paper
*Please give a balanced assessment of the paper's strengths and
weaknesses and a clear justification for your review score.*
## Detailed comments for authors
*Please give here any additional detailed comments or questions that you
would like the authors to address in revising the paper.*
### Minor comments
- line 23: "teh" => "the"
- line 99: "fou" => "foo"
## Questions to be addressed by author response
*Please list here any specific questions you would like the authors to
address in their author response. Since authors have limited time in
which to prepare their response, please only ask questions here that are
likely to affect your accept/reject decision.*
## Comments for PC and other reviewers
*Please list here any additional comments you have which you want the PC
and other reviewers to see, but not the authors.*
In case any one is wondering, I am an expert in foo, but not in bar nor
baz.
We can group these tasks into a one-liner as follows:cat review.bike | saxon -xsl:bike-to-html.xsl - | pandoc -f html -t markdown-raw_html-native_divs-native_spans-fenced_divs-bracketed_spans-smart | sed 's/\\//g'996jms-008Ijms-008I.xmlA convenient Bike-to-Markdown script2023831Jon Sterlinghttps://git.sr.ht/~jonsterling/bike-convertorsI have gathered the scripts to convert Bike outlines into Markdown via idiomatic HTML in a Git repository where they can be easily downloaded. If you have any improvements to these scripts, please submit them as a patch to my public inbox! I am also interested in whether it is possible to write the XSLT 2.0 stylesheet as equivalent XSLT 1.0, to avoid requiring the proprietary Saxon tool. Feel free also to send comments on this post to my public inbox, or discuss with me on Mastodon.
999☕jms-0088jms-0088.xmlCrowdfunding and sponsorshipJon SterlingApart from my day-job at the University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
1033jms-007Tjms-007T.xmlA synthetic proof of HTT 7.2.1.142023818Jon SterlingLurie states the following result as Proposition 7.2.1.14 of Higher Topos Theory:
Let \mathcal {X} be an ∞-topos and let { \mathcal {X} } \xrightarrow {{ \tau _{ \leq 0} }}{ \tau _{ \leq 0} \mathcal {X} } be the left adjoint to the inclusion. A morphism { U } \xrightarrow {{ \phi }}{ X } in \mathcal {X} is an effective epimorphism if and only if in \tau _{ \leq 0} { \mathopen {} \left ( \phi \right ) \mathclose {}} is an effective epimorphism in the ordinary topos \mathrm {h} { \mathopen {} \left ( \tau _{ \leq0 } \mathcal {X} \right ) \mathclose {}}.
Several users of MathOverflow have noticed that the proof of this result given by Lurie is circular. The result is true and can be recovered in a variety of ways, as pointed out in the MathOverflow discussion. For expository purposes, I would like to show how to prove this result directly in univalent foundations using standard results about n-truncations from the HoTT Book.First we observe a trivial lemma about truncations.1005Lemmajms-007Ujms-007U.xmlPropositions do not notice set truncation of indices2023818Jon SterlingAny proposition is orthogonal to the map \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } } B \left \lvert x \right \rvert _{ 0 } \to \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } } Bx sending { \mathopen {} \left ( x,y \right ) \mathclose {}} to { \mathopen {} \left ( \left \lvert x \right \rvert _{ 0 } ,y \right ) \mathclose {}}.
1003Proof#492unstable-492.xml2023818Jon Sterlingjms-007U
This follows from the dependent induction principle of set-truncation, as any proposition is a set.
It follows from that \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( x:A \right ) \mathclose {}} } } B \left \lvert x \right \rvert _{ 0 } \right \rVert _{ -1 } is equivalent to \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( x: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } } Bx \right \rVert _{ -1 }. This is enough to deduce the main result.1031Theoremjms-007Vjms-007V.xmlSet-truncation preserves and reflects surjections2023818Jon SterlingA function f:A \to B is surjective (i.e. an effective epimorphism) if and only if its set-truncation \left \lVert f \right \rVert _{ 0 } : \left \lVert A \right \rVert _{ 0 } \to \left \lVert B \right \rVert _{ 0 } is surjective.
1029Proof#491unstable-491.xml2023818Jon Sterlingjms-007V
We deduce that \mathsf { isSurjective } \, \left \lvert f \right \rvert _{ 0 } is equivalent to \mathsf { isSurjective } \, f as follows:
\mathsf { isSurjective } \, \left \lvert f \right \rvert _{ 0 }
\equiv \mathchoice { \textstyle \prod _{ { \mathopen {} \left ( b: \left \lVert B \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \textstyle \prod _{ { \mathopen {} \left ( b: \left \lVert B \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptstyle \prod _{ { \mathopen {} \left ( b: \left \lVert B \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptscriptstyle \prod _{ { \mathopen {} \left ( b: \left \lVert B \right \rVert _{ 0 } \right ) \mathclose {}} } } \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } } \left \lVert f \right \rVert _{ 0 } a=b \right \rVert _{ -1 }
by definition
\simeq \mathchoice { \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptscriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } } \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( a: \left \lVert A \right \rVert _{ 0 } \right ) \mathclose {}} } } \left \lVert f \right \rVert _{ 0 } a= \left \lvert b \right \rvert _{ 0 } \right \rVert _{ -1 }
by induction
\simeq \mathchoice { \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptscriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } } \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } } \left \lvert fa \right \rvert _{ 0 } = \left \lvert b \right \rvert _{ 0 } \right \rVert _{ -1 }
by Lemma
\simeq \mathchoice { \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptscriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } } \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } } \left \lVert fa=b \right \rVert _{ -1 } \right \rVert _{ -1 }
by HoTT Book, Theorem 7.3.12
\simeq \mathchoice { \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \textstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } }{ \scriptscriptstyle \prod _{ { \mathopen {} \left ( b:B \right ) \mathclose {}} } } \left \lVert \mathchoice { \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \textstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } }{ \scriptscriptstyle \sum _{ { \mathopen {} \left ( a:A \right ) \mathclose {}} } } fa=b \right \rVert _{ -1 }
by HoTT Book, Theorem 7.3.9
\equiv \mathsf { isSurjective } \, f
by definition
Incidentally, the appeal to Theorem 7.3.12 of the HoTT Book can be replaced by the more general Proposition 2.26 of Christensen et al., which applies to an arbitrary reflective subuniverse and the corresponding subuniverse of separated types.
1035jms-0052jms-0052.xmlBuild your own Stacks Project in 10 minutes20235142024425Jon SterlingThe Stacks project is the most successful scientific hypertext project in history. Its goal is to lay the foundations for the theory of algebraic stacks; to facilitate its scalable and sustainable development, several important innovations have been introduced, with the tags system being the most striking.
Each tag refers to a unique item (section, lemma, theorem, etc.) in order for this project to be referenceable. These tags don't change even if the item moves within the text. (Tags explained, The Stacks Project).
Many working scientists, students, and hobbyists have wished to create their own tag-based hypertext knowledge base, but the combination of tools historically required to make this happen are extremely daunting. Both the Stacks project and Kerodon use a cluster of software called Gerby, but bitrot has set in and it is no longer possible to build its dependencies on a modern environment without significant difficulty, raising questions of longevity.Moreover, Gerby’s deployment involves running a database on a server (in spite of the fact that almost the entire functionality is static HTML), an architecture that is incompatible with the constraints of the everyday working scientist or student who knows at most how to upload static files to their university-provided public storage. The recent experience of the nLab’s pandemic-era hiatus and near death experience has demonstrated with some urgency the pracarity faced by any project relying heavily on volunteer system administrators.786jms-0053jms-0053.xmlIntroducing Forester: a tool for scientific thought2023514Jon SterlingAfter spending two years exploring the design of tools for scientific thought that meet the unique needs of real, scalable scientific writing in hypertext, I have created a tool called Forester which has the following benefits:Forester is tag-based like Gerby, and can therefore power large-scale generational projects like Stacks and Kerodon.
Forester produces static content that can be uploaded to any web hosting service without needing to run or install any serverside software.
Forester is easy to install on your own machine.
To prevent bitrot, Forester is a single tool rather than a composition of several tools.
Forester satisfies all the requirements of serious scientific writing, including sophisticated notational macros, typesetting of diagrams, etc.Forester combines associative and hierarchical networks of evergreen notes (called “trees”) into hypertext sites called “forests”.784Definitiontfmt-000Rtfmt-000R.xmlForests and trees of evergreen notes202334Jon SterlingA forest of evergreen notes (or a forest for short) is loosely defined to be a collection of evergreen notes in which multiple hierarchical structures are allowed to emerge and evolve over time. Concretely, one note may contextualize several other notes via transclusion within its textual structure; in the context of a forest, we refer to an individual note as a tree. Of course, a tree can be viewed as a forest that has a root node.Trees correspond roughly to what are referred to as “tags” in the Stacks Project.In this article, I will show you how to set up your own forest using the Forester software. These instructions pertain to the Forester 4.0 version.801jms-006Rjms-006R.xmlPreparing to run the Forester software2023813Jon SterlingIn this section, we will walk through the installation of the Forester software.797jms-006Sjms-006S.xmlSystem requirements of Forester2023813Jon Sterling789Requirementjms-006Tjms-006T.xmlA unix-based system2023813Jon SterlingForester requires a unix-based system to run; it has been tested on both macOS and Linux. Windows support is desirable, but there are no concrete plans to implement it at this time.791Requirementjms-006Ujms-006U.xmlA working OCaml 5 installation2023813Jon SterlingForester is a tool written in the OCaml programming language, and makes use of the latest features of OCaml 5. Most users should install Forester through OCaml's opam package manager; instructions to install opam and OCaml simultaneously can be found here.793Requirementjms-006Vjms-006V.xmlA working \LaTeX installation2023813Jon SterlingIf you intend to embed \LaTeX-rendered diagrams in your forest, you will need to have a working installation of \LaTeX installed, such as TeX Live. If all your mathematical expressions are supported by \KaTeX, this is not necessary.795Requirementjms-006Yjms-006Y.xmlThe git distributed version control system2023814Jon SterlingIt is best practice to maintain your forest inside of distributed version control. This serves not only as a way to prevent data loss (because you will be pushing frequently to a remote repository); it also allows you to easily roll back to an earlier version of your forest, or to create “branches” in which you prepare trees that are not yet ready to be integrated into the forest.The recommended distributed version control system is git, which comes preinstalled on many unix-based systems and is easy to install otherwise. Git is not the most user-friendly piece of software, unfortunately, but it is ubiquitous. It is possible (but not recommended) to use Forester without version control, but note that the simplest way to initialize your own forest involves cloning a git repository.799jms-006Wjms-006W.xmlInstalling the Forester software2023813Jon SterlingOnce you have met the system requirements, installing Forester requires only a single shell command:opam install foresterTo verify that Forester is installed, please run forester --version in your shell.818jms-006Xjms-006X.xmlSetting up your forest from the template20238142024425Jon SterlingNow that you have installed the Forester software, it is time to set up your forest. Currently, the simplest way to set up a new forest is to clone your own copy of the forest-template repository which has been prepared for this purpose. In the future, I plan to have a more user-friendly way to set up new forests; please contact the mailing list if this is an urgent need, and it will be prioritized accordingly.We will initialize a new forest in a folder called forest. Open your shell to a suitable parent directory (e.g. your documents folder), and type the following commands:git init forest
cd forestThe first command above creates a new folder called forest and initializes a git repository inside that folder. The second command instructs your shell to enter the forest directory. Next, we will initialize your new forest with the contents of the default template by pulling from the template repository:git pull https://git.sr.ht/~jonsterling/forest-templateYour first contains a configuration file, forest.toml; this file specifies the locations of your trees, assets, etc.804jms-0073jms-0073.xmlNamespaces and addresses in a forest2023814Jon SterlingA tree in Forester is associated to an address of the form xxx-NNNN where xxx is your chosen “namespace” (most likely your initials) and NNNN is a four-digit base-36 number. The purpose of the namespace and the base-36 code is to uniquely identify a tree, not only within your forest but across all forests. A tree with address xxx-NNNN is always stored in a file named xxx-NNNN.tree.Note that the format of tree addresses is purely a matter of convention, and is not forced by the Forester tool. Users are free to use their own format for tree addresses, and in some cases alternative (human-readable) formats may be desirable: this includes trees representing bibliographic references, as well as biographical trees.The template forest assumes that your chosen namespace prefix is xxx, and comes equipped with a “root” tree located at trees/xxx-0001.tree. You should choose your own namespace prefix, likely your personal initials, and then rename trees/xxx-0001.tree accordingly.816jms-007Djms-007D.xmlBuilding and viewing your forest for the first time20238152024425Jon SterlingTo build your forest, you can run the following command of Forester's executable in your shell:forester build forest.tomlThe --dev flag is optional, and when activated supplies metadata to the generated website to support an “edit button” on each tree; this flag is meant to be used when developing your forest locally, and should not be used when building the forest to be uploaded to your public web host.806jms-007Gjms-007G.xmlForester renders each tree to an XML document2023815Jon SterlingForester renders your forest to some XML files in the output/ directory; XML is, like HTML, a format for structured documents that can be displayed by web browsers. The forest template comes equipped with a built-in XSLT stylesheet (assets/forest.xsl) which is used to instruct web browsers how to render your forest into a pleasing and readable format.808jms-007Ijms-007I.xmlServing and viewing your forest from a local web server2023815Jon SterlingThe recommended and most secure way to view your forest while editing it is to serve it from a local web server. To do this, first ensure that you have Python 3 correctly installed. Then run the following command from the root directory of your forest:python3 -m http.server 1313 -d outputWhile this command is running, you will be able to access your forest by navigating to localhost:1313/index.xml in your preferred web browser.In the future, Forester may be able to run its own local server to avoid the dependency on external tools like Python.814jms-007Jjms-007J.xmlViewing your forest locally without a local web server2023815Jon SterlingIt is also possible to open the generated file output/index.xml directly in your web browser. Unfortunately, modern web browsers by default prevent the use of XSLT stylesheets on the local file system for security reasons. Because Forester's output format is XML, the output cannot be viewed directly in your web browser without disabling this security feature (at your own risk). Users who do not understand the risks involved should turn back and use a local web server instead, which is more secure; if you understand and are willing to accept the risks, you may proceed as follows depending on your browser.810jms-007Ejms-007E.xmlConfiguring Firefox for viewing a local forest2023815Jon SterlingTo configure Firefox for viewing your local forest, navigate to about:config in your address bar.Firefox will present a page warning you to “Proceed with Caution”: you must “Accept the Risk and Continue”.
In the “Search preference name” box, search for security.fileuri.strict_origin_policy.
Most likely, the security.fileuri.strict_origin_policy will appear set to true. Double click on the word true to toggle it to false.812jms-007Fjms-007F.xmlConfiguring Safari for viewing a local forest2023815Jon SterlingTo configure Safari for viewing your local forest, you must activate the Develop menu and then toggle one setting.Open Safari's settings window.
In the Advanced tab, check the “Show Develop menu in menu bar” checkbox at the bottom.
Open the Develop menu in the menubar, and select “Disable Local File Restrictions”.820jms-007Kjms-007K.xmlCreating your personal biographical tree2023815Jon SterlingThe first tree that you should create is a biographical tree to represent your own identity; ultimately you will link to this tree when you set the authors of other trees that you create later on. Although most trees will be addressed by identifiers of the form xxx-NNNN, it is convenient to simply use a person’s full name to address a biographical tree. My own biographical tree is located at trees/people/jonmsterling.tree and contains the following source code:\title{Jon Sterling}
\taxon{person}
\meta{external}{https://www.jonmsterling.com/}
\meta{institution}{[[ucam]]}
\meta{orcid}{0000-0002-0585-5564}
\meta{position}{Associate Professor}
\p{Associate Professor in Logical Foundations and Formal Methods at University of Cambridge. Formerly a [Marie Skłodowska-Curie Postdoctoral Fellow](jms-0061) hosted at Aarhus University by [Lars Birkedal](larsbirkedal), and before this a PhD student of [Robert Harper](robertharper).}Let’s break this code down to understand what it does.The declaration \title{Jon Sterling} sets the title of the tree to my name.
The \taxon{person} declaration informs Forester that the tree is biographical. Not ever tree needs to have a taxon; common taxa include person, theorem, definition, lemma, blog, etc. You are free to use whatever you want, but some taxa are treated specially by Forester.
The subsequent \meta declarations attach additional information to the tree that can be used during rendering. These declarations are optional, and you are free to put whatever metadata you want.
Like in HTML, paragraphs must be wrapped in \p{...}.Do not hard-wrap your text, as this can have visible impact on how trees are rendered; it is recommended that you use a text editor with good support for soft-wrapping, like Visual Studio Code.You can see that the concrete syntax of Forester's trees looks superficially like a combination of \LaTeX and Markdown; Markdown-style links are used both for links to other trees and for links to external URLs. Forester's concrete syntax is not fully documented, but it is less ambiguous than both \LaTeX and Markdown.822jms-007Hjms-007H.xmlCreating a new tree using forester new20238152024425Jon SterlingCreating a new tree in your forest is as simple as adding a .tree file to the trees folder. Because it is hard to manually choose the next incremental tree address, Forester provides a command to do this automatically. If your chosen namespace prefix is xxx, then you should use the following command in your shell to create a new tree:forester new forest.toml --dest=trees --prefix=xxxIn return, Forester should output the location of the new tree, e.g. trees/xxx-0002.tree. If we look at the contents of this new file, we will see that it is empty except for metadata assigning a date to the tree:\date{2023-08-15}Most trees should have a \date annotation; this date is meant to be the date of the tree's creation. You should proceed by adding further metadata: the title and the author; for the latter, you will use the address of your personal biographical tree.\title{my first tree}
\author{jonmsterling}Tree titles should be given in lower case (except for proper names, etc.); these titles will be rendered by Forester in sentence case. A tree can have as many \author declarations as it has authors; these will be rendered in their order of appearance.Now you can begin to populate the tree with its content, written in the Forester markup language. Think carefully about keeping each tree relatively independent and atomic.828jms-007Ljms-007L.xmlBottom-up hierarchy via transclusion2023815Jon SterlingYou may be used to writing \LaTeX documents, where you work from the top down: you create some section headings, put some text under those headings, make some deeper section headings, put more text, etc. Forests work in the opposite way, from the bottom up: you start by writing independent, atomic notes/trees and then only later start to (sparingly) assemble these into a hierarchy in order to reify the emerging structure.Forester’s bottom-up approach to section hierarchy works via something called transclusion. The idea is that at any time, you can include (“transclude”) the full contents of another tree into the current tree as a subsection by adding the following code:\transclude{xxx-NNNN}This is kind of like \LaTeX’s \input command, but much better behaved: for instance, section numbers are computed on the fly. This entire tutorial is cobbled together by transcluding many smaller trees, each with their own independent existence. For example, the following two sections are transcluded from an entirely different part of my forest:824tfmt-0009tfmt-0009.xmlThe best structure to impose is relatively flat20221227Jon SterlingIt is easy to make the mistake of prematurely imposing a complex hierarchical structure on a network of notes, which leads to excessive refactoring. Hierarchy should be used sparingly, and its strength is for the large-scale organization of ideas. The best structure to impose on a network of many small related ideas is a relatively flat one. I believe that this is one of the mistakes made in the writing of the foundations of relative category theory, whose hierarchical nesting was too complex and quite beholden to my experience with pre-hypertext media.One of the immediate impacts and strengths of Forester’s transclusion model is that a given tree has no canonical “geographic” location in the forest. One tree can appear as a child of many other trees, which allows the same content to be incorporated into different textual and intellectual narratives.826tfmt-0006tfmt-0006.xmlHierarchical structure as non-unique narrative20221226Jon SterlingMultiple hierarchical structures can be imposed on the same associative network of nodes; a hierarchical structure amounts to a “narrative” that contextualizes a given subgraph of the network. One example could be the construction of lecture notes; another example could be a homework sheet; a further example could be a book chapter or scientific article. Although these may draw from the same body of definitions, theorems, examples, and exercises, these objects are contextualized within a different narrative, often toward fundamentally different ends.As a result, any interface for navigating the neighbor-relation in hierarchically organized notes would need to take account of the multiplicity of parent nodes. Most hypertext tools assume that the position of a node in the hierarchy is unique, and therefore have a single “next/previous” navigation interface; we must investigate the design of interfaces that surface all parent/neighbor relations.897jms-007Njms-007N.xmlThe Forester markup language2023816Jon SterlingA tree in Forester is a single file written in a markup language designed specifically for scientific writing with bottom-up hierarchy via transclusion. A tree has two components: the frontmatter and the mainmatter.855jms-007Pjms-007P.xmlForester markup: frontmatter2023816Jon SterlingThe frontmatter of a Forester tree is a sequence of declarations that we summarize below.
Declaration
Meaning
\title{...}
sets the title of the tree; can contain mainmatter markup
\author{name}
sets the author of the tree to be the biographical tree at address name
\date{YYYY-MM-DD}
sets the creation date of the tree
\taxon{taxon}
sets the taxon of the tree; example taxa include lemma, theorem, person, reference; the latter two taxa are treated specially by Forester for tracking biographical and bibliographical trees respectively
\def\ident[x][y]{body}
defines and exports from the current tree a function named \ident with two arguments; subsequently, the expression \ident{u}{v} would expand to body with the values of u,v substituted for \x,\y
\import{xxx-NNNN}
brings the functions exported by the tree xxx-NNNN into scope
\export{xxx-NNNN}
brings the functions exported by the tree xxx-NNNN into scope, and exports them from the current tree
891jms-007Ojms-007O.xmlForester markup: mainmatter2023816Jon SterlingBelow we summarize the concrete syntax of the mainmatter in a Forester tree.
Function
Meaning
\p{...}
creates a paragraph containing ...; unlike Markdown, it is mandatory to annotate paragraphs explicitly
\em{...}
typesets the content in italics
\strong{...}
typesets the content in boldface
#{...}
typesets the content in (inline) math mode using \KaTeX; note that math mode is idempotent in Forester
##{...}
typesets the content in (display) math mode using \KaTeX
\transclude{xxx-NNNN}
transcludes the tree at address xxx-NNNN as a subsection
[title](address)
formats the text title as a hyperlink to address address; if address is the address of a tree, the link will point to that tree, and otherwise it is treated as a URL
\let\ident[x][y]{body}
defines a local function named \ident with two arguments; subsequently, the expression \ident{u}{v} would expand to body with the values of u,v substituted for \x,\y.
\code{...}
typesets the content in monospace
\tex{...}
typesets the value of the body externally using \LaTeX, taking account of any \texpackage{pkgname} declarations in the frontmatter
895Examplejms-007Qjms-007Q.xmlAn complete worked example tree in Forester2023816Jon SterlingAn example of a complete tree in the Forester markup language can be seen below.\title{creation of (co)limits}
\date{2023-02-11}
\taxon{definition}
\author{jonmsterling}
\def\CCat{#{\mathcal{C}}}
\def\DCat{#{\mathcal{D}}}
\def\ICat{#{\mathcal{I}}}
\def\Mor[arg1][arg2][arg3]{#{{\arg2}\xrightarrow{\arg1}{\arg3}}}
\p{Let \Mor{U}{\CCat}{\DCat} be a functor and let \ICat be a category. The functor #{U} is said to \em{create (co)limits of #{\ICat}-figures} when for any diagram \Mor{C_\bullet}{\ICat}{\CCat} such that #{\ICat\xrightarrow{C_\bullet}\CCat\xrightarrow{F}\DCat} has a (co)limit, then #{C_\bullet} has a (co)limit that is both preserved and reflected by #{F}.}The code above results in the following tree:893Definitionjms-001Hjms-001H.xmlCreation of (co)limits2023211Jon SterlingLet { \mathcal {C} } \xrightarrow {{ U }}{ \mathcal {D} } be a functor and let \mathcal {I} be a category. The functor U is said to create (co)limits of \mathcal {I}-figures when for any diagram { \mathcal {I} } \xrightarrow {{ C_ \bullet }}{ \mathcal {C} } such that \mathcal {I} \xrightarrow {C_ \bullet } \mathcal {C} \xrightarrow {U} \mathcal {D} has a (co)limit, then C_ \bullet has a (co)limit that is both preserved and reflected by U.899jms-007Rjms-007R.xmlDeploying your forest to a web host2023816Jon SterlingNow that you have created your forest and added a few trees of your own, it is time to upload it to your web host. Many users of Forester will have university-supplied static web hosting, and others may prefer to use GitHub pages; deploying a forest works the same way in either case.First, make sure your forest is built using the earlier instructions.
Then take the entire contents of your output directory and upload them to your preferred web host.901jms-007Sjms-007S.xmlLet a hundred forests bloom!2023816Jon SterlingI am eager to see the new forests that people create using Forester. I am happy to offer personal assistance via the mailing list.Many aspects of Forester are in flux and not fully documented; it will often be instructive to consult the source of existings forests, such as my own. Please be aware that my own forest tends to be using the latest (often unreleased) version of Forester, so there may be slight incompatibilities.Have fun, and be sure to send me links to your forests when you have made them!
904☕jms-0088jms-0088.xmlCrowdfunding and sponsorshipJon SterlingApart from my day-job at the University of Cambridge, I am independently researching tools for scientific thought and developing software like Forester that you can use to unlock your brain. If you have benefited from this work or the writings on my blog, please considering supporting me with a sponsorship on Ko-fi.
4661#246unstable-246.xmlPublicationsJon Sterlingjms-00014590Referencesterling-2024-liftingsterling-2024-lifting.xmlTensorial structure of the lifting doctrine in constructive domain theory2023122720244162024426Jon SterlingProceedings of Category Theory at Work in Computational Mathematics and Theoretical Informaticspapers/sterling-2024-lifting.pdf10.48550/arXiv.2312.17023We present a survey of the two-dimensional and tensorial structure of the lifting doctrine in constructive domain theory, i.e. in the theory of directed-complete partial orders (dcpos) over an arbitrary elementary topos. We establish the universal property of lifting of dcpos as the Sierpiński cone, from which we deduce (1) that lifting forms a Kock–Zöberlein doctrine, (2) that lifting algebras, pointed dcpos, and inductive partial orders form canonically equivalent locally posetal 2-categories, and (3) that the category of lifting algebras is cocomplete, with connected colimits created by the forgetful functor to dcpos. Finally we deduce the symmetric monoidal closure of the Eilenberg–Moore resolution of the lifting 2-monad by means of smash products; these are shown to classify both bilinear maps and strict maps, which we prove to coincide in the constructive setting. We provide several concrete computations of the smash product as dcpo coequalisers and lifting algebra coequalisers, and compare these with the more abstract results of Seal. Although all these results are well-known classically, the existing proofs do not apply in a constructive setting; indeed, the classical analysis of the Eilenberg–Moore category of the lifting monad relies on the fact that all lifting algebras are free, a condition that is not known to hold constructively.4592Referenceniu-sterling-harper-2024niu-sterling-harper-2024.xmlCost-sensitive computational adequacy of higher-order recursion in synthetic domain theory2024329Yue NiuJon SterlingRobert Harper10.48550/arXiv.2404.00212MFPS ’24: 40th International Conference on Mathematical Foundations of Programming SemanticsWe study a cost-aware programming language for higher-order recursion dubbed PCFcost in the setting of synthetic domain theory (SDT). Our main contribution relates the denotational cost semantics of PCFcost to its computational cost semantics, a new kind of dynamic semantics for program execution that serves as a mathematically natural alternative to operational semantics in SDT. In particular we prove an internal, cost-sensitive version of Plotkin’s computational adequacy theorem, giving a precise correspondence between the denotational and computational semantics for complete programs at base type. The constructions and proofs of this paper take place in the internal dependent type theory of an SDT topos extended by a phase distinction in the sense of Sterling and Harper. By controlling the interpretation of cost structure via the phase distinction in the denotational semantics, we show that PCFcost programs also evince a noninterference property of cost and behavior. We verify the axioms of the type theory by means of a model construction based on relative sheaf models of SDT.4596Referencesterling-gratzer-birkedal-2024-univalentsterling-gratzer-birkedal-2024-univalent.xmlTowards univalent reference types202427Jon SterlingDaniel GratzerLars Birkedal10.4230/LIPIcs.CSL.2024.47CSL ’24: 32nd EACSL Annual Conference on Computer Science Logic 2024@inproceedings{sterling-gratzer-birkedal-2024-univalent,
author = {Sterling, Jonathan and Gratzer, Daniel and Birkedal, Lars},
title = {{Towards Univalent Reference Types: The Impact of Univalence on Denotational Semantics}},
booktitle = {32nd EACSL Annual Conference on Computer Science Logic (CSL 2024)},
pages = {47:1--47:21},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-310-2},
ISSN = {1868-8969},
year = {2024},
volume = {288},
editor = {Murano, Aniello and Silva, Alexandra},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
doi = {10.4230/LIPIcs.CSL.2024.47},
}We develop a denotational semantics for general reference types in an impredicative version of guarded homotopy type theory, an adaptation of synthetic guarded domain theory to Voevodsky’s univalent foundations. We observe for the first time the profound impact of univalence on the denotational semantics of mutable state. Univalence automatically ensures that all computations are invariant under symmetries of the heap—a bountiful source of program equivalences. In particular, even the most simplistic univalent model enjoys many new program equivalences that do not hold when the same constructions are carried out in the universes of traditional set-level (extensional) type theory.4600Referencegrodin-niu-sterling-harper-2024grodin-niu-sterling-harper-2024.xml decalf: a directed, effectful cost-aware logical framework202415Harrison GrodinYue NiuJon SterlingRobert HarperPOPL ’24: 51st ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3632852https://arxiv.org/abs/2307.05938@article{grodin-niu-sterling-harper-2024,
author = {Grodin, Harrison and Niu, Yue and Sterling, Jonathan and Harper, Robert},
title = {Decalf: A Directed, Effectful Cost-Aware Logical Framework},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {8},
number = {POPL},
doi = {10.1145/3632852},
journal = {Proc. ACM Program. Lang.},
month = {jan},
articleno = {10},
numpages = {29},
}We present decalf, a directed, effectful cost-aware logical framework for studying quantitative aspects of functional programs with effects. Like calf, the language is based on a formal phase distinction between the extension and the intension of a program, its pure behavior as distinct from its cost measured by an effectful step-counting primitive. The type theory ensures that the behavior is unaffected by the cost accounting. Unlike calf, the present language takes account of effects, such as probabilistic choice and mutable state; this extension requires a reformulation of calf’s approach to cost accounting: rather than rely on a “separable” notion of cost, here a cost bound is simply another program. To make this formal, we equip every type with an intrinsic preorder, relaxing the precise cost accounting intrinsic to a program to a looser but nevertheless informative estimate. For example, the cost bound of a probabilistic program is itself a probabilistic program that specifies the distribution of costs. This approach serves as a streamlined alternative to the standard method of isolating a recurrence that bounds the cost in a manner that readily extends to higher-order, effectful programs.The development proceeds by first introducing the decalf type system, which is based on an intrinsic ordering among terms that restricts in the extensional phase to extensional equality, but in the intensional phase reflects an approximation of the cost of a program of interest. This formulation is then applied to a number of illustrative examples, including pure and effectful sorting algorithms, simple probabilistic programs, and higher-order functions. Finally, we justify decalf via a model in the topos of augmented simplicial sets.4607Referencesieczkowski-stepanenko-sterling-birkedal-2024sieczkowski-stepanenko-sterling-birkedal-2024.xmlThe essence of generalized algebraic data types202415Filip SieczkowskiSergei StepanenkoJon SterlingLars BirkedalPOPL ’24: 51st ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3632866@article{sieczkowski-stepanenko-sterling-birkedal-2024,
author = {Sieczkowski, Filip and Stepanenko, Sergei and Sterling, Jonathan and Birkedal, Lars},
title = {The Essence of Generalized Algebraic Data Types},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {8},
number = {POPL},
doi = {10.1145/3632866},
journal = {Proc. ACM Program. Lang.},
month = {jan},
articleno = {24},
numpages = {29},
}This paper considers direct encodings of generalized algebraic data types (GADTs) in a minimal suitable lambda-calculus. To this end, we develop an extension of System Fω with recursive types and internalized type equalities with injective constant type constructors. We show how GADTs and associated pattern-matching constructs can be directly expressed in the calculus, thus showing that it may be treated as a highly idealized modern functional programming language. We prove that the internalized type equalities in conjunction with injectivity rules increase the expressive power of the calculus by establishing a non-macro-expressibility result in Fω, and prove the system type-sound via a syntactic argument. Finally, we build two relational models of our calculus: a simple, unary model that illustrates a novel, two-stage interpretation technique, necessary to account for the equational constraints; and a more sophisticated, binary model that relaxes the construction to allow, for the first time, formal reasoning about data-abstraction in a calculus equipped with GADTs.4612Referenceaagaard-sterling-birkedal-2023aagaard-sterling-birkedal-2023.xmlA denotationally-based program logic for higher-order store20231123Frederik Lerbjerg AagaardJon SterlingLars Birkedal10.46298/entics.1223239th International Conference on Mathematical Foundations of Programming SemanticsSeparation logic is used to reason locally about stateful programs. State of the art program logics for higher-order store are usually built on top of untyped operational semantics, in part because traditional denotational methods have struggled to simultaneously account for general references and parametric polymorphism. The recent discovery of simple denotational semantics for general references and polymorphism in synthetic guarded domain theory has enabled us to develop Tulip, a higher-order separation logic over the typed equational theory of higher-order store for a monadic version of System \textbf {F}^{ \mu , \textit {ref}}. The Tulip logic differs from operationally-based program logics in two ways: predicates range over the meanings of typed terms rather than over the raw code of untyped terms, and they are automatically invariant under the equational congruence of higher-order store, which applies even underneath a binder. As a result, “pure” proof steps that conventionally require focusing the Hoare triple on an operational redex are replaced by a simple equational rewrite in Tulip. We have evaluated Tulip against standard examples involving linked lists in the heap, comparing our abstract equational reasoning with more familiar operational-style reasoning. Our main result is the soundness of Tulip, which we establish by constructing a BI-hyperdoctrine over the denotational semantics of \textbf {F}^{ \mu , \textit {ref}} in an impredicative version of synthetic guarded domain theory.4616Referencesterling-2023-grothendiecksterling-2023-grothendieck.xmlTowards a geometry for syntax2023928Jon Sterling10.48550/arXiv.2307.09497Invited contribution to the proceedings of the Chapman Grothendieck Conference, to appearIt often happens that free algebras for a given theory satisfy useful reasoning principles that are not preserved under homomorphisms of algebras, and hence need not hold in an arbitrary algebra. For instance, if M is the free monoid on a set A, then the scalar multiplication function A \times M \to M is injective. Therefore, when reasoning in the formal theory of monoids under A, it is possible to use this injectivity law to make sound deductions even about monoids under A for which scalar multiplication is not injective — a principle known in algebra as the permanence of identity. Properties of this kind are of fundamental practical importance to the logicians and computer scientists who design and implement computerized proof assistants like Lean and Coq, as they enable the formal reductions of equational problems that make type checking tractable.As type theories have become increasingly more sophisticated, it has become more and more difficult to establish the useful properties of their free models that facilitate effective implementation. These obstructions have facilitated a fruitful return to foundational work in type theory, which has taken on a more geometrical flavor than ever before. Here we expose a modern way to prove a highly non-trivial injectivity law for free models of Martin-Löf type theory, paying special attention to the ways that contemporary methods in type theory have been influenced by three important ideas of the Grothendieck school: the relative point of view, the language of universes, and the recollement of generalized spaces.4618Referencesterling-2023-genericsterling-2023-generic.xmlWhat should a generic object be?2023425Jon Sterling@article{sterling-2023-generic,
author = {Sterling, Jonathan},
publisher = {Cambridge University Press},
date = {2023},
doi = {10.1017/S0960129523000117},
journaltitle = {Mathematical Structures in Computer Science},
pages = {1--22},
title = {What should a generic object be?},
}10.1017/S0960129523000117Mathematical Structures in Computer ScienceJacobs has proposed definitions for (weak, strong, split) generic objects for a fibered category; building on his definition of (split) generic objects, Jacobs develops a menagerie of important fibrational structures with applications to categorical logic and computer science, including higher order fibrations, polymorphic fibrations, 𝜆2-fibrations, triposes, and others. We observe that a split generic object need not in particular be a generic object under the given definitions, and that the definitions of polymorphic fibrations, triposes, etc. are strict enough to rule out some fundamental examples: for instance, the fibered preorder induced by a partial combinatory algebra in realizability is not a tripos in this sense. We propose a new alignment of terminology that emphasizes the forms of generic object appearing most commonly in nature, i.e. in the study of internal categories, triposes, and the denotational semantics of polymorphism. In addition, we propose a new class of acyclic generic objects inspired by recent developments in higher category theory and the semantics of homotopy type theory, generalizing the realignment property of universes to the setting of an arbitrary fibration.4620Referencepalombi-sterling-2023palombi-sterling-2023.xmlClassifying topoi in synthetic guarded domain theory: the universal property of multi-clock guarded recursion2023222Daniele PalombiJon Sterling@inproceedings{palombi-sterling-2023,
author = {Palombi, Daniele and Sterling, Jonathan},
booktitle = {Proceedings 38th Conference on Mathematical Foundations of Programming Semantics, {MFPS} 2022},
year = {2023},
month = feb,
title = {Classifying topoi in synthetic guarded domain theory},
doi = {10.46298/entics.10323},
}10.46298/entics.1032338th International Conference on Mathematical Foundations of Programming SemanticsSeveral different topoi have played an important role in the development and applications of synthetic guarded domain theory (SGDT), a new kind of synthetic domain theory that abstracts the concept of guarded recursion frequently employed in the semantics of programming languages. In order to unify the accounts of guarded recursion and coinduction, several authors have enriched SGDT with multiple “clocks” parameterizing different time-streams, leading to more complex and difficult to understand topos models. Until now these topoi have been understood very concretely qua categories of presheaves, and the logico-geometrical question of what theories these topoi classify has remained open. We show that several important topos models of SGDT classify very simple geometric theories, and that the passage to various forms of multi-clock guarded recursion can be rephrased more compositionally in terms of the lower bagtopos construction of Vickers and variations thereon due to Johnstone. We contribute to the consolidation of SGDT by isolating the universal property of multi-clock guarded recursion as a modular construction that applies to any topos model of single-clock guarded recursion.4623Referencesterling-angiuli-gratzer-2022sterling-angiuli-gratzer-2022.xmlA cubical language for Bishop sets202229Jon SterlingCarlo AngiuliDaniel Gratzer@article{sterling-angiuli-gratzer-2022,
author = {Sterling, Jonathan and Angiuli, Carlo and Gratzer, Daniel},
year = {2022},
month = mar,
doi = {10.46298/lmcs-18(1:43)2022},
eprint = {2003.01491},
eprintclass = {cs.LO},
eprinttype = {arXiv},
issue = {1},
journal = {Logical Methods in Computer Science},
title = {{A Cubical Language for Bishop Sets}},
volume = {18},
}10.46298/lmcs-18(1:43)2022Logical Methods in Computer ScienceWe present XTT, a version of Cartesian cubical type theory specialized for Bishop sets à la Coquand, in which every type enjoys a definitional version of the uniqueness of identity proofs. Using cubical notions, XTT reconstructs many of the ideas underlying Observational Type Theory, a version of intensional type theory that supports function extensionality. We prove the canonicity property of XTT (that every closed boolean is definitionally equal to a constant) by Artin gluing.4627Referenceniu-sterling-grodin-harper-2022niu-sterling-grodin-harper-2022.xmlA cost-aware logical framework202211Yue NiuJon SterlingHarrison GrodinRobert HarperProceedings of the ACM on Programming Languages, Volume 6, Issue POPL10.1145/3498670We present calf, a cost-aware logical framework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account of phase distinctions, we argue that the cost structure of programs motivates a phase distinction between intension and extension. Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which cost-aware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a full-spectrum dependent type theory, calf presents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants.We evaluate calf as a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: the method of recurrence relations and physicist’s method for amortized analysis. We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential and parallel complexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning in calf by means of a model construction.4636Referencesterling-harper-2022sterling-harper-2022.xmlSheaf semantics of termination-insensitive noninterference2022Jon SterlingRobert Harper10.4230/LIPIcs.FSCD.2022.5papers/sterling-harper-2022.pdf7th International Conference on Formal Structures for Computation and Deduction (FSCD 2022)We propose a new sheaf semantics for secure information flow over a space of abstract behaviors, based on synthetic domain theory: security classes are open/closed partitions, types are sheaves, and redaction of sensitive information corresponds to restricting a sheaf to a closed subspace. Our security-aware computational model satisfies termination-insensitive noninterference automatically, and therefore constitutes an intrinsic alternative to state of the art extrinsic/relational models of noninterference. Our semantics is the latest application of Sterling and Harper’s recent re-interpretation of phase distinctions and noninterference in programming languages in terms of Artin gluing and topos-theoretic open/closed modalities. Prior applications include parametricity for ML modules, the proof of normalization for cubical type theory by Sterling and Angiuli, and the cost-aware logical framework of Niu et al. In this paper we employ the phase distinction perspective twice: first to reconstruct the syntax and semantics of secure information flow as a lattice of phase distinctions between “higher” and “lower” security, and second to verify the computational adequacy of our sheaf semantics with respect to a version of Abadi et al.’s dependency core calculus to which we have added a construct for declassifying termination channels.4632Erratumjms-005Yjms-005Y.xmlMinor mistakes in sheaf semantics of noninterference2023Jon SterlingIn the published version of this paper, there were a few mistakes that have been corrected in the local copy hosted here.In the Critique of relational semantics for information flow, our discussion of the Failure of monotonicity stated incorrectly that algebras for the sealing monad at a higher security level could not be transformed into algebras for the sealing monad at a lower security level in the semantics of Abadi et al. This is not true, as pointed out to us privately by Carlos Tomé Cortiñas. What we meant to say was that it is not the case that a type whose component at a high security level is trivial shall always remain trivial at a lower security level.
The original version of the extended edition of this paper, we claimed that the constructive existence of tensor products on pointed dcpos was obvious; in fact, tensor products do exist, but their construction involves a reflexive coequalizer of pointed dcpos.4634Erratumjms-005Zjms-005Z.xmlAdequacy of sheaf semantics of noninterference2023717Jon SterlingA serious (and as-yet unfixed) problem was discovered in July of 2023 by Yue Niu, which undermines the proof of adequacy given; in particular, the proof that the logical relation on free algebras is admissible is not correct. I believe there is a different proof of adequacy for the calculus described, but it will have a different structure from what currently appears in the paper. We thank Yue Niu for his attention to detail and careful reading of this paper.4639Referencesterling-harper-2021sterling-harper-2021.xmlLogical relations as types: proof-relevant parametricity for program modules2021121Jon SterlingRobert Harperpapers/sterling-harper-2021.pdfJournal of the ACM, Volume 68, Issue 610.1145/3474834The theory of program modules is of interest to language designers not only for its practical importance to programming, but also because it lies at the nexus of three fundamental concerns in language design: the phase distinction, computational effects, and type abstraction. We contribute a fresh “synthetic” take on program modules that treats modules as the fundamental constructs, in which the usual suspects of prior module calculi (kinds, constructors, dynamic programs) are rendered as derived notions in terms of a modal type-theoretic account of the phase distinction. We simplify the account of type abstraction (embodied in the generativity of module functors) through a lax modality that encapsulates computational effects, placing projectibility of module expressions on a type-theoretic basis.Our main result is a (significant) proof-relevant and phase-sensitive generalization of the Reynolds abstraction theorem for a calculus of program modules, based on a new kind of logical relation called a parametricity structure. Parametricity structures generalize the proof-irrelevant relations of classical parametricity to proof-relevant families, where there may be non-trivial evidence witnessing the relatedness of two programs—simplifying the metatheory of strong sums over the collection of types, for although there can be no “relation classifying relations,” one easily accommodates a “family classifying small families.”Using the insight that logical relations/parametricity is itself a form of phase distinction between the syntactic and the semantic, we contribute a new synthetic approach to phase separated parametricity based on the slogan logical relations as types, by iterating our modal account of the phase distinction. We axiomatize a dependent type theory of parametricity structures using two pairs of complementary modalities (syntactic, semantic) and (static, dynamic), substantiated using the topos theoretic Artin gluing construction. Then, to construct a simulation between two implementations of an abstract type, one simply programs a third implementation whose type component carries the representation invariant.1604Erratumjms-0060jms-0060.xmlMinor mistakes in logical relations as types2021Jon SterlingAfter going to press, we have fixed the following mistakes:In the definition of a logos, we mistakenly said that "colimits commute with finite limits" but we meant to say that they are preserved by pullback. We thank Sarah Z. Rovner-Frydman for noticing this mistake.
In Remark 5.15, we used the notation for the closed immersion prior to introducing it.
We have fixed a few broken links in the bibliography.The local copy hosted here has the corrections implemented4642Referencesterling-angiuli-2021sterling-angiuli-2021.xmlNormalization for cubical type theory202177Jon SterlingCarlo Angiuli2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS52264.2021.9470719We prove normalization for (univalent, Cartesian) cubical type theory, closing the last major open problem in the syntactic metatheory of cubical type theory. Our normalization result is reduction-free, in the sense of yielding a bijection between equivalence classes of terms in context and a tractable language of \beta/\eta-normal forms. As corollaries we obtain both decidability of judgmental equality and the injectivity of type constructors.4645Referencesterling-2021-bhfssterling-2021-bhfs.xmlHigher order functions and Brouwer’s Thesis2021519Jon Sterling@article{sterling-2021-bhfs,
author = {Sterling, Jonathan},
publisher = {Cambridge University Press},
date = {2021},
doi = {10.1017/S0956796821000095},
eprint = {1608.03814},
eprintclass = {math.LO},
eprinttype = {arXiv},
journaltitle = {Journal of Functional Programming},
note = {\emph{Bob Harper Festschrift Collection}},
pages = {e11},
title = {Higher order functions and Brouwer's thesis},
volume = {31},
}http://www.jonmsterling.com/agda-effectful-forcing/index.html10.1017/S0956796821000095Journal of Functional Programming, Bob Harper Festschrift CollectionExtending Martín Hötzel Escardó’s effectful forcing technique, we give a new proof of a well-known result: Brouwer’s monotone bar theorem holds for any bar that can be realized by a functional of type { \mathopen {} \left ( \mathbb {N} \to \mathbb {N} \right ) \mathclose {}} \to \mathbb {N} in Gödel’s System T. Effectful forcing is an elementary alternative to standard sheaf-theoretic forcing arguments, using ideas from programming languages, including computational effects, monads, the algebra interpretation of call-by-name λ-calculus, and logical relations. Our argument proceeds by interpreting System T programs as well-founded dialogue trees whose nodes branch on a query to an oracle of type \mathbb {N} \to \mathbb {N}, lifted to higher type along a call-by-name translation. To connect this interpretation to the bar theorem, we then show that Brouwer’s famous "mental constructions" of barhood constitute an invariant form of these dialogue trees in which queries to the oracle are made maximally and in order.4647Referencesterling-angiuli-gratzer-2019sterling-angiuli-gratzer-2019.xmlCubical syntax for reflection-free extensional equality2019Jon SterlingCarlo AngiuliDaniel Gratzer@inproceedings{sterling-angiuli-gratzer-2019,
author = {Sterling, Jonathan and Angiuli, Carlo and Gratzer, Daniel},
editor = {Geuvers, Herman},
location = {Dagstuhl, Germany},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
url = {http://drops.dagstuhl.de/opus/volltexte/2019/10538},
booktitle = {Proceedings of the 4th International Conference on Formal Structures for Computation and Deduction (FSCD 2019)},
date = {2019},
doi = {10.4230/LIPIcs.FSCD.2019.31},
eprint = {1904.08562},
eprinttype = {arXiv},
isbn = {978-3-95977-107-8},
issn = {1868-8969},
pages = {31:1--31:25},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
title = {Cubical Syntax for Reflection-Free Extensional Equality},
volume = {131},
}slides/sterling-angiuli-gratzer-2019.pdf10.4230/LIPIcs.FSCD.2019.31FSCD ’19: International Conference on Formal Structures for Computation and DeductionWe contribute XTT, a cubical reconstruction of Observational Type Theory [Altenkirch et al., 2007] which extends Martin-Löf's intensional type theory with a dependent equality type that enjoys function extensionality and a judgmental version of the unicity of identity proofs principle (UIP): any two elements of the same equality type are judgmentally equal. Moreover, we conjecture that the typing relation can be decided in a practical way. In this paper, we establish an algebraic canonicity theorem using a novel extension of the logical families or categorical gluing argument inspired by Coquand and Shulman: every closed element of boolean type is derivably equal to either true or false.4651Referencegratzer-sterling-birkedal-2019gratzer-sterling-birkedal-2019.xmlImplementing a modal dependent type theory2019Daniel GratzerJon SterlingLars Birkedal@article{gratzer-sterling-birkedal-2019,
author = {Gratzer, Daniel and Sterling, Jonathan and Birkedal, Lars},
location = {New York, NY, USA},
publisher = {ACM},
date = {2019-07},
doi = {10.1145/3341711},
issn = {2475-1421},
journaltitle = {Proceedings of the ACM on Programming Languages},
keywords = {Modal types,dependent types,normalization by evaluation,type-checking},
number = {ICFP},
pages = {107:1--107:29},
title = {Implementing a Modal Dependent Type Theory},
volume = {3},
}10.1145/3341711ICFP ’19: The 24th ACM SIGPLAN International Conference on Functional ProgrammingModalities are everywhere in programming and mathematics! Despite this, however, there are still significant technical challenges in formulating a core dependent type theory with modalities. We present a dependent type theory MLTT🔒 supporting the connectives of standard Martin-Löf Type Theory as well as an S4-style necessity operator. MLTT🔒 supports a smooth interaction between modal and dependent types and provides a common basis for the use of modalities in programming and in synthetic mathematics. We design and prove the soundness and completeness of a type checking algorithm for MLTT🔒, using a novel extension of normalization by evaluation. We have also implemented our algorithm in a prototype proof assistant for MLTT🔒, demonstrating the ease of applying our techniques.4655Referencesterling-harper-2018sterling-harper-2018.xmlGuarded computational type theory2018Jon SterlingRobert Harper@inproceedings{sterling-harper-2018,
author = {Sterling, Jonathan and Harper, Robert},
title = {Guarded Computational Type Theory},
booktitle = {Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science},
series = {LICS '18},
year = {2018},
isbn = {978-1-4503-5583-4},
location = {Oxford, United Kingdom},
pages = {879--888},
numpages = {10},
url = {http://doi.acm.org/10.1145/3209108.3209153},
doi = {10.1145/3209108.3209153},
acmid = {3209153},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {clocks, dependent types, guarded recursion, operational semantics, type theory},
}slides/sterling-harper-2018.pdf10.1145/3209108.3209153LICS ’18: 33rd Annual ACM/IEEE Symposium on Logic in Computer ScienceNakano’s later modality can be used to specify and define recursive functions which are causal or synchronous; in concert with a notion of clock variable, it is possible to also capture the broader class of productive (co)programs. Until now, it has been difficult to combine these constructs with dependent types in a way that preserves the operational meaning of type theory and admits a hierarchy of universes. We present an operational account of guarded dependent type theory with clocks called Guarded Computational Type Theory, featuring a novel clock intersection connective that enjoys the clock irrelevance principle, as well as a predicative hierarchy of universes which does not require any indexing in clock contexts. Guarded Computational Type Theory is simultaneously a programming language with a rich specification logic, as well as a computational metalanguage that can be used to develop semantics of other languages and logics.4658Referencevalentine-sterling-2016valentine-sterling-2016.xmlDependent types for pragmatics2016Rebecca ValentineJon Sterling@inbook{valentine-sterling-2016,
author={Valentine, Rebecca and Sterling, Jonathan},
editor={Redmond, Juan and Pombo Martins, Olga and Nepomuceno Fern{\'a}ndez, {\'A}ngel},
title={Dependent Types for Pragmatics},
booktitle={Epistemology, Knowledge and the Impact of Interaction},
year={2016},
publisher={Springer International Publishing},
address={Cham},
pages={123--139},
isbn={978-3-319-26506-3},
doi={10.1007/978-3-319-26506-3_4},
}10.1007/978-3-319-26506-3_4In: Redmond J., Pombo Martins O., Nepomuceno Fernández Á. (eds) Epistemology, Knowledge and the Impact of Interaction. Logic, Epistemology, and the Unity of Science, vol 38. Springer, Cham.In this paper, we present an extension to Martin-Löf’s Intuitionistic Type Theory which gives natural solutions to problems in pragmatics, such as pronominal reference and presupposition. Our approach also gives a simple account of donkey anaphora without resorting to exotic scope extension of the sort used in Discourse Representation Theory and Dynamic Semantics, thanks to the proof-relevant nature of type theory.4691#247unstable-247.xmlPreprintsJon Sterlingjms-00014663Referencegratzer-shulman-sterling-2024-universesgratzer-shulman-sterling-2024-universes.xmlStrict universes for Grothendieck topoi20222242024516Daniel GratzerMike ShulmanJon Sterling@unpublished{gratzer-shulman-sterling-2024-universes,
author = {Gratzer, Daniel and Shulman, Michael and Sterling, Jonathan},
year = {2024},
month = may,
doi = {10.48550/arXiv.2202.12012},
eprint = {2202.12012},
eprintclass = {math.CT},
eprinttype = {arXiv},
note = {Unpublished manuscript},
title = {Strict universes for Grothendieck topoi},
}10.48550/arXiv.2202.12012Hofmann and Streicher famously showed how to lift Grothendieck universes into presheaf topoi, and Streicher has extended their result to the case of sheaf topoi by
sheafification. In parallel, van den Berg and Moerdijk have shown in the context of algebraic set theory that similar constructions continue to apply even in weaker metatheories. Unfortunately, sheafification seems not to preserve an important realignment property enjoyed by presheaf universes that plays a critical role in models of univalent type theory as well as synthetic Tait computability. When multiple universes are present, realignment also implies a coherent interpretation of connectives across all universes that justifies the cumulativity laws present in popular formulations of Martin-Löf type theory.We observe that a slight adjustment to an argument of Shulman constructs a cumulative universe hierarchy satisfying the realignment property at every level in any Grothendieck topos. Hence one has direct-style interpretations of Martin-Löf type theory with cumulative universes into all Grothendieck topoi. A further implication is to extend the reach of recent synthetic methods in the semantics of cubical type theory and the syntactic metatheory of type theory and programming languages to all Grothendieck topoi.4667Referencesterling-2024-liftingsterling-2024-lifting.xmlTensorial structure of the lifting doctrine in constructive domain theory2023122720244162024426Jon SterlingProceedings of Category Theory at Work in Computational Mathematics and Theoretical Informaticspapers/sterling-2024-lifting.pdf10.48550/arXiv.2312.17023We present a survey of the two-dimensional and tensorial structure of the lifting doctrine in constructive domain theory, i.e. in the theory of directed-complete partial orders (dcpos) over an arbitrary elementary topos. We establish the universal property of lifting of dcpos as the Sierpiński cone, from which we deduce (1) that lifting forms a Kock–Zöberlein doctrine, (2) that lifting algebras, pointed dcpos, and inductive partial orders form canonically equivalent locally posetal 2-categories, and (3) that the category of lifting algebras is cocomplete, with connected colimits created by the forgetful functor to dcpos. Finally we deduce the symmetric monoidal closure of the Eilenberg–Moore resolution of the lifting 2-monad by means of smash products; these are shown to classify both bilinear maps and strict maps, which we prove to coincide in the constructive setting. We provide several concrete computations of the smash product as dcpo coequalisers and lifting algebra coequalisers, and compare these with the more abstract results of Seal. Although all these results are well-known classically, the existing proofs do not apply in a constructive setting; indeed, the classical analysis of the Eilenberg–Moore category of the lifting monad relies on the fact that all lifting algebras are free, a condition that is not known to hold constructively.4669Referencesterling-2024-lensessterling-2024-lenses.xmlReflexive graph lenses in univalent foundations2024330Jon Sterlinghttps://arxiv.org/abs/2404.07854Martin-Löf’s identity types provide a generic (albeit opaque) notion of identification or “equality” between any two elements of the same type, embodied in a canonical reflexive graph structure (=_A, \mathbf {refl}) on any type A. The miracle of Voevodsky’s univalence principle is that it ensures, for essentially any naturally occuring structure in mathematics, that this the resultant notion of identification is equivalent to the type of isomorphisms in the category of such structures. Characterisations of this kind are not automatic and must be established one-by-one; to this end, several authors have employed reflexive graphs and displayed reflexive graphs to organise the characterisation of identity types. We contribute reflexive graph lenses, a new family of intermediate abstractions lying between families of reflexive graphs and displayed reflexive graphs that simplifies the characterisation of identity types for complex structures. Every reflexive graph lens gives rise to a (more complicated) displayed reflexive graph, and our experience suggests that many naturally occuring displayed reflexive graphs arise in this way. Evidence for the utility of reflexive graph lenses is given by means of several case studies, including the theory of reflexive graphs itself as well as that of polynomial type operators. Finally, we exhibit an equivalance between the type of reflexive graph fibrations and the type of univalent reflexive graph lenses.4671Referenceniu-sterling-harper-2024niu-sterling-harper-2024.xmlCost-sensitive computational adequacy of higher-order recursion in synthetic domain theory2024329Yue NiuJon SterlingRobert Harper10.48550/arXiv.2404.00212MFPS ’24: 40th International Conference on Mathematical Foundations of Programming SemanticsWe study a cost-aware programming language for higher-order recursion dubbed PCFcost in the setting of synthetic domain theory (SDT). Our main contribution relates the denotational cost semantics of PCFcost to its computational cost semantics, a new kind of dynamic semantics for program execution that serves as a mathematically natural alternative to operational semantics in SDT. In particular we prove an internal, cost-sensitive version of Plotkin’s computational adequacy theorem, giving a precise correspondence between the denotational and computational semantics for complete programs at base type. The constructions and proofs of this paper take place in the internal dependent type theory of an SDT topos extended by a phase distinction in the sense of Sterling and Harper. By controlling the interpretation of cost structure via the phase distinction in the denotational semantics, we show that PCFcost programs also evince a noninterference property of cost and behavior. We verify the axioms of the type theory by means of a model construction based on relative sheaf models of SDT.4675Referencesterling-2023-grothendiecksterling-2023-grothendieck.xmlTowards a geometry for syntax2023928Jon Sterling10.48550/arXiv.2307.09497Invited contribution to the proceedings of the Chapman Grothendieck Conference, to appearIt often happens that free algebras for a given theory satisfy useful reasoning principles that are not preserved under homomorphisms of algebras, and hence need not hold in an arbitrary algebra. For instance, if M is the free monoid on a set A, then the scalar multiplication function A \times M \to M is injective. Therefore, when reasoning in the formal theory of monoids under A, it is possible to use this injectivity law to make sound deductions even about monoids under A for which scalar multiplication is not injective — a principle known in algebra as the permanence of identity. Properties of this kind are of fundamental practical importance to the logicians and computer scientists who design and implement computerized proof assistants like Lean and Coq, as they enable the formal reductions of equational problems that make type checking tractable.As type theories have become increasingly more sophisticated, it has become more and more difficult to establish the useful properties of their free models that facilitate effective implementation. These obstructions have facilitated a fruitful return to foundational work in type theory, which has taken on a more geometrical flavor than ever before. Here we expose a modern way to prove a highly non-trivial injectivity law for free models of Martin-Löf type theory, paying special attention to the ways that contemporary methods in type theory have been influenced by three important ideas of the Grothendieck school: the relative point of view, the language of universes, and the recollement of generalized spaces.4677Referencegratzer-sterling-angiuli-coquand-birkedal-2022gratzer-sterling-angiuli-coquand-birkedal-2022.xmlControlling unfolding in type theory20221010Daniel GratzerJon SterlingCarlo AngiuliThierry CoquandLars Birkedal10.48550/arXiv.2210.05420@unpublished{gratzer-sterling-angiuli-coquand-birkedal-2022,
doi = {10.48550/ARXIV.2210.05420},
author = {Gratzer, Daniel and Sterling, Jonathan and Angiuli, Carlo and Coquand, Thierry and Birkedal, Lars},
title = {Controlling unfolding in type theory},
year = {2022},
note = {Unpublished manuscript}
}We present a novel mechanism for controlling the unfolding of definitions in
dependent type theory. Traditionally, proof assistants let users specify
whether each definition can or cannot be unfolded in the remainder of a
development; unfolding definitions is often necessary in order to reason about
them, but an excess of unfolding can result in brittle proofs and intractably
large proof goals. In our system, definitions are by default not unfolded, but
users can selectively unfold them in a local manner. We justify our mechanism
by means of elaboration to a core type theory with extension types, a
connective first introduced in the context of homotopy type theory. We prove a
normalization theorem for our core calculus and have implemented our system in
the cooltt proof assistant, providing both theoretical and practical evidence
for it.4683Referencesterling-gratzer-birkedal-2022sterling-gratzer-birkedal-2022.xmlDenotational semantics of general store and polymorphism2022106Jon SterlingDaniel GratzerLars Birkedal10.48550/arXiv.2210.02169@unpublished{sterling-gratzer-birkedal-2022,
author = {Sterling, Jonathan and Gratzer, Daniel and Birkedal, Lars},
year = {2022},
month = jul,
note = {Unpublished manuscript},
title = {Denotational semantics of general store and polymorphism},
}We contribute the first denotational semantics of polymorphic dependent type theory extended by an equational theory for general (higher-order) reference types and recursive types, based on a combination of guarded recursion and impredicative polymorphism; because our model is based on recursively defined semantic worlds, it is compatible with polymorphism and relational reasoning about stateful abstract datatypes. We then extend our language with modal constructs for proof-relevant relational reasoning based on the logical relations as types principle, in which equivalences between imperative abstract datatypes can be established synthetically. Finally we develop a decomposition of the store model as a general construction that extends an arbitrary polymorphic call-by-push-value adjunction with higher-order store, improving on Levy's possible worlds model construction; what is new in relation to prior typed denotational models of higher-order store is that our Kripke worlds need not be syntactically definable, and are thus compatible with relational reasoning in the heap. Our work combines recent advances in the operational semantics of state with the purely denotational viewpoint of synthetic guarded domain theory.4687Referencesterling-2022-existentialssterling-2022-existentials.xmlReflections on existential types2022101Jon Sterling@article{sterling-2022-existentials,
doi = {10.48550/ARXIV.2210.00758},
author = {Sterling, Jonathan},
title = {Reflections on existential types},
publisher = {arXiv},
year = {2022},
note = {Unpublished manuscript},
}10.48550/arXiv.2210.00758Existential types are reconstructed in terms of small reflective subuniverses and dependent sums. The folklore decomposition detailed here gives rise to a particularly simple account of first class modules as a mode of use of traditional second class modules in connection with the modal operator induced by a reflective subuniverse, leading to a semantic justification for the rules of first-class modules in languages like OCaml and MoscowML. Additionally, we expose several constructions that give rise to semantic models of ML-style programming languages with both first-class modules and realistic computational effects, culminating in a model that accommodates higher-order first class recursive modules and higher-order store.4689Referencesterling-2022-naivesterling-2022-naive.xmlNaïve logical relations in synthetic Tait computability20226Jon Sterling@unpublished{sterling-2022-naive,
author = {Sterling, Jonathan},
year = {2022},
month = jun,
note = {Unpublished manuscript},
title = {Na\"{i}ve logical relations in synthetic {Tait} computability},
}papers/sterling-2022-naive.pdfLogical relations are the main tool for proving positive properties of logics, type theories, and programming languages: canonicity, normalization, decidability, conservativity, computational adequacy, and more. Logical relations combine pure syntax with non-syntactic objects that are parameterized in syntax in a somewhat complex way; the sophistication of possible parameterizations makes logical relations a tool that is primarily accessible to specialists. In the spirit of Halmos' book Naïve Set Theory, I advocate for a new viewpoint on logical relations based on synthetic Tait computability, the internal language of categories of logical relations. In synthetic Tait computability, logical relations are manipulated as if they were sets, making the essence of many complex logical relations arguments accessible to non-specialists.
4707jms-0063jms-0063.xmlStudentsJon Sterlingjms-00014694jms-00DVjms-00DV.xmlMasters-level students2023119jms-00634693Personleonipughleonipugh.xmlLeoni PughPart III StudentUniversity of Cambridgejonmsterling4698jms-00B5jms-00B5.xmlBachelor-level students20231020jms-00634695Personzhiyiliuzhiyiliu.xmlZhiyi LiuUniversity of CambridgeUndergraduate Studentjonmsterling4696Persondanielepalombidanielepalombi.xmlDaniele Palombihttps://dpl0a.github.io/jonmsterlingSapienza University of Rome, 20[ ]0000-0002-8107-54394697Personaoyangyuaoyangyu.xmlAoyang Yujonmsterlinghttps://permui.github.ioZhejiang University4703jms-00B6jms-00B6.xmlPhD thesis committees20231020jms-00634699Personfilipposestinifilipposestini.xmlFilippo SestiniFunctional Software EngineerImandrahttp://www.cs.nott.ac.uk/~psxfs5/0000-0002-8701-56134700Personloïcpujetloïcpujet.xmlLoïc Pujethttps://pujet.fr/Stockholm UniversitySverker Lerheden Postdoctoral FellowMy research interests lie mainly in type theory, proof assistants, homotopy theory and constructive mathematics.4701Personyueniuyueniu.xmlYue NiuPhD StudentrobertharperCarnegie Mellon University0000-0003-4888-6042PhD student of Robert Harper.4702Personwojciechnawrockiwojciechnawrocki.xmlWojciech NawrockiCarnegie Mellon UniversityPhD Studenthttps://voidma.in4705jms-00SOjms-00SO.xmlPostdocsJon Sterlingjms-00634704Personandrewslatteryandrewslattery.xmlAndrew Slatteryhttps://andrewslattery.github.ioPhD Student; Research AssistantUniversity of Leeds; Cambridge Computer Laboratorynicolagambinojonmsterling
4723jms-007Zjms-007Z.xmlTeachingJon SterlingAlejandro AguirreAndrew PittsFrank StajanoLars BirkedalMarcelo Fiorejms-00014714Coursejms-0081jms-0081.xmlDiscrete Mathematics (2023–24)2023Marcelo FioreJon SterlingAndrew PittsFrank Stajanohttps://www.cl.cam.ac.uk/teaching/2324/DiscMath/University of CambridgeThe course aims to introduce the mathematics of discrete structures, showing it as an essential tool for computer science that can be clever and beautiful.Michaelmas term lectured by Marcelo Fiore; Lent term lectured by Jon Sterling.4709jms-00JBjms-00JB.xmlLectures on discrete mathematics2024118Jon SterlingAndrew PittsFrank StajanoMarcelo FioreDiscrete Mathematics (2023–24)3479jms-00I6jms-00I6.xmlLecture 13: relations and matrices2024119Jon SterlingMarcelo Fiore
3355jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3365jms-00J6jms-00J6.xmlBasic definitions2024118Jon SterlingMarcelo Fiore3357Definitionjms-00I7jms-00I7.xmlRelation2024117Jon SterlingMarcelo FioreA (binary) relation R from a set A to a set B, written R \colon A \mathbin { \nrightarrow } B or R \in \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} is defined to be a subset R \subseteq A \times B. We shall typically write a \mathrel {R}b for { \mathopen {} \left ( a,b \right ) \mathclose {}} \in R. More generally, a relation between multiple sets { \mathopen {} \left ( A_i \right ) \mathclose {}} _{i \in I} is defined to be a subset of the cartesian product \prod _{i \in I} A_i.3362Lemmajms-00IMjms-00IM.xmlRelational extensionality2024117Jon SterlingMarcelo FioreLet A and B be two sets, and let R,S \colon A \mathbin { \nrightarrow } B be two relations from A to B. Then we have R=S if and only if \forall {a \in A} \mathpunct {.} \forall {b \in B} \mathpunct {.} a \mathrel {R}b \Longleftrightarrow a \mathrel {S}b.
3360Proof#425unstable-425.xml2024117Jon Sterlingjms-00IM
We recall that relation from A to B is nothing more than a subset of A \times B. By the axiom of extensionality, two subsets of A \times B are equal if and only if they contain precisely the same elements.
3384jms-00I8jms-00I8.xmlUses of relations in computer science2024117Jon SterlingMarcelo Fiore3368Examplejms-00I9jms-00I9.xmlRelations in program specification2024117Jon SterlingMarcelo FioreIn the simplest terms, a specification of a program is a relation that describes the possible input/output pairs that can occur. For example, the specification that a given program compute the square root is captured by the relation \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} given by pairs { \mathopen {} \left ( x,y \right ) \mathclose {}} such that x = y^2.3371Examplejms-00IAjms-00IA.xmlRelations in operational semantics2024117Jon SterlingLet E represent the set of states in a machine; then the behavior of this machine is usually described by a pair of relations S \colon E \mathbin { \nrightarrow } E and V \colon E \mathbin { \nrightarrow } { \mathopen {} \left \{ \star \right \} \mathclose {}}, such that e \mathrel {S} e' when it is possible for the machine to transition from state e to e' and such that e \mathrel {V} \star when the machine can halt in state e.3373Examplejms-00IBjms-00IB.xmlRelations in program typing2024117Jon SterlingMarcelo FioreLet E be the set of expression in a given programming language, and let T be the set of types in that programming language. Then the property of a given program having a certain type forms a relation E \mathbin { \nrightarrow } T.3376Examplejms-00ICjms-00IC.xmlRelations for program equivalence2024117Jon SterlingLet e,e' be two programs of type \tau. We say that e and e' are observationally equivalent when for any other program h \colon \tau \to { \mathopen {} \left ( \right ) \mathclose {}}, then h { \mathopen {} \left ( e \right ) \mathclose {}} terminates if and only if h { \mathopen {} \left ( e' \right ) \mathclose {}} terminates. If E_ \tau is the set of programs of type \tau, observational equivalence therefore forms a relation E_ \tau \mathbin { \nrightarrow } E_ \tau.3378Examplejms-00IDjms-00ID.xmlNetworks as relations2024117Jon SterlingMarcelo FioreA network is given by a set of nodes N and a relation C \colon N \mathbin { \nrightarrow } N expressing with two nodes are connected.3381Examplejms-00IEjms-00IE.xmlRelations in databases2024117Jon SterlingMarcelo FioreWe now come to an example of a relation between multiple sets: we could define a relation R \subseteq \text {Movies} \times \text {Directors} \times \text {Years} \times \text {People} consisting of tuples { \mathopen {} \left ( m,d,y,p \right ) \mathclose {}} where m is a movie directed by d in year y with p as a cast member.3398jms-00IFjms-00IF.xmlFormal examples of relations2024117Jon SterlingMarcelo Fiore3387Examplejms-00IGjms-00IG.xmlThe empty relation2024117Jon SterlingMarcelo FioreFor any two sets A and B, we may form the empty relation \varnothing \colon A \mathbin { \nrightarrow } B that relates no elements. In other words, \varnothing is the empty subset of A \times B.3390Examplejms-00IHjms-00IH.xmlThe full relation2024117Jon SterlingMarcelo FioreFor any two sets A and B, we may form the full relation { \mathopen {} \left ( A \times B \right ) \mathclose {}} \colon A \mathbin { \nrightarrow } B, also called the total relation, so that a \mathrel { { \mathopen {} \left ( A \times B \right ) \mathclose {}} }b for all a \in A and b \in B. In other words, { \mathopen {} \left ( A \times B \right ) \mathclose {}} is the total subset of A \times B. 3393Examplejms-00IIjms-00II.xmlThe identity relation2024117Jon SterlingMarcelo FioreFor any set A, we can form the identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A, also called the equality relation, which relates each element of A to itself. In other words, we have a \mathrel { \mathsf {id}_{ A } } a' if and only if a=a'.We have already seen the square root relation from positive reals to reals, which corresponds to a total but many-valued function. We can define an analogous relationship in below from positive integers (naturals) to integers, which will correspond to a partial and many-valued function.3396Examplejms-00IJjms-00IJ.xmlThe integer square root relation2024117Jon SterlingThe square root operation corresponds to a relation R_2 \colon \mathbb {N} \mathbin { \nrightarrow } \mathbb {Z} such that m \mathrel {R_2} n if and only if m = n^2.3407jms-00J5jms-00J5.xmlVisualising relations2024118Jon SterlingMarcelo Fiore3401Notationjms-00IKjms-00IK.xmlInternal diagrams of relations2024117Jon SterlingMarcelo FioreA useful way to visualise a relation between two sets is by means of internal diagrams: each set is depicted as a blob containing its elements, and lines are drawn from the elements of one blob to the elements of the second blob when they are related.In particular, let R \colon \mathbb {N} \mathbin { \nrightarrow } \mathbb {Z} be the following relation: R = { \mathopen {} \left \{ { \mathopen {} \left ( 0,0 \right ) \mathclose {}} , { \mathopen {} \left ( 1,-1 \right ) \mathclose {}} , { \mathopen {} \left ( 0,1 \right ) \mathclose {}} , { \mathopen {} \left ( 1,2 \right ) \mathclose {}} , { \mathopen {} \left ( 1,1 \right ) \mathclose {}} , { \mathopen {} \left ( 2,1 \right ) \mathclose {}} \right \} \mathclose {}} We can depict R by the following internal diagram:
\usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd}
\usetikzlibrary{matrix,arrows}
\usetikzlibrary{backgrounds,fit,positioning,calc,shapes}
\usetikzlibrary{decorations.pathreplacing}
\usetikzlibrary{decorations.pathmorphing}
\usetikzlibrary{decorations.markings}
\tikzset{
desc/.style={sloped, fill=white,inner sep=2pt},
upright desc/.style={fill=white,inner sep=2pt},
pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pullback 45/.style = {
append after command={
\pgfextra{
\draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
ne pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm);
}
}
},
sw pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm);
}
}
},
dotted pullback/.style = {
append after command={
\pgfextra{
\draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
muted pullback/.style = {
append after command={
\pgfextra{
\draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pushout/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm);
}
}
},
between/.style args={#1 and #2}{
at = ($(#1)!0.5!(#2)$)
},
diagram/.style = {
on grid,
node distance=2cm,
commutative diagrams/every diagram,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
}
},
fibration/.style = {
-{Triangle[open]}
},
etale/.style = {
-{Triangle[open]}
},
etale cover/.style= {
>={Triangle[open]},->.>
},
opfibration/.style = {
-{Triangle}
},
lies over/.style = {
|-{Triangle[open]}
},
op lies over/.style = {
|-{Triangle}
},
embedding/.style = {
{right hook}->
},
open immersion/.style = {
{right hook}-{Triangle[open]}
},
closed immersion/.style = {
{right hook}-{Triangle}
},
closed immersion*/.style = {
{left hook}-{Triangle}
},
embedding*/.style = {
{left hook}->
},
open immersion*/.style = {
{left hook}-{Triangle[open]}
},
exists/.style = {
densely dashed
},
}
\newlength{\dontworryaboutit}
\tikzset{
inline diagram/.style = {
commutative diagrams/every diagram,
commutative diagrams/cramped,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
anchor = base,
inner sep = 0pt
},
every path/.append style = {
outer xsep = 2pt
}
}
}
\tikzset{
square/nw/.style = {},
square/ne/.style = {},
square/se/.style = {},
square/sw/.style = {},
square/north/.style = {->},
square/south/.style = {->},
square/west/.style = {->},
square/east/.style = {->},
square/north/node/.style = {above},
square/south/node/.style = {below},
square/west/node/.style = {left},
square/east/node/.style = {right},
}
\ExplSyntaxOn
\bool_new:N \l_jon_glue_west
\keys_define:nn { jon-tikz/diagram } {
nw .tl_set:N = \l_jon_tikz_diagram_nw,
sw .tl_set:N = \l_jon_tikz_diagram_sw,
ne .tl_set:N = \l_jon_tikz_diagram_ne,
se .tl_set:N = \l_jon_tikz_diagram_se,
width .tl_set:N = \l_jon_tikz_diagram_width,
height .tl_set:N = \l_jon_tikz_diagram_height,
north .tl_set:N = \l_jon_tikz_diagram_north,
south .tl_set:N = \l_jon_tikz_diagram_south,
west .tl_set:N = \l_jon_tikz_diagram_west,
east .tl_set:N = \l_jon_tikz_diagram_east,
nw/style .code:n = {\tikzset{square/nw/.style = {#1}}},
sw/style .code:n = {\tikzset{square/sw/.style = {#1}}},
ne/style .code:n = {\tikzset{square/ne/.style = {#1}}},
se/style .code:n = {\tikzset{square/se/.style = {#1}}},
glue .choice:,
glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool},
glue~target .tl_set:N = \l_jon_tikz_glue_target,
north/style .code:n = {\tikzset{square/north/.style = {#1}}},
north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}},
south/style .code:n = {\tikzset{square/south/.style = {#1}}},
south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}},
west/style .code:n = {\tikzset{square/west/.style = {#1}}},
west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}},
east/style .code:n = {\tikzset{square/east/.style = {#1}}},
east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}},
draft .meta:n = {
nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}},
sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}},
se = {\__jon_tikz_diagram_fmt_placeholder:n {se}},
ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}},
north = {\__jon_tikz_diagram_fmt_placeholder:n {north}},
south = {\__jon_tikz_diagram_fmt_placeholder:n {south}},
west = {\__jon_tikz_diagram_fmt_placeholder:n {west}},
east = {\__jon_tikz_diagram_fmt_placeholder:n {east}},
}
}
\tl_set:Nn \l_jon_tikz_diagram_width { 2cm }
\tl_set:Nn \l_jon_tikz_diagram_height { 2cm }
\cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 {
\texttt{\textcolor{red}{#1}}
}
\keys_set:nn { jon-tikz/diagram } {
glue~target = {},
}
\cs_new:Nn \__jon_tikz_render_square:nn {
\group_begin:
\keys_set:nn {jon-tikz/diagram} {#2}
\bool_if:nTF \l_jon_glue_west {
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
} {
\node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$};
\node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$};
\draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw);
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
}
\group_end:
}
\NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{
\__jon_tikz_render_square:nn {#1} {#2}
}
\NewDocumentCommand\DiagramSquare{D<>{}O{}m}{
\begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)]
\__jon_tikz_render_square:nn {#1} {#3}
\end{tikzpicture}
}
\ExplSyntaxOff
\begin {tikzpicture}
\begin {scope}
\node (l/0) {$0$};
\node [below = .5cm of l/0] (l/1) {$1$};
\node [below = .5cm of l/1] (l/2) {$2$};
\end {scope}
\begin {scope}[shift={(1.5cm,0)}]
\node (r/-1) {$-1$};
\node [below = .5cm of r/-1] (r/0) {$0$};
\node [below = .5cm of r/0] (r/1) {$1$};
\node [below = .5cm of r/1] (r/2) {$2$};
\end {scope}
\draw [thick] (l/0) to (r/0);
\draw [thick] (l/1) to (r/-1);
\draw [thick] (l/0) to (r/1);
\draw [thick] (l/1) to (r/2);
\draw [thick] (l/1) to (r/1);
\draw [thick] (l/2) to (r/1);
\begin {scope}[on background layer]
\node [rectangle, rounded corners=10pt, fill=yellow!20,thick,fit=(l/0)(l/2),inner sep=5pt] {};
\node [rectangle, rounded corners=10pt, fill=red!20,thick,fit=(r/-1)(r/2),inner sep=5pt] {};
\end {scope}
\end {tikzpicture}
3404Exercisejms-00ILjms-00IL.xmlAn internal diagram2024117Jon SterlingMarcelo FioreDraw the internal diagram corresponding to the following relation: \begin {aligned} S& \colon \mathbb {Z} \mathbin { \nrightarrow } \mathbb {Z} \\ S&= { \mathopen {} \left \{ { \mathopen {} \left ( 1,0 \right ) \mathclose {}} , { \mathopen {} \left ( 1,2 \right ) \mathclose {}} , { \mathopen {} \left ( 2,1 \right ) \mathclose {}} , { \mathopen {} \left ( 2,3 \right ) \mathclose {}} \right \} \mathclose {}} \end {aligned} 3423jms-00IQjms-00IQ.xmlRelational composition2024117Jon SterlingMarcelo Fiore3410Definitionjms-00INjms-00IN.xmlRelational composite2024117Jon SterlingMarcelo FioreGiven relations R \colon A \mathbin { \nrightarrow } B and S \colon B \mathbin { \nrightarrow } C, we can define the relational composite S \circ R \colon A \mathbin { \nrightarrow } C in a way that generalises composition of functions. In particular, we define S \circ R to be the following subset of A \times C: \begin {aligned} S \circ R & \colon A \mathbin { \nrightarrow } C \\ S \circ R &= { \mathopen {} \left \{ { \mathopen {} \left ( a,c \right ) \mathclose {}} \in A \times C \mid \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S} c \right \} \mathclose {}} \end {aligned} 3415Examplejms-00IOjms-00IO.xmlNegation invariance of the square root relation2024117Jon SterlingMarcelo FioreRecall the square root relation \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} from , and let \mathsf {neg} \colon \mathbb {R} \mathbin { \nrightarrow } \mathbb {R} be the relation { \mathopen {} \left \{ (x,y) \in \mathbb {R}^2 \mid x = -y \right \} \mathclose {}}. Then the relational composite \mathsf {neg} \circ \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} is equal to \mathsf {sq}.
3413Proof#424unstable-424.xml2024117Jon Sterlingjms-00IO
By relational extensionality, it suffices to check that for any x \in \mathbb {R}_{ \geq 0} and y \in \mathbb {R}, we have x \mathrel { { \mathopen {} \left ( \mathsf {neg} \circ \mathsf {sq} \right ) \mathclose {}} } y if and only if x \mathrel { \mathsf {sq}}y. We compute:
\begin {aligned} x \mathrel { { \mathopen {} \left ( \mathsf {neg} \circ \mathsf {sq} \right ) \mathclose {}} } y & \Longleftrightarrow \exists z \in \mathbb {R} \mathpunct {.} x \mathrel { \mathsf {sq}}z \land z \mathrel { \mathsf {neg}}y \\ & \Longleftrightarrow \exists z \in \mathbb {R} \mathpunct {.} x=z^2 \land z =-y \\ & \Longleftrightarrow x= { \mathopen {} \left ( -y \right ) \mathclose {}} ^2 \\ & \Longleftrightarrow x = y^2 \\ & \Longleftrightarrow x \mathrel { \mathsf {sq}}y \end {aligned}
3420Lemmajms-00IPjms-00IP.xmlAssociativity and unit laws of relational composition2024117Jon SterlingMarcelo FioreRelational composition is associative and has the identity relation as a neutral element.
3418Proof#423unstable-423.xml2024117Jon Sterlingjms-00IP
To prove associativity, we fix relations R \colon A \mathbin { \nrightarrow } B, S \colon B \mathbin { \nrightarrow } C, and T \colon C \mathbin { \nrightarrow } D to prove { \mathopen {} \left ( T \circ S \right ) \mathclose {}} \circ R = T \circ { \mathopen {} \left ( S \circ R \right ) \mathclose {}}. To get started, we compute the intermediate composites:
\begin {aligned} b \mathrel { { \mathopen {} \left ( T \circ S \right ) \mathclose {}} }d & \Longleftrightarrow \exists c \in C \mathpunct {.} b \mathrel {S}c \land c \mathrel {T}d \\ a \mathrel { { \mathopen {} \left ( S \circ R \right ) \mathclose {}} }c & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \end {aligned}
Using the above, we can compute the full composites:
\begin {aligned} a \mathrel { { \mathopen {} \left ( { \mathopen {} \left ( T \circ S \right ) \mathclose {}} \circ R \right ) \mathclose {}} } d & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel { { \mathopen {} \left ( T \circ S \right ) \mathclose {}} } d \\ & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land \exists c \in C \mathpunct {.} c \mathrel {S}c \land c \mathrel {T}d \\ & \Longleftrightarrow \exists b \in B \mathpunct {.} \exists c \in C \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \land c \mathrel {T}d \\ & \Longleftrightarrow \exists c \in C \mathpunct {.} { \mathopen {} \left ( \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \right ) \mathclose {}} \land c \mathrel {T}d \\ & \Longleftrightarrow \exists c \in C \mathpunct {.} a \mathrel { { \mathopen {} \left ( S \circ R \right ) \mathclose {}} }c \land c \mathrel {T}d \\ & \Longleftrightarrow a \mathrel { { \mathopen {} \left ( T \circ { \mathopen {} \left ( S \circ R \right ) \mathclose {}} \right ) \mathclose {}} }d \end {aligned}
For the right and left neutrality, we must prove that R \circ \mathsf {id}_{ A } = R = \mathsf {id}_{ B } \circ R for all r \colon A \mathbin { \nrightarrow } B. We prove only the first law, as the other proof is analogous:
\begin {aligned} a \mathrel { { \mathopen {} \left ( R \circ \mathsf {id}_{ A } \right ) \mathclose {}} } b & \Longleftrightarrow \exists a' \in A \mathpunct {.} a \mathrel { \mathsf {id}_{ A } }a' \land a' \mathrel {R}b \\ & \Longleftrightarrow \exists a' \in A \mathpunct {.} a = a' \land a' \mathrel {R}b \\ & \Longleftrightarrow a \mathrel {R} b \end {aligned}
3476jms-00IRjms-00IR.xmlRelations and matrices2024117Jon SterlingMarcelo FioreRelations between finite sets can be desribed in a more computationally friendly way by their tabulation as matrices. In particular, we shall see in that an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over the boolean semiring is precisely the same thing as a relation from { \mathopen {} \left [ m \right ] \mathclose {}} to { \mathopen {} \left [ n \right ] \mathclose {}}, where { \mathopen {} \left [ l \right ] \mathclose {}} = { \mathopen {} \left \{ i \mid 0 \leq i < l \right \} \mathclose {}} is the set of natural numbers strictly smaller than l. Then we will see that relational composition is, under this correspondence, the same as matrix multiplication.3429Definitionjms-00ISjms-00IS.xmlMatrix over a semiring2024117Jon SterlingMarcelo FioreFor natural numbers m and n, an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over a semiring { \mathopen {} \left ( S,0, \oplus ,1, \odot \right ) \mathclose {}} is given by entries M_{i,j} \in S for all i \in { \mathopen {} \left [ m \right ] \mathclose {}} and j \in { \mathopen {} \left [ n \right ] \mathclose {}}. We will write \mathrm {Mat}_{ S } { \mathopen {} \left ( m , m \right ) \mathclose {}} for the set of { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrices.3433Notationjms-00J4jms-00J4.xmlMatrices as tables2024117Jon SterlingMarcelo Fiore{ \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrices can be depicted in tables or grids with rows in the first dimension and columns in the second dimension. For example, let M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( 3 , 2 \right ) \mathclose {}} be the matrix over the booleans defined by the following equation: M_{i,j} = \begin {cases} \mathsf {true} & \text {if } \mathsf {parity} { \mathopen {} \left ( i \right ) \mathclose {}} = \mathsf {parity} { \mathopen {} \left ( j \right ) \mathclose {}} \\ \mathsf {false} & \text {otherwise} \end {cases} Then M is depicted by the following table with three rows and two columns: M = \begin {bmatrix} \mathsf {true} & \mathsf {false} \\ \mathsf {false} & \mathsf {true} \\ \mathsf {true} & \mathsf {false} \end {bmatrix} 3437Definitionjms-00ITjms-00IT.xmlThe identity matrix2024117Jon SterlingFor any m \in \mathbb {N}, we define the identity { \mathopen {} \left ( m \times m \right ) \mathclose {}}-matrix over a given semiring S as follows: I^m_{i,j} = \begin {cases} 1& \text {if } i=j \\ 0& \text {otherwise} \end {cases} The identity matrix is sometimes called the diagonal matrix, for reasons that become apparent when visualising it according to : I^4 = \begin {bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end {bmatrix} 3454jms-00J1jms-00J1.xmlCorrespondence between matrices and finite relations2024117Jon Sterling3440Definitionjms-00IYjms-00IY.xmlThe matrix associated to a finite relation2024117Jon SterlingLet R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} be a relation; for any semiring S, we may form the { \mathopen {} \left ( n \times n \right ) \mathclose {}}-matrix over S associated to R as follows: { \mathopen {} \left ( \operatorname { \underline {mat}} _S{R} \right ) \mathclose {}} _{i \in { \mathopen {} \left [ m \right ] \mathclose {}} ,j \in { \mathopen {} \left [ n \right ] \mathclose {}} } = \begin {cases} 1& \text {if } i \mathrel {R}j \\ 0& \text {otherwise} \end {cases} We have defined a function \operatorname { \underline {mat}} _S \colon \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}}.3443Definitionjms-00IZjms-00IZ.xmlThe relation associated to a matrix2024117Jon SterlingLet M be an { \mathopen {} \left ( n \times n \right ) \mathclose {}}-matrix over a semiring S. We define the relation associated to M below: \begin {aligned} \operatorname { \underline {rel}} _S{M} & \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} \\ i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _S{M} \right ) \mathclose {}} } j & \Longleftrightarrow M_{i,j} = 1 \end {aligned} We have defined a function \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}}.3447Lemmajms-00J0jms-00J0.xmlA retraction from matrices to finite relations2024117Jon SterlingThe associated matrix function \operatorname { \underline {mat}} _S \colon \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} is a section of the associated relation function \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} for any semiring S.
3445Proof#421unstable-421.xml2024117Jon Sterlingjms-00J0
We must check that \operatorname { \underline {rel}} _S \circ \operatorname { \underline {mat}} _S = \mathsf {id}_{ \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} }. Fixing a relation R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}}, we compute:
\begin {aligned} i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _S { \mathopen {} \left ( \operatorname { \underline {mat}} _SR \right ) \mathclose {}} \right ) \mathclose {}} } j & \Longleftrightarrow { \mathopen {} \left ( \operatorname { \underline {mat}} _SR \right ) \mathclose {}} _{i,j} = 1 \\ & \Longleftrightarrow i \mathrel {R}j \end {aligned}
The other composite \operatorname { \underline {mat}} _S \circ \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} is not in general the identity function, but is (necessarily) an idempotent on the set of matrices over S. We will see that this idempotent, in some sense, measures the degree to which the base semiring S is not boolean.3451Lemmajms-00IXjms-00IX.xmlFinite relations as matrices over the booleans2024117Jon SterlingThe idempotent { \operatorname { \underline {mat}} _ \mathbb {B} \circ \operatorname { \underline {rel}} _ \mathbb {B} \colon \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} } is in fact the identity function on matrices over the boolean semiring \mathbb {B}.
3449Proof#422unstable-422.xml2024117Jon Sterlingjms-00IX
We fix a matrix M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} and compute:
\begin {aligned} { \mathopen {} \left ( \operatorname { \underline {mat}} _ \mathbb {B} { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} \right ) \mathclose {}} _{i,j} &= \begin {cases} 1 & \text {if } i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} } j \\ 0 & \text {otherwise} \end {cases} \\ &= \begin {cases} 1 & \text {if } M_{i,j} = 1 \\ 0 & \text {otherwise} \end {cases} \end {aligned}
Because \mathbb {B} is the boolean semiring, any scalar s \in S is either 0 or 1. Therefore, we conclude:
{ \mathopen {} \left ( \operatorname { \underline {mat}} _ \mathbb {B} { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} \right ) \mathclose {}} _{i,j} = M_{i,j}
Thus we conclude that an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over the booleans is the same thing as a relation from { \mathopen {} \left [ m \right ] \mathclose {}} to { \mathopen {} \left [ n \right ] \mathclose {}}.3473jms-00IWjms-00IW.xmlMultiplication of matrices2024117Jon SterlingMarcelo Fiore3456Side remarkjms-00IUjms-00IU.xmlMatrix multiplication in geometry2024117Jon SterlingAlthough we do not explore it in this course, the viewpoint of matrix multiplication as relational composition generalises to a correct explanation of the role of matrices in geometry as presentations of linear maps between vector spaces in terms of an (uncanonical) choice of basis.3461Definitionjms-00IVjms-00IV.xmlProduct of matrices2024117Jon SterlingLet M be an { \mathopen {} \left ( l \times m \right ) \mathclose {}}-matrix and let N be a { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix. The product of M and N, written M \cdot N, is the following { \mathopen {} \left ( l \times n \right ) \mathclose {}}-matrix: { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{i \in { \mathopen {} \left [ l \right ] \mathclose {}} ,j \in { \mathopen {} \left [ n \right ] \mathclose {}} } = \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \cdot N_{k,j} 3465Lemmajms-00J3jms-00J3.xmlAssociativity and unit laws of matrix products2024117Jon SterlingMarcelo FioreThe product of matrices is associative and has the identity matrix as neutral element.
3463Proof#419unstable-419.xml2024117Jon Sterlingjms-00J3
For associativity, fix matrices L \in \mathrm {Mat}_{ S } { \mathopen {} \left ( k , l \right ) \mathclose {}}, M \in \mathrm {Mat}_{ S } { \mathopen {} \left ( l , m \right ) \mathclose {}} and N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} to check that { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} \cdot L = N \cdot { \mathopen {} \left ( M \cdot L \right ) \mathclose {}}. Below we use the associativity of multiplication, commutativity of addition, and distributivity of multiplication over addition in the semiring S:
\begin {aligned} { \mathopen {} \left ( { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} \cdot L \right ) \mathclose {}} _{a,b} &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{c,b} \\ &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{c,d} \cdot N_{d,b} \\ &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } L_{a,c} \cdot M_{c,d} \cdot N_{d,b} \\ &= \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } { \mathopen {} \left ( \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot M_{c,d} \right ) \mathclose {}} \cdot N_{d,b} \\ &= \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } { \mathopen {} \left ( M \cdot L \right ) \mathclose {}} _{a,d} \cdot N_{d,b} \\ &= N \cdot { \mathopen {} \left ( M \cdot L \right ) \mathclose {}} \end {aligned}
For one unit law, we fix M \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} and recall the definition of the identity matrix to check that M \cdot I^m = M.
\begin {aligned} { \mathopen {} \left ( M \cdot I^m \right ) \mathclose {}} _{i,j} &= \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } I^m_{i,k} \cdot M_{k, j} \end {aligned}
Unfolding the definition of I^m_{i,j} as given in , we see that the iterated sum above expands to m-1 copies of 0 \cdot M_{k,j} for various k \not = i and one copy of 1 \cdot M_{i,j}. Thus, we conclude that M \cdot I^m_{i,j} = M_{i,j} using the absorption and unit laws for multiplication as well as the unit laws for addition in S.
The other unit law follows in an analogous way.
3470Lemmajms-00J2jms-00J2.xmlMatrix product is relational composition2024117Jon SterlingMarcelo FioreUnder the correspondence between boolean matrices and finite relations (, ), matrix products of boolean matrices correspond to relational composites. In particular, given M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( l , m \right ) \mathclose {}} and N \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}}, we have: \operatorname { \underline {rel}} _ \mathbb {B} { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} = \operatorname { \underline {rel}} _ \mathbb {B} N \circ \operatorname { \underline {rel}} _ \mathbb {B} M
3468Proof#420unstable-420.xml2024117Jon Sterlingjms-00J2
We use the fact that the additive operation of the boolean semiring is disjunction and the multiplicative operation is conjunction:
\begin {aligned} { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{i,j} &= \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \cdot N_{k,j} \\ &= \bigvee _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \land N_{k,j} \\ &= \exists k \in { \mathopen {} \left [ m \right ] \mathclose {}} \mathpunct {.} M_{i,k} \land N_{k,j} \end {aligned}
3556jms-00JAjms-00JA.xmlLecture 14: directed graphs and paths2024122Jon SterlingMarcelo Fiore
3482jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3496jms-00JSjms-00JS.xmlAdding matrices together2024120Jon Sterling3484Definitionjms-00JOjms-00JO.xmlPointwise addition of matrices2024120Jon SterlingLet M,N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} be two matrices over the same semiring with the same dimensions. We will write M+N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} for the (pointwise) addition of the two matrices: { \mathopen {} \left ( M+N \right ) \mathclose {}} _{i,j} = M_{i,j} + N_{i,j} 3486Definitionjms-00JPjms-00JP.xmlThe zero matrix2024120Jon SterlingA zero matrix is one that whose cells all contain 0. In particular, we have: \begin {aligned} \mathbf {0} & \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \\ \mathbf {0}_{i,j} &= 0 \end {aligned} 3490Lemmajms-00JQjms-00JQ.xmlAssociativity, commutativity, and unit laws of matrix addition2024120Jon SterlingMatrix addition forms an associative and commutative operation on \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} with zero matrices \mathbf {0} as neutral element.
3488Proof#415unstable-415.xml2024120Jon Sterlingjms-00JQ
This follows immediately from the associativity, commutativity, and unit laws of the additive submonoid { \mathopen {} \left ( +,0 \right ) \mathclose {}} of the semiring S.
3494Lemmajms-00JRjms-00JR.xmlMatrix addition and union of relations2024120Jon SterlingThe pointwise sum M+N of M,N \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} corresponds to the union of the relations associated to M and N: \operatorname { \underline {rel}} _{ \mathbb {B}} { \mathopen {} \left ( M+N \right ) \mathclose {}} = \operatorname { \underline {rel}} _{ \mathbb {B}}M \cup \operatorname { \underline {rel}} _{ \mathbb {B}}N Moreover, the zero matrix corresponds to the empty relation: \operatorname { \underline {rel}} _{ \mathbb {B}} \mathbf {0} = \varnothing.
3492Proof#414unstable-414.xml2024120Jon Sterlingjms-00JR
This follows because addition and zero in the boolean semiring are given by disjunction and falsehood respectively.
3539jms-00JTjms-00JT.xmlDirected graphs and paths2024120Jon SterlingMarcelo Fiore3498Definitionjms-00JCjms-00JC.xmlDirected graph2024118Jon SterlingMarcelo FioreA directed graph { \mathopen {} \left ( A,R \right ) \mathclose {}} consists of a set A and a binary relation R \colon A \mathbin { \nrightarrow } A from A to itself; such a relation from a set to itself is called an endo-relation. We will abbreviate \mathrm {Rel} { \mathopen {} \left ( A,A \right ) \mathclose {}} by \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of all endo-relations on A.3503Corollaryjms-00JDjms-00JD.xmlMonoid structure of endo-relations2024118Jon SterlingMarcelo FioreFor every set A, the structure { \mathopen {} \left ( \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} , \mathsf {id}_{ A } , \circ \right ) \mathclose {}} given by endo-relations is a monoid.
3501Proof#418unstable-418.xml2024118Jon Sterlingjms-00JD
We have seen in that relational composition is associative and has identity relations \mathsf {id}_{ A } as neutral elements.
3507Definitionjms-00JEjms-00JE.xmlIteration of endo-relations2024118Jon SterlingMarcelo FioreFor R \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} and n \in \mathbb {N}, we define the n-fold iteration of R to be the following endo-relation R^{ \circ n} \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}}: R^{ \circ n} = \underbrace {R \circ \cdots \circ R}_{n \text { times}} More precisely, we define R^{ \circ n} by recursion on n: \begin {aligned} R^{ \circ 0} &= \mathsf {id}_{ A } \\ R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} } &= R \circ R^{ \circ n} \end {aligned} 3510Definitionjms-00JFjms-00JF.xmlPath in a directed graph2024118Jon SterlingMarcelo FioreLet { \mathopen {} \left ( A,R \right ) \mathclose {}} be a directed graph. For s,t \in A, a path of length n \in \mathbb {N} in R with source s and target t is a tuple { \mathopen {} \left ( a_0, \ldots ,a_n \right ) \mathclose {}} \in A^{n+1} such that a_0=s, a_n=t, and a_i \mathrel {R}a_{i+1} for all 0 \leq i <n.3515Lemmajms-00JGjms-00JG.xmlPaths and iteration2024118Jon SterlingMarcelo FioreLet A,R be a directed graph. For all n \in \mathbb {N} and s,t \in A, we have s \mathrel {R^{ \circ n}}t if and only if there exists a path of length n in R with source s and target t.
3513Proof#417unstable-417.xml2024118Jon Sterlingjms-00JG
Let P(n) be the proposition that for all s,t \in A we have s \mathrel {R^{ \circ n}}t if and only if there exists a path of length n in R with source s and target t.
We shall prove P { \mathopen {} \left ( n \right ) \mathclose {}} for all n by induction.
In the base case, we must prove P { \mathopen {} \left ( 0 \right ) \mathclose {}}: this states for for all s,t \in A, we have s=t if and only if there exists a path in R of length 0 from s to t. This holds by definition of paths, considering the unary tuple.
In the inductive step, we may assume P { \mathopen {} \left ( n \right ) \mathclose {}} to prove P { \mathopen {} \left ( n+1 \right ) \mathclose {}}. Fixing s,t \in A we must show that s \mathrel {R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }} t if and only if there exists a path of length n+1 in R. By definition, we have s \mathrel {R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }} t if and only if there exists s' such that s \mathrel {R}s' and s' \mathrel {R^{ \circ n}} t. By our inductive hypothesis P { \mathopen {} \left ( 0 \right ) \mathclose {}} applied to s',t, the latter holds if and only if there exists a path of length n from s' to t in R. As we can extend such a path by the link s \mathrel {R}s', we are done.
3518Definitionjms-00JHjms-00JH.xmlThe reflexive-transitive closure of an endo-relation2024118Jon SterlingMarcelo FioreFor R \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}}, we define R^{ \circ *} \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} to be the smallest relation closed under all finite paths in R, i.e. the reflexive-transitive closure of R. R^{ \circ *} = \bigcup _{n \in \mathbb {N}} R^{ \circ n} 3523Observationjms-00JUjms-00JU.xmlReflexive-transitive closure of finite endo-relations2024121Jon SterlingIn the case of an endo-relation R \in \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}}, the reflexive-transitive closure R^{ \circ *} can be computed using a finite union rather over m \leq n rather than an infinite union over all of \mathbb {N}; in particular, we have R^{ \circ *} = \bigcup _{k \leq n} R^{ \circ k}.
3521Proof#413unstable-413.xml2024121Jon Sterlingjms-00JU
There are only n distinct elements that could be related, so there cannot be a chain of distinct elements with length o>n, and any finite path in R from s to t can be replaced by one that has no repetitions.
3527Corollaryjms-00JIjms-00JI.xmlThe path relation via reflexive-transitive closure2024118Jon SterlingMarcelo FioreLet { \mathopen {} \left ( A,R \right ) \mathclose {}} be a directed graph. For all s,t \in A, we have s \mathrel {R^{ \circ *}}t if and only if there exists a path with source s and target t in R.
3525Proof#416unstable-416.xml2024118Jon Sterlingjms-00JI
This is an immediate consequence of : by we have s \mathrel {R^{ \circ *}}t if and only if there exists n \in \mathbb {N} such that s \mathrel {R^{ \circ n}}t, which by holds if and only if there existss a path from s to t in R of some length n \in \mathbb {N}.
3530Definitionjms-00JNjms-00JN.xmlAdjacency matrix of a finite directed graph2024120Jon SterlingMarcelo FioreWe have seen in Lecture 13 () how to turn any relation R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} into a matrix \operatorname { \underline {mat}} _{ \mathbb {B}}R \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} over the boolean semiring. When R \in \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} is the edge relation in a graph with vertices in the finite set { \mathopen {} \left [ n \right ] \mathclose {}}, we refer to \operatorname { \underline {mat}} _{ \mathbb {B}}R \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( n , n \right ) \mathclose {}} as the adjacency matrix of R.3536Algorithmjms-00JVjms-00JV.xmlThe adjacency matrix of the reflexive-transitive closure2024121Jon SterlingMarcelo FioreLet M = \operatorname { \underline {mat}} _{ \mathbb {B}}R be the adjacency matrix of a finite directed graph { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} ,R \right ) \mathclose {}}. We will show that the adjacency matrix M^* of the reflexive-transitive closure R^{ \circ *} can be computed recursively using the additive and multiplicative structure of matrices. In particular, we can define M^* = M_n where M_k is computed recursively as below: \begin {aligned} M_0 &= I^n \\ M_{k+1} &= I^n + { \mathopen {} \left ( M \cdot M_k \right ) \mathclose {}} \end {aligned} Thus we have a recursive algorithm that can deicde whether there is a path between two elemetns in a finite directed graph: indeed, there is a path from s to t in R if and only if M^*_{s,t} = \mathsf {true}.
3534Proof#412unstable-412.xml2024121Jon Sterlingjms-00JV
We first unravel the first few M_k’s to get a better understanding.
\begin {aligned} M_1 &= I^n + M \cdot M_0 \\ &= I^n+M \cdot I^n \\ &= I^n + M \\ M_2 &= I^n + M \cdot M_1 \\ &= I^n + M \cdot (I^n + M) \\ &= I^n + M \cdot I^n + M \cdot M \\ &= I^n + M + M^2 \end {aligned}
By induction, we can extend the pattern above to a closed form for each M_k. In particular, we deduce that M_k = \sum _{l \leq k} M^l by recalling from that the identity matrix is the neutral element for matrix multiplication. Therefore, we have M_n = \sum _{k \leq n}M^k.
Our goal is to show that M_n = \operatorname { \underline {mat}} _{ \mathbb {B}} R^{ \circ *}; we have already seen in and that matrix addition corresponds to union of relations and matrix multiplication corresponds to relational composition.
By (reflexive-transitive closure of finite endo-relations), we have R^{ \circ *} = \bigcup _{k \leq n} R^{ \circ k}. Therefore:
\operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ *} = \operatorname { \underline {mat}} _{ \mathbb {B}} \bigcup _{k \leq n} R^{ \circ k}
By (matrix addition and union of relations), we have:
\operatorname { \underline {mat}} _{ \mathbb {B}} \bigcup _{k \leq n} R^{ \circ k} = \sum _{k \leq n} \operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ k}
By (matrix product is relational composition), we have:
\sum _{k \leq n} \operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ k} = \sum _{k \leq n} { \mathopen {} \left ( \operatorname { \underline {mat}} _{ \mathbb {B}}R \right ) \mathclose {}} ^k = \sum _{k \leq n} M^k = M_n
3553jms-00JWjms-00JW.xmlPreorders and relations2024121Jon SterlingMarcelo Fiore3542Definitionjms-00JXjms-00JX.xmlPreorder2024121Jon SterlingMarcelo FioreA preorder { \mathopen {} \left ( P, \sqsubseteq \right ) \mathclose {}} consists of a set P and a relation { \sqsubseteq } \colon P \mathbin { \nrightarrow } P satisfying the following two axioms:Reflexivity. \forall x \in P \mathpunct {.} x \sqsubseteq x
Transitivity. \forall x,y,z \in P \mathpunct {.} { \mathopen {} \left ( x \sqsubseteq y \land y \sqsubseteq z \right ) \mathclose {}} \implies x \sqsubseteq zIn other words, a preorder is a directed graph such that the edge relation is both reflexive and transitive.3545Examplejms-00JYjms-00JY.xmlExamples of preorders2024121Jon SterlingMarcelo FiorePreorders are everywhere in both general mathematics and computer science.The real numbers equipped with their inequality relation form a preorder \mathbb {R}, \leq. The converse relation { \mathopen {} \left ( \mathbb {R}, \geq \right ) \mathclose {}} of course also forms a preorder. This example can be specialised to different classes of numbers, such as integers or naturals.
Subsets with their inclusion and containment relations form preorders { \mathopen {} \left ( \mathcal {P}(A), \subseteq \right ) \mathclose {}} and { \mathopen {} \left ( \mathcal {P}(A), \supseteq \right ) \mathclose {}}. This example can be specialised to other examples that have additional structure, such as the preorder \mathcal {P}_f(A) of finite subsets of a given set A, or the preorder \mathcal {O}(X) of open sets of a topological space X.
A slightly more nontrivial example is furnished by the preorder { \mathopen {} \left ( \mathbb {Z},{ \mid } \right ) \mathclose {}} of the integers and the divisibility relation, where m \mid n if and only if \exists k \in \mathbb {Z} \mathpunct {.} m k = n: indeed, we certainly have n divides n, and if l divides m and m divides n, then l divides m. The first example above happen to be total orders (meaning that the underlying directed graph is connected): so { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} , \subseteq \right ) \mathclose {}} and { \mathopen {} \left ( \mathbb {Z},{ \mid } \right ) \mathclose {}} are example of non-total preorders.All the examples we have seen so far are, furthermore, partial orders: a preorder is a partial order when we have a \sqsubseteq b \land b \sqsubseteq a if and only if a=b. In fact, there are arguably very few useful preorders that are not partial orders, and it is always possible to replace a preorder by an equivalent partial order; we do not pursue this here.3550Theoremjms-00JZjms-00JZ.xmlPreorders and reflexive-transitive closure2024121Jon SterlingMarcelo FioreThe reflexive-transitive closure R^{ \circ *} of an endo-relation R \colon A \mathbin { \nrightarrow } A is, by definition, a preorder. We will show that R^{ \circ *} enjoys a universal property as the smallest preorder containing R as a subrelation.To be more precise, let \mathcal {F}_R be the set of preorders on A that contain R as a subrelation: \mathcal {F}_R = { \mathopen {} \left \{ Q \colon A \mathbin { \nrightarrow } A \, \middle \vert \, R \subseteq Q \land \text { Q is a preorder} \right \} \mathclose {}} Then we have R^{ \circ *}= \bigcap \mathcal {F}_R, and so the reflexive-transitive closure is the smallest preorder containing R as a subrelation. (Indeed, recall that the intersection of an intersection-closed class of relations is the smallest relation in that class.)
3548Proof#411unstable-411.xml2024121Jon Sterlingjms-00JZ
To show that R^{ \circ *}= \bigcap \mathcal {F}_R, it suffices to check that \bigcap \mathcal {F}_R \subseteq R^{ \circ *} and R^{ \circ *} \subseteq \bigcap \mathcal {F}_R.
We first note that R \in \mathcal {F}_R — which holds by almost by definition:
We first need to show that R \subseteq R^{ \circ *}, i.e. that when a \mathrel {R}b we have a \mathrel {R}^{ \circ *}b. This is evident, as b \mathrel {R}b is a path of length 1 in R and R^{ \circ *} contains all finite paths in R.
Next, we need to show that R^{ \circ *} is a preorder. But this holds because R^{ \circ *} contains the empty path from an element to itself, and because a path of length m from a to b can be concatenated with a path of length n from b to c to obtain a path (of length m+n) from a to c.
It follows from R \in \mathcal {F}_R that \bigcap \mathcal {F}_R \subseteq R^{ \circ *}. Indeed:
a \mathrel { { \mathopen {} \left ( \bigcap \mathcal {F}_R \right ) \mathclose {}} }b \Longleftrightarrow \forall Q \in \mathcal {F}_R \mathpunct {.} a \mathrel {Q}b
So if a \mathrel { { \mathopen {} \left ( \bigcap \mathcal {F}_R \right ) \mathclose {}} }b and R^{ \circ *} \in \mathcal {F}_R, we may choose Q := R^{ \circ *} to deduce a \mathrel {R^{ \circ *}}b. Therefore, \bigcap \mathcal {F}_R \subseteq R^{ \circ *}.
Our second goal is to prove R^{ \circ *} \subseteq \bigcap \mathcal {F}_R; by the definition of intersections, this holds if and only if \forall Q \in \mathcal {F}_R \mathpunct {.} R^{ \circ *} \subseteq Q.
Fixing such a preorder Q over A containing R as a subrelation, we must prove that R^{ \circ *} \subseteq Q. Recalling that R^{ \circ *} is the union \bigcup _{n \in \mathbb {N}}R^{ \circ n}, we see by the definition of unions that that R^{ \circ *} \subseteq Q if and only if for each n \in \mathbb {N}, we have R^{ \circ n} \subseteq Q. This we prove by induction on n \in \mathbb {N}.
In the base case, we must show that R^{ \circ0 } = \mathsf {id}_{ A } is a subrelation of Q. This is equivalent to Q being reflexive, which we deduce from our assumption that Q is a preorder.
In the inductive step, we may assume R^{ \circ n} \subseteq Q and we must prove that R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }=R \circ R^{ \circ n} is a subrelation of Q. For any a,b \in A, we have a \mathrel { { \mathopen {} \left ( R \circ R^{ \circ n} \right ) \mathclose {}} } b if and only if there exists some c \in A such that a \mathrel {R}c and c \mathrel {R^{ \circ n}} b. Because we have assumed that R \subseteq Q, we have a \mathrel {Q}c; by our inductive hypothesis, we have c \mathrel {Q}b. Becuase Q is a preorder and therefore transitive, we therefore have a \mathrel {Q}b.
Therefore, R^{ \circ *} is indeed the intersection of \mathcal {F}_R.
3611jms-00K6jms-00K6.xmlLecture 15: functions and inductive definitions2024124Jon SterlingMarcelo Fiore
3559jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3605jms-00KJjms-00KJ.xmlPartial and total functions2024123Jon SterlingMarcelo Fiore3562Definitionjms-00K7jms-00K7.xmlPartial function2024122Jon SterlingMarcelo FioreA relation R \colon A \mathbin { \nrightarrow } B is said to be functional when it relates an element of A to at most one element of B: \forall a \in A \mathpunct {.} \forall b_1,b_2 \in B \mathpunct {.} a \mathrel {R}b_1 \land a \mathrel {R}b_2 \implies b_1 = b_2 We refer to a functional relation as a partial function; we often use letters like f,g to refer to partial functions rather than R,S. When we write “f \colon A \to B”, we mean that f is a partial function from A to B.3567Theoremjms-00K8jms-00K8.xmlClosure of partial functions under identity and composition2024122Jon SterlingMarcelo FiorePartial functions are closed under identity and composition in the following sense:For any set A, the identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A is functional.
For any sets A,B,C and functional relations f \colon A \mathbin { \rightharpoonup } B and g \colon B \mathbin { \rightharpoonup } C, the relational composite g \circ f \colon A \mathbin { \nrightarrow } C is functional.
3565Proof#410unstable-410.xml2024122Jon Sterlingjms-00K8
First, we prove that each identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A is functional. Fixing a,a_1,a_2 \in A such that a=a_1 and a=a_2, we must show that a_1=a_2; this follows from transitivity and symmetry of equality.
Next, we fix partial functions f \colon A \mathbin { \rightharpoonup } B and g \colon B \mathbin { \rightharpoonup } C to check that g \circ f \colon A \mathbin { \nrightarrow } C is functional. Fixing a \in A and c_1,c_2 \in C such that a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c_1 and a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c_2, we must check that c_1=c_2. By definition of the relational composite, we have b_1,b_2 \in B such that a \mathrel {f}b_1 \mathrel {g}c_1 and a \mathrel {f}b_2 \mathrel {g}c_2. Because f \colon A \mathbin { \rightharpoonup } B is functional, we note that b_1 = b_2. Writing b=b_1=b_2, we therefore have b \mathrel {g}c_1 and b \mathrel {g}c_2. Because g \colon B \mathbin { \rightharpoonup } C is functional, we conclude that c_1=c_2.
3570Notationjms-00KAjms-00KA.xmlValues of partial functions2024122Jon SterlingMarcelo FioreLet f \colon A \mathbin { \rightharpoonup } B be a partial function. Given a \in A, we will write f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } to mean that f is defined at a, i.e. there exists some (necessarily unique) b \in B such that a \mathrel {f}b. When f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } holds, we may write f { \mathopen {} \left ( a \right ) \mathclose {}} to mean the unique b such that a \mathrel {f}b, which we shall refer to as the value of b.3573Definitionjms-00KBjms-00KB.xmlDomain of definition of a partial function2024122Jon SterlingThe domain of definition of a partial function f \colon A \mathbin { \rightharpoonup } B is the subset { \mathopen {} \left \{ a \in A \, \middle \vert \, f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right \} \mathclose {}} \subseteq A spanned by elements of the domain on which the partial function is defined.3577Lemmajms-00K9jms-00K9.xmlPartial functional extensionality2024122Jon SterlingMarcelo FioreLet f,g \colon A \mathbin { \rightharpoonup } B be two partial functions. We have f=g if and only if for all a \in A, f is defined at a if and only if g is defined at a and, moreover, the value of f at a is the same as the value of g at a: f=g \Longleftrightarrow \forall a \in A \mathpunct {.} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \Leftrightarrow g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right ) \mathclose {}} \land f { \mathopen {} \left ( a \right ) \mathclose {}} = g { \mathopen {} \left ( a \right ) \mathclose {}}
3575Proof#409unstable-409.xml2024122Jon Sterlingjms-00K9
One direction is trivial. In the other direction, we assume
\forall a \in A \mathpunct {.} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \Leftrightarrow g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right ) \mathclose {}} \land f { \mathopen {} \left ( a \right ) \mathclose {}} = g { \mathopen {} \left ( a \right ) \mathclose {}}
to deduce f=g. By relational extensionality, it suffices to show
\forall a \in A \mathpunct {.} \forall b \in B \mathpunct {.} a \mathrel {f}b \Longleftrightarrow a \mathrel {g}b.
Fixing a \in A and b \in B, we assume a \mathrel {f} b to prove a \mathrel {g}b. By assumption, we deduce g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } from f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow }, so there exists b'=g { \mathopen {} \left ( a \right ) \mathclose {}} such that a \mathrel {g}b'. By our other assumption, we know that f { \mathopen {} \left ( a \right ) \mathclose {}} =g { \mathopen {} \left ( a \right ) \mathclose {}}, so we conclude a \mathrel {g}b. The other direction works the same way.
3580Examplejms-00KDjms-00KD.xmlExamples of partial functions2024122Jon SterlingThe following are examples of partial functions:rational division { \div } \colon \mathbb {Q} \times \mathbb {Q} \mathbin { \rightharpoonup } \mathbb {Q} defined by { \mathopen {} \left \{ { \mathopen {} \left ( { \mathopen {} \left ( r,s \right ) \mathclose {}} ,t \right ) \mathclose {}} \in { \mathopen {} \left ( \mathbb {Q} \times \mathbb {Q} \right ) \mathclose {}} \times \mathbb {Q} \, \middle \vert \, r=s \cdot t \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ { \mathopen {} \left ( r,s \right ) \mathclose {}} \in \mathbb {Q} \times \mathbb {Q} \, \middle \vert \, s \not =0 \right \} \mathclose {}};
integer square root \sqrt {-} \colon \mathbb {Z} \mathbin { \rightharpoonup } \mathbb {Z} defined by { \mathopen {} \left \{ { \mathopen {} \left ( m,n \right ) \mathclose {}} \in \mathbb {Z} \times \mathbb {Z} \, \middle \vert \, m=n^2 \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ m \in \mathbb {Z} \, \middle \vert \, \exists n \in \mathbb {Z} \mathpunct {.} m=n^2 \right \} \mathclose {}};
real square root \sqrt {-} \colon \mathbb {R} \mathbin { \rightharpoonup } \mathbb {R} defined by { \mathopen {} \left \{ { \mathopen {} \left ( x,y \right ) \mathclose {}} \in \mathbb {R} \times \mathbb {R} \, \middle \vert \, x=y^2 \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ x \in \mathbb {R} \, \middle \vert \, x \geq 0 \right \} \mathclose {}}.3584Lemmajms-00KEjms-00KE.xmlCardinality of the set of partial functions2024122Jon SterlingMarcelo FioreFor all finite sets A and B, we have \mathord { \# } { \mathopen {} \left ( A \mathbin { \rightharpoonup } B \right ) \mathclose {}} = { \mathopen {} \left ( \mathord { \# } {B}+1 \right ) \mathclose {}} ^{ \mathord { \# } {A}}, recalling that \mathord { \# } {S} is the cardinality of a set S.
3582Proof#408unstable-408.xml2024122Jon Sterlingjms-00KE
A partial function f \colon A \mathbin { \rightharpoonup } B associates to each element a \in A either a unique element f { \mathopen {} \left ( a \right ) \mathclose {}} \in B when f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow }, or it associates nothing. Therefore, we have \mathord { \# } {A} many possibilities, each followed by \mathord { \# } {B}+1 many possibilities.
3587Definitionjms-00KFjms-00KF.xmlFunction2024122Jon SterlingMarcelo FioreA partial function f \colon A \mathbin { \rightharpoonup } B is called total whenever its domain of definition coincides with its domain (source), i.e. we have f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } for all a \in A. In this case, we will write f \colon A \to B and refer to it as a function or a map. Sometimes, we redundantly refer to it as a total function.Just as we write \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} and { \mathopen {} \left ( A \mathbin { \rightharpoonup } B \right ) \mathclose {}}for the sets of relations and partial functions from A to B respectively, we shall write { \mathopen {} \left ( A \to B \right ) \mathclose {}} for the set of functions from A to B.3592Lemmajms-00KGjms-00KG.xmlFunctions are uniquely-valued relations2024122Jon SterlingMarcelo FioreAny relation R \colon A \mathbin { \nrightarrow } B is a function if and only if for all a \in A, there exists some unique b \in B such that a \mathrel {R}b.
3590Proof#407unstable-407.xml2024122Jon Sterlingjms-00KG
Symbolically, our second criterion is \forall a \in A \mathpunct {.} \exists ! b \in B \mathpunct {.} a \mathrel {R}b. Unfolding the definition of unique existence, this is equivalent to \forall a \in A \mathpunct {.} { \mathopen {} \left ( \forall b,b' \in B \mathpunct {.} a \mathrel {R}b \land a \mathrel {B}b' \implies b=b' \right ) \mathclose {}} \land { \mathopen {} \left ( \exists b \in B \mathpunct {.} a \mathrel {R}b \right ) \mathclose {}} which is, by the distributivity of universal quantification over conjunction, equivalent to the claim that R is both functional and total.
3597Lemmajms-00KHjms-00KH.xmlCardinality of the set of functions2024122Jon SterlingMarcelo FioreFor all finite sets A and B, we have \# { \mathopen {} \left ( A \to B \right ) \mathclose {}} = \# B^{ \# A}.
3595Proof#406unstable-406.xml2024122Jon Sterlingjms-00KH
A function f \colon A \to B associates to each element a \in A a single element f { \mathopen {} \left ( a \right ) \mathclose {}} \in B. Therefore, we have \# A possibilities, each followed by \# B possibilities.
3602Theoremjms-00KIjms-00KI.xmlClosure of functions under identity and composition2024122Jon SterlingMarcelo FioreFunctions are closed under identity and composition.
3600Proof#405unstable-405.xml2024122Jon Sterlingjms-00KI
We have already seen in that partial functions are closed under identity and composition. Therefore, it suffices to argue that the identity partial function is total, and that the composition of two total partial functions is total.
We must show that for all a \in A, there exists a' \in A such that a \mathrel { \mathsf {id}_{ A } } a'; unfolding the definition of the identity relation we may choose a' = a.
Given f \colon A \to B and g \colon A \to C, we must check that for all a \in A, there exists some c \in C such that a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c. By the definition of relational composition, we must show that there exists b \in B and c \in C and such that a \mathrel {f}b and b \mathrel {c}. We choose b = f { \mathopen {} \left ( a \right ) \mathclose {}} and c = g { \mathopen {} \left ( b \right ) \mathclose {}} =g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}}.
3608jms-00KKjms-00KK.xmlInductive definitions2024123Jon SterlingMarcelo FioreWe have already seen some inductive definitions: defines n-fold iterations of endo-relations by induction on n.
computes the adjacency matrix of the reflexive-transitive closure of a finite directed graph by induction on the number of vertices in the graph.In this section, we will study this concept in more detail.3233Examplejms-00KLjms-00KL.xmlInformal examples of inductively defined functions2024123Jon SterlingMarcelo FioreAn inductive definition on the naturals defines a structure for each number n \in \mathbb {N} by first defining it for n=0 and then specifying how to augment a structure for n=m to a structure for n=m+1. From the point of view of computer programming, inductive definitions proceed by structural recursion. For example, addition of natural numbers can be defined by induction on the second argument (or the first, if you prefer!): \begin {aligned} \mathsf {add} & \colon \mathbb {N}^2 \to \mathbb {N} \\ \mathsf {add} { \mathopen {} \left ( m,0 \right ) \mathclose {}} &= m \\ \mathsf {add} { \mathopen {} \left ( m,n+1 \right ) \mathclose {}} &= \mathsf {add} { \mathopen {} \left ( m,n \right ) \mathclose {}} +1 \end {aligned} Likewise, the function \Sigma \colon \mathbb {N} \to \mathbb {N} that takes n \in \mathbb {N} to the sum \sum _{0 \leq i < n}i can be defined inductively: \begin {aligned} \Sigma & \colon \mathbb {N} \to \mathbb {N} \\ \Sigma { \mathopen {} \left ( 0 \right ) \mathclose {}} &= 0 \\ \Sigma { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= \mathsf {add} { \mathopen {} \left ( n, \Sigma { \mathopen {} \left ( n \right ) \mathclose {}} \right ) \mathclose {}} \end {aligned} In order to make precise the description of , , and as “inductive definitions”, we must give an actual formal definition of what an inductive definition is.3236Definitionjms-00KMjms-00KM.xmlInductively defined function2024123Jon SterlingMarcelo FioreLet A be a set, and fix a \in A and f \colon \mathbb {N} \times A \to A. The function inductively defined from a and f is defined to be the unique function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A satisfying the following equations: \begin {aligned} \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} &= a \\ \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} \right ) \mathclose {}} \end {aligned} Note that our specification of inductively defined functions in asserts, without proof, that there does in fact exist a unique function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A satisfying the described equations. It is not blindingly obvious that this claim is true — and, after seeing a few examples to motivate , we will explicitly justify by means of an existence theorem.3239Examplejms-00KNjms-00KN.xmlFormal examples of inductively defined functions2024123Jon SterlingMarcelo FioreWe have seen two informal examples of inductive definitions in ; we will now update these examples in light of :
For every m \in \mathbb {N}, function \mathsf {add} { \mathopen {} \left ( m,- \right ) \mathclose {}} \colon \mathbb {N} \to \mathbb {N} is the function \mathbf { \rho }_{ m , { \mathopen {} \left ( i,n \right ) \mathclose {}} \mapsto n+1 } inductively defined by m \in \mathbb {N} and f { \mathopen {} \left ( i,n \right ) \mathclose {}} = n+1.
The function \Sigma \colon \mathbb {N} \to \mathbb {N} taking n \in \mathbb {N} to \sum _{0 \leq i<n}i is the function \mathbf { \rho }_{ 0 , \mathsf {add} } inductively defined by 0 \in \mathbb {N} and \mathsf {add} \colon \mathbb {N} \times \mathbb {N} \to \mathbb {N}.
3243Definitionjms-00KOjms-00KO.xml{ \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation2024123Jon SterlingMarcelo FioreGiven an element a \in A and a function f \colon \mathbb {N} \times A \to A, we call a relation R \colon \mathbb {N} \mathbin { \nrightarrow } A { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed whenever we have both 0 \mathrel {R}a and n \mathrel {R}x \implies { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}f { \mathopen {} \left ( n,x \right ) \mathclose {}} for all n \in \mathbb {N} and x \in A.In light of , we see that is defining \mathbf { \rho }_{ a , f } to be the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function. We still have to show that this exists.3265Theoremjms-00KPjms-00KP.xmlExistence of inductively defined functions2024123Jon SterlingMarcelo FioreGiven an element a \in A and a function f \colon \mathbb {N} \times A \to A, now let \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A be the intersection of all the { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relations R \colon \mathbb {N} \mathbin { \nrightarrow } A.The relation \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is functional and total, and therefore a function.
The function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A is the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function, i.e. it is the unique function satisfying both \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} =a and \forall n \in \mathbb {N} \mathpunct {.} \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} =f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } {n} \right ) \mathclose {}}.
3263Proof#404unstable-404.xml2024123Jon Sterlingjms-00KP
We first notice that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is itself { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed, which can be seen by observing that the intersection of a set of { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relations is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. With this in hand, we proceed.
We must show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is a function.
To show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is functional, we must show that for all n \in \mathbb {N} and x,y \in A, if n \mathrel { \mathbf { \rho }_{ a , f } } x and n \mathrel { \mathbf { \rho }_{ a , f } }y, then x=y. We proceed to prove P { \mathopen {} \left ( n \right ) \mathclose {}} = \forall x,y \in A \mathpunct {.} n \mathrel { \mathbf { \rho }_{ a , f } } x \land n \mathrel { \mathbf { \rho }_{ a , f } }y \implies x=y by induction on n \in \mathbb {N}.
For the base case, we fix x,y \in A such that 0 \mathrel { \mathbf { \rho }_{ a , f } }x and 0 \mathrel { \mathbf { \rho }_{ a , f } } y to check that x=y. By definition, we know that for any { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation R \colon \mathbb {N} \mathbin { \nrightarrow } A we both 0 \mathrel {R}x and 0 \mathrel {R}y.
We choose R be the union of { \mathopen {} \left \{ { \mathopen {} \left ( 0,a \right ) \mathclose {}} \right \} \mathclose {}} and { \mathopen {} \left \{ { \mathopen {} \left ( n,x \right ) \mathclose {}} \, \middle \vert \, n > 0 \land n \mathrel { \mathbf { \rho }_{ a , f } } x \right \} \mathclose {}}, which is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed; we have used the fact that \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. As this union is disjoint, we can see that 0 \mathrel {R}z if and only if z=a, so we have x=a=y.
In the inductive step, we assume P { \mathopen {} \left ( n \right ) \mathclose {}} to prove P { \mathopen {} \left ( n+1 \right ) \mathclose {}}. Fixing x,y \in A such that { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }x and { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }y, we must show that x=y.
We know that for any { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation R \colon \mathbb {N} \mathbin { \nrightarrow } A we have both { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}x and { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}y.
We choose R to be the union of { \mathopen {} \left \{ { \mathopen {} \left ( m+1,f { \mathopen {} \left ( m,z \right ) \mathclose {}} \right ) \mathclose {}} \, \middle \vert \, m=n \land m \mathrel { \mathbf { \rho }_{ a , f } }{z} \right \} \mathclose {}} with { \mathopen {} \left \{ { \mathopen {} \left ( m,z \right ) \mathclose {}} \, \middle \vert \, m \mathrel { \mathbf { \rho }_{ a , f } }z \land m \not =n+1 \right \} \mathclose {}}, noting that R is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed because \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. As the union defining R is disjoint, for any z \in A we know that { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}z if and only if z=f { \mathopen {} \left ( n,z' \right ) \mathclose {}} for some z' with n \mathrel { \mathbf { \rho }_{ a , f } } z'.
Therefore, we have some x',y' \in A such that x=f { \mathopen {} \left ( n,x' \right ) \mathclose {}} and y=f { \mathopen {} \left ( n,y' \right ) \mathclose {}} and n \mathrel { \mathbf { \rho }_{ a , f } } x' and n \mathrel { \mathbf { \rho }_{ a , f } }y'. By our inductive hypothesis, we know that x'=y', and hence x=y.
To show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \rightharpoonup } A is total, we must show that for every n \in \mathbb {N}, we have \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} { \downarrow }. This is proved by induction on n \in \mathbb {N}:
In the base case, we deduce \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} { \downarrow } from our asumption that 0 \mathrel { \mathbf { \rho }_{ a , f } }a.
In the inductive step, we assume that \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} { \downarrow } and must show that \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} { \downarrow }. By our assumption, we have n \mathrel { \mathbf { \rho }_{ a , f } } \mathbf { \rho }_{ a , f } { { \mathopen {} \left ( n \right ) \mathclose {}} }; because \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation, we may conclude { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } \right ) \mathclose {}}.
It remains to show that the function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A is the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function. This too can be proved by induction, as functionality and { \mathopen {} \left ( a,f \right ) \mathclose {}}-closure fully specify the values of \mathbf { \rho }_{ a , f } on all inputs.
3675jms-00KXjms-00KX.xmlLecture 16: bijections, equivalence relations, and partitions2024126Jon SterlingMarcelo Fiore
3614jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3628jms-00LDjms-00LD.xmlSections and retractions2024125Jon Sterling3616Definitionjms-00KSjms-00KS.xmlSection-retraction pair2024123Jon SterlingA section-retraction pair is defined to be a pair of functions s \colon B \to A and r \colon A \to B such that r \circ s = \mathsf {id}_{ B }. In this case, s is referred to as a section of r, and r is referred to as a retraction of s.3618Definitionjms-00LAjms-00LA.xmlIdempotent function2024125Jon SterlingA function f \colon A \to A is called idempotent when we have f \circ f = f.3622Lemmajms-00LBjms-00LB.xmlThe idempotent determined by a section-retraction pair2024125Jon SterlingEvery section-retraction pair determines an idempotent. In particular, when s \colon B \to A is a section of r \colon A \to B, the composite s \circ r \colon A \to A is idempotent.
3620Proof#399unstable-399.xml2024125Jon Sterlingjms-00LB
We proceed by calculation:
\begin {aligned} { \mathopen {} \left ( s \circ r \right ) \mathclose {}} \circ { \mathopen {} \left ( s \circ r \right ) \mathclose {}} &= s \circ { \mathopen {} \left ( r \circ s \right ) \mathclose {}} \circ r \\ &= s \circ \mathsf {id}_{ B } \circ r \\ &= s \circ r \end {aligned}
3626Lemmajms-00LCjms-00LC.xmlSplitting idempotents2024125Jon SterlingIt happens that every idempotent arises from a section-retraction pair in the manner of .
3624Proof#398unstable-398.xml2024125Jon Sterlingjms-00LC
Let f \colon A \to A be an idempotent on a set A. Let B be the subset { \mathopen {} \left \{ f { \mathopen {} \left ( a \right ) \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \subseteq A, sometimes called the direct image of f in A. Then f \colon A \to A factors, by definition, through some unique r \colon A \to B, where r { \mathopen {} \left ( x \right ) \mathclose {}} = f { \mathopen {} \left ( x \right ) \mathclose {}} for all x \in A. We define s \colon B \to A to be be the subset inclusion.
We must check that r \circ s = \mathsf {id}_{ B }; in other words, we fix b \in B to check that r { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} = b. By definition, the subset inclusion s \colon B \to A takes b to the corresponding element s { \mathopen {} \left ( b \right ) \mathclose {}} \in A such that b=f { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}}. By definition of r \colon A \to B, we therefore have r { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =f { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b.
3656jms-00KQjms-00KQ.xmlBijections2024123Jon SterlingMarcelo Fiore3632Definitionjms-00KRjms-00KR.xmlBijection2024123Jon SterlingMarcelo FioreA function f \colon A \to B is said to be bijective or bijection whenever there exists a (necessarily unique) function g \colon B \to A (referred to as the inverse of f) that is simultaneously a retraction and a section of f \colon A \to B in the sense that g \circ f= \mathsf {id}_{ A } and f \circ g = \mathsf {id}_{ B }.We will often write f ^{-1} \colon B \to A for the inverse to a bijection f \colon A \to B.
3630Proof#403unstable-403.xml2024123Jon Sterlingjms-00KR
To see that g \colon B \to A is unique, we fix any other g' that is simultaneously a retraction and a section of f and show that g=g':
\begin {aligned} g &= \mathsf {id}_{ A } \circ g \\ &= { \mathopen {} \left ( g' \circ f \right ) \mathclose {}} \circ g \\ &= g' \circ { \mathopen {} \left ( f \circ g \right ) \mathclose {}} \\ &= g' \circ \mathsf {id}_{ B } \\ &= g' \end {aligned}
3640Propositionjms-00KTjms-00KT.xmlCounting bijections between finite sets2024123Jon SterlingMarcelo FioreFor all finite sets A and B, we have: \mathord { \# } { \mathrm {Bij} { \mathopen {} \left ( A , B \right ) \mathclose {}} } = \begin {cases} 0 & \text {if } \mathord { \# } {A} \not = \mathord { \# } {B} \\ n!& \text {if } \mathord { \# } {A}= \mathord { \# } {B}=n \end {cases}
3638Proof#402unstable-402.xml2024123Jon Sterlingjms-00KT
A bijection between two sets is a way of associating precisely one element of the first set with precisely one element of the second set. These associations are embodied in the functions f \colon A \to B and g \colon B \to A, and the uniqueness of the associations is expressed by the conditio that g be simultaneously a section and a retraction of f. Two sets of different cardinality can not be associated in this way, because there would always be left over elements that are not associated to any corresponding element. In this way, we see that a bijection exists between two sets if and only if they have the same cardinality.
Assuming two finite sets A,B with cardinality \mathord { \# } {A}= \mathord { \# } {B}=n, we must now count precisely how many bijections there are from A to B. To construct such a bijection, we put all the elements B into a pile. Then, for each element of A we must successively choose and then discard an element of B; we have to discard the element because we are not allowed to choose it again (for, if we did, the association would not be bijective). At the beginning (“round k=0”), there are n many choices we could make; in the next round k=1, there are n-1 many choices that we could make; in the kth round, there are n-k many choices that we could make. At the very end (after the n-1th round), we shall have constructed a bijection after making n! = \prod _{0 \leq k<n} n-k choices in total. Thus the number of distinct bijections from A to B is n!.
3645Theoremjms-00KZjms-00KZ.xmlClosure of bijections under identity and composition2024123Jon SterlingMarcelo FioreThe identity function is a bijection, and the composition of bijections yields a bijection.
3643Proof#401unstable-401.xml2024123Jon Sterlingjms-00KZ
The inverse to the identity function \mathsf {id}_{ A } \colon A \to A is simply itself.
Fix bijections f \colon A \to B and g \colon B \to C, so that we have inverses f ^{-1} \colon B \to A and g ^{-1} \colon C \to B. The inverse to the composite function g \circ f \colon A \to C is the composite f ^{-1} \circ g ^{-1} \colon C \to A. The section/retraction conditions follow by those for f and g using the associative and unit laws of composition.
3648Definitionjms-00L0jms-00L0.xmlIsomorphic sets2024123Jon SterlingMarcelo FioreTwo sets A and B are said to be isomorphic (and have the same cardinality) whenever there is a bijection between them. In this case, we write A \cong B or \mathord { \# } {A}= \mathord { \# } {B}3651Examplejms-00L1jms-00L1.xmlExamples of isomorphic sets2024123Jon SterlingMarcelo FioreA bijection between finite sets is just a “relabeling” of its elements: for example, we have { \mathopen {} \left \{ 0,1 \right \} \mathclose {}} \cong { \mathopen {} \left \{ \mathsf {false} , \mathsf {true} \right \} \mathclose {}}. Isomorphism of infinite sets can be more confusing: for example, we have \mathbb {N} \cong { \mathopen {} \left \{ n: \mathbb {N} \, \middle \vert \, n>0 \right \} \mathclose {}}, \mathbb {N} \cong \mathbb {Z}, \mathbb {N} \cong \mathbb {N} \times \mathbb {N}, \mathbb {N} \cong \mathbb {Q}, and even { \mathopen {} \left ( \mathbb {N} \to \mathbb {N} \right ) \mathclose {}} \cong \mathbb {R}.3654Remarkjms-00L2jms-00L2.xmlWhich bijection?2024123Jon SterlingAlthough we are speaking of “being isomorphic” as the property of there being a bijection between them, in practice, it is almost never useful to know that there exists some undetermined bijection between given sets; it is always important to know which bijection. Indeed, we have seen in that there can be many distinct bijections between two sets.3672jms-00L3jms-00L3.xmlEquivalence relations and set partitions2024123Jon SterlingMarcelo Fiore3659Definitionjms-00L4jms-00L4.xmlEquivalence relation2024123Jon SterlingMarcelo FioreA relation E \colon A \mathbin { \nrightarrow } A is called an equivalence relation when it is reflexive, transitive, and symmetric: \forall x \in A \mathpunct {.} x \mathrel {E}x \forall x,y,z \in A \mathpunct {.} x \mathrel {E} y \land y \mathrel {E} z \implies x \mathrel {E} z \forall x,y \in A \mathpunct {.} x \mathrel {E} y \implies y \mathrel {E}x We will write \mathrm {EqRel} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of all equivalence relations on A.3662Definitionjms-00L6jms-00L6.xmlEquivalence class2024123Jon SterlingLet E \colon A \mathbin { \nrightarrow } A be an equivalence relation on a set A. The equivalence class of a given element a \in A in E is the subset { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \subseteq A spanned by elements related to a in E, i.e. { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } = { \mathopen {} \left \{ x \in A \, \middle \vert \, x \mathrel {E}a \right \} \mathclose {}}.You may hear some people say “an equivalence” instead of “an equivalence relation”. This usage is wrong and leads to extremely confused thinking: you must not repeat it!3664Definitionjms-00L5jms-00L5.xmlSet partition2024123Jon SterlingMarcelo FioreA partition of a set A is a set P \subseteq \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} of non-empty subsets of A whose elements are referred to as blocks, satisfying the following conditions:the union of all blocks is all of A, i.e. \bigcup P = A;
the blocks are pairwise disjoint, i.e. for all b_1 \not = b_2 \in P we have b_1 \cap b_2 = \varnothing.We will write \mathrm {Part} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of partitions of A.3670Theoremjms-00L7jms-00L7.xmlBijection between equivalence relations and partitions2024123Jon SterlingFor every set A, we can define a bijection \Phi \colon \mathrm {EqRel} { \mathopen {} \left ( A \right ) \mathclose {}} \to \mathrm {Part} { \mathopen {} \left ( A \right ) \mathclose {}} sending every equivalence relation E on A to the partition \Phi { \mathopen {} \left ( E \right ) \mathclose {}} = { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \, \middle \vert \, a \in A \right \} \mathclose {}} whose blocks consist of the equivalence classes of each element of A.
3668Proof#400unstable-400.xml2024123Jon Sterlingjms-00L7
We must first argue that { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \, \middle \vert \, a \in A \right \} \mathclose {}} in fact constitutes a partition; the union of all equivalence classes is indeed all of A, and we can see that any two distinct equivalence classes have null intersection — since, if there were an intersection, the two equivalence classes would be identical.
The inverse \Phi ^{-1} sends a partition P to the following equivalence relation:
x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} } y \Longleftrightarrow \exists b \in P \mathpunct {.} x \in b \land y \in b
To check that \Phi ^{-1} is a section of \Phi, we fix a partition P to check that \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} = P.
\begin {aligned} \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} &= { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} } \, \middle \vert \, a \in A \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ { \mathopen {} \left \{ x \in A \, \middle \vert \, x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} } a \right \} \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ { \mathopen {} \left \{ x \in A \, \middle \vert \, \exists b \in P \mathpunct {.} x \in b \land a \in b \right \} \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \end {aligned}
As P is a partition, there is exactly one block containing a given element a \in A — no more than one by disjointness, and no fewer than one because the union of a partition must be the entire set. Therefore, the set { \mathopen {} \left \{ x \in A \, \middle \vert \, \exists b \in P \mathpunct {.} x \in b \land a \in b \right \} \mathclose {}} is precisely the unique block of P containing a. Thus we have \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} = P.
Conversely, we must show that \Phi ^{-1} is a retraction of \Phi. Fixing an equivalence relation E \colon A \mathbin { \nrightarrow } A, we must check that \Phi ^{-1} { \mathopen {} \left ( \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \right ) \mathclose {}} = E. Fixing x,y \in A, we proceed using relational extensionality:
\begin {aligned} x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \right ) \mathclose {}} \right ) \mathclose {}} }y & \Longleftrightarrow \exists b \in \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \mathpunct {.} x \in b \land y \in b \\ & \Longleftrightarrow \exists a \in A \mathpunct {.} x \in { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \land y \in { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \\ & \Longleftrightarrow \exists a \in A \mathpunct {.} x \mathrel {E}a \land y \mathrel {E}a \\ & \Longleftrightarrow x \mathrel {E} y \end {aligned}
3733jms-00LHjms-00LH.xmlLecture 17: bijections, indicators, finite cardinality, infinity axiom2024129Jon SterlingMarcelo Fiore
3678jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3687jms-00LIjms-00LI.xmlCalculus of bijections, I2024125Jon SterlingMarcelo Fiore3680Lemmajms-00LJjms-00LJ.xmlProperties of isomorphism2024125Jon SterlingThe concept of isomorphism between two sets satisfies laws like those of an equivalence relation. In particular:reflexivity: A \cong A;
transitivity: A \cong B \land B \cong C \implies A \cong C;
symmetry: A \cong B \implies B \cong A.3684Examplejms-00LKjms-00LK.xmlInvariance under isomorphism2024125Jon SterlingMarcelo FioreFor all sets A,B,X,Y, we have A \cong X \land B \cong Y \implies \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} \cong \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}}.
3682Proof#397unstable-397.xml2024125Jon Sterlingjms-00LK
Fix bijections f \colon A \to X and g \colon B \to Y. We will define a bijection H \colon \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}}.
Given R \colon A \mathbin { \nrightarrow } B, we define H { \mathopen {} \left ( R \right ) \mathclose {}} \colon X \mathbin { \nrightarrow } Y to be the relational composite g \circ R \circ f ^{-1}. The inverse mapping H ^{-1} \colon \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} is defined to take S \colon X \mathbin { \nrightarrow } Y to the relational composite g ^{-1} \circ S \circ f.
Next, we check that H ^{-1} is simulatneously a retraction and a section of H.
\begin {aligned} H ^{-1} { \mathopen {} \left ( H { \mathopen {} \left ( R \right ) \mathclose {}} \right ) \mathclose {}} &= g ^{-1} \circ H { \mathopen {} \left ( R \right ) \mathclose {}} \circ f \\ &= g ^{-1} \circ { \mathopen {} \left ( g \circ R \circ f ^{-1} \right ) \mathclose {}} \circ f \\ &= { \mathopen {} \left ( g ^{-1} \circ g \right ) \mathclose {}} \circ R \circ { \mathopen {} \left ( f ^{-1} \circ f \right ) \mathclose {}} \\ &= \mathsf {id}_{ B } \circ R \circ \mathsf {id}_{ A } \\ &= R \\ H { \mathopen {} \left ( H ^{-1} { \mathopen {} \left ( S \right ) \mathclose {}} \right ) \mathclose {}} &= g \circ H ^{-1} { \mathopen {} \left ( S \right ) \mathclose {}} \circ f ^{-1} \\ &= g \circ { \mathopen {} \left ( g ^{-1} \circ S \circ f \right ) \mathclose {}} \circ f ^{-1} \\ &= { \mathopen {} \left ( g \circ g ^{-1} \right ) \mathclose {}} \circ S \circ { \mathopen {} \left ( f \circ f ^{-1} \right ) \mathclose {}} \\ &= \mathsf {id}_{ Y } \circ S \circ \mathsf {id}_{ X } \\ &= S \end {aligned}
3706jms-00LLjms-00LL.xmlIndicator functions and comprehension2024125Jon SterlingMarcelo Fiore3690Definitionjms-00LOjms-00LO.xmlPredicate2024125Jon SterlingA predicate on a set A is defined to be a function \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} from A to the two-element set { \mathopen {} \left [ 2 \right ] \mathclose {}} = { \mathopen {} \left \{ 0,1 \right \} \mathclose {}}. We say that a predicate \phi holds of a given element a \in A when \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1.3692Definitionjms-00LMjms-00LM.xmlThe indicator function of a subset2024125Jon SterlingMarcelo FioreThe indicator function or characteristic function of S \subseteq A is defined to be the predicate \chi _S \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} defined piecewise below: \chi _S { \mathopen {} \left ( a \right ) \mathclose {}} = \begin {cases} 1 & \text {if } a \in S \\ 0 & \text {if } a \not \in S \end {cases} 3695Definitionjms-00LPjms-00LP.xmlThe comprehension of a predicate2024125Jon SterlingLet \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} be a predicate on a set A. The comprehension of \phi is defined to be the following subset { \mathopen {} \left [ \phi \right ] \mathclose {}} \subseteq A spanned by elements at which \phi holds, as specified below: { \mathopen {} \left [ \phi \right ] \mathclose {}} = { \mathopen {} \left \{ a \in A \, \middle \vert \, \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \right \} \mathclose {}} 3699Theoremjms-00LNjms-00LN.xmlUniversal property of indicator functions2024125Jon SterlingMarcelo FioreThe mappings \chi _{ { \mathopen {} \left ( - \right ) \mathclose {}} } \colon \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} \to { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} and { \mathopen {} \left [ - \right ] \mathclose {}} \colon { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} \to \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} given by indicator functions and comprehension are mutually inverse. Thus we have \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} \cong { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}}.
3697Proof#396unstable-396.xml2024125Jon Sterlingjms-00LN
Fixing S \subseteq A, we compute:
\begin {aligned} { \mathopen {} \left [ \chi _{S} \right ] \mathclose {}} &= { \mathopen {} \left \{ a \in A \, \middle \vert \, \chi _S { \mathopen {} \left ( a \right ) \mathclose {}} =1 \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ a \in A \, \middle \vert \, a \in S \right \} \mathclose {}} \\ &= S \end {aligned}
Conversely, we fix \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} and compute:
\begin {aligned} \chi _{ { \mathopen {} \left [ \phi \right ] \mathclose {}} } { \mathopen {} \left ( a \right ) \mathclose {}} &= \begin {cases} 1 & \text {if } a \in { \mathopen {} \left [ \phi \right ] \mathclose {}} \\ 0 & \text {if } a \not \in { \mathopen {} \left [ \phi \right ] \mathclose {}} \end {cases} \\ &= \begin {cases} 1 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \\ 0 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} \not =1 \end {cases} \\ &= \begin {cases} 1 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \\ 0 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =0 \end {cases} \\ &= \phi { \mathopen {} \left ( a \right ) \mathclose {}} \end {aligned}
3704Examplejms-00LQjms-00LQ.xmlAn identity involving indicator functions2024125Jon SterlingFor any set X, we have \mathcal {P} { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \cong \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}}, where + denotes disjoint union.
3702Proof#395unstable-395.xml2024125Jon Sterlingjms-00LQ
We first note that \mathcal {P} { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \cong { \mathopen {} \left ( { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} by . A map out of a disjoint union is given by one map for each side, so this is isomorphic to { \mathopen {} \left ( X \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} \times { \mathopen {} \left ( { \mathopen {} \left [ 1 \right ] \mathclose {}} \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}}. Applying again in the other direction, this is isomorphic to \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 2 \right ] \mathclose {}}. We finally compute:
\begin {aligned} \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 2 \right ] \mathclose {}} &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left ( { \mathopen {} \left [ 1 \right ] \mathclose {}} + { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \\ &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 1 \right ] \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 1 \right ] \mathclose {}} \\ &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \end {aligned}
3730jms-00LWjms-00LW.xmlFinite and infinite sets2024128Jon SterlingMarcelo FioreNow that we have learned about bijections, we are ready to replace our intuitive/informal understanding of finite sets and finite cardinality with a formal one.3710Definitionjms-00LXjms-00LX.xmlFinite set2024128Jon SterlingMarcelo FioreA set A is said to be finite or have finite cardinality when it is isomorphic to a set of the form { \mathopen {} \left [ n \right ] \mathclose {}} for some n \in \mathbb {N}; in this case, we therefore have \mathord { \# } {A}=n. (Recall that { \mathopen {} \left [ n \right ] \mathclose {}} is the standard n-element set { \mathopen {} \left \{ i \in \mathbb {N} \, \middle \vert \, 0 \leq i < n \right \} \mathclose {}}.)There are many identities that relate set-theoretic operations on finite sets to arithmetic operations on their cardinalities. We will illustrate a few of them here.3717Examplejms-00LYjms-00LY.xmlCardinality of the cartesian product2024128Jon SterlingFor all m,n \in \mathbb {N}, we have { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} \cong { \mathopen {} \left [ m \cdot n \right ] \mathclose {}}.We can prove this from a combinatoric point of view.
3713Proof#393unstable-393.xml2024128Jon Sterlingjms-00LY
To choose a pair { \mathopen {} \left ( i,j \right ) \mathclose {}} \in { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}}, we first choose an element of { \mathopen {} \left [ m \right ] \mathclose {}} and we second choose an element of { \mathopen {} \left [ n \right ] \mathclose {}}. There are m possibilities for the first choice and n possibilities for the second choice; as these choices are independent, we simply have m \cdot n choices in total.
We can also prove this isomorphism more formally by means of a specific bijection.
3715Proof#394unstable-394.xml2024128Jon Sterlingjms-00LY
We can think of the cartesian product { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} as the set of cells in table or matrix with m columns and n rows — and then an element { \mathopen {} \left ( i,j \right ) \mathclose {}} with i \in { \mathopen {} \left [ m \right ] \mathclose {}} and j \in { \mathopen {} \left [ n \right ] \mathclose {}} is the coordinate of a given cell. Naturally, the total number of cells would be m \cdot n, but we can prove this formally by constructing the function that takes a coordinate { \mathopen {} \left ( i,j \right ) \mathclose {}} to its absolute index in the counting of cells from left to right and top to bottom:
\begin {aligned} I & \colon { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m \cdot n \right ] \mathclose {}} \\ I { \mathopen {} \left ( i,j \right ) \mathclose {}} &= m \cdot j + i \end {aligned}
We do need to check that I { \mathopen {} \left ( i,j \right ) \mathclose {}} < m \cdot n for all i<m and j<n. The maximimum value of I is naturally I { \mathopen {} \left ( m-1,n-1 \right ) \mathclose {}}: \begin {aligned} I { \mathopen {} \left ( m-1,n-1 \right ) \mathclose {}} &= m \cdot { \mathopen {} \left ( n-1 \right ) \mathclose {}} + { \mathopen {} \left ( m-1 \right ) \mathclose {}} \\ &= m \cdot n - m + m - 1 \\ &= m \cdot n - 1 \\ &< m \cdot n \end {aligned} The inverse function I ^{-1} \colon { \mathopen {} \left [ m \cdot n \right ] \mathclose {}} \to { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} takes i<m \cdot n to the pair { \mathopen {} \left ( \operatorname {rem} { \mathopen {} \left ( i,m \right ) \mathclose {}} , \operatorname {quo} { \mathopen {} \left ( i,m \right ) \mathclose {}} \right ) \mathclose {}}.3721Examplejms-00LZjms-00LZ.xmlCardinality of the disjoint sum2024128Jon SterlingMarcelo FioreFor all m,n \in \mathbb {N}, we have { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \cong { \mathopen {} \left [ m+n \right ] \mathclose {}}.
3719Proof#392unstable-392.xml2024128Jon Sterlingjms-00LZ
We can think of { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} as the set of coordinates into two strips of cells, the first of which has m cells and the second of which has n cells. Conversely, we can think of { \mathopen {} \left [ m+n \right ] \mathclose {}} as the set of coordinates a single strip of m+n cells. A function I \colon { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m+n \right ] \mathclose {}} could then be thought of as witnessing a method to lay out the two strips in serial; we will define such a function and show that it is a bijection.
\begin {aligned} I & \colon { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m+n \right ] \mathclose {}} \\ I { \mathopen {} \left ( 0,i<m \right ) \mathclose {}} &= i \\ I { \mathopen {} \left ( 1,j<n \right ) \mathclose {}} &= m+j \end {aligned}
The function I places a cell from the first strip sequentially after the first m cells of the serialised strip. The inverse can be defined as follows:
I { \mathopen {} \left ( k \right ) \mathclose {}} = \begin {cases} { \mathopen {} \left ( 0,k \right ) \mathclose {}} & \text {when } k < m \\ { \mathopen {} \left ( 1,k-m \right ) \mathclose {}} & \text {when } k \geq m \end {cases}
3724Definitionjms-00M1jms-00M1.xmlInfinite set2024128Jon SterlingMarcelo FioreA set A is called infinite when it is not finite in the sense of .The basic axioms of set theory that we have learned in this course so far do not in fact guarantee the existence of any infinite sets. For this reason, one usually adds the axiom of infinity below to set theory.3727Axiomjms-00M0jms-00M0.xmlThe axiom of infinity2024128Jon SterlingMarcelo FioreThe natural numbers form a set.3830jms-00M6jms-00M6.xmlLecture 18: surjections, injections, choice, and enumerability2024131Jon SterlingMarcelo Fiore
3736jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3740Theoremjms-00M7jms-00M7.xmlLogical characterisation of bijections2024129Jon SterlingMarcelo FioreFor a function f \colon A \to B, the following are equivalent:the function f \colon A \to B is bijective;
we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b;
we have both \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b and \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2.
3738Proof#391unstable-391.xml2024129Jon Sterlingjms-00M7
We first prove that f \colon A \to B is bijective if and only if we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b.
Indeed, suppose that f is a bijection and fix b \in B to check that there exists some unique a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} =b. We let a = f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} and we do indeed have f { \mathopen {} \left ( a \right ) \mathclose {}} =f { \mathopen {} \left ( f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b; to see that a is unique with this property, we fix a_1,a_2 \in A such that f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =b and f { \mathopen {} \left ( a_2 \right ) \mathclose {}} =b. We have a_1=f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a_1 \right ) \mathclose {}} \right ) \mathclose {}} =f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} =f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \right ) \mathclose {}} =a_2.
Conversely, suppose that \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b; then we may define an inverse function f ^{-1} \colon B \to A sending b \in B to the unique element a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} =b. We therefore have f { \mathopen {} \left ( f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b by definition and f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} is defined to be the unique element a' \in A such that f { \mathopen {} \left ( a' \right ) \mathclose {}} = f { \mathopen {} \left ( a \right ) \mathclose {}}; as a' is assumed unique with this property and we clearly have f { \mathopen {} \left ( a \right ) \mathclose {}} =f { \mathopen {} \left ( a \right ) \mathclose {}}, we may conclude that a'=a and so f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} =a.
Next, we show that we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b if and only if we have both \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b and \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2. In fact, this holds immediately by unfolding the definition of unique existence and the distribution of universal quantification over conjunction.
3772jms-00MFjms-00MF.xmlSurjective functions2024129Jon SterlingMarcelo Fiore3743Definitionjms-00M8jms-00M8.xmlSurjection2024129Jon SterlingMarcelo FioreA function f \colon A \to B is said to be surjective or asurjection when for we have \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b. A function satisfying this condition is written f \colon A \twoheadrightarrow B.3748Corollaryjms-00M9jms-00M9.xmlBijections are surjective2024129Jon SterlingMarcelo FioreAny bijection f \colon A \to B is surjective.
3746Proof#390unstable-390.xml2024129Jon Sterlingjms-00M9
By .
3753Examplejms-00MAjms-00MA.xmlNon-empty sets and surjections2024129Jon SterlingMarcelo FioreThe unique function !_A \colon A \to { \mathopen {} \left [ 1 \right ] \mathclose {}} is surjective if and only if A \not = \varnothing.
3751Proof#389unstable-389.xml2024129Jon Sterlingjms-00MA
Suppose that A is nonempty, so that we have some a \in A. We must show that for every i \in { \mathopen {} \left [ 1 \right ] \mathclose {}}, there exists some x \in A such that !_A { \mathopen {} \left ( x \right ) \mathclose {}} =i. Letting i=0 be the unique element of { \mathopen {} \left [ 1 \right ] \mathclose {}}, we may set x:= a.
Conversely, if !_A \colon A \to { \mathopen {} \left [ 1 \right ] \mathclose {}} is surjective, we know that there exists some a \in A such that !_A=0; therefore, A is nonempty.
3758Examplejms-00MBjms-00MB.xmlQuotients and surjections2024129Jon SterlingMarcelo FioreLet E be an equivalence relation on a set A, and let q \colon A \to A/E be the quotient function that sends a \in A to its equivalence class { \mathopen {} \left [ a \right ] \mathclose {}} _{ E }. Then q \colon A \to A/E is surjective.
3756Proof#388unstable-388.xml2024129Jon Sterlingjms-00MB
We must show that for every equivalence class u \in A/E there exists an element a \in A such that q { \mathopen {} \left ( a \right ) \mathclose {}} =u. By definition u= { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } =q { \mathopen {} \left ( a \right ) \mathclose {}} for some a \in A.
3764Examplejms-00MCjms-00MC.xmlProjection functions and surjections2024129Jon SterlingMarcelo FioreThe projection function \pi _1 \colon A \times B \to A sending { \mathopen {} \left ( a,b \right ) \mathclose {}} to a is surjective if and only if either B \not = \varnothing or A= \varnothing.
3762Proof#387unstable-387.xml2024129Jon Sterlingjms-00MC
Suppose that \pi _1 \colon A \times B \to A is surjective. We proceed by cases on whether A is empty:
If A is empty, we are done.
If A is inhabited by some a \in A, then by assumption there exists u \in A \times B such that \pi _1 { \mathopen {} \left ( u \right ) \mathclose {}} =a. Thus B is inhabited by \pi _2 { \mathopen {} \left ( u \right ) \mathclose {}}.
Conversely, suppose that either B \not = \varnothing or A= \varnothing. Fixing a \in A, we must show that there exists u \in A \times B such that \pi _1 { \mathopen {} \left ( u \right ) \mathclose {}} =a. If A= \varnothing, then we have a contradiction already; on the other hand, if there exists any b \in B, we may define u:= { \mathopen {} \left ( a,b \right ) \mathclose {}} and we are done.
3769Theoremjms-00MDjms-00MD.xmlClosure of surjections under identity and composition2024129Jon SterlingMarcelo FioreThe identity function is a surjection and the composition of surjections yields a surjection.
3767Proof#386unstable-386.xml2024129Jon Sterlingjms-00MD
For the identity function on a set A, we must check that for all a \in A, there exists an element a' \in A such that a'=a. Of course, we choose a' := a.
Let f \colon A \twoheadrightarrow B and g \colon B \twoheadrightarrow C be two surjections. To show that g \circ f \colon A \to C is a surjection, we fix c \in C to exhibit some a \in A such that g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} = c. Because g \colon B \twoheadrightarrow C is surjective, there exists b \in B such that g { \mathopen {} \left ( b \right ) \mathclose {}} = c. Becuase f \colon A \twoheadrightarrow B is surjective, there exists a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} = b. Thus we have g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} =g { \mathopen {} \left ( b \right ) \mathclose {}} =c.
3798jms-00MGjms-00MG.xmlEnumerability and countability2024129Jon SterlingMarcelo Fiore3775Definitionjms-00MEjms-00ME.xmlEnumerable set2024129Jon SterlingMarcelo FioreA set A is said to be enumerable whenever there exists a surjection \mathbb {N} \twoheadrightarrow A, referred to as an enumeration.In an enumeration e \colon \mathbb {N} \twoheadrightarrow A of a set A, we think of n \in \mathbb {N} as a “code” for a \in A when e { \mathopen {} \left ( n \right ) \mathclose {}} =a; these “codes” are not unique unless e is a bijection. By virtue of the viewpoint of quotients as surjections developed in , enumerability expresses when a given set can be built up from the natural numbers by quotienting (i.e. identifying codes that are taken to the same element by the enumeration).3778Definitionjms-00MHjms-00MH.xmlCountable set2024129Jon SterlingMarcelo FioreA countable set is one that is either empty or enumerable.3783Lemmajms-00MIjms-00MI.xmlAlternative definition of countability2024129Jon SterlingA set A is countable if and only if the disjoint union A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} is enumerable.
3781Proof#385unstable-385.xml2024129Jon Sterlingjms-00MI
Suppose that A is countable. We must find a surjection \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}}; if A is empty, this is the same as to find a surjection \mathbb {N} \to { \mathopen {} \left [ 1 \right ] \mathclose {}}, which we have by . On the other hand, if we have an enumeration e \colon \mathbb {N} \twoheadrightarrow {A}, we may define an enumeration e' \colon \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} as follows:
\begin {aligned} e' { \mathopen {} \left ( 0 \right ) \mathclose {}} &= 0 \\ e' { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= e { \mathopen {} \left ( n \right ) \mathclose {}} \end {aligned}
Conversely, suppose that A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} is enumerable by e \colon \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}}. If A is empty, we are done. If to the contrary A is inhabited by some a_0, we let define e' \colon \mathbb {N} \twoheadrightarrow A as follows:
e' { \mathopen {} \left ( i \right ) \mathclose {}} = \begin {cases} e { \mathopen {} \left ( i \right ) \mathclose {}} & \text {when } e { \mathopen {} \left ( i \right ) \mathclose {}} \in A \\ a_0 & \text {otherwise} \end {cases}
3785Examplejms-00MJjms-00MJ.xmlA bijective enumeration of the integers2024129Jon SterlingMarcelo FioreWe can define a bijective enumeration of the integers by “zigzagging” between the positive and the negative following the pattern{ \mathopen {} \left ( 0, -1, 1,-2,2,-3,3, \ldots \right ) \mathclose {}}: \begin {aligned} e& \colon \mathbb {N} \twoheadrightarrow \mathbb {Z} \\ e { \mathopen {} \left ( i \right ) \mathclose {}} &= \begin {cases} 0 & \text {when } i=0 \\ i \div 2 & \text {when } i > 0 \land 2 \mid i \\ - { \mathopen {} \left ( i+1 \right ) \mathclose {}} \div 2 & \text {when } i > 0 \land \lnot { \mathopen {} \left ( 2 \mid i \right ) \mathclose {}} \end {cases} \end {aligned} 3790Lemmajms-00MKjms-00MK.xmlNon-empty subsets of enumerable sets are enumerable2024129Jon SterlingMarcelo FioreAny non-empty subset of an enumerable set is enumerable.
3788Proof#384unstable-384.xml2024129Jon Sterlingjms-00MK
We will use the same method as in our proof of .
Let S \subseteq A be a non-empty subset of an enumerable set A, so that we have some s \in S and a surjection e \colon \mathbb {N} \twoheadrightarrow A. We define e' \colon \mathbb {N} \twoheadrightarrow {S} to send i \in \mathbb {N} to e { \mathopen {} \left ( i \right ) \mathclose {}} if e { \mathopen {} \left ( i \right ) \mathclose {}} \in S and to s otherwise.
3795Lemmajms-00MLjms-00ML.xmlCartesian product of countable sets2024129Jon SterlingMarcelo FioreThe cartesian product of countable sets is countable.
3793Proof#383unstable-383.xml2024129Jon Sterlingjms-00ML
Let A and B be countable sets. If either A or B is empty, then A \times B is empty and thus countable. Otherwise, let e_A \colon \mathbb {N} \twoheadrightarrow A and e_B \colon \mathbb {N} \twoheadrightarrow B be enumerations of A and B respectively. Then the function e_A \times e_B \colon \mathbb {N} \times \mathbb {N} \to A \times B can be seen to be surjective; letting f \colon \mathbb {N} \twoheadrightarrow \mathbb {N} \times \mathbb {N} be any enumeration of \mathbb {N} \times \mathbb {N}, can compose to obtain a surjection { \mathopen {} \left ( e_A \times e_B \right ) \mathclose {}} \circ f \colon \mathbb {N} \twoheadrightarrow A \times B by .
3801Axiomjms-00MMjms-00MM.xmlThe axiom of choice2024129Jon SterlingMarcelo FioreEvery surjection has a section.3827jms-00MNjms-00MN.xmlInjective functions2024129Jon SterlingMarcelo Fiore3804Definitionjms-00MOjms-00MO.xmlInjection2024129Jon SterlingMarcelo FioreA function f \colon A \to B is said to be injective, or an injection, whenever we have \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2. Such a function is written f \colon A \rightarrowtail B.3809Examplejms-00N1jms-00N1.xmlSections are injective2024130Jon SterlingMarcelo FioreEvery section is injective
3807Proof#382unstable-382.xml2024130Jon Sterlingjms-00N1
Let s \colon B \to A be a section of r \colon A \to B. Fixing b_1,b_2 \in B such that s { \mathopen {} \left ( b_1 \right ) \mathclose {}} =s { \mathopen {} \left ( b_2 \right ) \mathclose {}}, we must show that b_1=b_2. We have r { \mathopen {} \left ( s { \mathopen {} \left ( b_1 \right ) \mathclose {}} \right ) \mathclose {}} =r { \mathopen {} \left ( s { \mathopen {} \left ( b_2 \right ) \mathclose {}} \right ) \mathclose {}} by assumption, and thus b_1=r { \mathopen {} \left ( s { \mathopen {} \left ( b_1 \right ) \mathclose {}} \right ) \mathclose {}} =r { \mathopen {} \left ( s { \mathopen {} \left ( b_2 \right ) \mathclose {}} \right ) \mathclose {}} =b_2.
3814Examplejms-00N2jms-00N2.xmlSubset inclusion are injective2024130Jon SterlingMarcelo FioreThe function i_S \colon S \to A including a subset S \subseteq A into A is injective.
3812Proof#381unstable-381.xml2024130Jon Sterlingjms-00N2
Immediate: the inclusion maps a given element to itself!
3819Theoremjms-00N3jms-00N3.xmlClosure of injections under identity and composition2024130Jon SterlingMarcelo FioreThe identity function is an injection and the composition of injections yields a injection.
3817Proof#380unstable-380.xml2024130Jon Sterlingjms-00N3
The identity function is clearly injective, as a_0=a_1 implies a_0=a_1.
Let f \colon A \rightarrowtail B and g \colon B \rightarrowtail C be injections. To show that the composite function g \circ f \colon A \to C is injective, we fix a_0,a_1 \in A to check that g { \mathopen {} \left ( f { \mathopen {} \left ( a_0 \right ) \mathclose {}} \right ) \mathclose {}} = g { \mathopen {} \left ( f { \mathopen {} \left ( a_1 \right ) \mathclose {}} \right ) \mathclose {}} implies a_0=a_1. Because g is injective, we have f { \mathopen {} \left ( a_0 \right ) \mathclose {}} =f { \mathopen {} \left ( a_1 \right ) \mathclose {}}; therefore, because f is injective, we have a_0=a_1 as desired.
Now that we have enough definitions in hand to see it, the import of is then to state that a function is bijective if and only if it is both injective and surjective.3824Propositionjms-00N4jms-00N4.xmlCounting injections between finite sets2024130Jon SterlingMarcelo FioreFor finite sets A and B, we can describe number of injections from A to B as follows: \mathord { \# } { \mathrm {Inj} { \mathopen {} \left ( A , B \right ) \mathclose {}} } = \begin {cases} { \mathord { \# } {B} \choose \mathord { \# } {A}} \cdot { \mathopen {} \left ( \mathord { \# } {A} \right ) \mathclose {}} ! & \text {when } \mathord { \# } {A} \leq \mathord { \# } {B} \\ 0 & \text {otherwise} \end {cases}
3822Proof#379unstable-379.xml2024130Jon Sterlingjms-00N4
We will argue from a combinatoric perspective.
An injection f from A to B associates no more than one a \in A to the same b \in B. Thus if we consider the subset U \subseteq B spanned by elements of the form f { \mathopen {} \left ( a \right ) \mathclose {}}, we should then get a bijection from A to Uas each b \in U comes from a unique a \in A.
Therefore, choosing an injection amounts to making two independent choices: first we choose a subset of B that is in bijection with A, and then we choose a specific bijection from A to that subset. The number of such subsets is precisely \mathord { \# } {B} \choose \mathord { \# } {A}, and we have already seen in that the number of such bijections is the factorial { \mathopen {} \left ( \mathord { \# } {A}! \right ) \mathclose {}}.
Of course, if there is no subset of B equinumerous with A, then there can be no injection.
3909jms-00N5jms-00N5.xmlLecture 19: relational images, families, diagonalisation, well-foundedness202422Jon SterlingMarcelo Fiore
3833jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3841jms-00NCjms-00NC.xmlRelational images2024131Jon SterlingMarcelo Fiore3835Definitionjms-00N6jms-00N6.xmlDirect image2024131Jon SterlingMarcelo FioreLet R \colon A \mathbin { \nrightarrow } B be a relation. The direct image of X \subseteq A under R is the set R _* X \subseteq B defined below: R _* X = { \mathopen {} \left \{ b \in B \, \middle \vert \, \exists x \in X \mathpunct {.} x \mathrel {R}b \right \} \mathclose {}} 3838Definitionjms-00N7jms-00N7.xmlInverse image2024131Jon SterlingMarcelo FioreLet R \colon A \mathbin { \nrightarrow } B be a relation. The inverse image of Y \subseteq B under R is the set R ^* Y \subseteq A defined below: R ^* Y = { \mathopen {} \left \{ a \in A \, \middle \vert \, \forall b \in B \mathpunct {.} a \mathrel {R}b \implies b \in Y \right \} \mathclose {}} 3857jms-00NDjms-00ND.xmlFamilies of sets and replacement2024131Jon SterlingMarcelo Fiore3844Definitionjms-00N9jms-00N9.xmlFamily of sets2024131Jon SterlingA family of sets indexed in a set I is defined to be a set S equipped with a function \pi _S \colon S \to I. For i \in I, we will write S_i \subseteq S for the inverse image \pi _S ^* { \mathopen {} \left \{ i \right \} \mathclose {}} \subseteq S: \begin {aligned} S_i &= \pi _S ^* { \mathopen {} \left \{ i \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \forall j \in I \mathpunct {.} s \mathrel { \pi _S} j \implies j \in { \mathopen {} \left \{ i \right \} \mathclose {}} \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \forall j \in I \mathpunct {.} \pi _S { \mathopen {} \left ( s \right ) \mathclose {}} =j \implies j=i \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \pi _S { \mathopen {} \left ( s \right ) \mathclose {}} =j \right \} \mathclose {}} \end {aligned} Thus S is the disjoint union of all the S_i.3846Axiomjms-00N8jms-00N8.xmlAxiom scheme of replacement2024131Jon SterlingMarcelo FioreLet I be a set, and let \mathtt {P} { \mathopen {} \left ( \mathtt {x}, \mathtt {y} \right ) \mathclose {}} be a formula in the language of set theory such that \forall i \in I \mathpunct {.} \exists ! S \mathpunct {.} \mathtt {P} { \mathopen {} \left ( i,S \right ) \mathclose {}} holds. Then the collection of sets S such that \mathtt {P} { \mathopen {} \left ( i,S \right ) \mathclose {}} holds for some i \in I holds forms a set.This is almost saying that the direct image of I under \mathtt {P} is a set, but this doesn’t actually make sense because we do not have a way to speak about a “relation” between a set and the class of all sets. This strange situation is clarified by other foundational system that go beyond the limitations of the Zermelo-Fraenkel set theory that we have studied in this course.3849Examplejms-00NAjms-00NA.xmlThe set of iterated powersets2024131Jon Sterling may seem a bit obscure (and it is!). But an example of a set that we need replacement to define is the set of iterated powersets { \mathopen {} \left \{ \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} , \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \right ) \mathclose {}} , \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \right ) \mathclose {}} \right ) \mathclose {}} , \ldots \right \} \mathclose {}}.3854Propositionjms-00NBjms-00NB.xmlDisjoint unions of enumerable sets2024131Jon SterlingMarcelo FioreLet I be an enumerable set and let A be a family of sets indexed in I such that each A_i is enumerable. Then the disjoint union A = \coprod _{i \in I}A_i is enumerable.
3852Proof#378unstable-378.xml2024131Jon Sterlingjms-00NB
By assumption, we have an enumeration e_I \colon \mathbb {N} \twoheadrightarrow I; as each A_i is enumerable, we may use the axiom of choice to assign to each i \in I a specific e_i \colon \mathbb {N} \twoheadrightarrow A_i.
Let h \colon \mathbb {N} \twoheadrightarrow \mathbb {N} \times \mathbb {N} be any enumeration of \mathbb {N} \times \mathbb {N}. Define k \colon \mathbb {N} \times \mathbb {N} \to A as follows:
k { \mathopen {} \left ( m,n \right ) \mathclose {}} = e_{e_I { \mathopen {} \left ( m \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}}
The function k \colon \mathbb {N} \times \mathbb {N} \to A is surjective: fixing a \in A, we must find { \mathopen {} \left ( m,n \right ) \mathclose {}} such that k { \mathopen {} \left ( m,n \right ) \mathclose {}} =a. Let i= \pi _A { \mathopen {} \left ( a \right ) \mathclose {}}; because e_I is surjective, we may find m \in \mathbb {N} such that e_I { \mathopen {} \left ( m \right ) \mathclose {}} = \pi _A { \mathopen {} \left ( a \right ) \mathclose {}}, and because e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } is surjective, we may find n \in \mathbb {N} such that e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} = a. Therefore, we have k { \mathopen {} \left ( m,n \right ) \mathclose {}} = e_{e_I { \mathopen {} \left ( m \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} = e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} =a.
3880jms-00NEjms-00NE.xmlDiagonalisation and fixed point theorems2024131Jon SterlingMarcelo Fiore3862Theoremjms-00NFjms-00NF.xmlCantor’s theorem2024131Jon SterlingMarcelo FioreGiven any set A, there can be no surjection from A to \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}}.
3860Proof#377unstable-377.xml2024131Jon Sterlingjms-00NF
Assume that we do have a surjection e \colon A \twoheadrightarrow \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}}. Consider the subset U \in \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} defined below:
U = { \mathopen {} \left \{ x \in A \, \middle \vert \, x \not \in e { \mathopen {} \left ( x \right ) \mathclose {}} \right \} \mathclose {}}
Because e \colon A \twoheadrightarrow \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} is surjective, we may find some a \in A such that e { \mathopen {} \left ( a \right ) \mathclose {}} =U.
Thus for each x \in A, we have
x \in e { \mathopen {} \left ( a \right ) \mathclose {}} \Longleftrightarrow x \in U \Longleftrightarrow x \not \in e { \mathopen {} \left ( x \right ) \mathclose {}} . Setting x:= a, we have a \in e { \mathopen {} \left ( a \right ) \mathclose {}} \Longleftrightarrow a \not \in e { \mathopen {} \left ( a \right ) \mathclose {}}, a contradiction.
3865Definitionjms-00NGjms-00NG.xmlFixed point2024131Jon SterlingMarcelo FioreA fixed point of a function f \colon X \to X is an element x \in X such that f { \mathopen {} \left ( x \right ) \mathclose {}} =x.3870Theoremjms-00NHjms-00NH.xmlLawvere’s fixed point theorem2024131Jon SterlingMarcelo FioreGiven sets A and X, if there exists a surjection A \twoheadrightarrow { \mathopen {} \left ( A \to X \right ) \mathclose {}} then every function X \to X has a fixed point; and hence X is a singleton.
3868Proof#376unstable-376.xml2024131Jon Sterlingjms-00NH
Let e \colon A \twoheadrightarrow { \mathopen {} \left ( A \to X \right ) \mathclose {}} be a surjection.
Let f \colon X \to X be a function of which are trying to find a fixed point. Let h \colon A \to X be the following function:
h { \mathopen {} \left ( a \right ) \mathclose {}} = f { \mathopen {} \left ( e { \mathopen {} \left ( a \right ) \mathclose {}} { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}}
Because e \colon A \twoheadrightarrow { \mathopen {} \left ( A \to X \right ) \mathclose {}} is a surjection, we may find a_h \in A such that e { \mathopen {} \left ( a_h \right ) \mathclose {}} = h. Therefore, we have e { \mathopen {} \left ( a_h \right ) \mathclose {}} { \mathopen {} \left ( a_h \right ) \mathclose {}} = f { \mathopen {} \left ( e { \mathopen {} \left ( a_h \right ) \mathclose {}} { \mathopen {} \left ( a_h \right ) \mathclose {}} \right ) \mathclose {}} and so e { \mathopen {} \left ( a_h \right ) \mathclose {}} { \mathopen {} \left ( a_h \right ) \mathclose {}} is a fixed point of f.
Finally, we conclude that X is a singleton: first of all X has at least one element (the fixed point of the identity function). Now, let x_0 \not = x_1 \in X be two unequal elements of X and let f \colon X \to X be the function that sends x_0 to x_1 and everything else to x_0. By assumption, we have a fixed point p \in X of f. We now proceed by cases:
If p=x_0, then we have f { \mathopen {} \left ( p \right ) \mathclose {}} =x_1 and f { \mathopen {} \left ( p \right ) \mathclose {}} =p and thus x_0=x_1, a contradiction.
If p \not = x_0, then we have f { \mathopen {} \left ( p \right ) \mathclose {}} =x_0 and f { \mathopen {} \left ( p \right ) \mathclose {}} =p, also a contradiction.
3873jms-00NIjms-00NI.xmlDeducing Cantor’s theorem à la Lawvere2024131Jon SterlingTo deduce from , we suppose that there is a surjection A \twoheadrightarrow \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}}; by , this is equivalent to saying that we have a surjection e \colon A \twoheadrightarrow { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}}. From , it follows that { \mathopen {} \left [ 2 \right ] \mathclose {}} is a singleton — a contradiction.3877Corollaryjms-00NJjms-00NJ.xmlUncountability of the reals2024131Jon SterlingMarcelo FioreThe set \mathbb {R} of real numbers is uncountable.
3875Proof#375unstable-375.xml2024131Jon Sterlingjms-00NJ
Of course, as \mathbb {R} is non-empty, it is enough to show that it is not enumerable. We know that \mathbb {R} \cong { \mathopen {} \left [ 0,1 \right ] \mathclose {}} \cong { \mathopen {} \left ( \mathbb {N} \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} \cong \mathcal {P} { \mathopen {} \left ( \mathbb {N} \right ) \mathclose {}}, and and it follows from that the latter is not enumerable.
3906jms-00NKjms-00NK.xmlWell-foundedness and induction2024131Jon SterlingMarcelo Fiore3884Definitionjms-00NLjms-00NL.xmlWell-founded relation2024131Jon SterlingMarcelo FioreLet \prec \colon A \mathbin { \nrightarrow } A be a relation on a set A.An element m \in S \subseteq A is a \prec-minimal (henceforth, “minimal”) element of S when \lnot { \mathopen {} \left ( \exists x \in S \mathpunct {.} x \prec m \right ) \mathclose {}}.
The binary relation \prec on A is called well-founded whenever each non-empty subset of A has a minimal element.3889Examplejms-00NMjms-00NM.xmlA well-founded relation on the naturals2024131Jon SterlingThe “strictly less-than” relation < \colon \mathbb {N} \mathbin { \nrightarrow } \mathbb {N} is well-founded.
3887Proof#374unstable-374.xml2024131Jon Sterlingjms-00NM
Almost immediate: give a non-empty set of naturals, pick the smallest number.
3894Counterexamplejms-00NNjms-00NN.xmlA non-well-founded relation on the integers2024131Jon SterlingThe strictly less-than relation < \colon \mathbb {Z} \mathbin { \nrightarrow } \mathbb {Z} is not well-founded.
3892Proof#373unstable-373.xml2024131Jon Sterlingjms-00NN
There is no smallest integer, so the set \mathbb {Z} has no <-minimal element.
3897Propositionjms-00NOjms-00NO.xmlCharacterisation of well-foundness in terms of chains2024131Jon SterlingMarcelo FioreA relation \prec \colon A \mathbin { \nrightarrow } A is well-founded if and only if there are no infinite descending \prec-chains in A, i.e. no infinite sequences a_0,a_1, \ldots ,a_i, \ldots such that a_0 \succ a_1 \succ \cdots a_i \succ \cdots.3900Propositionjms-00NPjms-00NP.xmlThe principle of well-founded induction2024131Jon SterlingMarcelo FioreLet \prec \colon A \mathbin { \nrightarrow } A be a well-founded relation and fix a subset S \subseteq A. Then A \subseteq S holds if and only if for each x \in A such that \forall y \in A \mathpunct {.} y \prec x \implies y \in S, we have x \in S.In other words, to show that every element of A lies in S, it is enough to show that for any element x \in A, the elements below x lie in S.3903Corollaryjms-00NQjms-00NQ.xmlStrong induction as well-founded induction2024131Jon SterlingMarcelo FioreThe “strong” or “complete” induction principle for natural numbers states that for any subset S \subseteq \mathbb {N}, we have \forall x \in \mathbb {N} \mathpunct {.} x \in S if and only if we have 0 \in S and, for any n \in \mathbb {N} such that m \in S for all m<n, we have n \in S. This is precisely the statement of well-founded induction for the strictly less-than relation.3912jms-00OCjms-00OC.xmlLecture 22: finite automata202429Jon SterlingAndrew PittsFrank Stajano3316Lemmajms-00ODjms-00OD.xmlSoundness of the subset construction202429Jon SterlingAndrew PittsFrank StajanoLet M be an NFA with \epsilon-transitions; then \mathcal {L} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} \right ) \mathclose {}} \subseteq \mathcal {L} { \mathopen {} \left ( M \right ) \mathclose {}}.3297Proof#369unstable-369.xml202429Jon Sterlingjms-00ODWe must show that any string u \in \mathcal {L} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} \right ) \mathclose {}} lies in \mathcal {L} { \mathopen {} \left ( M \right ) \mathclose {}}.3293Case#370unstable-370.xmlEmpty string202429Jon Sterling#369If u= \epsilon, then by definition we have s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \in F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } and so by definition, there exists q \in s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } such that q \in F_{ M }. By definition of s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }, we therefore have s \xRightarrow { \epsilon }q; because q \in F_{ M }, we have \epsilon \in \mathcal {L} { \mathopen {} \left ( M \right ) \mathclose {}}.3295Case#371unstable-371.xmlNon-empty string202429Jon Sterling#369Suppose on the other hand that u=a_1 \cdots a_n, and so we therefore have transitions s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \xrightarrow {a_1}S_1 \xrightarrow { \ldots }S_{n-1} \xrightarrow {a_n}S_n with S_n \in F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }, and so there exists some q_n \in S_n \cap F_{ M }. Define S_0 = s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }.By definition of \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}}, from the transition S_{n-1} \xrightarrow {a_n}S_n we may find some q_{n-1} \in S_{n-1} such that q_{n-1} \xRightarrow {a_n}q_n. Iterating this process, we may find states q_i \in S_{i} such that we have a sequence of transitions q_{i-1} \xRightarrow {a_i}q_i in M; thus we have a sequence of transitions q_0 \xRightarrow {u}q_n. Because q_0 \in S_0= s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }, we have s_{ M } \xRightarrow { \epsilon }q_0 and thus s_{ M } \xRightarrow { \epsilon }q_0 \xRightarrow {u}q_n and therefore s_{ M } \xRightarrow {u} q_n. Because q_n \in F_{ M }, we are done.3320Lemmajms-00OEjms-00OE.xmlCompleteness of the subset construction202429Jon SterlingLet M be an NFA with \epsilon-transitions; then \mathcal {L} { \mathopen {} \left ( M \right ) \mathclose {}} \subseteq \mathcal {L} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} \right ) \mathclose {}}.3278Proof#365unstable-365.xml202429Jon Sterlingjms-00OEWe must show that any string u \in \mathcal {L} { \mathopen {} \left ( M \right ) \mathclose {}} lies in \mathcal {L} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} \right ) \mathclose {}}.3272Case#366unstable-366.xmlEmpty string202429Jon Sterling#365If u= \epsilon, then we have assumed that s_{ M } \xRightarrow { \epsilon }q such that q \in F_{ M }; therefore, by definition, we have q \in s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }. Thus we have q \in s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \cap F_{ M } and so it follows by definition that s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \in F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }.3276Case#367unstable-367.xmlNon-empty string202429Jon Sterling#365Suppose that u=a_1 \cdots a_n, and so we have transitions s_{ M } \xRightarrow {a_1}q_1 \xRightarrow { \cdots } q_{n-1} \xRightarrow {a_n}q_n with q_n \in F_{ M }.We inductively define a sequence of states S_i in \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} as follows: \begin {aligned} S_0 &= s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \\ S_{k+1} &= \delta _{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } { \mathopen {} \left ( S_k,a_{k+1} \right ) \mathclose {}} \end {aligned} Define q_0= s_{ M }. Then for all 0 \leq i \leq n, we observe that we have q_i \in S_i.3274Subproof#368unstable-368.xml202429Jon Sterling#367We proceed by induction on i. For i=0, we need s_{ M } \in s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }; but this holds by definition. Next, assuming q_i \in S_i we must show that q_{i+1} \in S_{i+1} = \delta _{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } { \mathopen {} \left ( S_i,a_{i+1} \right ) \mathclose {}}. Unfolding definitions, this follows from the transition q_i \xRightarrow {a_{i+1}} q_{i+1} that we have assumed, as our inductive hypothesis ensures q_i \in S_i.It follows from the above that we have transitions S_0= s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \xrightarrow {a_1}S_1 \xrightarrow { \cdots }S_{n-1} \xrightarrow {a_n}S_n. Becuase we have shown q_n \in S_n and we have assumed q_n \in F_{ M }, we have S_n \in F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }. Thus it follows that \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} accepts u.3322Questionjms-00OGjms-00OG.xmlAlternative definition of F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }?202429Jon Sterlingjms-00OCWhy can’t we define F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } to be the set of singletons { \mathopen {} \left \{ q \right \} \mathclose {}} such that q \in F_{ M }?3308Answer#372unstable-372.xml202429Jon Sterlingjms-00OGBecause if we did, then would fail already in the case of the empty string. We would have s_{ M } \xRightarrow { \epsilon }{q} with q \in F_{ M } and thus q \in s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} }; we would need s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } \in F_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } and with the proposed definition, this would need to be a singleton. There is, however, nothing about our assumptions that allows us to deduce that and, indeed, we can certainly come up with examples where s_{ \mathcal {P} { \mathopen {} \left ( M \right ) \mathclose {}} } is not a singleton (e.g. if we have \epsilon-transitions of the form s_{ M } \xRightarrow { \epsilon }q for q \not = s_{ M }).4719Coursejms-0080jms-0080.xmlCategory Theory (Fall 2022)2022Alejandro AguirreLars BirkedalJon SterlingAarhus Universityhttps://jonsterling.github.io/courses/ct-fall-2022/
4730jms-00M4jms-00M4.xmlConsulting2024129Jon Sterlingjms-0001I offer a variety of paid consulting services, ranging from language design to proof+software engineering to private tutoring; please contact me for a consultation.
21173jms-008Ljms-008L.xmlJon Sterling › curriculum vitæ202398Jon SterlingAlejandro AguirreAndrew PittsFrank StajanoKarl CraryLars BirkedalMarcelo FioreRobert HarperStephanie Balzerjms-0001false
12012jms-008Njms-008N.xmlResearch themes202398Jon Sterlingjms-0001Central to both the design of programming languages and the practice of software engineering is the tension between abstraction and composition. I employ semantic methods from category theory and type theory to design, verify, and implement languages that enable both programmers and mathematicians to negotiate the different levels of abstraction that arise in their work. I apply my research to global safety and security properties of programming languages as well as the design and implementation of interactive theorem provers for higher-dimensional mathematics. I develop foundational and practical mathematical tools to deftly weave together verifications that cut across multiple levels of abstraction.
12014jms-008Mjms-008M.xmlProfessional history202398Jon Sterlingjms-008LFrom September 2023, I am an Associate Professor in Logical Foundations and Formal Methods at University of Cambridge.From 2022, I was a Marie Skłodowska-Curie Postdoctoral Fellow hosted at Aarhus University working with Professor Lars Birkedal.From 2016 to 2021, I was a PhD student of Professor Robert Harper at Carnegie Mellon University, where I wrote my doctoral thesis on synthetic Tait computability and its application to normalization for cubical type theory.
12073#249unstable-249.xmlRefereed papers202398Jon Sterlingjms-008L12016Referencesterling-gratzer-birkedal-2024-univalentsterling-gratzer-birkedal-2024-univalent.xmlTowards univalent reference types202427Jon SterlingDaniel GratzerLars Birkedal10.4230/LIPIcs.CSL.2024.47CSL ’24: 32nd EACSL Annual Conference on Computer Science Logic 2024@inproceedings{sterling-gratzer-birkedal-2024-univalent,
author = {Sterling, Jonathan and Gratzer, Daniel and Birkedal, Lars},
title = {{Towards Univalent Reference Types: The Impact of Univalence on Denotational Semantics}},
booktitle = {32nd EACSL Annual Conference on Computer Science Logic (CSL 2024)},
pages = {47:1--47:21},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-310-2},
ISSN = {1868-8969},
year = {2024},
volume = {288},
editor = {Murano, Aniello and Silva, Alexandra},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
doi = {10.4230/LIPIcs.CSL.2024.47},
}We develop a denotational semantics for general reference types in an impredicative version of guarded homotopy type theory, an adaptation of synthetic guarded domain theory to Voevodsky’s univalent foundations. We observe for the first time the profound impact of univalence on the denotational semantics of mutable state. Univalence automatically ensures that all computations are invariant under symmetries of the heap—a bountiful source of program equivalences. In particular, even the most simplistic univalent model enjoys many new program equivalences that do not hold when the same constructions are carried out in the universes of traditional set-level (extensional) type theory.12020Referencegrodin-niu-sterling-harper-2024grodin-niu-sterling-harper-2024.xml decalf: a directed, effectful cost-aware logical framework202415Harrison GrodinYue NiuJon SterlingRobert HarperPOPL ’24: 51st ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3632852https://arxiv.org/abs/2307.05938@article{grodin-niu-sterling-harper-2024,
author = {Grodin, Harrison and Niu, Yue and Sterling, Jonathan and Harper, Robert},
title = {Decalf: A Directed, Effectful Cost-Aware Logical Framework},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {8},
number = {POPL},
doi = {10.1145/3632852},
journal = {Proc. ACM Program. Lang.},
month = {jan},
articleno = {10},
numpages = {29},
}We present decalf, a directed, effectful cost-aware logical framework for studying quantitative aspects of functional programs with effects. Like calf, the language is based on a formal phase distinction between the extension and the intension of a program, its pure behavior as distinct from its cost measured by an effectful step-counting primitive. The type theory ensures that the behavior is unaffected by the cost accounting. Unlike calf, the present language takes account of effects, such as probabilistic choice and mutable state; this extension requires a reformulation of calf’s approach to cost accounting: rather than rely on a “separable” notion of cost, here a cost bound is simply another program. To make this formal, we equip every type with an intrinsic preorder, relaxing the precise cost accounting intrinsic to a program to a looser but nevertheless informative estimate. For example, the cost bound of a probabilistic program is itself a probabilistic program that specifies the distribution of costs. This approach serves as a streamlined alternative to the standard method of isolating a recurrence that bounds the cost in a manner that readily extends to higher-order, effectful programs.The development proceeds by first introducing the decalf type system, which is based on an intrinsic ordering among terms that restricts in the extensional phase to extensional equality, but in the intensional phase reflects an approximation of the cost of a program of interest. This formulation is then applied to a number of illustrative examples, including pure and effectful sorting algorithms, simple probabilistic programs, and higher-order functions. Finally, we justify decalf via a model in the topos of augmented simplicial sets.12025Referencesieczkowski-stepanenko-sterling-birkedal-2024sieczkowski-stepanenko-sterling-birkedal-2024.xmlThe essence of generalized algebraic data types202415Filip SieczkowskiSergei StepanenkoJon SterlingLars BirkedalPOPL ’24: 51st ACM SIGPLAN Symposium on Principles of Programming Languages10.1145/3632866@article{sieczkowski-stepanenko-sterling-birkedal-2024,
author = {Sieczkowski, Filip and Stepanenko, Sergei and Sterling, Jonathan and Birkedal, Lars},
title = {The Essence of Generalized Algebraic Data Types},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {8},
number = {POPL},
doi = {10.1145/3632866},
journal = {Proc. ACM Program. Lang.},
month = {jan},
articleno = {24},
numpages = {29},
}This paper considers direct encodings of generalized algebraic data types (GADTs) in a minimal suitable lambda-calculus. To this end, we develop an extension of System Fω with recursive types and internalized type equalities with injective constant type constructors. We show how GADTs and associated pattern-matching constructs can be directly expressed in the calculus, thus showing that it may be treated as a highly idealized modern functional programming language. We prove that the internalized type equalities in conjunction with injectivity rules increase the expressive power of the calculus by establishing a non-macro-expressibility result in Fω, and prove the system type-sound via a syntactic argument. Finally, we build two relational models of our calculus: a simple, unary model that illustrates a novel, two-stage interpretation technique, necessary to account for the equational constraints; and a more sophisticated, binary model that relaxes the construction to allow, for the first time, formal reasoning about data-abstraction in a calculus equipped with GADTs.12030Referenceaagaard-sterling-birkedal-2023aagaard-sterling-birkedal-2023.xmlA denotationally-based program logic for higher-order store20231123Frederik Lerbjerg AagaardJon SterlingLars Birkedal10.46298/entics.1223239th International Conference on Mathematical Foundations of Programming SemanticsSeparation logic is used to reason locally about stateful programs. State of the art program logics for higher-order store are usually built on top of untyped operational semantics, in part because traditional denotational methods have struggled to simultaneously account for general references and parametric polymorphism. The recent discovery of simple denotational semantics for general references and polymorphism in synthetic guarded domain theory has enabled us to develop Tulip, a higher-order separation logic over the typed equational theory of higher-order store for a monadic version of System \textbf {F}^{ \mu , \textit {ref}}. The Tulip logic differs from operationally-based program logics in two ways: predicates range over the meanings of typed terms rather than over the raw code of untyped terms, and they are automatically invariant under the equational congruence of higher-order store, which applies even underneath a binder. As a result, “pure” proof steps that conventionally require focusing the Hoare triple on an operational redex are replaced by a simple equational rewrite in Tulip. We have evaluated Tulip against standard examples involving linked lists in the heap, comparing our abstract equational reasoning with more familiar operational-style reasoning. Our main result is the soundness of Tulip, which we establish by constructing a BI-hyperdoctrine over the denotational semantics of \textbf {F}^{ \mu , \textit {ref}} in an impredicative version of synthetic guarded domain theory.12034Referencesterling-2023-genericsterling-2023-generic.xmlWhat should a generic object be?2023425Jon Sterling@article{sterling-2023-generic,
author = {Sterling, Jonathan},
publisher = {Cambridge University Press},
date = {2023},
doi = {10.1017/S0960129523000117},
journaltitle = {Mathematical Structures in Computer Science},
pages = {1--22},
title = {What should a generic object be?},
}10.1017/S0960129523000117Mathematical Structures in Computer ScienceJacobs has proposed definitions for (weak, strong, split) generic objects for a fibered category; building on his definition of (split) generic objects, Jacobs develops a menagerie of important fibrational structures with applications to categorical logic and computer science, including higher order fibrations, polymorphic fibrations, 𝜆2-fibrations, triposes, and others. We observe that a split generic object need not in particular be a generic object under the given definitions, and that the definitions of polymorphic fibrations, triposes, etc. are strict enough to rule out some fundamental examples: for instance, the fibered preorder induced by a partial combinatory algebra in realizability is not a tripos in this sense. We propose a new alignment of terminology that emphasizes the forms of generic object appearing most commonly in nature, i.e. in the study of internal categories, triposes, and the denotational semantics of polymorphism. In addition, we propose a new class of acyclic generic objects inspired by recent developments in higher category theory and the semantics of homotopy type theory, generalizing the realignment property of universes to the setting of an arbitrary fibration.12036Referencepalombi-sterling-2023palombi-sterling-2023.xmlClassifying topoi in synthetic guarded domain theory: the universal property of multi-clock guarded recursion2023222Daniele PalombiJon Sterling@inproceedings{palombi-sterling-2023,
author = {Palombi, Daniele and Sterling, Jonathan},
booktitle = {Proceedings 38th Conference on Mathematical Foundations of Programming Semantics, {MFPS} 2022},
year = {2023},
month = feb,
title = {Classifying topoi in synthetic guarded domain theory},
doi = {10.46298/entics.10323},
}10.46298/entics.1032338th International Conference on Mathematical Foundations of Programming SemanticsSeveral different topoi have played an important role in the development and applications of synthetic guarded domain theory (SGDT), a new kind of synthetic domain theory that abstracts the concept of guarded recursion frequently employed in the semantics of programming languages. In order to unify the accounts of guarded recursion and coinduction, several authors have enriched SGDT with multiple “clocks” parameterizing different time-streams, leading to more complex and difficult to understand topos models. Until now these topoi have been understood very concretely qua categories of presheaves, and the logico-geometrical question of what theories these topoi classify has remained open. We show that several important topos models of SGDT classify very simple geometric theories, and that the passage to various forms of multi-clock guarded recursion can be rephrased more compositionally in terms of the lower bagtopos construction of Vickers and variations thereon due to Johnstone. We contribute to the consolidation of SGDT by isolating the universal property of multi-clock guarded recursion as a modular construction that applies to any topos model of single-clock guarded recursion.12039Referencesterling-angiuli-gratzer-2022sterling-angiuli-gratzer-2022.xmlA cubical language for Bishop sets202229Jon SterlingCarlo AngiuliDaniel Gratzer@article{sterling-angiuli-gratzer-2022,
author = {Sterling, Jonathan and Angiuli, Carlo and Gratzer, Daniel},
year = {2022},
month = mar,
doi = {10.46298/lmcs-18(1:43)2022},
eprint = {2003.01491},
eprintclass = {cs.LO},
eprinttype = {arXiv},
issue = {1},
journal = {Logical Methods in Computer Science},
title = {{A Cubical Language for Bishop Sets}},
volume = {18},
}10.46298/lmcs-18(1:43)2022Logical Methods in Computer ScienceWe present XTT, a version of Cartesian cubical type theory specialized for Bishop sets à la Coquand, in which every type enjoys a definitional version of the uniqueness of identity proofs. Using cubical notions, XTT reconstructs many of the ideas underlying Observational Type Theory, a version of intensional type theory that supports function extensionality. We prove the canonicity property of XTT (that every closed boolean is definitionally equal to a constant) by Artin gluing.12043Referenceniu-sterling-grodin-harper-2022niu-sterling-grodin-harper-2022.xmlA cost-aware logical framework202211Yue NiuJon SterlingHarrison GrodinRobert HarperProceedings of the ACM on Programming Languages, Volume 6, Issue POPL10.1145/3498670We present calf, a cost-aware logical framework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account of phase distinctions, we argue that the cost structure of programs motivates a phase distinction between intension and extension. Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which cost-aware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a full-spectrum dependent type theory, calf presents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants.We evaluate calf as a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: the method of recurrence relations and physicist’s method for amortized analysis. We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential and parallel complexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning in calf by means of a model construction.12048Referencesterling-harper-2022sterling-harper-2022.xmlSheaf semantics of termination-insensitive noninterference2022Jon SterlingRobert Harper10.4230/LIPIcs.FSCD.2022.5papers/sterling-harper-2022.pdf7th International Conference on Formal Structures for Computation and Deduction (FSCD 2022)We propose a new sheaf semantics for secure information flow over a space of abstract behaviors, based on synthetic domain theory: security classes are open/closed partitions, types are sheaves, and redaction of sensitive information corresponds to restricting a sheaf to a closed subspace. Our security-aware computational model satisfies termination-insensitive noninterference automatically, and therefore constitutes an intrinsic alternative to state of the art extrinsic/relational models of noninterference. Our semantics is the latest application of Sterling and Harper’s recent re-interpretation of phase distinctions and noninterference in programming languages in terms of Artin gluing and topos-theoretic open/closed modalities. Prior applications include parametricity for ML modules, the proof of normalization for cubical type theory by Sterling and Angiuli, and the cost-aware logical framework of Niu et al. In this paper we employ the phase distinction perspective twice: first to reconstruct the syntax and semantics of secure information flow as a lattice of phase distinctions between “higher” and “lower” security, and second to verify the computational adequacy of our sheaf semantics with respect to a version of Abadi et al.’s dependency core calculus to which we have added a construct for declassifying termination channels.4632Erratumjms-005Yjms-005Y.xmlMinor mistakes in sheaf semantics of noninterference2023Jon SterlingIn the published version of this paper, there were a few mistakes that have been corrected in the local copy hosted here.In the Critique of relational semantics for information flow, our discussion of the Failure of monotonicity stated incorrectly that algebras for the sealing monad at a higher security level could not be transformed into algebras for the sealing monad at a lower security level in the semantics of Abadi et al. This is not true, as pointed out to us privately by Carlos Tomé Cortiñas. What we meant to say was that it is not the case that a type whose component at a high security level is trivial shall always remain trivial at a lower security level.
The original version of the extended edition of this paper, we claimed that the constructive existence of tensor products on pointed dcpos was obvious; in fact, tensor products do exist, but their construction involves a reflexive coequalizer of pointed dcpos.4634Erratumjms-005Zjms-005Z.xmlAdequacy of sheaf semantics of noninterference2023717Jon SterlingA serious (and as-yet unfixed) problem was discovered in July of 2023 by Yue Niu, which undermines the proof of adequacy given; in particular, the proof that the logical relation on free algebras is admissible is not correct. I believe there is a different proof of adequacy for the calculus described, but it will have a different structure from what currently appears in the paper. We thank Yue Niu for his attention to detail and careful reading of this paper.12051Referencesterling-harper-2021sterling-harper-2021.xmlLogical relations as types: proof-relevant parametricity for program modules2021121Jon SterlingRobert Harperpapers/sterling-harper-2021.pdfJournal of the ACM, Volume 68, Issue 610.1145/3474834The theory of program modules is of interest to language designers not only for its practical importance to programming, but also because it lies at the nexus of three fundamental concerns in language design: the phase distinction, computational effects, and type abstraction. We contribute a fresh “synthetic” take on program modules that treats modules as the fundamental constructs, in which the usual suspects of prior module calculi (kinds, constructors, dynamic programs) are rendered as derived notions in terms of a modal type-theoretic account of the phase distinction. We simplify the account of type abstraction (embodied in the generativity of module functors) through a lax modality that encapsulates computational effects, placing projectibility of module expressions on a type-theoretic basis.Our main result is a (significant) proof-relevant and phase-sensitive generalization of the Reynolds abstraction theorem for a calculus of program modules, based on a new kind of logical relation called a parametricity structure. Parametricity structures generalize the proof-irrelevant relations of classical parametricity to proof-relevant families, where there may be non-trivial evidence witnessing the relatedness of two programs—simplifying the metatheory of strong sums over the collection of types, for although there can be no “relation classifying relations,” one easily accommodates a “family classifying small families.”Using the insight that logical relations/parametricity is itself a form of phase distinction between the syntactic and the semantic, we contribute a new synthetic approach to phase separated parametricity based on the slogan logical relations as types, by iterating our modal account of the phase distinction. We axiomatize a dependent type theory of parametricity structures using two pairs of complementary modalities (syntactic, semantic) and (static, dynamic), substantiated using the topos theoretic Artin gluing construction. Then, to construct a simulation between two implementations of an abstract type, one simply programs a third implementation whose type component carries the representation invariant.1604Erratumjms-0060jms-0060.xmlMinor mistakes in logical relations as types2021Jon SterlingAfter going to press, we have fixed the following mistakes:In the definition of a logos, we mistakenly said that "colimits commute with finite limits" but we meant to say that they are preserved by pullback. We thank Sarah Z. Rovner-Frydman for noticing this mistake.
In Remark 5.15, we used the notation for the closed immersion prior to introducing it.
We have fixed a few broken links in the bibliography.The local copy hosted here has the corrections implemented12054Referencesterling-angiuli-2021sterling-angiuli-2021.xmlNormalization for cubical type theory202177Jon SterlingCarlo Angiuli2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS52264.2021.9470719We prove normalization for (univalent, Cartesian) cubical type theory, closing the last major open problem in the syntactic metatheory of cubical type theory. Our normalization result is reduction-free, in the sense of yielding a bijection between equivalence classes of terms in context and a tractable language of \beta/\eta-normal forms. As corollaries we obtain both decidability of judgmental equality and the injectivity of type constructors.12057Referencesterling-2021-bhfssterling-2021-bhfs.xmlHigher order functions and Brouwer’s Thesis2021519Jon Sterling@article{sterling-2021-bhfs,
author = {Sterling, Jonathan},
publisher = {Cambridge University Press},
date = {2021},
doi = {10.1017/S0956796821000095},
eprint = {1608.03814},
eprintclass = {math.LO},
eprinttype = {arXiv},
journaltitle = {Journal of Functional Programming},
note = {\emph{Bob Harper Festschrift Collection}},
pages = {e11},
title = {Higher order functions and Brouwer's thesis},
volume = {31},
}http://www.jonmsterling.com/agda-effectful-forcing/index.html10.1017/S0956796821000095Journal of Functional Programming, Bob Harper Festschrift CollectionExtending Martín Hötzel Escardó’s effectful forcing technique, we give a new proof of a well-known result: Brouwer’s monotone bar theorem holds for any bar that can be realized by a functional of type { \mathopen {} \left ( \mathbb {N} \to \mathbb {N} \right ) \mathclose {}} \to \mathbb {N} in Gödel’s System T. Effectful forcing is an elementary alternative to standard sheaf-theoretic forcing arguments, using ideas from programming languages, including computational effects, monads, the algebra interpretation of call-by-name λ-calculus, and logical relations. Our argument proceeds by interpreting System T programs as well-founded dialogue trees whose nodes branch on a query to an oracle of type \mathbb {N} \to \mathbb {N}, lifted to higher type along a call-by-name translation. To connect this interpretation to the bar theorem, we then show that Brouwer’s famous "mental constructions" of barhood constitute an invariant form of these dialogue trees in which queries to the oracle are made maximally and in order.12059Referencesterling-angiuli-gratzer-2019sterling-angiuli-gratzer-2019.xmlCubical syntax for reflection-free extensional equality2019Jon SterlingCarlo AngiuliDaniel Gratzer@inproceedings{sterling-angiuli-gratzer-2019,
author = {Sterling, Jonathan and Angiuli, Carlo and Gratzer, Daniel},
editor = {Geuvers, Herman},
location = {Dagstuhl, Germany},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
url = {http://drops.dagstuhl.de/opus/volltexte/2019/10538},
booktitle = {Proceedings of the 4th International Conference on Formal Structures for Computation and Deduction (FSCD 2019)},
date = {2019},
doi = {10.4230/LIPIcs.FSCD.2019.31},
eprint = {1904.08562},
eprinttype = {arXiv},
isbn = {978-3-95977-107-8},
issn = {1868-8969},
pages = {31:1--31:25},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
title = {Cubical Syntax for Reflection-Free Extensional Equality},
volume = {131},
}slides/sterling-angiuli-gratzer-2019.pdf10.4230/LIPIcs.FSCD.2019.31FSCD ’19: International Conference on Formal Structures for Computation and DeductionWe contribute XTT, a cubical reconstruction of Observational Type Theory [Altenkirch et al., 2007] which extends Martin-Löf's intensional type theory with a dependent equality type that enjoys function extensionality and a judgmental version of the unicity of identity proofs principle (UIP): any two elements of the same equality type are judgmentally equal. Moreover, we conjecture that the typing relation can be decided in a practical way. In this paper, we establish an algebraic canonicity theorem using a novel extension of the logical families or categorical gluing argument inspired by Coquand and Shulman: every closed element of boolean type is derivably equal to either true or false.12063Referencegratzer-sterling-birkedal-2019gratzer-sterling-birkedal-2019.xmlImplementing a modal dependent type theory2019Daniel GratzerJon SterlingLars Birkedal@article{gratzer-sterling-birkedal-2019,
author = {Gratzer, Daniel and Sterling, Jonathan and Birkedal, Lars},
location = {New York, NY, USA},
publisher = {ACM},
date = {2019-07},
doi = {10.1145/3341711},
issn = {2475-1421},
journaltitle = {Proceedings of the ACM on Programming Languages},
keywords = {Modal types,dependent types,normalization by evaluation,type-checking},
number = {ICFP},
pages = {107:1--107:29},
title = {Implementing a Modal Dependent Type Theory},
volume = {3},
}10.1145/3341711ICFP ’19: The 24th ACM SIGPLAN International Conference on Functional ProgrammingModalities are everywhere in programming and mathematics! Despite this, however, there are still significant technical challenges in formulating a core dependent type theory with modalities. We present a dependent type theory MLTT🔒 supporting the connectives of standard Martin-Löf Type Theory as well as an S4-style necessity operator. MLTT🔒 supports a smooth interaction between modal and dependent types and provides a common basis for the use of modalities in programming and in synthetic mathematics. We design and prove the soundness and completeness of a type checking algorithm for MLTT🔒, using a novel extension of normalization by evaluation. We have also implemented our algorithm in a prototype proof assistant for MLTT🔒, demonstrating the ease of applying our techniques.12067Referencesterling-harper-2018sterling-harper-2018.xmlGuarded computational type theory2018Jon SterlingRobert Harper@inproceedings{sterling-harper-2018,
author = {Sterling, Jonathan and Harper, Robert},
title = {Guarded Computational Type Theory},
booktitle = {Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science},
series = {LICS '18},
year = {2018},
isbn = {978-1-4503-5583-4},
location = {Oxford, United Kingdom},
pages = {879--888},
numpages = {10},
url = {http://doi.acm.org/10.1145/3209108.3209153},
doi = {10.1145/3209108.3209153},
acmid = {3209153},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {clocks, dependent types, guarded recursion, operational semantics, type theory},
}slides/sterling-harper-2018.pdf10.1145/3209108.3209153LICS ’18: 33rd Annual ACM/IEEE Symposium on Logic in Computer ScienceNakano’s later modality can be used to specify and define recursive functions which are causal or synchronous; in concert with a notion of clock variable, it is possible to also capture the broader class of productive (co)programs. Until now, it has been difficult to combine these constructs with dependent types in a way that preserves the operational meaning of type theory and admits a hierarchy of universes. We present an operational account of guarded dependent type theory with clocks called Guarded Computational Type Theory, featuring a novel clock intersection connective that enjoys the clock irrelevance principle, as well as a predicative hierarchy of universes which does not require any indexing in clock contexts. Guarded Computational Type Theory is simultaneously a programming language with a rich specification logic, as well as a computational metalanguage that can be used to develop semantics of other languages and logics.12070Referencevalentine-sterling-2016valentine-sterling-2016.xmlDependent types for pragmatics2016Rebecca ValentineJon Sterling@inbook{valentine-sterling-2016,
author={Valentine, Rebecca and Sterling, Jonathan},
editor={Redmond, Juan and Pombo Martins, Olga and Nepomuceno Fern{\'a}ndez, {\'A}ngel},
title={Dependent Types for Pragmatics},
booktitle={Epistemology, Knowledge and the Impact of Interaction},
year={2016},
publisher={Springer International Publishing},
address={Cham},
pages={123--139},
isbn={978-3-319-26506-3},
doi={10.1007/978-3-319-26506-3_4},
}10.1007/978-3-319-26506-3_4In: Redmond J., Pombo Martins O., Nepomuceno Fernández Á. (eds) Epistemology, Knowledge and the Impact of Interaction. Logic, Epistemology, and the Unity of Science, vol 38. Springer, Cham.In this paper, we present an extension to Martin-Löf’s Intuitionistic Type Theory which gives natural solutions to problems in pragmatics, such as pronominal reference and presupposition. Our approach also gives a simple account of donkey anaphora without resorting to exotic scope extension of the sort used in Discourse Representation Theory and Dynamic Semantics, thanks to the proof-relevant nature of type theory.
12075jms-008Ojms-008O.xmlAcademic service202398Jon Sterlingjms-008L
11755jms-00AFjms-00AF.xmlConference program committees20231012Jon Sterlingjms-000111752Conferenceicfp-2023icfp-2023.xmlICFP ’23: The 28th ACM SIGPLAN International Conference on Functional Programming20239https://icfp23.sigplan.org/11753Conferencepopl-2023popl-2023.xmlPOPL ’23: 50th ACM SIGPLAN Symposium on Principles of Programming Languages20231https://popl23.sigplan.org/The annual Symposium on Principles of Programming Languages is a forum for the discussion of all aspects of programming languages and programming systems. Both theoretical and experimental papers are welcome on topics ranging from formal frameworks to experience reports. We seek submissions that make principled, enduring contributions to the theory, design, understanding, implementation or application of programming languages.11754Conferenceact-2022act-2022.xmlACT ’22: International Conference on Applied Category Theory20227https://msp.cis.strath.ac.uk/act2022/Applied category theory is a topic of interest for a growing community of researchers, interested in studying many different kinds of systems using category-theoretic tools. These systems are found across computer science, mathematics, and physics, as well as in social science, linguistics, cognition, and neuroscience. The background and experience of our members is as varied as the systems being studied. The goal of Applied Category Theory is to bring researchers in the field together, disseminate the latest results, and facilitate further development of the field.
11757jms-00AGjms-00AG.xmlWorkshop program committees20231012Jon Sterlingjms-00011537Workshophott-uf-2024hott-uf-2024.xmlHoTT-UF ’24: Workshop on Homotopy Type Theory/ Univalent Foundations202442EuroProofNet WG6 meeting in Leuvenhttps://hott-uf.github.io/2024/Homotopy Type Theory is a young area of logic, combining ideas from several established fields: the use of dependent type theory as a foundation for mathematics, inspired by ideas and tools from abstract homotopy theory. Univalent Foundations are foundations of mathematics based on the homotopical interpretation of type theory.The goal of this workshop is to bring together researchers interested in all aspects of Homotopy Type Theory/Univalent Foundations: from the study of syntax and semantics of type theory to practical formalization in proof assistants based on univalent type theory.abstract submission deadline: 2024-01-19
PC review deadline: 2024-02-141538Workshopwits-2023wits-2023.xmlWITS ’23: Workshop on the Implementation of Type Systems20238https://ifl23.github.io/call_papers_wits.htmlThe 35th Symposium on Implementation and Application of Functional LanguagesThe Second Workshop on the Implementation of Type Systems (WITS 2023) will be held on August 28, 2023, in Braga, Portugal, co-located with IFL 2023. The goal of this workshop is to bring together the implementors of a variety of languages with advanced type systems. The main focus is on the practical issues that come up in the implementation of these systems, rather than the theoretical frameworks that underlie them. In particular, we want to encourage exchanging ideas between the communities around specific systems that would otherwise be accessible to only a very select group. The workshop will have a mix of invited and contributed talks, organized discussion times, and informal collaboration time. We invite participants to share their experiences, study differences among the implementations, and generalize lessons from those. We also want to promote the creation of a shared vocabulary and set of best practices for implementing type systems.1539Workshophott-uf-2023hott-uf-2023.xmlHoTT-UF ’23: Workshop on Homotopy Type Theory/ Univalent Foundations2023422EuroProofNet WG6 meeting in Vienna in April 2023https://hott-uf.github.io/2023/Homotopy Type Theory is a young area of logic, combining ideas from several established fields: the use of dependent type theory as a foundation for mathematics, inspired by ideas and tools from abstract homotopy theory. Univalent Foundations are foundations of mathematics based on the homotopical interpretation of type theory.The goal of this workshop is to bring together researchers interested in all aspects of Homotopy Type Theory/Univalent Foundations: from the study of syntax and semantics of type theory to practical formalization in proof assistants based on univalent type theory.1540Workshoptyde-2022tyde-2022.xmlTyDe ’22: Workshop on Type-driven Development20229https://icfp22.sigplan.org/home/tyde-2022The 27th ACM SIGPLAN International Conference on Functional ProgrammingThe Workshop on Type-Driven Development (TyDe) aims to show how static type information may be used effectively in the development of computer programs. Co-located with ICFP, this workshop brings together leading researchers and practitioners who are using or exploring types as a means of program development.
12104jms-006Bjms-006B.xmlFunding and grantsJon SterlingRobert HarperStephanie Balzer12077Grantjms-008Kjms-008K.xmlNew spaces for denotational semantics20239Jon SterlingUnited States Air Force Office of Scientific Research
PI:
Jonathan Sterling
institution:
University of Cambridge
Funding Agency:
United States Air Force Office of Scientific Research
Program Officer:
Dr. Tristan Nguyen
Award No.:
FA9550-23-1-0728
Years:
2023–2028
Amount:
1,221,099 USD
Status:
awarded 2023-09-27
See project bibliography.Abstract. What does it mean for a piece of software to be correct? There are many possible degrees and dimensions of correctness, e.g. safety, functional correctness, computational complexity, and security. To grapple with this diversity of verification requirements, semanticists develop mathematical models of program behavior that put into relief different aspects of the physical reality of program execution on hardware, just as physicists create many idealized mathematical models to study different aspects of the material reality of the universe.Mathematical models enable us to reason about program behavior by viewing highly complex objects as being glued together from smaller, simpler objects that are easier to study in isolation. For instance, operational models aim to reduce the behavior of a process to that of individual steps of discrete computation that take place on an idealized computer; in contrast, denotational models reduce the complex global behavior of a process to the simpler local behavior of its constituent subroutines. One advantage of operational methods is that they are applicable even in situations that challenge the modularity of denotational semantics, e.g. where it is not yet understood how to reduce the global behavior of a program to that of its components. On the other hand, denotational methods provide vastly stronger and simpler reasoning principles for program verification when available.The central thesis of denotational semantics is that programs arrange themselves into geometrical spaces called computational domains, and that a computational process can be thought of as the limit of a sequence of continuous transformations on these domains. Although this thesis has been amply born out for simple kinds of program, today’s most urgent verification requirements pertain to program constructs like concurrency and side effects whose treatment requires the introduction of new kinds of space: for instance, the correct treatment of branching behavior for concurrent processes requires the introduction of higher-dimensional computational domains in which programs can “remember” the specific way that they were glued together.This project will extend the reach of denotational semantics and its attendant advantages for program verification into terrains where scientists have historically struggled to enact the reduction of global behavior to local behavior, making essential use of new advances in the homotopical and geometrical understanding of computation via higher dimensional category theory and topos theory. I will investigate two areas that are ripe for reaping the benefits of a modern denotational semantics: the semantics of side-effects which govern the interaction of a program with the computer’s memory, and the semantics of concurrent processes.12079Fellowshipjms-0061jms-0061.xmlTypeSynth: synthetic methods in program verification2022Jon Sterling10.3030/101065303Marie Skłodowska-Curie Actions Postdoctoral Fellowship
Beneficiary:
Jonathan Sterling
Award:
Marie Skłodowska-Curie Actions Postdoctoral Fellowship
Funder:
European Commission, Horizon Europe Framework Programme (HORIZON)
Host:
Aarhus University, Center for Basic Research in Program Verification
Years:
2022–2024 (terminated 2023)
Amount:
214,934.4 EUR
See the Final Report and Bibliography.Abstract. Software systems mediate a growing proportion of human activity, e.g. communication, transport, medicine, industrial and agricultural production, etc. As a result, it is urgent to understand and better control both the correctness and security properties of these increasingly complex software systems. The diversity of verification requirements speaks to a need for models of program execution that smoothly interpolate between many different levels of abstraction.Models of program execution vary in expressiveness along the spectrum of possible programming languages and specification logics. At one extreme, dependent type theory is a language for mathematically-inspired functional programming that is sufficiently expressive to serve as its own specification logic. Dependent type theory has struggled, however, to incorporate several computational effects that are common in every-day programming languages, such as state and concurrency. Languages that support these features require very sophisticated specification logics due to the myriad details that must be surfaced in their semantic models.In the context of dependent type theory, I have recently developed a new technique called Synthetic Tait Computability or STC that smoothly combines multiple levels of abstraction into a single language. Inspired by sophisticated mathematical techniques invented in topos theory and category theory for entirely different purposes, STC enables low-level details (even down to execution steps) to be manipulated in a simpler and more abstract way than ever before, making them easier to control mathematically. Perhaps more importantly, the STC method makes it possible to import ideas and techniques from other mathematical fields that are comparatively more developed than programming languages.The goal of the TypeSynth project is to extend the successful STC approach to a wider class of programming models, in particular programming languages with effects.12100Grantjms-006Cjms-006C.xmlSession types and phase distinctions for noninterference2021Stephanie BalzerRobert HarperJon SterlingUnited States Air Force Office of Scientific Research
PIs:
Stephanie Balzer and Robert Harper
Unfunded:
Jonathan Sterling
Funding Agency:
United States Air Force Office of Scientific Research
Program Officer:
Dr. Tristan Nguyen
Award No.:
FA9550-21-1-0385
Years:
2021-2022
The purpose of this research is to investigate the development of programming language techniques to express and enforce constraints on the flow of information in a program.Type systems are the most widely applicable tool for enforcing such restrictions within and among programs. The aim of the project is to investigate the development of suitable type systems for information flow security in two settings, and to understand their inter-relationship. In each case the goal is to state and prove non-interference properties of programs that ensure the independence of non-sensitive outputs on sensitive inputs to a system. Both methods draw on the method of logical relations to establish these properties.The first setting is that of session types for communicating programs. In their most basic form session types express the protocols for interaction among programs that interact along data-carrying communication channels. A key characteristic of session types is that they are able to track changes of state in a computation using methods drawn from substructural logic. The intent of the investigation is to extend session types to track information flow in a composite program using refinement types that encode security levels of data.The second setting is that of program modules which govern the construction of programs from separable, reusable components. Type systems for modularity are primarily concerned with the interfaces between components, which ensure that the effects of changes to a module implementation on other modules in a system can be tightly controlled. The project will investigate the extension of module type systems to express information flow dependencies among components using a generalization of the phase distinction between static and dynamic aspects of a program to account for a richer hierarchy of security levels.
12108jms-0063jms-0063.xmlStudentsJon Sterlingjms-00014694jms-00DVjms-00DV.xmlMasters-level students2023119jms-00634693Personleonipughleonipugh.xmlLeoni PughPart III StudentUniversity of Cambridgejonmsterling4698jms-00B5jms-00B5.xmlBachelor-level students20231020jms-00634695Personzhiyiliuzhiyiliu.xmlZhiyi LiuUniversity of CambridgeUndergraduate Studentjonmsterling4696Persondanielepalombidanielepalombi.xmlDaniele Palombihttps://dpl0a.github.io/jonmsterlingSapienza University of Rome, 20[ ]0000-0002-8107-54394697Personaoyangyuaoyangyu.xmlAoyang Yujonmsterlinghttps://permui.github.ioZhejiang University4703jms-00B6jms-00B6.xmlPhD thesis committees20231020jms-00634699Personfilipposestinifilipposestini.xmlFilippo SestiniFunctional Software EngineerImandrahttp://www.cs.nott.ac.uk/~psxfs5/0000-0002-8701-56134700Personloïcpujetloïcpujet.xmlLoïc Pujethttps://pujet.fr/Stockholm UniversitySverker Lerheden Postdoctoral FellowMy research interests lie mainly in type theory, proof assistants, homotopy theory and constructive mathematics.4701Personyueniuyueniu.xmlYue NiuPhD StudentrobertharperCarnegie Mellon University0000-0003-4888-6042PhD student of Robert Harper.4702Personwojciechnawrockiwojciechnawrocki.xmlWojciech NawrockiCarnegie Mellon UniversityPhD Studenthttps://voidma.in4705jms-00SOjms-00SO.xmlPostdocsJon Sterlingjms-00634704Personandrewslatteryandrewslattery.xmlAndrew Slatteryhttps://andrewslattery.github.ioPhD Student; Research AssistantUniversity of Leeds; Cambridge Computer Laboratorynicolagambinojonmsterling
12110jms-007Zjms-007Z.xmlTeachingJon SterlingAlejandro AguirreAndrew PittsFrank StajanoLars BirkedalMarcelo Fiorejms-00014714Coursejms-0081jms-0081.xmlDiscrete Mathematics (2023–24)2023Marcelo FioreJon SterlingAndrew PittsFrank Stajanohttps://www.cl.cam.ac.uk/teaching/2324/DiscMath/University of CambridgeThe course aims to introduce the mathematics of discrete structures, showing it as an essential tool for computer science that can be clever and beautiful.Michaelmas term lectured by Marcelo Fiore; Lent term lectured by Jon Sterling.4709jms-00JBjms-00JB.xmlLectures on discrete mathematics2024118Jon SterlingAndrew PittsFrank StajanoMarcelo FioreDiscrete Mathematics (2023–24)3479jms-00I6jms-00I6.xmlLecture 13: relations and matrices2024119Jon SterlingMarcelo Fiore
3355jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3365jms-00J6jms-00J6.xmlBasic definitions2024118Jon SterlingMarcelo Fiore3357Definitionjms-00I7jms-00I7.xmlRelation2024117Jon SterlingMarcelo FioreA (binary) relation R from a set A to a set B, written R \colon A \mathbin { \nrightarrow } B or R \in \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} is defined to be a subset R \subseteq A \times B. We shall typically write a \mathrel {R}b for { \mathopen {} \left ( a,b \right ) \mathclose {}} \in R. More generally, a relation between multiple sets { \mathopen {} \left ( A_i \right ) \mathclose {}} _{i \in I} is defined to be a subset of the cartesian product \prod _{i \in I} A_i.3362Lemmajms-00IMjms-00IM.xmlRelational extensionality2024117Jon SterlingMarcelo FioreLet A and B be two sets, and let R,S \colon A \mathbin { \nrightarrow } B be two relations from A to B. Then we have R=S if and only if \forall {a \in A} \mathpunct {.} \forall {b \in B} \mathpunct {.} a \mathrel {R}b \Longleftrightarrow a \mathrel {S}b.
3360Proof#425unstable-425.xml2024117Jon Sterlingjms-00IM
We recall that relation from A to B is nothing more than a subset of A \times B. By the axiom of extensionality, two subsets of A \times B are equal if and only if they contain precisely the same elements.
3384jms-00I8jms-00I8.xmlUses of relations in computer science2024117Jon SterlingMarcelo Fiore3368Examplejms-00I9jms-00I9.xmlRelations in program specification2024117Jon SterlingMarcelo FioreIn the simplest terms, a specification of a program is a relation that describes the possible input/output pairs that can occur. For example, the specification that a given program compute the square root is captured by the relation \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} given by pairs { \mathopen {} \left ( x,y \right ) \mathclose {}} such that x = y^2.3371Examplejms-00IAjms-00IA.xmlRelations in operational semantics2024117Jon SterlingLet E represent the set of states in a machine; then the behavior of this machine is usually described by a pair of relations S \colon E \mathbin { \nrightarrow } E and V \colon E \mathbin { \nrightarrow } { \mathopen {} \left \{ \star \right \} \mathclose {}}, such that e \mathrel {S} e' when it is possible for the machine to transition from state e to e' and such that e \mathrel {V} \star when the machine can halt in state e.3373Examplejms-00IBjms-00IB.xmlRelations in program typing2024117Jon SterlingMarcelo FioreLet E be the set of expression in a given programming language, and let T be the set of types in that programming language. Then the property of a given program having a certain type forms a relation E \mathbin { \nrightarrow } T.3376Examplejms-00ICjms-00IC.xmlRelations for program equivalence2024117Jon SterlingLet e,e' be two programs of type \tau. We say that e and e' are observationally equivalent when for any other program h \colon \tau \to { \mathopen {} \left ( \right ) \mathclose {}}, then h { \mathopen {} \left ( e \right ) \mathclose {}} terminates if and only if h { \mathopen {} \left ( e' \right ) \mathclose {}} terminates. If E_ \tau is the set of programs of type \tau, observational equivalence therefore forms a relation E_ \tau \mathbin { \nrightarrow } E_ \tau.3378Examplejms-00IDjms-00ID.xmlNetworks as relations2024117Jon SterlingMarcelo FioreA network is given by a set of nodes N and a relation C \colon N \mathbin { \nrightarrow } N expressing with two nodes are connected.3381Examplejms-00IEjms-00IE.xmlRelations in databases2024117Jon SterlingMarcelo FioreWe now come to an example of a relation between multiple sets: we could define a relation R \subseteq \text {Movies} \times \text {Directors} \times \text {Years} \times \text {People} consisting of tuples { \mathopen {} \left ( m,d,y,p \right ) \mathclose {}} where m is a movie directed by d in year y with p as a cast member.3398jms-00IFjms-00IF.xmlFormal examples of relations2024117Jon SterlingMarcelo Fiore3387Examplejms-00IGjms-00IG.xmlThe empty relation2024117Jon SterlingMarcelo FioreFor any two sets A and B, we may form the empty relation \varnothing \colon A \mathbin { \nrightarrow } B that relates no elements. In other words, \varnothing is the empty subset of A \times B.3390Examplejms-00IHjms-00IH.xmlThe full relation2024117Jon SterlingMarcelo FioreFor any two sets A and B, we may form the full relation { \mathopen {} \left ( A \times B \right ) \mathclose {}} \colon A \mathbin { \nrightarrow } B, also called the total relation, so that a \mathrel { { \mathopen {} \left ( A \times B \right ) \mathclose {}} }b for all a \in A and b \in B. In other words, { \mathopen {} \left ( A \times B \right ) \mathclose {}} is the total subset of A \times B. 3393Examplejms-00IIjms-00II.xmlThe identity relation2024117Jon SterlingMarcelo FioreFor any set A, we can form the identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A, also called the equality relation, which relates each element of A to itself. In other words, we have a \mathrel { \mathsf {id}_{ A } } a' if and only if a=a'.We have already seen the square root relation from positive reals to reals, which corresponds to a total but many-valued function. We can define an analogous relationship in below from positive integers (naturals) to integers, which will correspond to a partial and many-valued function.3396Examplejms-00IJjms-00IJ.xmlThe integer square root relation2024117Jon SterlingThe square root operation corresponds to a relation R_2 \colon \mathbb {N} \mathbin { \nrightarrow } \mathbb {Z} such that m \mathrel {R_2} n if and only if m = n^2.3407jms-00J5jms-00J5.xmlVisualising relations2024118Jon SterlingMarcelo Fiore3401Notationjms-00IKjms-00IK.xmlInternal diagrams of relations2024117Jon SterlingMarcelo FioreA useful way to visualise a relation between two sets is by means of internal diagrams: each set is depicted as a blob containing its elements, and lines are drawn from the elements of one blob to the elements of the second blob when they are related.In particular, let R \colon \mathbb {N} \mathbin { \nrightarrow } \mathbb {Z} be the following relation: R = { \mathopen {} \left \{ { \mathopen {} \left ( 0,0 \right ) \mathclose {}} , { \mathopen {} \left ( 1,-1 \right ) \mathclose {}} , { \mathopen {} \left ( 0,1 \right ) \mathclose {}} , { \mathopen {} \left ( 1,2 \right ) \mathclose {}} , { \mathopen {} \left ( 1,1 \right ) \mathclose {}} , { \mathopen {} \left ( 2,1 \right ) \mathclose {}} \right \} \mathclose {}} We can depict R by the following internal diagram:
\usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd}
\usetikzlibrary{matrix,arrows}
\usetikzlibrary{backgrounds,fit,positioning,calc,shapes}
\usetikzlibrary{decorations.pathreplacing}
\usetikzlibrary{decorations.pathmorphing}
\usetikzlibrary{decorations.markings}
\tikzset{
desc/.style={sloped, fill=white,inner sep=2pt},
upright desc/.style={fill=white,inner sep=2pt},
pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pullback 45/.style = {
append after command={
\pgfextra{
\draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
ne pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm);
}
}
},
sw pullback/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm);
}
}
},
dotted pullback/.style = {
append after command={
\pgfextra{
\draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
muted pullback/.style = {
append after command={
\pgfextra{
\draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm);
}
}
},
pushout/.style = {
append after command={
\pgfextra{
\draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm);
}
}
},
between/.style args={#1 and #2}{
at = ($(#1)!0.5!(#2)$)
},
diagram/.style = {
on grid,
node distance=2cm,
commutative diagrams/every diagram,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
}
},
fibration/.style = {
-{Triangle[open]}
},
etale/.style = {
-{Triangle[open]}
},
etale cover/.style= {
>={Triangle[open]},->.>
},
opfibration/.style = {
-{Triangle}
},
lies over/.style = {
|-{Triangle[open]}
},
op lies over/.style = {
|-{Triangle}
},
embedding/.style = {
{right hook}->
},
open immersion/.style = {
{right hook}-{Triangle[open]}
},
closed immersion/.style = {
{right hook}-{Triangle}
},
closed immersion*/.style = {
{left hook}-{Triangle}
},
embedding*/.style = {
{left hook}->
},
open immersion*/.style = {
{left hook}-{Triangle[open]}
},
exists/.style = {
densely dashed
},
}
\newlength{\dontworryaboutit}
\tikzset{
inline diagram/.style = {
commutative diagrams/every diagram,
commutative diagrams/cramped,
line width = .5pt,
every node/.append style = {
commutative diagrams/every cell,
anchor = base,
inner sep = 0pt
},
every path/.append style = {
outer xsep = 2pt
}
}
}
\tikzset{
square/nw/.style = {},
square/ne/.style = {},
square/se/.style = {},
square/sw/.style = {},
square/north/.style = {->},
square/south/.style = {->},
square/west/.style = {->},
square/east/.style = {->},
square/north/node/.style = {above},
square/south/node/.style = {below},
square/west/node/.style = {left},
square/east/node/.style = {right},
}
\ExplSyntaxOn
\bool_new:N \l_jon_glue_west
\keys_define:nn { jon-tikz/diagram } {
nw .tl_set:N = \l_jon_tikz_diagram_nw,
sw .tl_set:N = \l_jon_tikz_diagram_sw,
ne .tl_set:N = \l_jon_tikz_diagram_ne,
se .tl_set:N = \l_jon_tikz_diagram_se,
width .tl_set:N = \l_jon_tikz_diagram_width,
height .tl_set:N = \l_jon_tikz_diagram_height,
north .tl_set:N = \l_jon_tikz_diagram_north,
south .tl_set:N = \l_jon_tikz_diagram_south,
west .tl_set:N = \l_jon_tikz_diagram_west,
east .tl_set:N = \l_jon_tikz_diagram_east,
nw/style .code:n = {\tikzset{square/nw/.style = {#1}}},
sw/style .code:n = {\tikzset{square/sw/.style = {#1}}},
ne/style .code:n = {\tikzset{square/ne/.style = {#1}}},
se/style .code:n = {\tikzset{square/se/.style = {#1}}},
glue .choice:,
glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool},
glue~target .tl_set:N = \l_jon_tikz_glue_target,
north/style .code:n = {\tikzset{square/north/.style = {#1}}},
north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}},
south/style .code:n = {\tikzset{square/south/.style = {#1}}},
south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}},
west/style .code:n = {\tikzset{square/west/.style = {#1}}},
west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}},
east/style .code:n = {\tikzset{square/east/.style = {#1}}},
east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}},
draft .meta:n = {
nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}},
sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}},
se = {\__jon_tikz_diagram_fmt_placeholder:n {se}},
ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}},
north = {\__jon_tikz_diagram_fmt_placeholder:n {north}},
south = {\__jon_tikz_diagram_fmt_placeholder:n {south}},
west = {\__jon_tikz_diagram_fmt_placeholder:n {west}},
east = {\__jon_tikz_diagram_fmt_placeholder:n {east}},
}
}
\tl_set:Nn \l_jon_tikz_diagram_width { 2cm }
\tl_set:Nn \l_jon_tikz_diagram_height { 2cm }
\cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 {
\texttt{\textcolor{red}{#1}}
}
\keys_set:nn { jon-tikz/diagram } {
glue~target = {},
}
\cs_new:Nn \__jon_tikz_render_square:nn {
\group_begin:
\keys_set:nn {jon-tikz/diagram} {#2}
\bool_if:nTF \l_jon_glue_west {
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
} {
\node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$};
\node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$};
\draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw);
\node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$};
\node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$};
\draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne);
\draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se);
\draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se);
}
\group_end:
}
\NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{
\__jon_tikz_render_square:nn {#1} {#2}
}
\NewDocumentCommand\DiagramSquare{D<>{}O{}m}{
\begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)]
\__jon_tikz_render_square:nn {#1} {#3}
\end{tikzpicture}
}
\ExplSyntaxOff
\begin {tikzpicture}
\begin {scope}
\node (l/0) {$0$};
\node [below = .5cm of l/0] (l/1) {$1$};
\node [below = .5cm of l/1] (l/2) {$2$};
\end {scope}
\begin {scope}[shift={(1.5cm,0)}]
\node (r/-1) {$-1$};
\node [below = .5cm of r/-1] (r/0) {$0$};
\node [below = .5cm of r/0] (r/1) {$1$};
\node [below = .5cm of r/1] (r/2) {$2$};
\end {scope}
\draw [thick] (l/0) to (r/0);
\draw [thick] (l/1) to (r/-1);
\draw [thick] (l/0) to (r/1);
\draw [thick] (l/1) to (r/2);
\draw [thick] (l/1) to (r/1);
\draw [thick] (l/2) to (r/1);
\begin {scope}[on background layer]
\node [rectangle, rounded corners=10pt, fill=yellow!20,thick,fit=(l/0)(l/2),inner sep=5pt] {};
\node [rectangle, rounded corners=10pt, fill=red!20,thick,fit=(r/-1)(r/2),inner sep=5pt] {};
\end {scope}
\end {tikzpicture}
3404Exercisejms-00ILjms-00IL.xmlAn internal diagram2024117Jon SterlingMarcelo FioreDraw the internal diagram corresponding to the following relation: \begin {aligned} S& \colon \mathbb {Z} \mathbin { \nrightarrow } \mathbb {Z} \\ S&= { \mathopen {} \left \{ { \mathopen {} \left ( 1,0 \right ) \mathclose {}} , { \mathopen {} \left ( 1,2 \right ) \mathclose {}} , { \mathopen {} \left ( 2,1 \right ) \mathclose {}} , { \mathopen {} \left ( 2,3 \right ) \mathclose {}} \right \} \mathclose {}} \end {aligned} 3423jms-00IQjms-00IQ.xmlRelational composition2024117Jon SterlingMarcelo Fiore3410Definitionjms-00INjms-00IN.xmlRelational composite2024117Jon SterlingMarcelo FioreGiven relations R \colon A \mathbin { \nrightarrow } B and S \colon B \mathbin { \nrightarrow } C, we can define the relational composite S \circ R \colon A \mathbin { \nrightarrow } C in a way that generalises composition of functions. In particular, we define S \circ R to be the following subset of A \times C: \begin {aligned} S \circ R & \colon A \mathbin { \nrightarrow } C \\ S \circ R &= { \mathopen {} \left \{ { \mathopen {} \left ( a,c \right ) \mathclose {}} \in A \times C \mid \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S} c \right \} \mathclose {}} \end {aligned} 3415Examplejms-00IOjms-00IO.xmlNegation invariance of the square root relation2024117Jon SterlingMarcelo FioreRecall the square root relation \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} from , and let \mathsf {neg} \colon \mathbb {R} \mathbin { \nrightarrow } \mathbb {R} be the relation { \mathopen {} \left \{ (x,y) \in \mathbb {R}^2 \mid x = -y \right \} \mathclose {}}. Then the relational composite \mathsf {neg} \circ \mathsf {sq} \colon \mathbb {R}_{ \geq 0} \mathbin { \nrightarrow } \mathbb {R} is equal to \mathsf {sq}.
3413Proof#424unstable-424.xml2024117Jon Sterlingjms-00IO
By relational extensionality, it suffices to check that for any x \in \mathbb {R}_{ \geq 0} and y \in \mathbb {R}, we have x \mathrel { { \mathopen {} \left ( \mathsf {neg} \circ \mathsf {sq} \right ) \mathclose {}} } y if and only if x \mathrel { \mathsf {sq}}y. We compute:
\begin {aligned} x \mathrel { { \mathopen {} \left ( \mathsf {neg} \circ \mathsf {sq} \right ) \mathclose {}} } y & \Longleftrightarrow \exists z \in \mathbb {R} \mathpunct {.} x \mathrel { \mathsf {sq}}z \land z \mathrel { \mathsf {neg}}y \\ & \Longleftrightarrow \exists z \in \mathbb {R} \mathpunct {.} x=z^2 \land z =-y \\ & \Longleftrightarrow x= { \mathopen {} \left ( -y \right ) \mathclose {}} ^2 \\ & \Longleftrightarrow x = y^2 \\ & \Longleftrightarrow x \mathrel { \mathsf {sq}}y \end {aligned}
3420Lemmajms-00IPjms-00IP.xmlAssociativity and unit laws of relational composition2024117Jon SterlingMarcelo FioreRelational composition is associative and has the identity relation as a neutral element.
3418Proof#423unstable-423.xml2024117Jon Sterlingjms-00IP
To prove associativity, we fix relations R \colon A \mathbin { \nrightarrow } B, S \colon B \mathbin { \nrightarrow } C, and T \colon C \mathbin { \nrightarrow } D to prove { \mathopen {} \left ( T \circ S \right ) \mathclose {}} \circ R = T \circ { \mathopen {} \left ( S \circ R \right ) \mathclose {}}. To get started, we compute the intermediate composites:
\begin {aligned} b \mathrel { { \mathopen {} \left ( T \circ S \right ) \mathclose {}} }d & \Longleftrightarrow \exists c \in C \mathpunct {.} b \mathrel {S}c \land c \mathrel {T}d \\ a \mathrel { { \mathopen {} \left ( S \circ R \right ) \mathclose {}} }c & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \end {aligned}
Using the above, we can compute the full composites:
\begin {aligned} a \mathrel { { \mathopen {} \left ( { \mathopen {} \left ( T \circ S \right ) \mathclose {}} \circ R \right ) \mathclose {}} } d & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel { { \mathopen {} \left ( T \circ S \right ) \mathclose {}} } d \\ & \Longleftrightarrow \exists b \in B \mathpunct {.} a \mathrel {R}b \land \exists c \in C \mathpunct {.} c \mathrel {S}c \land c \mathrel {T}d \\ & \Longleftrightarrow \exists b \in B \mathpunct {.} \exists c \in C \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \land c \mathrel {T}d \\ & \Longleftrightarrow \exists c \in C \mathpunct {.} { \mathopen {} \left ( \exists b \in B \mathpunct {.} a \mathrel {R}b \land b \mathrel {S}c \right ) \mathclose {}} \land c \mathrel {T}d \\ & \Longleftrightarrow \exists c \in C \mathpunct {.} a \mathrel { { \mathopen {} \left ( S \circ R \right ) \mathclose {}} }c \land c \mathrel {T}d \\ & \Longleftrightarrow a \mathrel { { \mathopen {} \left ( T \circ { \mathopen {} \left ( S \circ R \right ) \mathclose {}} \right ) \mathclose {}} }d \end {aligned}
For the right and left neutrality, we must prove that R \circ \mathsf {id}_{ A } = R = \mathsf {id}_{ B } \circ R for all r \colon A \mathbin { \nrightarrow } B. We prove only the first law, as the other proof is analogous:
\begin {aligned} a \mathrel { { \mathopen {} \left ( R \circ \mathsf {id}_{ A } \right ) \mathclose {}} } b & \Longleftrightarrow \exists a' \in A \mathpunct {.} a \mathrel { \mathsf {id}_{ A } }a' \land a' \mathrel {R}b \\ & \Longleftrightarrow \exists a' \in A \mathpunct {.} a = a' \land a' \mathrel {R}b \\ & \Longleftrightarrow a \mathrel {R} b \end {aligned}
3476jms-00IRjms-00IR.xmlRelations and matrices2024117Jon SterlingMarcelo FioreRelations between finite sets can be desribed in a more computationally friendly way by their tabulation as matrices. In particular, we shall see in that an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over the boolean semiring is precisely the same thing as a relation from { \mathopen {} \left [ m \right ] \mathclose {}} to { \mathopen {} \left [ n \right ] \mathclose {}}, where { \mathopen {} \left [ l \right ] \mathclose {}} = { \mathopen {} \left \{ i \mid 0 \leq i < l \right \} \mathclose {}} is the set of natural numbers strictly smaller than l. Then we will see that relational composition is, under this correspondence, the same as matrix multiplication.3429Definitionjms-00ISjms-00IS.xmlMatrix over a semiring2024117Jon SterlingMarcelo FioreFor natural numbers m and n, an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over a semiring { \mathopen {} \left ( S,0, \oplus ,1, \odot \right ) \mathclose {}} is given by entries M_{i,j} \in S for all i \in { \mathopen {} \left [ m \right ] \mathclose {}} and j \in { \mathopen {} \left [ n \right ] \mathclose {}}. We will write \mathrm {Mat}_{ S } { \mathopen {} \left ( m , m \right ) \mathclose {}} for the set of { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrices.3433Notationjms-00J4jms-00J4.xmlMatrices as tables2024117Jon SterlingMarcelo Fiore{ \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrices can be depicted in tables or grids with rows in the first dimension and columns in the second dimension. For example, let M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( 3 , 2 \right ) \mathclose {}} be the matrix over the booleans defined by the following equation: M_{i,j} = \begin {cases} \mathsf {true} & \text {if } \mathsf {parity} { \mathopen {} \left ( i \right ) \mathclose {}} = \mathsf {parity} { \mathopen {} \left ( j \right ) \mathclose {}} \\ \mathsf {false} & \text {otherwise} \end {cases} Then M is depicted by the following table with three rows and two columns: M = \begin {bmatrix} \mathsf {true} & \mathsf {false} \\ \mathsf {false} & \mathsf {true} \\ \mathsf {true} & \mathsf {false} \end {bmatrix} 3437Definitionjms-00ITjms-00IT.xmlThe identity matrix2024117Jon SterlingFor any m \in \mathbb {N}, we define the identity { \mathopen {} \left ( m \times m \right ) \mathclose {}}-matrix over a given semiring S as follows: I^m_{i,j} = \begin {cases} 1& \text {if } i=j \\ 0& \text {otherwise} \end {cases} The identity matrix is sometimes called the diagonal matrix, for reasons that become apparent when visualising it according to : I^4 = \begin {bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end {bmatrix} 3454jms-00J1jms-00J1.xmlCorrespondence between matrices and finite relations2024117Jon Sterling3440Definitionjms-00IYjms-00IY.xmlThe matrix associated to a finite relation2024117Jon SterlingLet R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} be a relation; for any semiring S, we may form the { \mathopen {} \left ( n \times n \right ) \mathclose {}}-matrix over S associated to R as follows: { \mathopen {} \left ( \operatorname { \underline {mat}} _S{R} \right ) \mathclose {}} _{i \in { \mathopen {} \left [ m \right ] \mathclose {}} ,j \in { \mathopen {} \left [ n \right ] \mathclose {}} } = \begin {cases} 1& \text {if } i \mathrel {R}j \\ 0& \text {otherwise} \end {cases} We have defined a function \operatorname { \underline {mat}} _S \colon \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}}.3443Definitionjms-00IZjms-00IZ.xmlThe relation associated to a matrix2024117Jon SterlingLet M be an { \mathopen {} \left ( n \times n \right ) \mathclose {}}-matrix over a semiring S. We define the relation associated to M below: \begin {aligned} \operatorname { \underline {rel}} _S{M} & \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} \\ i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _S{M} \right ) \mathclose {}} } j & \Longleftrightarrow M_{i,j} = 1 \end {aligned} We have defined a function \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}}.3447Lemmajms-00J0jms-00J0.xmlA retraction from matrices to finite relations2024117Jon SterlingThe associated matrix function \operatorname { \underline {mat}} _S \colon \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} is a section of the associated relation function \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} for any semiring S.
3445Proof#421unstable-421.xml2024117Jon Sterlingjms-00J0
We must check that \operatorname { \underline {rel}} _S \circ \operatorname { \underline {mat}} _S = \mathsf {id}_{ \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ m \right ] \mathclose {}} , { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} }. Fixing a relation R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}}, we compute:
\begin {aligned} i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _S { \mathopen {} \left ( \operatorname { \underline {mat}} _SR \right ) \mathclose {}} \right ) \mathclose {}} } j & \Longleftrightarrow { \mathopen {} \left ( \operatorname { \underline {mat}} _SR \right ) \mathclose {}} _{i,j} = 1 \\ & \Longleftrightarrow i \mathrel {R}j \end {aligned}
The other composite \operatorname { \underline {mat}} _S \circ \operatorname { \underline {rel}} _S \colon \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} is not in general the identity function, but is (necessarily) an idempotent on the set of matrices over S. We will see that this idempotent, in some sense, measures the degree to which the base semiring S is not boolean.3451Lemmajms-00IXjms-00IX.xmlFinite relations as matrices over the booleans2024117Jon SterlingThe idempotent { \operatorname { \underline {mat}} _ \mathbb {B} \circ \operatorname { \underline {rel}} _ \mathbb {B} \colon \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} \to \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} } is in fact the identity function on matrices over the boolean semiring \mathbb {B}.
3449Proof#422unstable-422.xml2024117Jon Sterlingjms-00IX
We fix a matrix M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} and compute:
\begin {aligned} { \mathopen {} \left ( \operatorname { \underline {mat}} _ \mathbb {B} { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} \right ) \mathclose {}} _{i,j} &= \begin {cases} 1 & \text {if } i \mathrel { { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} } j \\ 0 & \text {otherwise} \end {cases} \\ &= \begin {cases} 1 & \text {if } M_{i,j} = 1 \\ 0 & \text {otherwise} \end {cases} \end {aligned}
Because \mathbb {B} is the boolean semiring, any scalar s \in S is either 0 or 1. Therefore, we conclude:
{ \mathopen {} \left ( \operatorname { \underline {mat}} _ \mathbb {B} { \mathopen {} \left ( \operatorname { \underline {rel}} _ \mathbb {B} M \right ) \mathclose {}} \right ) \mathclose {}} _{i,j} = M_{i,j}
Thus we conclude that an { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix over the booleans is the same thing as a relation from { \mathopen {} \left [ m \right ] \mathclose {}} to { \mathopen {} \left [ n \right ] \mathclose {}}.3473jms-00IWjms-00IW.xmlMultiplication of matrices2024117Jon SterlingMarcelo Fiore3456Side remarkjms-00IUjms-00IU.xmlMatrix multiplication in geometry2024117Jon SterlingAlthough we do not explore it in this course, the viewpoint of matrix multiplication as relational composition generalises to a correct explanation of the role of matrices in geometry as presentations of linear maps between vector spaces in terms of an (uncanonical) choice of basis.3461Definitionjms-00IVjms-00IV.xmlProduct of matrices2024117Jon SterlingLet M be an { \mathopen {} \left ( l \times m \right ) \mathclose {}}-matrix and let N be a { \mathopen {} \left ( m \times n \right ) \mathclose {}}-matrix. The product of M and N, written M \cdot N, is the following { \mathopen {} \left ( l \times n \right ) \mathclose {}}-matrix: { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{i \in { \mathopen {} \left [ l \right ] \mathclose {}} ,j \in { \mathopen {} \left [ n \right ] \mathclose {}} } = \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \cdot N_{k,j} 3465Lemmajms-00J3jms-00J3.xmlAssociativity and unit laws of matrix products2024117Jon SterlingMarcelo FioreThe product of matrices is associative and has the identity matrix as neutral element.
3463Proof#419unstable-419.xml2024117Jon Sterlingjms-00J3
For associativity, fix matrices L \in \mathrm {Mat}_{ S } { \mathopen {} \left ( k , l \right ) \mathclose {}}, M \in \mathrm {Mat}_{ S } { \mathopen {} \left ( l , m \right ) \mathclose {}} and N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} to check that { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} \cdot L = N \cdot { \mathopen {} \left ( M \cdot L \right ) \mathclose {}}. Below we use the associativity of multiplication, commutativity of addition, and distributivity of multiplication over addition in the semiring S:
\begin {aligned} { \mathopen {} \left ( { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} \cdot L \right ) \mathclose {}} _{a,b} &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{c,b} \\ &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{c,d} \cdot N_{d,b} \\ &= \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } L_{a,c} \cdot M_{c,d} \cdot N_{d,b} \\ &= \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } { \mathopen {} \left ( \bigoplus _{c \in { \mathopen {} \left [ l \right ] \mathclose {}} } L_{a,c} \cdot M_{c,d} \right ) \mathclose {}} \cdot N_{d,b} \\ &= \bigoplus _{d \in { \mathopen {} \left [ m \right ] \mathclose {}} } { \mathopen {} \left ( M \cdot L \right ) \mathclose {}} _{a,d} \cdot N_{d,b} \\ &= N \cdot { \mathopen {} \left ( M \cdot L \right ) \mathclose {}} \end {aligned}
For one unit law, we fix M \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} and recall the definition of the identity matrix to check that M \cdot I^m = M.
\begin {aligned} { \mathopen {} \left ( M \cdot I^m \right ) \mathclose {}} _{i,j} &= \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } I^m_{i,k} \cdot M_{k, j} \end {aligned}
Unfolding the definition of I^m_{i,j} as given in , we see that the iterated sum above expands to m-1 copies of 0 \cdot M_{k,j} for various k \not = i and one copy of 1 \cdot M_{i,j}. Thus, we conclude that M \cdot I^m_{i,j} = M_{i,j} using the absorption and unit laws for multiplication as well as the unit laws for addition in S.
The other unit law follows in an analogous way.
3470Lemmajms-00J2jms-00J2.xmlMatrix product is relational composition2024117Jon SterlingMarcelo FioreUnder the correspondence between boolean matrices and finite relations (, ), matrix products of boolean matrices correspond to relational composites. In particular, given M \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( l , m \right ) \mathclose {}} and N \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}}, we have: \operatorname { \underline {rel}} _ \mathbb {B} { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} = \operatorname { \underline {rel}} _ \mathbb {B} N \circ \operatorname { \underline {rel}} _ \mathbb {B} M
3468Proof#420unstable-420.xml2024117Jon Sterlingjms-00J2
We use the fact that the additive operation of the boolean semiring is disjunction and the multiplicative operation is conjunction:
\begin {aligned} { \mathopen {} \left ( N \cdot M \right ) \mathclose {}} _{i,j} &= \bigoplus _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \cdot N_{k,j} \\ &= \bigvee _{k \in { \mathopen {} \left [ m \right ] \mathclose {}} } M_{i,k} \land N_{k,j} \\ &= \exists k \in { \mathopen {} \left [ m \right ] \mathclose {}} \mathpunct {.} M_{i,k} \land N_{k,j} \end {aligned}
3556jms-00JAjms-00JA.xmlLecture 14: directed graphs and paths2024122Jon SterlingMarcelo Fiore
3482jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3496jms-00JSjms-00JS.xmlAdding matrices together2024120Jon Sterling3484Definitionjms-00JOjms-00JO.xmlPointwise addition of matrices2024120Jon SterlingLet M,N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} be two matrices over the same semiring with the same dimensions. We will write M+N \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} for the (pointwise) addition of the two matrices: { \mathopen {} \left ( M+N \right ) \mathclose {}} _{i,j} = M_{i,j} + N_{i,j} 3486Definitionjms-00JPjms-00JP.xmlThe zero matrix2024120Jon SterlingA zero matrix is one that whose cells all contain 0. In particular, we have: \begin {aligned} \mathbf {0} & \in \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} \\ \mathbf {0}_{i,j} &= 0 \end {aligned} 3490Lemmajms-00JQjms-00JQ.xmlAssociativity, commutativity, and unit laws of matrix addition2024120Jon SterlingMatrix addition forms an associative and commutative operation on \mathrm {Mat}_{ S } { \mathopen {} \left ( m , n \right ) \mathclose {}} with zero matrices \mathbf {0} as neutral element.
3488Proof#415unstable-415.xml2024120Jon Sterlingjms-00JQ
This follows immediately from the associativity, commutativity, and unit laws of the additive submonoid { \mathopen {} \left ( +,0 \right ) \mathclose {}} of the semiring S.
3494Lemmajms-00JRjms-00JR.xmlMatrix addition and union of relations2024120Jon SterlingThe pointwise sum M+N of M,N \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} corresponds to the union of the relations associated to M and N: \operatorname { \underline {rel}} _{ \mathbb {B}} { \mathopen {} \left ( M+N \right ) \mathclose {}} = \operatorname { \underline {rel}} _{ \mathbb {B}}M \cup \operatorname { \underline {rel}} _{ \mathbb {B}}N Moreover, the zero matrix corresponds to the empty relation: \operatorname { \underline {rel}} _{ \mathbb {B}} \mathbf {0} = \varnothing.
3492Proof#414unstable-414.xml2024120Jon Sterlingjms-00JR
This follows because addition and zero in the boolean semiring are given by disjunction and falsehood respectively.
3539jms-00JTjms-00JT.xmlDirected graphs and paths2024120Jon SterlingMarcelo Fiore3498Definitionjms-00JCjms-00JC.xmlDirected graph2024118Jon SterlingMarcelo FioreA directed graph { \mathopen {} \left ( A,R \right ) \mathclose {}} consists of a set A and a binary relation R \colon A \mathbin { \nrightarrow } A from A to itself; such a relation from a set to itself is called an endo-relation. We will abbreviate \mathrm {Rel} { \mathopen {} \left ( A,A \right ) \mathclose {}} by \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of all endo-relations on A.3503Corollaryjms-00JDjms-00JD.xmlMonoid structure of endo-relations2024118Jon SterlingMarcelo FioreFor every set A, the structure { \mathopen {} \left ( \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} , \mathsf {id}_{ A } , \circ \right ) \mathclose {}} given by endo-relations is a monoid.
3501Proof#418unstable-418.xml2024118Jon Sterlingjms-00JD
We have seen in that relational composition is associative and has identity relations \mathsf {id}_{ A } as neutral elements.
3507Definitionjms-00JEjms-00JE.xmlIteration of endo-relations2024118Jon SterlingMarcelo FioreFor R \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} and n \in \mathbb {N}, we define the n-fold iteration of R to be the following endo-relation R^{ \circ n} \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}}: R^{ \circ n} = \underbrace {R \circ \cdots \circ R}_{n \text { times}} More precisely, we define R^{ \circ n} by recursion on n: \begin {aligned} R^{ \circ 0} &= \mathsf {id}_{ A } \\ R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} } &= R \circ R^{ \circ n} \end {aligned} 3510Definitionjms-00JFjms-00JF.xmlPath in a directed graph2024118Jon SterlingMarcelo FioreLet { \mathopen {} \left ( A,R \right ) \mathclose {}} be a directed graph. For s,t \in A, a path of length n \in \mathbb {N} in R with source s and target t is a tuple { \mathopen {} \left ( a_0, \ldots ,a_n \right ) \mathclose {}} \in A^{n+1} such that a_0=s, a_n=t, and a_i \mathrel {R}a_{i+1} for all 0 \leq i <n.3515Lemmajms-00JGjms-00JG.xmlPaths and iteration2024118Jon SterlingMarcelo FioreLet A,R be a directed graph. For all n \in \mathbb {N} and s,t \in A, we have s \mathrel {R^{ \circ n}}t if and only if there exists a path of length n in R with source s and target t.
3513Proof#417unstable-417.xml2024118Jon Sterlingjms-00JG
Let P(n) be the proposition that for all s,t \in A we have s \mathrel {R^{ \circ n}}t if and only if there exists a path of length n in R with source s and target t.
We shall prove P { \mathopen {} \left ( n \right ) \mathclose {}} for all n by induction.
In the base case, we must prove P { \mathopen {} \left ( 0 \right ) \mathclose {}}: this states for for all s,t \in A, we have s=t if and only if there exists a path in R of length 0 from s to t. This holds by definition of paths, considering the unary tuple.
In the inductive step, we may assume P { \mathopen {} \left ( n \right ) \mathclose {}} to prove P { \mathopen {} \left ( n+1 \right ) \mathclose {}}. Fixing s,t \in A we must show that s \mathrel {R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }} t if and only if there exists a path of length n+1 in R. By definition, we have s \mathrel {R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }} t if and only if there exists s' such that s \mathrel {R}s' and s' \mathrel {R^{ \circ n}} t. By our inductive hypothesis P { \mathopen {} \left ( 0 \right ) \mathclose {}} applied to s',t, the latter holds if and only if there exists a path of length n from s' to t in R. As we can extend such a path by the link s \mathrel {R}s', we are done.
3518Definitionjms-00JHjms-00JH.xmlThe reflexive-transitive closure of an endo-relation2024118Jon SterlingMarcelo FioreFor R \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}}, we define R^{ \circ *} \in \mathrm {Rel} { \mathopen {} \left ( A \right ) \mathclose {}} to be the smallest relation closed under all finite paths in R, i.e. the reflexive-transitive closure of R. R^{ \circ *} = \bigcup _{n \in \mathbb {N}} R^{ \circ n} 3523Observationjms-00JUjms-00JU.xmlReflexive-transitive closure of finite endo-relations2024121Jon SterlingIn the case of an endo-relation R \in \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}}, the reflexive-transitive closure R^{ \circ *} can be computed using a finite union rather over m \leq n rather than an infinite union over all of \mathbb {N}; in particular, we have R^{ \circ *} = \bigcup _{k \leq n} R^{ \circ k}.
3521Proof#413unstable-413.xml2024121Jon Sterlingjms-00JU
There are only n distinct elements that could be related, so there cannot be a chain of distinct elements with length o>n, and any finite path in R from s to t can be replaced by one that has no repetitions.
3527Corollaryjms-00JIjms-00JI.xmlThe path relation via reflexive-transitive closure2024118Jon SterlingMarcelo FioreLet { \mathopen {} \left ( A,R \right ) \mathclose {}} be a directed graph. For all s,t \in A, we have s \mathrel {R^{ \circ *}}t if and only if there exists a path with source s and target t in R.
3525Proof#416unstable-416.xml2024118Jon Sterlingjms-00JI
This is an immediate consequence of : by we have s \mathrel {R^{ \circ *}}t if and only if there exists n \in \mathbb {N} such that s \mathrel {R^{ \circ n}}t, which by holds if and only if there existss a path from s to t in R of some length n \in \mathbb {N}.
3530Definitionjms-00JNjms-00JN.xmlAdjacency matrix of a finite directed graph2024120Jon SterlingMarcelo FioreWe have seen in Lecture 13 () how to turn any relation R \colon { \mathopen {} \left [ m \right ] \mathclose {}} \mathbin { \nrightarrow } { \mathopen {} \left [ n \right ] \mathclose {}} into a matrix \operatorname { \underline {mat}} _{ \mathbb {B}}R \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( m , n \right ) \mathclose {}} over the boolean semiring. When R \in \mathrm {Rel} { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} \right ) \mathclose {}} is the edge relation in a graph with vertices in the finite set { \mathopen {} \left [ n \right ] \mathclose {}}, we refer to \operatorname { \underline {mat}} _{ \mathbb {B}}R \in \mathrm {Mat}_{ \mathbb {B} } { \mathopen {} \left ( n , n \right ) \mathclose {}} as the adjacency matrix of R.3536Algorithmjms-00JVjms-00JV.xmlThe adjacency matrix of the reflexive-transitive closure2024121Jon SterlingMarcelo FioreLet M = \operatorname { \underline {mat}} _{ \mathbb {B}}R be the adjacency matrix of a finite directed graph { \mathopen {} \left ( { \mathopen {} \left [ n \right ] \mathclose {}} ,R \right ) \mathclose {}}. We will show that the adjacency matrix M^* of the reflexive-transitive closure R^{ \circ *} can be computed recursively using the additive and multiplicative structure of matrices. In particular, we can define M^* = M_n where M_k is computed recursively as below: \begin {aligned} M_0 &= I^n \\ M_{k+1} &= I^n + { \mathopen {} \left ( M \cdot M_k \right ) \mathclose {}} \end {aligned} Thus we have a recursive algorithm that can deicde whether there is a path between two elemetns in a finite directed graph: indeed, there is a path from s to t in R if and only if M^*_{s,t} = \mathsf {true}.
3534Proof#412unstable-412.xml2024121Jon Sterlingjms-00JV
We first unravel the first few M_k’s to get a better understanding.
\begin {aligned} M_1 &= I^n + M \cdot M_0 \\ &= I^n+M \cdot I^n \\ &= I^n + M \\ M_2 &= I^n + M \cdot M_1 \\ &= I^n + M \cdot (I^n + M) \\ &= I^n + M \cdot I^n + M \cdot M \\ &= I^n + M + M^2 \end {aligned}
By induction, we can extend the pattern above to a closed form for each M_k. In particular, we deduce that M_k = \sum _{l \leq k} M^l by recalling from that the identity matrix is the neutral element for matrix multiplication. Therefore, we have M_n = \sum _{k \leq n}M^k.
Our goal is to show that M_n = \operatorname { \underline {mat}} _{ \mathbb {B}} R^{ \circ *}; we have already seen in and that matrix addition corresponds to union of relations and matrix multiplication corresponds to relational composition.
By (reflexive-transitive closure of finite endo-relations), we have R^{ \circ *} = \bigcup _{k \leq n} R^{ \circ k}. Therefore:
\operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ *} = \operatorname { \underline {mat}} _{ \mathbb {B}} \bigcup _{k \leq n} R^{ \circ k}
By (matrix addition and union of relations), we have:
\operatorname { \underline {mat}} _{ \mathbb {B}} \bigcup _{k \leq n} R^{ \circ k} = \sum _{k \leq n} \operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ k}
By (matrix product is relational composition), we have:
\sum _{k \leq n} \operatorname { \underline {mat}} _{ \mathbb {B}}R^{ \circ k} = \sum _{k \leq n} { \mathopen {} \left ( \operatorname { \underline {mat}} _{ \mathbb {B}}R \right ) \mathclose {}} ^k = \sum _{k \leq n} M^k = M_n
3553jms-00JWjms-00JW.xmlPreorders and relations2024121Jon SterlingMarcelo Fiore3542Definitionjms-00JXjms-00JX.xmlPreorder2024121Jon SterlingMarcelo FioreA preorder { \mathopen {} \left ( P, \sqsubseteq \right ) \mathclose {}} consists of a set P and a relation { \sqsubseteq } \colon P \mathbin { \nrightarrow } P satisfying the following two axioms:Reflexivity. \forall x \in P \mathpunct {.} x \sqsubseteq x
Transitivity. \forall x,y,z \in P \mathpunct {.} { \mathopen {} \left ( x \sqsubseteq y \land y \sqsubseteq z \right ) \mathclose {}} \implies x \sqsubseteq zIn other words, a preorder is a directed graph such that the edge relation is both reflexive and transitive.3545Examplejms-00JYjms-00JY.xmlExamples of preorders2024121Jon SterlingMarcelo FiorePreorders are everywhere in both general mathematics and computer science.The real numbers equipped with their inequality relation form a preorder \mathbb {R}, \leq. The converse relation { \mathopen {} \left ( \mathbb {R}, \geq \right ) \mathclose {}} of course also forms a preorder. This example can be specialised to different classes of numbers, such as integers or naturals.
Subsets with their inclusion and containment relations form preorders { \mathopen {} \left ( \mathcal {P}(A), \subseteq \right ) \mathclose {}} and { \mathopen {} \left ( \mathcal {P}(A), \supseteq \right ) \mathclose {}}. This example can be specialised to other examples that have additional structure, such as the preorder \mathcal {P}_f(A) of finite subsets of a given set A, or the preorder \mathcal {O}(X) of open sets of a topological space X.
A slightly more nontrivial example is furnished by the preorder { \mathopen {} \left ( \mathbb {Z},{ \mid } \right ) \mathclose {}} of the integers and the divisibility relation, where m \mid n if and only if \exists k \in \mathbb {Z} \mathpunct {.} m k = n: indeed, we certainly have n divides n, and if l divides m and m divides n, then l divides m. The first example above happen to be total orders (meaning that the underlying directed graph is connected): so { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} , \subseteq \right ) \mathclose {}} and { \mathopen {} \left ( \mathbb {Z},{ \mid } \right ) \mathclose {}} are example of non-total preorders.All the examples we have seen so far are, furthermore, partial orders: a preorder is a partial order when we have a \sqsubseteq b \land b \sqsubseteq a if and only if a=b. In fact, there are arguably very few useful preorders that are not partial orders, and it is always possible to replace a preorder by an equivalent partial order; we do not pursue this here.3550Theoremjms-00JZjms-00JZ.xmlPreorders and reflexive-transitive closure2024121Jon SterlingMarcelo FioreThe reflexive-transitive closure R^{ \circ *} of an endo-relation R \colon A \mathbin { \nrightarrow } A is, by definition, a preorder. We will show that R^{ \circ *} enjoys a universal property as the smallest preorder containing R as a subrelation.To be more precise, let \mathcal {F}_R be the set of preorders on A that contain R as a subrelation: \mathcal {F}_R = { \mathopen {} \left \{ Q \colon A \mathbin { \nrightarrow } A \, \middle \vert \, R \subseteq Q \land \text { Q is a preorder} \right \} \mathclose {}} Then we have R^{ \circ *}= \bigcap \mathcal {F}_R, and so the reflexive-transitive closure is the smallest preorder containing R as a subrelation. (Indeed, recall that the intersection of an intersection-closed class of relations is the smallest relation in that class.)
3548Proof#411unstable-411.xml2024121Jon Sterlingjms-00JZ
To show that R^{ \circ *}= \bigcap \mathcal {F}_R, it suffices to check that \bigcap \mathcal {F}_R \subseteq R^{ \circ *} and R^{ \circ *} \subseteq \bigcap \mathcal {F}_R.
We first note that R \in \mathcal {F}_R — which holds by almost by definition:
We first need to show that R \subseteq R^{ \circ *}, i.e. that when a \mathrel {R}b we have a \mathrel {R}^{ \circ *}b. This is evident, as b \mathrel {R}b is a path of length 1 in R and R^{ \circ *} contains all finite paths in R.
Next, we need to show that R^{ \circ *} is a preorder. But this holds because R^{ \circ *} contains the empty path from an element to itself, and because a path of length m from a to b can be concatenated with a path of length n from b to c to obtain a path (of length m+n) from a to c.
It follows from R \in \mathcal {F}_R that \bigcap \mathcal {F}_R \subseteq R^{ \circ *}. Indeed:
a \mathrel { { \mathopen {} \left ( \bigcap \mathcal {F}_R \right ) \mathclose {}} }b \Longleftrightarrow \forall Q \in \mathcal {F}_R \mathpunct {.} a \mathrel {Q}b
So if a \mathrel { { \mathopen {} \left ( \bigcap \mathcal {F}_R \right ) \mathclose {}} }b and R^{ \circ *} \in \mathcal {F}_R, we may choose Q := R^{ \circ *} to deduce a \mathrel {R^{ \circ *}}b. Therefore, \bigcap \mathcal {F}_R \subseteq R^{ \circ *}.
Our second goal is to prove R^{ \circ *} \subseteq \bigcap \mathcal {F}_R; by the definition of intersections, this holds if and only if \forall Q \in \mathcal {F}_R \mathpunct {.} R^{ \circ *} \subseteq Q.
Fixing such a preorder Q over A containing R as a subrelation, we must prove that R^{ \circ *} \subseteq Q. Recalling that R^{ \circ *} is the union \bigcup _{n \in \mathbb {N}}R^{ \circ n}, we see by the definition of unions that that R^{ \circ *} \subseteq Q if and only if for each n \in \mathbb {N}, we have R^{ \circ n} \subseteq Q. This we prove by induction on n \in \mathbb {N}.
In the base case, we must show that R^{ \circ0 } = \mathsf {id}_{ A } is a subrelation of Q. This is equivalent to Q being reflexive, which we deduce from our assumption that Q is a preorder.
In the inductive step, we may assume R^{ \circ n} \subseteq Q and we must prove that R^{ \circ { \mathopen {} \left ( n+1 \right ) \mathclose {}} }=R \circ R^{ \circ n} is a subrelation of Q. For any a,b \in A, we have a \mathrel { { \mathopen {} \left ( R \circ R^{ \circ n} \right ) \mathclose {}} } b if and only if there exists some c \in A such that a \mathrel {R}c and c \mathrel {R^{ \circ n}} b. Because we have assumed that R \subseteq Q, we have a \mathrel {Q}c; by our inductive hypothesis, we have c \mathrel {Q}b. Becuase Q is a preorder and therefore transitive, we therefore have a \mathrel {Q}b.
Therefore, R^{ \circ *} is indeed the intersection of \mathcal {F}_R.
3611jms-00K6jms-00K6.xmlLecture 15: functions and inductive definitions2024124Jon SterlingMarcelo Fiore
3559jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3605jms-00KJjms-00KJ.xmlPartial and total functions2024123Jon SterlingMarcelo Fiore3562Definitionjms-00K7jms-00K7.xmlPartial function2024122Jon SterlingMarcelo FioreA relation R \colon A \mathbin { \nrightarrow } B is said to be functional when it relates an element of A to at most one element of B: \forall a \in A \mathpunct {.} \forall b_1,b_2 \in B \mathpunct {.} a \mathrel {R}b_1 \land a \mathrel {R}b_2 \implies b_1 = b_2 We refer to a functional relation as a partial function; we often use letters like f,g to refer to partial functions rather than R,S. When we write “f \colon A \to B”, we mean that f is a partial function from A to B.3567Theoremjms-00K8jms-00K8.xmlClosure of partial functions under identity and composition2024122Jon SterlingMarcelo FiorePartial functions are closed under identity and composition in the following sense:For any set A, the identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A is functional.
For any sets A,B,C and functional relations f \colon A \mathbin { \rightharpoonup } B and g \colon B \mathbin { \rightharpoonup } C, the relational composite g \circ f \colon A \mathbin { \nrightarrow } C is functional.
3565Proof#410unstable-410.xml2024122Jon Sterlingjms-00K8
First, we prove that each identity relation \mathsf {id}_{ A } \colon A \mathbin { \nrightarrow } A is functional. Fixing a,a_1,a_2 \in A such that a=a_1 and a=a_2, we must show that a_1=a_2; this follows from transitivity and symmetry of equality.
Next, we fix partial functions f \colon A \mathbin { \rightharpoonup } B and g \colon B \mathbin { \rightharpoonup } C to check that g \circ f \colon A \mathbin { \nrightarrow } C is functional. Fixing a \in A and c_1,c_2 \in C such that a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c_1 and a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c_2, we must check that c_1=c_2. By definition of the relational composite, we have b_1,b_2 \in B such that a \mathrel {f}b_1 \mathrel {g}c_1 and a \mathrel {f}b_2 \mathrel {g}c_2. Because f \colon A \mathbin { \rightharpoonup } B is functional, we note that b_1 = b_2. Writing b=b_1=b_2, we therefore have b \mathrel {g}c_1 and b \mathrel {g}c_2. Because g \colon B \mathbin { \rightharpoonup } C is functional, we conclude that c_1=c_2.
3570Notationjms-00KAjms-00KA.xmlValues of partial functions2024122Jon SterlingMarcelo FioreLet f \colon A \mathbin { \rightharpoonup } B be a partial function. Given a \in A, we will write f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } to mean that f is defined at a, i.e. there exists some (necessarily unique) b \in B such that a \mathrel {f}b. When f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } holds, we may write f { \mathopen {} \left ( a \right ) \mathclose {}} to mean the unique b such that a \mathrel {f}b, which we shall refer to as the value of b.3573Definitionjms-00KBjms-00KB.xmlDomain of definition of a partial function2024122Jon SterlingThe domain of definition of a partial function f \colon A \mathbin { \rightharpoonup } B is the subset { \mathopen {} \left \{ a \in A \, \middle \vert \, f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right \} \mathclose {}} \subseteq A spanned by elements of the domain on which the partial function is defined.3577Lemmajms-00K9jms-00K9.xmlPartial functional extensionality2024122Jon SterlingMarcelo FioreLet f,g \colon A \mathbin { \rightharpoonup } B be two partial functions. We have f=g if and only if for all a \in A, f is defined at a if and only if g is defined at a and, moreover, the value of f at a is the same as the value of g at a: f=g \Longleftrightarrow \forall a \in A \mathpunct {.} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \Leftrightarrow g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right ) \mathclose {}} \land f { \mathopen {} \left ( a \right ) \mathclose {}} = g { \mathopen {} \left ( a \right ) \mathclose {}}
3575Proof#409unstable-409.xml2024122Jon Sterlingjms-00K9
One direction is trivial. In the other direction, we assume
\forall a \in A \mathpunct {.} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \Leftrightarrow g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } \right ) \mathclose {}} \land f { \mathopen {} \left ( a \right ) \mathclose {}} = g { \mathopen {} \left ( a \right ) \mathclose {}}
to deduce f=g. By relational extensionality, it suffices to show
\forall a \in A \mathpunct {.} \forall b \in B \mathpunct {.} a \mathrel {f}b \Longleftrightarrow a \mathrel {g}b.
Fixing a \in A and b \in B, we assume a \mathrel {f} b to prove a \mathrel {g}b. By assumption, we deduce g { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } from f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow }, so there exists b'=g { \mathopen {} \left ( a \right ) \mathclose {}} such that a \mathrel {g}b'. By our other assumption, we know that f { \mathopen {} \left ( a \right ) \mathclose {}} =g { \mathopen {} \left ( a \right ) \mathclose {}}, so we conclude a \mathrel {g}b. The other direction works the same way.
3580Examplejms-00KDjms-00KD.xmlExamples of partial functions2024122Jon SterlingThe following are examples of partial functions:rational division { \div } \colon \mathbb {Q} \times \mathbb {Q} \mathbin { \rightharpoonup } \mathbb {Q} defined by { \mathopen {} \left \{ { \mathopen {} \left ( { \mathopen {} \left ( r,s \right ) \mathclose {}} ,t \right ) \mathclose {}} \in { \mathopen {} \left ( \mathbb {Q} \times \mathbb {Q} \right ) \mathclose {}} \times \mathbb {Q} \, \middle \vert \, r=s \cdot t \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ { \mathopen {} \left ( r,s \right ) \mathclose {}} \in \mathbb {Q} \times \mathbb {Q} \, \middle \vert \, s \not =0 \right \} \mathclose {}};
integer square root \sqrt {-} \colon \mathbb {Z} \mathbin { \rightharpoonup } \mathbb {Z} defined by { \mathopen {} \left \{ { \mathopen {} \left ( m,n \right ) \mathclose {}} \in \mathbb {Z} \times \mathbb {Z} \, \middle \vert \, m=n^2 \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ m \in \mathbb {Z} \, \middle \vert \, \exists n \in \mathbb {Z} \mathpunct {.} m=n^2 \right \} \mathclose {}};
real square root \sqrt {-} \colon \mathbb {R} \mathbin { \rightharpoonup } \mathbb {R} defined by { \mathopen {} \left \{ { \mathopen {} \left ( x,y \right ) \mathclose {}} \in \mathbb {R} \times \mathbb {R} \, \middle \vert \, x=y^2 \right \} \mathclose {}} with domain of definition { \mathopen {} \left \{ x \in \mathbb {R} \, \middle \vert \, x \geq 0 \right \} \mathclose {}}.3584Lemmajms-00KEjms-00KE.xmlCardinality of the set of partial functions2024122Jon SterlingMarcelo FioreFor all finite sets A and B, we have \mathord { \# } { \mathopen {} \left ( A \mathbin { \rightharpoonup } B \right ) \mathclose {}} = { \mathopen {} \left ( \mathord { \# } {B}+1 \right ) \mathclose {}} ^{ \mathord { \# } {A}}, recalling that \mathord { \# } {S} is the cardinality of a set S.
3582Proof#408unstable-408.xml2024122Jon Sterlingjms-00KE
A partial function f \colon A \mathbin { \rightharpoonup } B associates to each element a \in A either a unique element f { \mathopen {} \left ( a \right ) \mathclose {}} \in B when f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow }, or it associates nothing. Therefore, we have \mathord { \# } {A} many possibilities, each followed by \mathord { \# } {B}+1 many possibilities.
3587Definitionjms-00KFjms-00KF.xmlFunction2024122Jon SterlingMarcelo FioreA partial function f \colon A \mathbin { \rightharpoonup } B is called total whenever its domain of definition coincides with its domain (source), i.e. we have f { \mathopen {} \left ( a \right ) \mathclose {}} { \downarrow } for all a \in A. In this case, we will write f \colon A \to B and refer to it as a function or a map. Sometimes, we redundantly refer to it as a total function.Just as we write \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} and { \mathopen {} \left ( A \mathbin { \rightharpoonup } B \right ) \mathclose {}}for the sets of relations and partial functions from A to B respectively, we shall write { \mathopen {} \left ( A \to B \right ) \mathclose {}} for the set of functions from A to B.3592Lemmajms-00KGjms-00KG.xmlFunctions are uniquely-valued relations2024122Jon SterlingMarcelo FioreAny relation R \colon A \mathbin { \nrightarrow } B is a function if and only if for all a \in A, there exists some unique b \in B such that a \mathrel {R}b.
3590Proof#407unstable-407.xml2024122Jon Sterlingjms-00KG
Symbolically, our second criterion is \forall a \in A \mathpunct {.} \exists ! b \in B \mathpunct {.} a \mathrel {R}b. Unfolding the definition of unique existence, this is equivalent to \forall a \in A \mathpunct {.} { \mathopen {} \left ( \forall b,b' \in B \mathpunct {.} a \mathrel {R}b \land a \mathrel {B}b' \implies b=b' \right ) \mathclose {}} \land { \mathopen {} \left ( \exists b \in B \mathpunct {.} a \mathrel {R}b \right ) \mathclose {}} which is, by the distributivity of universal quantification over conjunction, equivalent to the claim that R is both functional and total.
3597Lemmajms-00KHjms-00KH.xmlCardinality of the set of functions2024122Jon SterlingMarcelo FioreFor all finite sets A and B, we have \# { \mathopen {} \left ( A \to B \right ) \mathclose {}} = \# B^{ \# A}.
3595Proof#406unstable-406.xml2024122Jon Sterlingjms-00KH
A function f \colon A \to B associates to each element a \in A a single element f { \mathopen {} \left ( a \right ) \mathclose {}} \in B. Therefore, we have \# A possibilities, each followed by \# B possibilities.
3602Theoremjms-00KIjms-00KI.xmlClosure of functions under identity and composition2024122Jon SterlingMarcelo FioreFunctions are closed under identity and composition.
3600Proof#405unstable-405.xml2024122Jon Sterlingjms-00KI
We have already seen in that partial functions are closed under identity and composition. Therefore, it suffices to argue that the identity partial function is total, and that the composition of two total partial functions is total.
We must show that for all a \in A, there exists a' \in A such that a \mathrel { \mathsf {id}_{ A } } a'; unfolding the definition of the identity relation we may choose a' = a.
Given f \colon A \to B and g \colon A \to C, we must check that for all a \in A, there exists some c \in C such that a \mathrel { { \mathopen {} \left ( g \circ f \right ) \mathclose {}} }c. By the definition of relational composition, we must show that there exists b \in B and c \in C and such that a \mathrel {f}b and b \mathrel {c}. We choose b = f { \mathopen {} \left ( a \right ) \mathclose {}} and c = g { \mathopen {} \left ( b \right ) \mathclose {}} =g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}}.
3608jms-00KKjms-00KK.xmlInductive definitions2024123Jon SterlingMarcelo FioreWe have already seen some inductive definitions: defines n-fold iterations of endo-relations by induction on n.
computes the adjacency matrix of the reflexive-transitive closure of a finite directed graph by induction on the number of vertices in the graph.In this section, we will study this concept in more detail.3233Examplejms-00KLjms-00KL.xmlInformal examples of inductively defined functions2024123Jon SterlingMarcelo FioreAn inductive definition on the naturals defines a structure for each number n \in \mathbb {N} by first defining it for n=0 and then specifying how to augment a structure for n=m to a structure for n=m+1. From the point of view of computer programming, inductive definitions proceed by structural recursion. For example, addition of natural numbers can be defined by induction on the second argument (or the first, if you prefer!): \begin {aligned} \mathsf {add} & \colon \mathbb {N}^2 \to \mathbb {N} \\ \mathsf {add} { \mathopen {} \left ( m,0 \right ) \mathclose {}} &= m \\ \mathsf {add} { \mathopen {} \left ( m,n+1 \right ) \mathclose {}} &= \mathsf {add} { \mathopen {} \left ( m,n \right ) \mathclose {}} +1 \end {aligned} Likewise, the function \Sigma \colon \mathbb {N} \to \mathbb {N} that takes n \in \mathbb {N} to the sum \sum _{0 \leq i < n}i can be defined inductively: \begin {aligned} \Sigma & \colon \mathbb {N} \to \mathbb {N} \\ \Sigma { \mathopen {} \left ( 0 \right ) \mathclose {}} &= 0 \\ \Sigma { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= \mathsf {add} { \mathopen {} \left ( n, \Sigma { \mathopen {} \left ( n \right ) \mathclose {}} \right ) \mathclose {}} \end {aligned} In order to make precise the description of , , and as “inductive definitions”, we must give an actual formal definition of what an inductive definition is.3236Definitionjms-00KMjms-00KM.xmlInductively defined function2024123Jon SterlingMarcelo FioreLet A be a set, and fix a \in A and f \colon \mathbb {N} \times A \to A. The function inductively defined from a and f is defined to be the unique function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A satisfying the following equations: \begin {aligned} \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} &= a \\ \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} \right ) \mathclose {}} \end {aligned} Note that our specification of inductively defined functions in asserts, without proof, that there does in fact exist a unique function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A satisfying the described equations. It is not blindingly obvious that this claim is true — and, after seeing a few examples to motivate , we will explicitly justify by means of an existence theorem.3239Examplejms-00KNjms-00KN.xmlFormal examples of inductively defined functions2024123Jon SterlingMarcelo FioreWe have seen two informal examples of inductive definitions in ; we will now update these examples in light of :
For every m \in \mathbb {N}, function \mathsf {add} { \mathopen {} \left ( m,- \right ) \mathclose {}} \colon \mathbb {N} \to \mathbb {N} is the function \mathbf { \rho }_{ m , { \mathopen {} \left ( i,n \right ) \mathclose {}} \mapsto n+1 } inductively defined by m \in \mathbb {N} and f { \mathopen {} \left ( i,n \right ) \mathclose {}} = n+1.
The function \Sigma \colon \mathbb {N} \to \mathbb {N} taking n \in \mathbb {N} to \sum _{0 \leq i<n}i is the function \mathbf { \rho }_{ 0 , \mathsf {add} } inductively defined by 0 \in \mathbb {N} and \mathsf {add} \colon \mathbb {N} \times \mathbb {N} \to \mathbb {N}.
3243Definitionjms-00KOjms-00KO.xml{ \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation2024123Jon SterlingMarcelo FioreGiven an element a \in A and a function f \colon \mathbb {N} \times A \to A, we call a relation R \colon \mathbb {N} \mathbin { \nrightarrow } A { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed whenever we have both 0 \mathrel {R}a and n \mathrel {R}x \implies { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}f { \mathopen {} \left ( n,x \right ) \mathclose {}} for all n \in \mathbb {N} and x \in A.In light of , we see that is defining \mathbf { \rho }_{ a , f } to be the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function. We still have to show that this exists.3265Theoremjms-00KPjms-00KP.xmlExistence of inductively defined functions2024123Jon SterlingMarcelo FioreGiven an element a \in A and a function f \colon \mathbb {N} \times A \to A, now let \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A be the intersection of all the { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relations R \colon \mathbb {N} \mathbin { \nrightarrow } A.The relation \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is functional and total, and therefore a function.
The function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A is the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function, i.e. it is the unique function satisfying both \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} =a and \forall n \in \mathbb {N} \mathpunct {.} \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} =f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } {n} \right ) \mathclose {}}.
3263Proof#404unstable-404.xml2024123Jon Sterlingjms-00KP
We first notice that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is itself { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed, which can be seen by observing that the intersection of a set of { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relations is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. With this in hand, we proceed.
We must show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is a function.
To show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \nrightarrow } A is functional, we must show that for all n \in \mathbb {N} and x,y \in A, if n \mathrel { \mathbf { \rho }_{ a , f } } x and n \mathrel { \mathbf { \rho }_{ a , f } }y, then x=y. We proceed to prove P { \mathopen {} \left ( n \right ) \mathclose {}} = \forall x,y \in A \mathpunct {.} n \mathrel { \mathbf { \rho }_{ a , f } } x \land n \mathrel { \mathbf { \rho }_{ a , f } }y \implies x=y by induction on n \in \mathbb {N}.
For the base case, we fix x,y \in A such that 0 \mathrel { \mathbf { \rho }_{ a , f } }x and 0 \mathrel { \mathbf { \rho }_{ a , f } } y to check that x=y. By definition, we know that for any { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation R \colon \mathbb {N} \mathbin { \nrightarrow } A we both 0 \mathrel {R}x and 0 \mathrel {R}y.
We choose R be the union of { \mathopen {} \left \{ { \mathopen {} \left ( 0,a \right ) \mathclose {}} \right \} \mathclose {}} and { \mathopen {} \left \{ { \mathopen {} \left ( n,x \right ) \mathclose {}} \, \middle \vert \, n > 0 \land n \mathrel { \mathbf { \rho }_{ a , f } } x \right \} \mathclose {}}, which is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed; we have used the fact that \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. As this union is disjoint, we can see that 0 \mathrel {R}z if and only if z=a, so we have x=a=y.
In the inductive step, we assume P { \mathopen {} \left ( n \right ) \mathclose {}} to prove P { \mathopen {} \left ( n+1 \right ) \mathclose {}}. Fixing x,y \in A such that { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }x and { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }y, we must show that x=y.
We know that for any { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation R \colon \mathbb {N} \mathbin { \nrightarrow } A we have both { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}x and { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}y.
We choose R to be the union of { \mathopen {} \left \{ { \mathopen {} \left ( m+1,f { \mathopen {} \left ( m,z \right ) \mathclose {}} \right ) \mathclose {}} \, \middle \vert \, m=n \land m \mathrel { \mathbf { \rho }_{ a , f } }{z} \right \} \mathclose {}} with { \mathopen {} \left \{ { \mathopen {} \left ( m,z \right ) \mathclose {}} \, \middle \vert \, m \mathrel { \mathbf { \rho }_{ a , f } }z \land m \not =n+1 \right \} \mathclose {}}, noting that R is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed because \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed. As the union defining R is disjoint, for any z \in A we know that { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel {R}z if and only if z=f { \mathopen {} \left ( n,z' \right ) \mathclose {}} for some z' with n \mathrel { \mathbf { \rho }_{ a , f } } z'.
Therefore, we have some x',y' \in A such that x=f { \mathopen {} \left ( n,x' \right ) \mathclose {}} and y=f { \mathopen {} \left ( n,y' \right ) \mathclose {}} and n \mathrel { \mathbf { \rho }_{ a , f } } x' and n \mathrel { \mathbf { \rho }_{ a , f } }y'. By our inductive hypothesis, we know that x'=y', and hence x=y.
To show that \mathbf { \rho }_{ a , f } \colon \mathbb {N} \mathbin { \rightharpoonup } A is total, we must show that for every n \in \mathbb {N}, we have \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} { \downarrow }. This is proved by induction on n \in \mathbb {N}:
In the base case, we deduce \mathbf { \rho }_{ a , f } { \mathopen {} \left ( 0 \right ) \mathclose {}} { \downarrow } from our asumption that 0 \mathrel { \mathbf { \rho }_{ a , f } }a.
In the inductive step, we assume that \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n \right ) \mathclose {}} { \downarrow } and must show that \mathbf { \rho }_{ a , f } { \mathopen {} \left ( n+1 \right ) \mathclose {}} { \downarrow }. By our assumption, we have n \mathrel { \mathbf { \rho }_{ a , f } } \mathbf { \rho }_{ a , f } { { \mathopen {} \left ( n \right ) \mathclose {}} }; because \mathbf { \rho }_{ a , f } is { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed relation, we may conclude { \mathopen {} \left ( n+1 \right ) \mathclose {}} \mathrel { \mathbf { \rho }_{ a , f } }f { \mathopen {} \left ( n, \mathbf { \rho }_{ a , f } \right ) \mathclose {}}.
It remains to show that the function \mathbf { \rho }_{ a , f } \colon \mathbb {N} \to A is the unique { \mathopen {} \left ( a,f \right ) \mathclose {}}-closed function. This too can be proved by induction, as functionality and { \mathopen {} \left ( a,f \right ) \mathclose {}}-closure fully specify the values of \mathbf { \rho }_{ a , f } on all inputs.
3675jms-00KXjms-00KX.xmlLecture 16: bijections, equivalence relations, and partitions2024126Jon SterlingMarcelo Fiore
3614jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3628jms-00LDjms-00LD.xmlSections and retractions2024125Jon Sterling3616Definitionjms-00KSjms-00KS.xmlSection-retraction pair2024123Jon SterlingA section-retraction pair is defined to be a pair of functions s \colon B \to A and r \colon A \to B such that r \circ s = \mathsf {id}_{ B }. In this case, s is referred to as a section of r, and r is referred to as a retraction of s.3618Definitionjms-00LAjms-00LA.xmlIdempotent function2024125Jon SterlingA function f \colon A \to A is called idempotent when we have f \circ f = f.3622Lemmajms-00LBjms-00LB.xmlThe idempotent determined by a section-retraction pair2024125Jon SterlingEvery section-retraction pair determines an idempotent. In particular, when s \colon B \to A is a section of r \colon A \to B, the composite s \circ r \colon A \to A is idempotent.
3620Proof#399unstable-399.xml2024125Jon Sterlingjms-00LB
We proceed by calculation:
\begin {aligned} { \mathopen {} \left ( s \circ r \right ) \mathclose {}} \circ { \mathopen {} \left ( s \circ r \right ) \mathclose {}} &= s \circ { \mathopen {} \left ( r \circ s \right ) \mathclose {}} \circ r \\ &= s \circ \mathsf {id}_{ B } \circ r \\ &= s \circ r \end {aligned}
3626Lemmajms-00LCjms-00LC.xmlSplitting idempotents2024125Jon SterlingIt happens that every idempotent arises from a section-retraction pair in the manner of .
3624Proof#398unstable-398.xml2024125Jon Sterlingjms-00LC
Let f \colon A \to A be an idempotent on a set A. Let B be the subset { \mathopen {} \left \{ f { \mathopen {} \left ( a \right ) \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \subseteq A, sometimes called the direct image of f in A. Then f \colon A \to A factors, by definition, through some unique r \colon A \to B, where r { \mathopen {} \left ( x \right ) \mathclose {}} = f { \mathopen {} \left ( x \right ) \mathclose {}} for all x \in A. We define s \colon B \to A to be be the subset inclusion.
We must check that r \circ s = \mathsf {id}_{ B }; in other words, we fix b \in B to check that r { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} = b. By definition, the subset inclusion s \colon B \to A takes b to the corresponding element s { \mathopen {} \left ( b \right ) \mathclose {}} \in A such that b=f { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}}. By definition of r \colon A \to B, we therefore have r { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =f { \mathopen {} \left ( s { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b.
3656jms-00KQjms-00KQ.xmlBijections2024123Jon SterlingMarcelo Fiore3632Definitionjms-00KRjms-00KR.xmlBijection2024123Jon SterlingMarcelo FioreA function f \colon A \to B is said to be bijective or bijection whenever there exists a (necessarily unique) function g \colon B \to A (referred to as the inverse of f) that is simultaneously a retraction and a section of f \colon A \to B in the sense that g \circ f= \mathsf {id}_{ A } and f \circ g = \mathsf {id}_{ B }.We will often write f ^{-1} \colon B \to A for the inverse to a bijection f \colon A \to B.
3630Proof#403unstable-403.xml2024123Jon Sterlingjms-00KR
To see that g \colon B \to A is unique, we fix any other g' that is simultaneously a retraction and a section of f and show that g=g':
\begin {aligned} g &= \mathsf {id}_{ A } \circ g \\ &= { \mathopen {} \left ( g' \circ f \right ) \mathclose {}} \circ g \\ &= g' \circ { \mathopen {} \left ( f \circ g \right ) \mathclose {}} \\ &= g' \circ \mathsf {id}_{ B } \\ &= g' \end {aligned}
3640Propositionjms-00KTjms-00KT.xmlCounting bijections between finite sets2024123Jon SterlingMarcelo FioreFor all finite sets A and B, we have: \mathord { \# } { \mathrm {Bij} { \mathopen {} \left ( A , B \right ) \mathclose {}} } = \begin {cases} 0 & \text {if } \mathord { \# } {A} \not = \mathord { \# } {B} \\ n!& \text {if } \mathord { \# } {A}= \mathord { \# } {B}=n \end {cases}
3638Proof#402unstable-402.xml2024123Jon Sterlingjms-00KT
A bijection between two sets is a way of associating precisely one element of the first set with precisely one element of the second set. These associations are embodied in the functions f \colon A \to B and g \colon B \to A, and the uniqueness of the associations is expressed by the conditio that g be simultaneously a section and a retraction of f. Two sets of different cardinality can not be associated in this way, because there would always be left over elements that are not associated to any corresponding element. In this way, we see that a bijection exists between two sets if and only if they have the same cardinality.
Assuming two finite sets A,B with cardinality \mathord { \# } {A}= \mathord { \# } {B}=n, we must now count precisely how many bijections there are from A to B. To construct such a bijection, we put all the elements B into a pile. Then, for each element of A we must successively choose and then discard an element of B; we have to discard the element because we are not allowed to choose it again (for, if we did, the association would not be bijective). At the beginning (“round k=0”), there are n many choices we could make; in the next round k=1, there are n-1 many choices that we could make; in the kth round, there are n-k many choices that we could make. At the very end (after the n-1th round), we shall have constructed a bijection after making n! = \prod _{0 \leq k<n} n-k choices in total. Thus the number of distinct bijections from A to B is n!.
3645Theoremjms-00KZjms-00KZ.xmlClosure of bijections under identity and composition2024123Jon SterlingMarcelo FioreThe identity function is a bijection, and the composition of bijections yields a bijection.
3643Proof#401unstable-401.xml2024123Jon Sterlingjms-00KZ
The inverse to the identity function \mathsf {id}_{ A } \colon A \to A is simply itself.
Fix bijections f \colon A \to B and g \colon B \to C, so that we have inverses f ^{-1} \colon B \to A and g ^{-1} \colon C \to B. The inverse to the composite function g \circ f \colon A \to C is the composite f ^{-1} \circ g ^{-1} \colon C \to A. The section/retraction conditions follow by those for f and g using the associative and unit laws of composition.
3648Definitionjms-00L0jms-00L0.xmlIsomorphic sets2024123Jon SterlingMarcelo FioreTwo sets A and B are said to be isomorphic (and have the same cardinality) whenever there is a bijection between them. In this case, we write A \cong B or \mathord { \# } {A}= \mathord { \# } {B}3651Examplejms-00L1jms-00L1.xmlExamples of isomorphic sets2024123Jon SterlingMarcelo FioreA bijection between finite sets is just a “relabeling” of its elements: for example, we have { \mathopen {} \left \{ 0,1 \right \} \mathclose {}} \cong { \mathopen {} \left \{ \mathsf {false} , \mathsf {true} \right \} \mathclose {}}. Isomorphism of infinite sets can be more confusing: for example, we have \mathbb {N} \cong { \mathopen {} \left \{ n: \mathbb {N} \, \middle \vert \, n>0 \right \} \mathclose {}}, \mathbb {N} \cong \mathbb {Z}, \mathbb {N} \cong \mathbb {N} \times \mathbb {N}, \mathbb {N} \cong \mathbb {Q}, and even { \mathopen {} \left ( \mathbb {N} \to \mathbb {N} \right ) \mathclose {}} \cong \mathbb {R}.3654Remarkjms-00L2jms-00L2.xmlWhich bijection?2024123Jon SterlingAlthough we are speaking of “being isomorphic” as the property of there being a bijection between them, in practice, it is almost never useful to know that there exists some undetermined bijection between given sets; it is always important to know which bijection. Indeed, we have seen in that there can be many distinct bijections between two sets.3672jms-00L3jms-00L3.xmlEquivalence relations and set partitions2024123Jon SterlingMarcelo Fiore3659Definitionjms-00L4jms-00L4.xmlEquivalence relation2024123Jon SterlingMarcelo FioreA relation E \colon A \mathbin { \nrightarrow } A is called an equivalence relation when it is reflexive, transitive, and symmetric: \forall x \in A \mathpunct {.} x \mathrel {E}x \forall x,y,z \in A \mathpunct {.} x \mathrel {E} y \land y \mathrel {E} z \implies x \mathrel {E} z \forall x,y \in A \mathpunct {.} x \mathrel {E} y \implies y \mathrel {E}x We will write \mathrm {EqRel} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of all equivalence relations on A.3662Definitionjms-00L6jms-00L6.xmlEquivalence class2024123Jon SterlingLet E \colon A \mathbin { \nrightarrow } A be an equivalence relation on a set A. The equivalence class of a given element a \in A in E is the subset { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \subseteq A spanned by elements related to a in E, i.e. { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } = { \mathopen {} \left \{ x \in A \, \middle \vert \, x \mathrel {E}a \right \} \mathclose {}}.You may hear some people say “an equivalence” instead of “an equivalence relation”. This usage is wrong and leads to extremely confused thinking: you must not repeat it!3664Definitionjms-00L5jms-00L5.xmlSet partition2024123Jon SterlingMarcelo FioreA partition of a set A is a set P \subseteq \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} of non-empty subsets of A whose elements are referred to as blocks, satisfying the following conditions:the union of all blocks is all of A, i.e. \bigcup P = A;
the blocks are pairwise disjoint, i.e. for all b_1 \not = b_2 \in P we have b_1 \cap b_2 = \varnothing.We will write \mathrm {Part} { \mathopen {} \left ( A \right ) \mathclose {}} for the set of partitions of A.3670Theoremjms-00L7jms-00L7.xmlBijection between equivalence relations and partitions2024123Jon SterlingFor every set A, we can define a bijection \Phi \colon \mathrm {EqRel} { \mathopen {} \left ( A \right ) \mathclose {}} \to \mathrm {Part} { \mathopen {} \left ( A \right ) \mathclose {}} sending every equivalence relation E on A to the partition \Phi { \mathopen {} \left ( E \right ) \mathclose {}} = { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \, \middle \vert \, a \in A \right \} \mathclose {}} whose blocks consist of the equivalence classes of each element of A.
3668Proof#400unstable-400.xml2024123Jon Sterlingjms-00L7
We must first argue that { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \, \middle \vert \, a \in A \right \} \mathclose {}} in fact constitutes a partition; the union of all equivalence classes is indeed all of A, and we can see that any two distinct equivalence classes have null intersection — since, if there were an intersection, the two equivalence classes would be identical.
The inverse \Phi ^{-1} sends a partition P to the following equivalence relation:
x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} } y \Longleftrightarrow \exists b \in P \mathpunct {.} x \in b \land y \in b
To check that \Phi ^{-1} is a section of \Phi, we fix a partition P to check that \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} = P.
\begin {aligned} \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} &= { \mathopen {} \left \{ { \mathopen {} \left [ a \right ] \mathclose {}} _{ \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} } \, \middle \vert \, a \in A \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ { \mathopen {} \left \{ x \in A \, \middle \vert \, x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} } a \right \} \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ { \mathopen {} \left \{ x \in A \, \middle \vert \, \exists b \in P \mathpunct {.} x \in b \land a \in b \right \} \mathclose {}} \, \middle \vert \, a \in A \right \} \mathclose {}} \end {aligned}
As P is a partition, there is exactly one block containing a given element a \in A — no more than one by disjointness, and no fewer than one because the union of a partition must be the entire set. Therefore, the set { \mathopen {} \left \{ x \in A \, \middle \vert \, \exists b \in P \mathpunct {.} x \in b \land a \in b \right \} \mathclose {}} is precisely the unique block of P containing a. Thus we have \Phi { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( P \right ) \mathclose {}} \right ) \mathclose {}} = P.
Conversely, we must show that \Phi ^{-1} is a retraction of \Phi. Fixing an equivalence relation E \colon A \mathbin { \nrightarrow } A, we must check that \Phi ^{-1} { \mathopen {} \left ( \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \right ) \mathclose {}} = E. Fixing x,y \in A, we proceed using relational extensionality:
\begin {aligned} x \mathrel { { \mathopen {} \left ( \Phi ^{-1} { \mathopen {} \left ( \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \right ) \mathclose {}} \right ) \mathclose {}} }y & \Longleftrightarrow \exists b \in \Phi { \mathopen {} \left ( E \right ) \mathclose {}} \mathpunct {.} x \in b \land y \in b \\ & \Longleftrightarrow \exists a \in A \mathpunct {.} x \in { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \land y \in { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } \\ & \Longleftrightarrow \exists a \in A \mathpunct {.} x \mathrel {E}a \land y \mathrel {E}a \\ & \Longleftrightarrow x \mathrel {E} y \end {aligned}
3733jms-00LHjms-00LH.xmlLecture 17: bijections, indicators, finite cardinality, infinity axiom2024129Jon SterlingMarcelo Fiore
3678jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3687jms-00LIjms-00LI.xmlCalculus of bijections, I2024125Jon SterlingMarcelo Fiore3680Lemmajms-00LJjms-00LJ.xmlProperties of isomorphism2024125Jon SterlingThe concept of isomorphism between two sets satisfies laws like those of an equivalence relation. In particular:reflexivity: A \cong A;
transitivity: A \cong B \land B \cong C \implies A \cong C;
symmetry: A \cong B \implies B \cong A.3684Examplejms-00LKjms-00LK.xmlInvariance under isomorphism2024125Jon SterlingMarcelo FioreFor all sets A,B,X,Y, we have A \cong X \land B \cong Y \implies \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} \cong \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}}.
3682Proof#397unstable-397.xml2024125Jon Sterlingjms-00LK
Fix bijections f \colon A \to X and g \colon B \to Y. We will define a bijection H \colon \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}}.
Given R \colon A \mathbin { \nrightarrow } B, we define H { \mathopen {} \left ( R \right ) \mathclose {}} \colon X \mathbin { \nrightarrow } Y to be the relational composite g \circ R \circ f ^{-1}. The inverse mapping H ^{-1} \colon \mathrm {Rel} { \mathopen {} \left ( X,Y \right ) \mathclose {}} \to \mathrm {Rel} { \mathopen {} \left ( A,B \right ) \mathclose {}} is defined to take S \colon X \mathbin { \nrightarrow } Y to the relational composite g ^{-1} \circ S \circ f.
Next, we check that H ^{-1} is simulatneously a retraction and a section of H.
\begin {aligned} H ^{-1} { \mathopen {} \left ( H { \mathopen {} \left ( R \right ) \mathclose {}} \right ) \mathclose {}} &= g ^{-1} \circ H { \mathopen {} \left ( R \right ) \mathclose {}} \circ f \\ &= g ^{-1} \circ { \mathopen {} \left ( g \circ R \circ f ^{-1} \right ) \mathclose {}} \circ f \\ &= { \mathopen {} \left ( g ^{-1} \circ g \right ) \mathclose {}} \circ R \circ { \mathopen {} \left ( f ^{-1} \circ f \right ) \mathclose {}} \\ &= \mathsf {id}_{ B } \circ R \circ \mathsf {id}_{ A } \\ &= R \\ H { \mathopen {} \left ( H ^{-1} { \mathopen {} \left ( S \right ) \mathclose {}} \right ) \mathclose {}} &= g \circ H ^{-1} { \mathopen {} \left ( S \right ) \mathclose {}} \circ f ^{-1} \\ &= g \circ { \mathopen {} \left ( g ^{-1} \circ S \circ f \right ) \mathclose {}} \circ f ^{-1} \\ &= { \mathopen {} \left ( g \circ g ^{-1} \right ) \mathclose {}} \circ S \circ { \mathopen {} \left ( f \circ f ^{-1} \right ) \mathclose {}} \\ &= \mathsf {id}_{ Y } \circ S \circ \mathsf {id}_{ X } \\ &= S \end {aligned}
3706jms-00LLjms-00LL.xmlIndicator functions and comprehension2024125Jon SterlingMarcelo Fiore3690Definitionjms-00LOjms-00LO.xmlPredicate2024125Jon SterlingA predicate on a set A is defined to be a function \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} from A to the two-element set { \mathopen {} \left [ 2 \right ] \mathclose {}} = { \mathopen {} \left \{ 0,1 \right \} \mathclose {}}. We say that a predicate \phi holds of a given element a \in A when \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1.3692Definitionjms-00LMjms-00LM.xmlThe indicator function of a subset2024125Jon SterlingMarcelo FioreThe indicator function or characteristic function of S \subseteq A is defined to be the predicate \chi _S \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} defined piecewise below: \chi _S { \mathopen {} \left ( a \right ) \mathclose {}} = \begin {cases} 1 & \text {if } a \in S \\ 0 & \text {if } a \not \in S \end {cases} 3695Definitionjms-00LPjms-00LP.xmlThe comprehension of a predicate2024125Jon SterlingLet \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} be a predicate on a set A. The comprehension of \phi is defined to be the following subset { \mathopen {} \left [ \phi \right ] \mathclose {}} \subseteq A spanned by elements at which \phi holds, as specified below: { \mathopen {} \left [ \phi \right ] \mathclose {}} = { \mathopen {} \left \{ a \in A \, \middle \vert \, \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \right \} \mathclose {}} 3699Theoremjms-00LNjms-00LN.xmlUniversal property of indicator functions2024125Jon SterlingMarcelo FioreThe mappings \chi _{ { \mathopen {} \left ( - \right ) \mathclose {}} } \colon \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} \to { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} and { \mathopen {} \left [ - \right ] \mathclose {}} \colon { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} \to \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} given by indicator functions and comprehension are mutually inverse. Thus we have \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} \cong { \mathopen {} \left ( A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}}.
3697Proof#396unstable-396.xml2024125Jon Sterlingjms-00LN
Fixing S \subseteq A, we compute:
\begin {aligned} { \mathopen {} \left [ \chi _{S} \right ] \mathclose {}} &= { \mathopen {} \left \{ a \in A \, \middle \vert \, \chi _S { \mathopen {} \left ( a \right ) \mathclose {}} =1 \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ a \in A \, \middle \vert \, a \in S \right \} \mathclose {}} \\ &= S \end {aligned}
Conversely, we fix \phi \colon A \to { \mathopen {} \left [ 2 \right ] \mathclose {}} and compute:
\begin {aligned} \chi _{ { \mathopen {} \left [ \phi \right ] \mathclose {}} } { \mathopen {} \left ( a \right ) \mathclose {}} &= \begin {cases} 1 & \text {if } a \in { \mathopen {} \left [ \phi \right ] \mathclose {}} \\ 0 & \text {if } a \not \in { \mathopen {} \left [ \phi \right ] \mathclose {}} \end {cases} \\ &= \begin {cases} 1 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \\ 0 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} \not =1 \end {cases} \\ &= \begin {cases} 1 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =1 \\ 0 & \text {if } \phi { \mathopen {} \left ( a \right ) \mathclose {}} =0 \end {cases} \\ &= \phi { \mathopen {} \left ( a \right ) \mathclose {}} \end {aligned}
3704Examplejms-00LQjms-00LQ.xmlAn identity involving indicator functions2024125Jon SterlingFor any set X, we have \mathcal {P} { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \cong \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}}, where + denotes disjoint union.
3702Proof#395unstable-395.xml2024125Jon Sterlingjms-00LQ
We first note that \mathcal {P} { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \cong { \mathopen {} \left ( { \mathopen {} \left ( X+ { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} by . A map out of a disjoint union is given by one map for each side, so this is isomorphic to { \mathopen {} \left ( X \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}} \times { \mathopen {} \left ( { \mathopen {} \left [ 1 \right ] \mathclose {}} \to { \mathopen {} \left [ 2 \right ] \mathclose {}} \right ) \mathclose {}}. Applying again in the other direction, this is isomorphic to \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 2 \right ] \mathclose {}}. We finally compute:
\begin {aligned} \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 2 \right ] \mathclose {}} &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left ( { \mathopen {} \left [ 1 \right ] \mathclose {}} + { \mathopen {} \left [ 1 \right ] \mathclose {}} \right ) \mathclose {}} \\ &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 1 \right ] \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \times { \mathopen {} \left [ 1 \right ] \mathclose {}} \\ &= \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} + \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \end {aligned}
3730jms-00LWjms-00LW.xmlFinite and infinite sets2024128Jon SterlingMarcelo FioreNow that we have learned about bijections, we are ready to replace our intuitive/informal understanding of finite sets and finite cardinality with a formal one.3710Definitionjms-00LXjms-00LX.xmlFinite set2024128Jon SterlingMarcelo FioreA set A is said to be finite or have finite cardinality when it is isomorphic to a set of the form { \mathopen {} \left [ n \right ] \mathclose {}} for some n \in \mathbb {N}; in this case, we therefore have \mathord { \# } {A}=n. (Recall that { \mathopen {} \left [ n \right ] \mathclose {}} is the standard n-element set { \mathopen {} \left \{ i \in \mathbb {N} \, \middle \vert \, 0 \leq i < n \right \} \mathclose {}}.)There are many identities that relate set-theoretic operations on finite sets to arithmetic operations on their cardinalities. We will illustrate a few of them here.3717Examplejms-00LYjms-00LY.xmlCardinality of the cartesian product2024128Jon SterlingFor all m,n \in \mathbb {N}, we have { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} \cong { \mathopen {} \left [ m \cdot n \right ] \mathclose {}}.We can prove this from a combinatoric point of view.
3713Proof#393unstable-393.xml2024128Jon Sterlingjms-00LY
To choose a pair { \mathopen {} \left ( i,j \right ) \mathclose {}} \in { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}}, we first choose an element of { \mathopen {} \left [ m \right ] \mathclose {}} and we second choose an element of { \mathopen {} \left [ n \right ] \mathclose {}}. There are m possibilities for the first choice and n possibilities for the second choice; as these choices are independent, we simply have m \cdot n choices in total.
We can also prove this isomorphism more formally by means of a specific bijection.
3715Proof#394unstable-394.xml2024128Jon Sterlingjms-00LY
We can think of the cartesian product { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} as the set of cells in table or matrix with m columns and n rows — and then an element { \mathopen {} \left ( i,j \right ) \mathclose {}} with i \in { \mathopen {} \left [ m \right ] \mathclose {}} and j \in { \mathopen {} \left [ n \right ] \mathclose {}} is the coordinate of a given cell. Naturally, the total number of cells would be m \cdot n, but we can prove this formally by constructing the function that takes a coordinate { \mathopen {} \left ( i,j \right ) \mathclose {}} to its absolute index in the counting of cells from left to right and top to bottom:
\begin {aligned} I & \colon { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m \cdot n \right ] \mathclose {}} \\ I { \mathopen {} \left ( i,j \right ) \mathclose {}} &= m \cdot j + i \end {aligned}
We do need to check that I { \mathopen {} \left ( i,j \right ) \mathclose {}} < m \cdot n for all i<m and j<n. The maximimum value of I is naturally I { \mathopen {} \left ( m-1,n-1 \right ) \mathclose {}}: \begin {aligned} I { \mathopen {} \left ( m-1,n-1 \right ) \mathclose {}} &= m \cdot { \mathopen {} \left ( n-1 \right ) \mathclose {}} + { \mathopen {} \left ( m-1 \right ) \mathclose {}} \\ &= m \cdot n - m + m - 1 \\ &= m \cdot n - 1 \\ &< m \cdot n \end {aligned} The inverse function I ^{-1} \colon { \mathopen {} \left [ m \cdot n \right ] \mathclose {}} \to { \mathopen {} \left [ m \right ] \mathclose {}} \times { \mathopen {} \left [ n \right ] \mathclose {}} takes i<m \cdot n to the pair { \mathopen {} \left ( \operatorname {rem} { \mathopen {} \left ( i,m \right ) \mathclose {}} , \operatorname {quo} { \mathopen {} \left ( i,m \right ) \mathclose {}} \right ) \mathclose {}}.3721Examplejms-00LZjms-00LZ.xmlCardinality of the disjoint sum2024128Jon SterlingMarcelo FioreFor all m,n \in \mathbb {N}, we have { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \cong { \mathopen {} \left [ m+n \right ] \mathclose {}}.
3719Proof#392unstable-392.xml2024128Jon Sterlingjms-00LZ
We can think of { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} as the set of coordinates into two strips of cells, the first of which has m cells and the second of which has n cells. Conversely, we can think of { \mathopen {} \left [ m+n \right ] \mathclose {}} as the set of coordinates a single strip of m+n cells. A function I \colon { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m+n \right ] \mathclose {}} could then be thought of as witnessing a method to lay out the two strips in serial; we will define such a function and show that it is a bijection.
\begin {aligned} I & \colon { \mathopen {} \left [ m \right ] \mathclose {}} + { \mathopen {} \left [ n \right ] \mathclose {}} \to { \mathopen {} \left [ m+n \right ] \mathclose {}} \\ I { \mathopen {} \left ( 0,i<m \right ) \mathclose {}} &= i \\ I { \mathopen {} \left ( 1,j<n \right ) \mathclose {}} &= m+j \end {aligned}
The function I places a cell from the first strip sequentially after the first m cells of the serialised strip. The inverse can be defined as follows:
I { \mathopen {} \left ( k \right ) \mathclose {}} = \begin {cases} { \mathopen {} \left ( 0,k \right ) \mathclose {}} & \text {when } k < m \\ { \mathopen {} \left ( 1,k-m \right ) \mathclose {}} & \text {when } k \geq m \end {cases}
3724Definitionjms-00M1jms-00M1.xmlInfinite set2024128Jon SterlingMarcelo FioreA set A is called infinite when it is not finite in the sense of .The basic axioms of set theory that we have learned in this course so far do not in fact guarantee the existence of any infinite sets. For this reason, one usually adds the axiom of infinity below to set theory.3727Axiomjms-00M0jms-00M0.xmlThe axiom of infinity2024128Jon SterlingMarcelo FioreThe natural numbers form a set.3830jms-00M6jms-00M6.xmlLecture 18: surjections, injections, choice, and enumerability2024131Jon SterlingMarcelo Fiore
3736jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3740Theoremjms-00M7jms-00M7.xmlLogical characterisation of bijections2024129Jon SterlingMarcelo FioreFor a function f \colon A \to B, the following are equivalent:the function f \colon A \to B is bijective;
we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b;
we have both \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b and \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2.
3738Proof#391unstable-391.xml2024129Jon Sterlingjms-00M7
We first prove that f \colon A \to B is bijective if and only if we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b.
Indeed, suppose that f is a bijection and fix b \in B to check that there exists some unique a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} =b. We let a = f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} and we do indeed have f { \mathopen {} \left ( a \right ) \mathclose {}} =f { \mathopen {} \left ( f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b; to see that a is unique with this property, we fix a_1,a_2 \in A such that f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =b and f { \mathopen {} \left ( a_2 \right ) \mathclose {}} =b. We have a_1=f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a_1 \right ) \mathclose {}} \right ) \mathclose {}} =f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} =f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \right ) \mathclose {}} =a_2.
Conversely, suppose that \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b; then we may define an inverse function f ^{-1} \colon B \to A sending b \in B to the unique element a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} =b. We therefore have f { \mathopen {} \left ( f ^{-1} { \mathopen {} \left ( b \right ) \mathclose {}} \right ) \mathclose {}} =b by definition and f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} is defined to be the unique element a' \in A such that f { \mathopen {} \left ( a' \right ) \mathclose {}} = f { \mathopen {} \left ( a \right ) \mathclose {}}; as a' is assumed unique with this property and we clearly have f { \mathopen {} \left ( a \right ) \mathclose {}} =f { \mathopen {} \left ( a \right ) \mathclose {}}, we may conclude that a'=a and so f ^{-1} { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} =a.
Next, we show that we have \forall b \in B \mathpunct {.} \exists ! a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b if and only if we have both \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b and \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2. In fact, this holds immediately by unfolding the definition of unique existence and the distribution of universal quantification over conjunction.
3772jms-00MFjms-00MF.xmlSurjective functions2024129Jon SterlingMarcelo Fiore3743Definitionjms-00M8jms-00M8.xmlSurjection2024129Jon SterlingMarcelo FioreA function f \colon A \to B is said to be surjective or asurjection when for we have \forall b \in B \mathpunct {.} \exists a \in A \mathpunct {.} f { \mathopen {} \left ( a \right ) \mathclose {}} =b. A function satisfying this condition is written f \colon A \twoheadrightarrow B.3748Corollaryjms-00M9jms-00M9.xmlBijections are surjective2024129Jon SterlingMarcelo FioreAny bijection f \colon A \to B is surjective.
3746Proof#390unstable-390.xml2024129Jon Sterlingjms-00M9
By .
3753Examplejms-00MAjms-00MA.xmlNon-empty sets and surjections2024129Jon SterlingMarcelo FioreThe unique function !_A \colon A \to { \mathopen {} \left [ 1 \right ] \mathclose {}} is surjective if and only if A \not = \varnothing.
3751Proof#389unstable-389.xml2024129Jon Sterlingjms-00MA
Suppose that A is nonempty, so that we have some a \in A. We must show that for every i \in { \mathopen {} \left [ 1 \right ] \mathclose {}}, there exists some x \in A such that !_A { \mathopen {} \left ( x \right ) \mathclose {}} =i. Letting i=0 be the unique element of { \mathopen {} \left [ 1 \right ] \mathclose {}}, we may set x:= a.
Conversely, if !_A \colon A \to { \mathopen {} \left [ 1 \right ] \mathclose {}} is surjective, we know that there exists some a \in A such that !_A=0; therefore, A is nonempty.
3758Examplejms-00MBjms-00MB.xmlQuotients and surjections2024129Jon SterlingMarcelo FioreLet E be an equivalence relation on a set A, and let q \colon A \to A/E be the quotient function that sends a \in A to its equivalence class { \mathopen {} \left [ a \right ] \mathclose {}} _{ E }. Then q \colon A \to A/E is surjective.
3756Proof#388unstable-388.xml2024129Jon Sterlingjms-00MB
We must show that for every equivalence class u \in A/E there exists an element a \in A such that q { \mathopen {} \left ( a \right ) \mathclose {}} =u. By definition u= { \mathopen {} \left [ a \right ] \mathclose {}} _{ E } =q { \mathopen {} \left ( a \right ) \mathclose {}} for some a \in A.
3764Examplejms-00MCjms-00MC.xmlProjection functions and surjections2024129Jon SterlingMarcelo FioreThe projection function \pi _1 \colon A \times B \to A sending { \mathopen {} \left ( a,b \right ) \mathclose {}} to a is surjective if and only if either B \not = \varnothing or A= \varnothing.
3762Proof#387unstable-387.xml2024129Jon Sterlingjms-00MC
Suppose that \pi _1 \colon A \times B \to A is surjective. We proceed by cases on whether A is empty:
If A is empty, we are done.
If A is inhabited by some a \in A, then by assumption there exists u \in A \times B such that \pi _1 { \mathopen {} \left ( u \right ) \mathclose {}} =a. Thus B is inhabited by \pi _2 { \mathopen {} \left ( u \right ) \mathclose {}}.
Conversely, suppose that either B \not = \varnothing or A= \varnothing. Fixing a \in A, we must show that there exists u \in A \times B such that \pi _1 { \mathopen {} \left ( u \right ) \mathclose {}} =a. If A= \varnothing, then we have a contradiction already; on the other hand, if there exists any b \in B, we may define u:= { \mathopen {} \left ( a,b \right ) \mathclose {}} and we are done.
3769Theoremjms-00MDjms-00MD.xmlClosure of surjections under identity and composition2024129Jon SterlingMarcelo FioreThe identity function is a surjection and the composition of surjections yields a surjection.
3767Proof#386unstable-386.xml2024129Jon Sterlingjms-00MD
For the identity function on a set A, we must check that for all a \in A, there exists an element a' \in A such that a'=a. Of course, we choose a' := a.
Let f \colon A \twoheadrightarrow B and g \colon B \twoheadrightarrow C be two surjections. To show that g \circ f \colon A \to C is a surjection, we fix c \in C to exhibit some a \in A such that g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} = c. Because g \colon B \twoheadrightarrow C is surjective, there exists b \in B such that g { \mathopen {} \left ( b \right ) \mathclose {}} = c. Becuase f \colon A \twoheadrightarrow B is surjective, there exists a \in A such that f { \mathopen {} \left ( a \right ) \mathclose {}} = b. Thus we have g { \mathopen {} \left ( f { \mathopen {} \left ( a \right ) \mathclose {}} \right ) \mathclose {}} =g { \mathopen {} \left ( b \right ) \mathclose {}} =c.
3798jms-00MGjms-00MG.xmlEnumerability and countability2024129Jon SterlingMarcelo Fiore3775Definitionjms-00MEjms-00ME.xmlEnumerable set2024129Jon SterlingMarcelo FioreA set A is said to be enumerable whenever there exists a surjection \mathbb {N} \twoheadrightarrow A, referred to as an enumeration.In an enumeration e \colon \mathbb {N} \twoheadrightarrow A of a set A, we think of n \in \mathbb {N} as a “code” for a \in A when e { \mathopen {} \left ( n \right ) \mathclose {}} =a; these “codes” are not unique unless e is a bijection. By virtue of the viewpoint of quotients as surjections developed in , enumerability expresses when a given set can be built up from the natural numbers by quotienting (i.e. identifying codes that are taken to the same element by the enumeration).3778Definitionjms-00MHjms-00MH.xmlCountable set2024129Jon SterlingMarcelo FioreA countable set is one that is either empty or enumerable.3783Lemmajms-00MIjms-00MI.xmlAlternative definition of countability2024129Jon SterlingA set A is countable if and only if the disjoint union A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} is enumerable.
3781Proof#385unstable-385.xml2024129Jon Sterlingjms-00MI
Suppose that A is countable. We must find a surjection \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}}; if A is empty, this is the same as to find a surjection \mathbb {N} \to { \mathopen {} \left [ 1 \right ] \mathclose {}}, which we have by . On the other hand, if we have an enumeration e \colon \mathbb {N} \twoheadrightarrow {A}, we may define an enumeration e' \colon \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} as follows:
\begin {aligned} e' { \mathopen {} \left ( 0 \right ) \mathclose {}} &= 0 \\ e' { \mathopen {} \left ( n+1 \right ) \mathclose {}} &= e { \mathopen {} \left ( n \right ) \mathclose {}} \end {aligned}
Conversely, suppose that A+ { \mathopen {} \left [ 1 \right ] \mathclose {}} is enumerable by e \colon \mathbb {N} \twoheadrightarrow A+ { \mathopen {} \left [ 1 \right ] \mathclose {}}. If A is empty, we are done. If to the contrary A is inhabited by some a_0, we let define e' \colon \mathbb {N} \twoheadrightarrow A as follows:
e' { \mathopen {} \left ( i \right ) \mathclose {}} = \begin {cases} e { \mathopen {} \left ( i \right ) \mathclose {}} & \text {when } e { \mathopen {} \left ( i \right ) \mathclose {}} \in A \\ a_0 & \text {otherwise} \end {cases}
3785Examplejms-00MJjms-00MJ.xmlA bijective enumeration of the integers2024129Jon SterlingMarcelo FioreWe can define a bijective enumeration of the integers by “zigzagging” between the positive and the negative following the pattern{ \mathopen {} \left ( 0, -1, 1,-2,2,-3,3, \ldots \right ) \mathclose {}}: \begin {aligned} e& \colon \mathbb {N} \twoheadrightarrow \mathbb {Z} \\ e { \mathopen {} \left ( i \right ) \mathclose {}} &= \begin {cases} 0 & \text {when } i=0 \\ i \div 2 & \text {when } i > 0 \land 2 \mid i \\ - { \mathopen {} \left ( i+1 \right ) \mathclose {}} \div 2 & \text {when } i > 0 \land \lnot { \mathopen {} \left ( 2 \mid i \right ) \mathclose {}} \end {cases} \end {aligned} 3790Lemmajms-00MKjms-00MK.xmlNon-empty subsets of enumerable sets are enumerable2024129Jon SterlingMarcelo FioreAny non-empty subset of an enumerable set is enumerable.
3788Proof#384unstable-384.xml2024129Jon Sterlingjms-00MK
We will use the same method as in our proof of .
Let S \subseteq A be a non-empty subset of an enumerable set A, so that we have some s \in S and a surjection e \colon \mathbb {N} \twoheadrightarrow A. We define e' \colon \mathbb {N} \twoheadrightarrow {S} to send i \in \mathbb {N} to e { \mathopen {} \left ( i \right ) \mathclose {}} if e { \mathopen {} \left ( i \right ) \mathclose {}} \in S and to s otherwise.
3795Lemmajms-00MLjms-00ML.xmlCartesian product of countable sets2024129Jon SterlingMarcelo FioreThe cartesian product of countable sets is countable.
3793Proof#383unstable-383.xml2024129Jon Sterlingjms-00ML
Let A and B be countable sets. If either A or B is empty, then A \times B is empty and thus countable. Otherwise, let e_A \colon \mathbb {N} \twoheadrightarrow A and e_B \colon \mathbb {N} \twoheadrightarrow B be enumerations of A and B respectively. Then the function e_A \times e_B \colon \mathbb {N} \times \mathbb {N} \to A \times B can be seen to be surjective; letting f \colon \mathbb {N} \twoheadrightarrow \mathbb {N} \times \mathbb {N} be any enumeration of \mathbb {N} \times \mathbb {N}, can compose to obtain a surjection { \mathopen {} \left ( e_A \times e_B \right ) \mathclose {}} \circ f \colon \mathbb {N} \twoheadrightarrow A \times B by .
3801Axiomjms-00MMjms-00MM.xmlThe axiom of choice2024129Jon SterlingMarcelo FioreEvery surjection has a section.3827jms-00MNjms-00MN.xmlInjective functions2024129Jon SterlingMarcelo Fiore3804Definitionjms-00MOjms-00MO.xmlInjection2024129Jon SterlingMarcelo FioreA function f \colon A \to B is said to be injective, or an injection, whenever we have \forall a_1,a_2 \in A \mathpunct {.} f { \mathopen {} \left ( a_1 \right ) \mathclose {}} =f { \mathopen {} \left ( a_2 \right ) \mathclose {}} \implies a_1=a_2. Such a function is written f \colon A \rightarrowtail B.3809Examplejms-00N1jms-00N1.xmlSections are injective2024130Jon SterlingMarcelo FioreEvery section is injective
3807Proof#382unstable-382.xml2024130Jon Sterlingjms-00N1
Let s \colon B \to A be a section of r \colon A \to B. Fixing b_1,b_2 \in B such that s { \mathopen {} \left ( b_1 \right ) \mathclose {}} =s { \mathopen {} \left ( b_2 \right ) \mathclose {}}, we must show that b_1=b_2. We have r { \mathopen {} \left ( s { \mathopen {} \left ( b_1 \right ) \mathclose {}} \right ) \mathclose {}} =r { \mathopen {} \left ( s { \mathopen {} \left ( b_2 \right ) \mathclose {}} \right ) \mathclose {}} by assumption, and thus b_1=r { \mathopen {} \left ( s { \mathopen {} \left ( b_1 \right ) \mathclose {}} \right ) \mathclose {}} =r { \mathopen {} \left ( s { \mathopen {} \left ( b_2 \right ) \mathclose {}} \right ) \mathclose {}} =b_2.
3814Examplejms-00N2jms-00N2.xmlSubset inclusion are injective2024130Jon SterlingMarcelo FioreThe function i_S \colon S \to A including a subset S \subseteq A into A is injective.
3812Proof#381unstable-381.xml2024130Jon Sterlingjms-00N2
Immediate: the inclusion maps a given element to itself!
3819Theoremjms-00N3jms-00N3.xmlClosure of injections under identity and composition2024130Jon SterlingMarcelo FioreThe identity function is an injection and the composition of injections yields a injection.
3817Proof#380unstable-380.xml2024130Jon Sterlingjms-00N3
The identity function is clearly injective, as a_0=a_1 implies a_0=a_1.
Let f \colon A \rightarrowtail B and g \colon B \rightarrowtail C be injections. To show that the composite function g \circ f \colon A \to C is injective, we fix a_0,a_1 \in A to check that g { \mathopen {} \left ( f { \mathopen {} \left ( a_0 \right ) \mathclose {}} \right ) \mathclose {}} = g { \mathopen {} \left ( f { \mathopen {} \left ( a_1 \right ) \mathclose {}} \right ) \mathclose {}} implies a_0=a_1. Because g is injective, we have f { \mathopen {} \left ( a_0 \right ) \mathclose {}} =f { \mathopen {} \left ( a_1 \right ) \mathclose {}}; therefore, because f is injective, we have a_0=a_1 as desired.
Now that we have enough definitions in hand to see it, the import of is then to state that a function is bijective if and only if it is both injective and surjective.3824Propositionjms-00N4jms-00N4.xmlCounting injections between finite sets2024130Jon SterlingMarcelo FioreFor finite sets A and B, we can describe number of injections from A to B as follows: \mathord { \# } { \mathrm {Inj} { \mathopen {} \left ( A , B \right ) \mathclose {}} } = \begin {cases} { \mathord { \# } {B} \choose \mathord { \# } {A}} \cdot { \mathopen {} \left ( \mathord { \# } {A} \right ) \mathclose {}} ! & \text {when } \mathord { \# } {A} \leq \mathord { \# } {B} \\ 0 & \text {otherwise} \end {cases}
3822Proof#379unstable-379.xml2024130Jon Sterlingjms-00N4
We will argue from a combinatoric perspective.
An injection f from A to B associates no more than one a \in A to the same b \in B. Thus if we consider the subset U \subseteq B spanned by elements of the form f { \mathopen {} \left ( a \right ) \mathclose {}}, we should then get a bijection from A to Uas each b \in U comes from a unique a \in A.
Therefore, choosing an injection amounts to making two independent choices: first we choose a subset of B that is in bijection with A, and then we choose a specific bijection from A to that subset. The number of such subsets is precisely \mathord { \# } {B} \choose \mathord { \# } {A}, and we have already seen in that the number of such bijections is the factorial { \mathopen {} \left ( \mathord { \# } {A}! \right ) \mathclose {}}.
Of course, if there is no subset of B equinumerous with A, then there can be no injection.
3909jms-00N5jms-00N5.xmlLecture 19: relational images, families, diagonalisation, well-foundedness202422Jon SterlingMarcelo Fiore
3833jms-00K0jms-00K0.xmlAuthorship statement2024121Jon Sterlingjms-00JBThese lecture notes were prepared by Jon Sterling using Marcelo Fiore’s lectures as source material. Any mistakes were introduced by Jon Sterling.
3841jms-00NCjms-00NC.xmlRelational images2024131Jon SterlingMarcelo Fiore3835Definitionjms-00N6jms-00N6.xmlDirect image2024131Jon SterlingMarcelo FioreLet R \colon A \mathbin { \nrightarrow } B be a relation. The direct image of X \subseteq A under R is the set R _* X \subseteq B defined below: R _* X = { \mathopen {} \left \{ b \in B \, \middle \vert \, \exists x \in X \mathpunct {.} x \mathrel {R}b \right \} \mathclose {}} 3838Definitionjms-00N7jms-00N7.xmlInverse image2024131Jon SterlingMarcelo FioreLet R \colon A \mathbin { \nrightarrow } B be a relation. The inverse image of Y \subseteq B under R is the set R ^* Y \subseteq A defined below: R ^* Y = { \mathopen {} \left \{ a \in A \, \middle \vert \, \forall b \in B \mathpunct {.} a \mathrel {R}b \implies b \in Y \right \} \mathclose {}} 3857jms-00NDjms-00ND.xmlFamilies of sets and replacement2024131Jon SterlingMarcelo Fiore3844Definitionjms-00N9jms-00N9.xmlFamily of sets2024131Jon SterlingA family of sets indexed in a set I is defined to be a set S equipped with a function \pi _S \colon S \to I. For i \in I, we will write S_i \subseteq S for the inverse image \pi _S ^* { \mathopen {} \left \{ i \right \} \mathclose {}} \subseteq S: \begin {aligned} S_i &= \pi _S ^* { \mathopen {} \left \{ i \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \forall j \in I \mathpunct {.} s \mathrel { \pi _S} j \implies j \in { \mathopen {} \left \{ i \right \} \mathclose {}} \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \forall j \in I \mathpunct {.} \pi _S { \mathopen {} \left ( s \right ) \mathclose {}} =j \implies j=i \right \} \mathclose {}} \\ &= { \mathopen {} \left \{ s \in S \, \middle \vert \, \pi _S { \mathopen {} \left ( s \right ) \mathclose {}} =j \right \} \mathclose {}} \end {aligned} Thus S is the disjoint union of all the S_i.3846Axiomjms-00N8jms-00N8.xmlAxiom scheme of replacement2024131Jon SterlingMarcelo FioreLet I be a set, and let \mathtt {P} { \mathopen {} \left ( \mathtt {x}, \mathtt {y} \right ) \mathclose {}} be a formula in the language of set theory such that \forall i \in I \mathpunct {.} \exists ! S \mathpunct {.} \mathtt {P} { \mathopen {} \left ( i,S \right ) \mathclose {}} holds. Then the collection of sets S such that \mathtt {P} { \mathopen {} \left ( i,S \right ) \mathclose {}} holds for some i \in I holds forms a set.This is almost saying that the direct image of I under \mathtt {P} is a set, but this doesn’t actually make sense because we do not have a way to speak about a “relation” between a set and the class of all sets. This strange situation is clarified by other foundational system that go beyond the limitations of the Zermelo-Fraenkel set theory that we have studied in this course.3849Examplejms-00NAjms-00NA.xmlThe set of iterated powersets2024131Jon Sterling may seem a bit obscure (and it is!). But an example of a set that we need replacement to define is the set of iterated powersets { \mathopen {} \left \{ \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} , \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \right ) \mathclose {}} , \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( \mathcal {P} { \mathopen {} \left ( X \right ) \mathclose {}} \right ) \mathclose {}} \right ) \mathclose {}} , \ldots \right \} \mathclose {}}.3854Propositionjms-00NBjms-00NB.xmlDisjoint unions of enumerable sets2024131Jon SterlingMarcelo FioreLet I be an enumerable set and let A be a family of sets indexed in I such that each A_i is enumerable. Then the disjoint union A = \coprod _{i \in I}A_i is enumerable.
3852Proof#378unstable-378.xml2024131Jon Sterlingjms-00NB
By assumption, we have an enumeration e_I \colon \mathbb {N} \twoheadrightarrow I; as each A_i is enumerable, we may use the axiom of choice to assign to each i \in I a specific e_i \colon \mathbb {N} \twoheadrightarrow A_i.
Let h \colon \mathbb {N} \twoheadrightarrow \mathbb {N} \times \mathbb {N} be any enumeration of \mathbb {N} \times \mathbb {N}. Define k \colon \mathbb {N} \times \mathbb {N} \to A as follows:
k { \mathopen {} \left ( m,n \right ) \mathclose {}} = e_{e_I { \mathopen {} \left ( m \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}}
The function k \colon \mathbb {N} \times \mathbb {N} \to A is surjective: fixing a \in A, we must find { \mathopen {} \left ( m,n \right ) \mathclose {}} such that k { \mathopen {} \left ( m,n \right ) \mathclose {}} =a. Let i= \pi _A { \mathopen {} \left ( a \right ) \mathclose {}}; because e_I is surjective, we may find m \in \mathbb {N} such that e_I { \mathopen {} \left ( m \right ) \mathclose {}} = \pi _A { \mathopen {} \left ( a \right ) \mathclose {}}, and because e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } is surjective, we may find n \in \mathbb {N} such that e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} = a. Therefore, we have k { \mathopen {} \left ( m,n \right ) \mathclose {}} = e_{e_I { \mathopen {} \left ( m \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} = e_{ \pi _A { \mathopen {} \left ( a \right ) \mathclose {}} } { \mathopen {} \left ( n \right ) \mathclose {}} =a.
3880jms-00NEjms-00NE.xmlDiagonalisation and fixed point theorems2024131Jon SterlingMarcelo Fiore3862Theoremjms-00NFjms-00NF.xmlCantor’s theorem2024131Jon SterlingMarcelo FioreGiven any set A, there can be no surjection from A to \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}}.
3860Proof#377unstable-377.xml2024131Jon Sterlingjms-00NF
Assume that we do have a surjection e \colon A \twoheadrightarrow \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}}. Consider the subset U \in \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} defined below:
U = { \mathopen {} \left \{ x \in A \, \middle \vert \, x \not \in e { \mathopen {} \left ( x \right ) \mathclose {}} \right \} \mathclose {}}
Because e \colon A \twoheadrightarrow \mathcal {P} { \mathopen {} \left ( A \right ) \mathclose {}} is surjective, we may find some a \in A such that e { \mathopen {} \left ( a \right ) \mathclose {}} =U.
Thus for each x \in A, we have
x \in e { \mathopen {} \left ( a \right ) \mathclose {}} \Longleftrightarrow x \in U \Longleftrightarrow x \not \in e { \mathopen {} \left ( x \right ) \mathclose {}} . Setting x:= a, we have a \in e { \mathopen {} \left ( a \right ) \mathclose {}} \Longleftrightarrow a \not \in e { \mathopen {} \left ( a \right ) \mathclose {}}, a contradiction.
3865Definitionjms-00NGjms-00NG.xml