nLab

# Contents

## Searching the nLab

There are two methods of searching the nLab:

1. The built-in search. This is via the search box at the top of every page. The distinguishing characteristics of this search are:

1. It uses regular expressions.
2. It searches the source of each page.
2. An external search engine. Most search engines allow you to restrict the search to a single site. The best site to use for the nLab is http://ncatlab.org/nlab. The distinguishing characteristics of an external search are:

1. Most just search for alphanumeric characters.
2. It searches the rendered version of each page.

## Regular Expressions

Regular expressions are a powerful way of extending search capabilities to take into account that one often wants to search for more than just a set phrase. In a regular expression, certain characters are declared to be “special” and have a particular interpretation (somewhat like TeX with its special catcodes). A special character can always be “escaped” to interpret it as an ordinary character. Thus . means “match any single character” but \. means “match a period”.

As Instiki is written in ruby, it uses the ruby version of regular expressions (each language has its own version; the differences are usually minor). The following is based on the list at ruby-doc. It has been condensed slightly to those aspects likely to be of use here:

Special characters
The special characters are: ., |, (, ), [, \, ^, {, +, $, *, and ?. To match one of these characters, precede it with a backslash. All other characters ordinarily just match themselves unless they are made somehow special by one of the special characters. \b, \B Match word boundaries and nonword boundaries respectively. Thus cat matches against category and cat but cat\b only matches cat (and scat). [] This matches against a single character in a list. The characters |, (, ), [, ^, $, *, ? are treated as regular characters in such a list. You can specify a range using -: thus, a-z. To include a ] or - it must come at the start of the list. A ^ at the start negates the list.
\d, \s, \w
These match, respectively, digits, spaces, and word characters.
\D, \S, \W
These are the negations of the lowercase versions.
. (period)
Matches any character except a newline.
()
Parentheses group pieces of the regular expression. This is important for the following modifications. In these, $x$ stands for a sub-expression which can be a single character, a [], or a ().
$x$*
Matches zero or more occurrences of $x$. Thus ab* matches a, ab, abb, and so forth. Similarly, (cat)* matches cat, catcat, catcatcat, and so forth. This will try to match as much as possible; use $x$*? to make it match as little as possible.
$x$+
Matches one or more occurrences of $x$. Thus ab+ matches ab, abb, but not a. This will try to match as much as possible; use $x$+? to make it match as little as possible.
$x${$m$,$n$}
Matches at least $m$ and at most $n$ occurrences of $re$. This will try to match as much as possible; use $x${$m$,$n$}? to make it match as little as possible.
$x$?
Matches zero or one occurrence of $x$.
$x_1$|$x_2$
Matches either $x_1$ or $x_2$.

category: meta

Revised on January 22, 2012 05:53:50 by Toby Bartels (71.29.67.53)