In addition to the operators described in the advanced search section, DocGoblin also lets you use wildcards and regular expressions to create more flexible search patterns.

Wildcards

Wildcards are special characters that replace one or more characters in a word. The two available wildcards are ? and *.

The ? wildcard replaces a single character. For example, the search te?t will find test, text, and any other four-letter word starting with te and ending with t.

The * wildcard replaces zero or more characters. For example, the search test* will find test, tests, testing, and any other word starting with test.

You can combine both wildcards in a single search. For example, the search t?st* will find test, testing, tastes, and any other word matching this pattern.

Regular expressions

Regular expressions (regexes) are a powerful way to describe text patterns. DocGoblin supports a subset of regex syntax through Apache Lucene.

To use a regex in your search, place the pattern between forward slashes: /pattern/. The following sections describe the most useful regex constructs.

Any character

The dot . matches any single character. It is the regex equivalent of the ? wildcard.

Examples:

One or more occurrences

The plus sign + matches one or more occurrences of the preceding character.

Example (matches der, deer, deeer, etc.):

Zero or more occurrences

The asterisk * matches zero or more occurrences of the preceding character.

Example (matches wd, wed, weed, etc.):

Zero or one occurrence

The question mark ? matches zero or one occurrence of the preceding character, making it optional.

Example (matches wed and weed):

Minimum and maximum occurrences

Curly braces {} let you specify an exact number or a range of occurrences of the preceding character:

Examples:

Grouping

Parentheses () let you group characters together so that quantifiers apply to the whole group rather than a single character.

Examples:

Alternation

The pipe | matches either the expression on its left or the expression on its right. It is typically used inside a group.

Example (matches preparations and proportions):

Character classes

Character classes let you match a single character from a set of characters.

You define a character class by placing the accepted characters between square brackets []. You can also specify a range of characters by using a hyphen -.

Available forms:

As shown above, the caret ^ at the start of a character class negates it, matching any character not in the set.

Character classes can be combined with all other regex constructs to build complex search patterns.

In order to retrieve the string “weed”, the following expression could be used:

Learn more

DocGoblin uses the Apache Lucene regex engine. For the complete syntax reference, see the official Lucene documentation.

If you want to experiment with regexes in a sandbox environment, try regex101.com, a free online regex tester.