+## Custom query preprocessors
+
+A query preprocessor must export a single factory function `create` with
+the following signature:
+
+``` python
+create(self, config: QueryConfig) -> Callable[[list[Phrase]], list[Phrase]]
+```
+
+The function receives the custom configuration for the preprocessor and
+returns a callable (function or class) with the actual preprocessing
+code. When a query comes in, then the callable gets a list of phrases
+and needs to return the transformed list of phrases. The list and phrases
+may be changed in place or a completely new list may be generated.
+
+The `QueryConfig` is a simple dictionary which contains all configuration
+options given in the yaml configuration of the ICU tokenizer. It is up to
+the function to interpret the values.
+
+A `nominatim_api.search.Phrase` describes a part of the query that contains one or more independent
+search terms. Breaking a query into phrases helps reducing the number of
+possible tokens Nominatim has to take into account. However a phrase break
+is definitive: a multi-term search word cannot go over a phrase break.
+A Phrase object has two fields:
+
+ * `ptype` further refines the type of phrase (see list below)
+ * `text` contains the query text for the phrase
+
+The order of phrases matters to Nominatim when doing further processing.
+Thus, while you may split or join phrases, you should not reorder them
+unless you really know what you are doing.
+
+Phrase types (`nominatim_api.search.PhraseType`) can further help narrowing
+down how the tokens in the phrase are interpreted. The following phrase types
+are known:
+
+::: nominatim_api.search.PhraseType
+ options:
+ heading_level: 6
+
+