Skip to content

Introduction

Our search mechanism is based on Elasticsearch. We store the gallery, video, and tag information in JSON format.

There are 3 ways to search the content in the Web UI.

The HTML input element in the top navigation is the entry point for the text search. It supports some special search grammar, e.g. including or excluding field value under specific field name.

Field name and field value

For example, we have a JSON file:

{
    ...
    "tags": {
        "langauge": ["Chinese", "English"]
    }
    ...
}

tags is the field name. Chinese or English is the field value.

Web screenshot

Alt Text

Grammar

Special character

Operator Description
- Exclude for field name or field value under specific field name.
" Everything between the quotes after the = character is treated as a field value.
= Separator for field name and field value.

Examples

Developer notes

This section is generated by python cli.py build docs print-search-grammar-examples.

  • Excludes all documents that have a tags.language field.
-tags.language
{
    "bool": {
        "should": [],
        "must_not": [
            {
                "exists": {
                    "field": "tags.language"
                }
            }
        ]
    }
}
  • Excludes documents that have an english value in a tags.language field.
-tags.language=english

or

-tags.language="english"

or

-"tags.language"=english

or

-"tags.language"="english"
{
    "bool": {
        "should": [],
        "must_not": [
            {
                "term": {
                    "tags.language.keyword": {
                        "value": "english"
                    }
                }
            }
        ]
    }
}
  • Includes documents that have a 中文 value in a tags.language field.
tags.language=中文

or

tags.language="中文"

or

"tags.language"=中文

or

"tags.language"="中文"
{
    "bool": {
        "should": [
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "中文",
                            "fuzziness": 0,
                            "fields": "tags.language.*"
                        }
                    }
                }
            }
        ],
        "must_not": []
    }
}
  • Includes documents that have a =中文 value in a tags.language field.
tags.language==中文
{
    "bool": {
        "should": [
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "=中文",
                            "fuzziness": 0,
                            "fields": "tags.language.*"
                        }
                    }
                }
            }
        ],
        "must_not": []
    }
}
  • Includes documents that have a =中=文= value in a tags.language field.
tags.language====
{
    "bool": {
        "should": [
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "=中=文=",
                            "fuzziness": 0,
                            "fields": "tags.language.*"
                        }
                    }
                }
            }
        ],
        "must_not": []
    }
}
  • Includes documents that have an English (UK) value in a tags.language field.
tags.language="English (UK)"
{
    "bool": {
        "should": [
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "English (UK)",
                            "fuzziness": 0,
                            "fields": "tags.language.*"
                        }
                    }
                }
            }
        ],
        "must_not": []
    }
}
  • If there is an odd number of quotes, it is treated as a normal keyword string.
"["社會" (歷史)] 今天天氣真好=(三國演義) [Chn]
{
    "bool": {
        "should": [
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "\"[\"社會\"",
                            "fuzziness": 0,
                            "fields": [
                                "tags.*"
                            ]
                        }
                    }
                }
            },
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "(歷史)]",
                            "fuzziness": 0,
                            "fields": [
                                "tags.*"
                            ]
                        }
                    }
                }
            },
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "今天天氣真好=(三國演義)",
                            "fuzziness": 0,
                            "fields": [
                                "tags.*"
                            ]
                        }
                    }
                }
            },
            {
                "constant_score": {
                    "filter": {
                        "multi_match": {
                            "query": "[Chn]",
                            "fuzziness": 0,
                            "fields": [
                                "tags.*"
                            ]
                        }
                    }
                }
            }
        ],
        "must_not": []
    }
}

With additional parameters

At the end of the top input navigation, near the magnifier icon, there is a rounded-expand-more icon. Click on the rounded-expand-more icon to expand the dropdown.

Web screenshot

Alt Text

Here are the parameters.

Parameter Description
Base The order of returning results.
Analyzer The algorithm used to convert the text into tokens.
Fuzziness Fuzziness1 of the text search.
Boolean The relationship between keywords.
Custom The custom Elasticsearch query.

Additional parameters

Base

Search performs the default search which will split the input text with space and searches that splitted keywords in mutiple fields.

Random will return the same results as Search, but they will be returned randomly.

Analyzer

The word Analyzer has two meanings in ZetsuBou. One is the Elasticsearch Analyzer and the other is the Web Search Analyzer.

Elasticsearch Analyzer is a field name of the form *.<analyzer>. For example, raw_name.default means field name raw_name and analyzer default.

Web Search Analyzer is a dropdown list in the Web UI. It is a combination of the Elasticsearch Analyzer.

Developer notes

You can see the following tabs in JSON format by python cli.py build docs print-web-search-analyzer.

The following two tabs are the details of Web Search Analyzer.

Default

Full field name Field name Elasticsearch analyzer
path.url path url
name.default name default
raw_name.default raw_name default
src.url src url
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Keyword

Full field name Field name Elasticsearch analyzer
path.keyword path keyword
name.keyword name keyword
raw_name.keyword raw_name keyword
src.keyword src keyword
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Ngram

Full field name Field name Elasticsearch analyzer
path.ngram path ngram
name.ngram name ngram
raw_name.ngram raw_name ngram
src.ngram src ngram
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Standard

Full field name Field name Elasticsearch analyzer
path.standard path standard
name.standard name standard
raw_name.standard raw_name standard
src.standard src standard
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

URL

Full field name Field name Elasticsearch analyzer
path.url path url
src.url src url

Default

Full field name Field name Elasticsearch analyzer
path.url path url
name.default name default
raw_name.default raw_name default
src.url src url
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Keyword

Full field name Field name Elasticsearch analyzer
path.keyword path keyword
name.keyword name keyword
raw_name.keyword raw_name keyword
src.keyword src keyword
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Ngram

Full field name Field name Elasticsearch analyzer
path.ngram path ngram
name.ngram name ngram
raw_name.ngram raw_name ngram
src.ngram src ngram
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

Standard

Full field name Field name Elasticsearch analyzer
path.standard path standard
name.standard name standard
raw_name.standard raw_name standard
src.standard src standard
attributes.category attributes.category keyword
attributes.uploader attributes.uploader keyword
labels labels keyword
tags.* tags.* keyword

URL

Full field name Field name Elasticsearch analyzer
path.url path url
src.url src url

Fuzziness

The fuzziness value represents the minimum steps from one text sequence to another.^[2]

Boolean

The default value is Should, which means it will return the results that match any of the keywords under any of the field names.

Must means that all tokens must be hit.

Custom

Users can create their own Elasticsearch query at http://localhost:3000/settings/elasticsearch-search.

Web screenshot

Alt Text


Similar to searching with additional parameters, but more detailed.

Web screenshot

Alt Text Alt Text


Last update: September 12, 2023