`, from the input text and renders plain text. The filter can be configured to preserve certain tags or decode specific HTML entities, such as ` `, into spaces.
+
+## Example: HTML analyzer
+
+```json
+GET /_analyze
+{
+ "tokenizer": "keyword",
+ "char_filter": [
+ "html_strip"
+ ],
+ "text": "Commonly used calculus symbols include α, β and θ
"
+}
+```
+{% include copy-curl.html %}
+
+Using the HTML analyzer, you can convert the HTML character entity references into their corresponding symbols. The processed text would read as follows:
+
+```
+Commonly used calculus symbols include α, β and θ
+```
+
+## Example: Custom analyzer with lowercase filter
+
+The following example query creates a custom analyzer that strips HTML tags and converts the plain text to lowercase by using the `html_strip` analyzer and `lowercase` filter:
+
+```json
+PUT /html_strip_and_lowercase_analyzer
+{
+ "settings": {
+ "analysis": {
+ "char_filter": {
+ "html_filter": {
+ "type": "html_strip"
+ }
+ },
+ "analyzer": {
+ "html_strip_analyzer": {
+ "type": "custom",
+ "char_filter": ["html_filter"],
+ "tokenizer": "standard",
+ "filter": ["lowercase"]
+ }
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+### Testing `html_strip_and_lowercase_analyzer`
+
+You can run the following request to test the analyzer:
+
+```json
+GET /html_strip_and_lowercase_analyzer/_analyze
+{
+ "analyzer": "html_strip_analyzer",
+ "text": "Welcome to OpenSearch!
"
+}
+```
+{% include copy-curl.html %}
+
+In the response, the HTML tags have been removed and the plain text has been converted to lowercase:
+
+```
+welcome to opensearch!
+```
+
+## Example: Custom analyzer that preserves HTML tags
+
+The following example request creates a custom analyzer that preserves HTML tags:
+
+```json
+PUT /html_strip_preserve_analyzer
+{
+ "settings": {
+ "analysis": {
+ "char_filter": {
+ "html_filter": {
+ "type": "html_strip",
+ "escaped_tags": ["b", "i"]
+ }
+ },
+ "analyzer": {
+ "html_strip_analyzer": {
+ "type": "custom",
+ "char_filter": ["html_filter"],
+ "tokenizer": "keyword"
+ }
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+### Testing `html_strip_preserve_analyzer`
+
+You can run the following request to test the analyzer:
+
+```json
+GET /html_strip_preserve_analyzer/_analyze
+{
+ "analyzer": "html_strip_analyzer",
+ "text": "This is a bold and italic text.
"
+}
+```
+{% include copy-curl.html %}
+
+In the response, the `italic` and `bold` tags have been retained, as specified in the custom analyzer request:
+
+```
+This is a bold and italic text.
+```
diff --git a/_analyzers/character-filters/index.md b/_analyzers/character-filters/index.md
new file mode 100644
index 0000000000..0e2ce01b8c
--- /dev/null
+++ b/_analyzers/character-filters/index.md
@@ -0,0 +1,19 @@
+---
+layout: default
+title: Character filters
+nav_order: 90
+has_children: true
+has_toc: false
+---
+
+# Character filters
+
+Character filters process text before tokenization to prepare it for further analysis.
+
+Unlike token filters, which operate on tokens (words or terms), character filters process the raw input text before tokenization. They are especially useful for cleaning or transforming structured text containing unwanted characters, such as HTML tags or special symbols. Character filters help to strip or replace these elements so that text is properly formatted for analysis.
+
+Use cases for character filters include:
+
+- **HTML stripping:** Removes HTML tags from content so that only the plain text is indexed.
+- **Pattern replacement:** Replaces or removes unwanted characters or patterns in text, for example, converting hyphens to spaces.
+- **Custom mappings:** Substitutes specific characters or sequences with other values, for example, to convert currency symbols into their textual equivalents.