diff --git a/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt index aa04726e42..11ff53efe6 100644 --- a/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt +++ b/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt @@ -127,6 +127,7 @@ stdout [Ss]ubvector [Ss]ubwords? [Ss]uperset +[Ss]uperadmins? [Ss]yslog tebibyte [Tt]emplated diff --git a/_about/index.md b/_about/index.md index d2cc011b55..041197eeba 100644 --- a/_about/index.md +++ b/_about/index.md @@ -22,16 +22,21 @@ This section contains documentation for OpenSearch and OpenSearch Dashboards. ## Getting started -- [Intro to OpenSearch]({{site.url}}{{site.baseurl}}/intro/) -- [Quickstart]({{site.url}}{{site.baseurl}}/quickstart/) +To get started, explore the following documentation: + +- [Getting started guide]({{site.url}}{{site.baseurl}}/getting-started/): + - [Intro to OpenSearch]({{site.url}}{{site.baseurl}}/getting-started/intro/) + - [Installation quickstart]({{site.url}}{{site.baseurl}}/getting-started/quickstart/) + - [Communicate with OpenSearch]({{site.url}}{{site.baseurl}}/getting-started/communicate/) + - [Ingest data]({{site.url}}{{site.baseurl}}/getting-started/ingest-data/) + - [Search data]({{site.url}}{{site.baseurl}}/getting-started/search-data/) + - [Getting started with OpenSearch security]({{site.url}}{{site.baseurl}}/getting-started/security/) - [Install OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/install-opensearch/index/) - [Install OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/install-and-configure/install-dashboards/index/) -- [See the FAQ](https://opensearch.org/faq) +- [FAQ](https://opensearch.org/faq) ## Why use OpenSearch? -With OpenSearch, you can perform the following use cases: - @@ -41,35 +46,38 @@ With OpenSearch, you can perform the following use cases: - - - - + + + + - + - +
Operational health tracking
Fast, Scalable Full-text SearchApplication and Infrastructure MonitoringSecurity and Event Information ManagementOperational Health TrackingFast, scalable full-text searchApplication and infrastructure monitoringSecurity and event information managementOperational health tracking
Help users find the right information within your application, website, or data lake catalog. Easily store and analyze log data, and set automated alerts for underperformance.Easily store and analyze log data, and set automated alerts for performance issues. Centralize logs to enable real-time security monitoring and forensic analysis.Use observability logs, metrics, and traces to monitor your applications and business in real time.Use observability logs, metrics, and traces to monitor your applications in real time.
-**Additional features and plugins:** +## Key features + +OpenSearch provides several features to help index, secure, monitor, and analyze your data: -OpenSearch has several features and plugins to help index, secure, monitor, and analyze your data. Most OpenSearch plugins have corresponding OpenSearch Dashboards plugins that provide a convenient, unified user interface. -- [Anomaly detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/) - Identify atypical data and receive automatic notifications -- [KNN]({{site.url}}{{site.baseurl}}/search-plugins/knn/) - Find “nearest neighbors” in your vector data -- [Performance Analyzer]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/) - Monitor and optimize your cluster -- [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) - Use SQL or a piped processing language to query your data -- [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/) - Automate index operations -- [ML Commons plugin]({{site.url}}{{site.baseurl}}/ml-commons-plugin/index/) - Train and execute machine-learning models -- [Asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/) - Run search requests in the background -- [Cross-cluster replication]({{site.url}}{{site.baseurl}}/replication-plugin/index/) - Replicate your data across multiple OpenSearch clusters +- [Anomaly detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/) -- Identify atypical data and receive automatic notifications. +- [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) -- Use SQL or a Piped Processing Language (PPL) to query your data. +- [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/) -- Automate index operations. +- [Search methods]({{site.url}}{{site.baseurl}}/search-plugins/knn/) -- From traditional lexical search to advanced vector and hybrid search, discover the optimal search method for your use case. +- [Machine learning]({{site.url}}{{site.baseurl}}/ml-commons-plugin/index/) -- Integrate machine learning models into your workloads. +- [Workflow automation]({{site.url}}{{site.baseurl}}/automating-configurations/index/) -- Automate complex OpenSearch setup and preprocessing tasks. +- [Performance evaluation]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/) -- Monitor and optimize your cluster. +- [Asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/) -- Run search requests in the background. +- [Cross-cluster replication]({{site.url}}{{site.baseurl}}/replication-plugin/index/) -- Replicate your data across multiple OpenSearch clusters. ## The secure path forward -OpenSearch includes a demo configuration so that you can get up and running quickly, but before using OpenSearch in a production environment, you must [configure the Security plugin manually]({{site.url}}{{site.baseurl}}/security/configuration/index/) with your own certificates, authentication method, users, and passwords. + +OpenSearch includes a demo configuration so that you can get up and running quickly, but before using OpenSearch in a production environment, you must [configure the Security plugin manually]({{site.url}}{{site.baseurl}}/security/configuration/index/) with your own certificates, authentication method, users, and passwords. To get started, see [Getting started with OpenSearch security]({{site.url}}{{site.baseurl}}/getting-started/security/). ## Looking for the Javadoc? diff --git a/_automating-configurations/api/index.md b/_automating-configurations/api/index.md index 716e19c41f..78bc4eaede 100644 --- a/_automating-configurations/api/index.md +++ b/_automating-configurations/api/index.md @@ -18,4 +18,6 @@ OpenSearch supports the following workflow APIs: * [Search workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/search-workflow/) * [Search workflow state]({{site.url}}{{site.baseurl}}/automating-configurations/api/search-workflow-state/) * [Deprovision workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/deprovision-workflow/) -* [Delete workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/delete-workflow/) \ No newline at end of file +* [Delete workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/delete-workflow/) + +For information about workflow access control, see [Workflow template security]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-security/). \ No newline at end of file diff --git a/_automating-configurations/index.md b/_automating-configurations/index.md index 144ad445c8..68742f6149 100644 --- a/_automating-configurations/index.md +++ b/_automating-configurations/index.md @@ -44,3 +44,4 @@ Workflow automation provides the following benefits: - For the workflow step syntax, see [Workflow steps]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-steps/). - For a complete example, see [Workflow tutorial]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-tutorial/). - For configurable settings, see [Workflow settings]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-settings/). +- For information about workflow access control, see [Workflow template security]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-security/). \ No newline at end of file diff --git a/_automating-configurations/workflow-security.md b/_automating-configurations/workflow-security.md new file mode 100644 index 0000000000..f3a3d7eeb9 --- /dev/null +++ b/_automating-configurations/workflow-security.md @@ -0,0 +1,93 @@ +--- +layout: default +title: Workflow template security +nav_order: 50 +--- + +# Workflow template security + +In OpenSearch, automated workflow configurations are provided by the Flow Framework plugin. You can use the Security plugin together with the Flow Framework plugin to limit non-admin users to specific actions. For example, you might want some users to only be able to create, update, or delete workflows, while others may only be able to view workflows. + +All Flow Framework indexes are protected as system indexes. Only a superadmin user or an admin user with a TLS certificate can access system indexes. For more information, see [System indexes]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/). + +Security for Flow Framework is set up similarly to [security for anomaly detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/security/). + +## Basic permissions + +As an admin user, you can use the Security plugin to assign specific permissions to users based on the APIs they need to access. For a list of supported Flow Framework APIs, see [Workflow APIs]({{site.url}}{{site.baseurl}}/automating-configurations/api/index/). + +The Security plugin has two built-in roles that cover most Flow Framework use cases: `flow_framework_full_access` and `flow_framework_read_access`. For descriptions of each, see [Predefined roles]({{site.url}}{{site.baseurl}}/security/access-control/users-roles#predefined-roles). + +If these roles don't meet your needs, you can assign users individual Flow Framework [permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions/) to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/flow_framework/workflow/search` permission lets you search workflows. + +### Fine-grained access control + +To reduce the chances of unintended users viewing metadata that describes an index, we recommend that administrators enable role-based access control when assigning permissions to the intended user group. For more information, see [Limit access by backend role](#advanced-limit-access-by-backend-role). + +## (Advanced) Limit access by backend role + +Use backend roles to configure fine-grained access to individual workflows based on roles. For example, users in different departments of an organization can view workflows owned by their own department. + +First, make sure your users have the appropriate [backend roles]({{site.url}}{{site.baseurl}}/security/access-control/index/). Backend roles usually come from an [LDAP server]({{site.url}}{{site.baseurl}}/security/configuration/ldap/) or [SAML provider]({{site.url}}{{site.baseurl}}/security/configuration/saml/), but if you use an internal user database, you can [create users manually using the API]({{site.url}}{{site.baseurl}}/security/access-control/api#create-user). + +Next, enable the following setting: + +```json +PUT _cluster/settings +{ + "transient": { + "plugins.flow_framework.filter_by_backend_roles": "true" + } +} +``` +{% include copy-curl.html %} + +Now when users view workflow resources in OpenSearch Dashboards (or make REST API calls), they only see workflows created by users who share at least one backend role. + +For example, consider two users: `alice` and `bob`. + +`alice` has an `analyst` backend role: + +```json +PUT _plugins/_security/api/internalusers/alice +{ + "password": "alice", + "backend_roles": [ + "analyst" + ], + "attributes": {} +} +``` + +`bob` has a `human-resources` backend role: + +```json +PUT _plugins/_security/api/internalusers/bob +{ + "password": "bob", + "backend_roles": [ + "human-resources" + ], + "attributes": {} +} +``` + +Both `alice` and `bob` have full access to the Flow Framework APIs: + +```json +PUT _plugins/_security/api/rolesmapping/flow_framework_full_access +{ + "backend_roles": [], + "hosts": [], + "users": [ + "alice", + "bob" + ] +} +``` + +Because they have different backend roles, `alice` and `bob` cannot view each other's workflows or their results. + +Users without backend roles can still view other users' workflow results if they have `flow_framework_read_access`. This also applies to users who have `flow_framework_full_access` because this permission includes all of the permissions of `flow_framework_read_access`. + +Administrators should inform users that the `flow_framework_read_access` permission allows them to view the results of any workflow in a cluster, including data not directly accessible to them. To limit access to the results of a specific workflow, administrators should apply backend role filters when creating the workflow. This ensures that only users with matching backend roles can access that workflow's results. \ No newline at end of file diff --git a/_dashboards/management/connect-prometheus.md b/_dashboards/management/connect-prometheus.md new file mode 100644 index 0000000000..ed5545bc56 --- /dev/null +++ b/_dashboards/management/connect-prometheus.md @@ -0,0 +1,53 @@ +--- +layout: default +title: Connecting Prometheus to OpenSearch +parent: Data sources +nav_order: 20 +--- + +# Connecting Prometheus to OpenSearch +Introduced 2.16 +{: .label .label-purple } + +This documentation covers the key steps to connect Prometheus to OpenSearch using the OpenSearch Dashboards interface, including setting up the data source connection, modifying the connection details, and creating an index pattern for the Prometheus data. + +## Prerequisites and permissions + +Before connecting a data source, ensure you have met the [Prerequisites]({{site.url}}{{site.baseurl}}/dashboards/management/data-sources/#prerequisites) and have the necessary [Permissions]({{site.url}}{{site.baseurl}}/dashboards/management/data-sources/#permissions). + +## Create a Prometheus data source connection + +A data source connection specifies the parameters needed to connect to a data source. These parameters form a connection string for the data source. Using OpenSearch Dashboards, you can add new **Prometheus** data source connections or manage existing ones. + +Follow these steps to connect your data source: + +1. From the OpenSearch Dashboards main menu, go to **Management** > **Data sources** > **New data source** > **Prometheus**. + +2. From the **Configure Prometheus data source** section: + + - Under **Data source details**, provide a title and optional description. + - Under **Prometheus data location**, enter the Prometheus URI. + - Under **Authentication details**, select the appropriate authentication method from the dropdown list and enter the required details: + - **Basic authentication**: Enter a username and password. + - **AWS Signature Version 4**: Specify the **Region**, select the OpenSearch service from the **Service Name** list (**Amazon OpenSearch Service** or **Amazon OpenSearch Serverless**), and enter the **Access Key** and **Secret Key**. + - Under **Query permissions**, choose the role needed to search and index data. If you select **Restricted**, an additional field will become available to configure the required role. + +3. Select **Review Configuration** > **Connect to Prometheus** to save your settings. The new connection will appear in the list of data sources. + +## Modify a data source connection + +To modify a data source connection, follow these steps: + +1. Select the desired connection from the list on the **Data sources** main page. This will open the **Connection Details** window. +2. Within the **Connection Details** window, edit the **Title** and **Description** fields. Select the **Save changes** button to apply the changes. +3. To update the **Authentication Method**, choose the method from the dropdown list and enter any necessary credentials. Select **Save changes** to apply the changes. + - To update the **Basic authentication** authentication method, select the **Update stored password** button. Within the pop-up window, enter the updated password and confirm it and select **Update stored password** to save the changes. To test the connection, select the **Test connection** button. + - To update the **AWS Signature Version 4** authentication method, select the **Update stored AWS credential** button. Within the pop-up window, enter the updated access and secret keys and select **Update stored AWS credential** to save the changes. To test the connection, select the **Test connection** button. + +## Delete a data source connection + +To delete the data source connection, select the {::nomarkdown}delete icon{:/} icon. + +## Create an index pattern + +After creating a data source connection, the next step is to create an index pattern for that data source. For more information and a tutorial on index patterns, refer to [Index patterns]({{site.url}}{{site.baseurl}}/dashboards/management/index-patterns/). diff --git a/_dashboards/management/data-sources.md b/_dashboards/management/data-sources.md index fdd4edc150..62d3a5aab2 100644 --- a/_dashboards/management/data-sources.md +++ b/_dashboards/management/data-sources.md @@ -13,67 +13,37 @@ This documentation focuses on using the OpenSeach Dashboards interface to connec ## Prerequisites -The first step in connecting your data sources to OpenSearch is to install OpenSearch and OpenSearch Dashboards on your system. You can follow the installation instructions in the [OpenSearch documentation]({{site.url}}{{site.baseurl}}/install-and-configure/index/) to install these tools. +The first step in connecting your data sources to OpenSearch is to install OpenSearch and OpenSearch Dashboards on your system. Refer to the [installation instructions]({{site.url}}{{site.baseurl}}/install-and-configure/index/) for information. Once you have installed OpenSearch and OpenSearch Dashboards, you can use Dashboards to connect your data sources to OpenSearch and then use Dashboards to manage data sources, create index patterns based on those data sources, run queries against a specific data source, and combine visualizations in one dashboard. Configuration of the [YAML files]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/#configuration-file) and installation of the `dashboards-observability` and `opensearch-sql` plugins is necessary. For more information, see [OpenSearch plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/). -## Permissions - -To work with data sources in OpenSearch Dashboards, make sure that the user has been assigned the correct cluster-level [data source permission]({{site.url}}{{site.baseurl}}/security/access-control/permissions#data-source-permissions). - - - -## Create a data source connection - -A data source connection specifies the parameters needed to connect to a data source. These parameters form a connection string for the data source. Using Dashboards, you can add new data source connections or manage existing ones. - -The following steps guide you through the basics of creating a data source connection: - -1. From the OpenSearch Dashboards main menu, select **Management** > **Data sources** > **Create data source connection**. The UI is shown in the following image. +To securely store and encrypt data source connections in OpenSearch, you must add the following configuration to the `opensearch.yml` file on all the nodes: - Connecting a data source UI +`plugins.query.datasources.encryption.masterkey: "YOUR_GENERATED_MASTER_KEY_HERE"` -2. Create the data source connection by entering the appropriate information into the **Connection Details** and **Authentication Method** fields. - - - Under **Connection Details**, enter a title and endpoint URL. For this tutorial, use the URL `http://localhost:5601/app/management/opensearch-dashboards/dataSources`. Entering a description is optional. +The key must be 16, 24, or 32 characters. You can use the following command to generate a 24-character key: - - Under **Authentication Method**, select an authentication method from the dropdown list. Once an authentication method is selected, the applicable fields for that method appear. You can then enter the required details. The authentication method options are: - - **No authentication**: No authentication is used to connect to the data source. - - **Username & Password**: A basic username and password are used to connect to the data source. - - **AWS SigV4**: An AWS Signature Version 4 authenticating request is used to connect to the data source. AWS Signature Version 4 requires an access key and a secret key. - - For AWS Signature Version 4 authentication, first specify the **Region**. Next, select the OpenSearch service in the **Service Name** list. The options are **Amazon OpenSearch Service** and **Amazon OpenSearch Serverless**. Lastly, enter the **Access Key** and **Secret Key** for authorization. +`openssl rand -hex 12` - After you have populated the required fields, the **Test connection** and **Create data source** buttons become active. You can select **Test connection** to confirm that the connection is valid. +Generating 12 bytes results in a hexadecimal string that is 12 * 2 = 24 characters. +{: .note} -3. Select **Create data source** to save your settings. The connection is created. The active window returns to the **Data sources** main page, and the new connection appears in the list of data sources. - -4. To delete a data source connection, select the checkbox to the left of the data source **Title** and then select the **Delete 1 connection** button. Selecting multiple checkboxes for multiple connections is supported. An example UI is shown in the following image. - - Deleting a data source UI - -### Modify a data source connection - -To make changes to a data source connection, select a connection in the list on the **Data sources** main page. The **Connection Details** window opens. - -To make changes to **Connection Details**, edit one or both of the **Title** and **Description** fields and select **Save changes** in the lower-right corner of the screen. You can also cancel changes here. To change the **Authentication Method**, choose a different authentication method, enter your credentials (if applicable), and then select **Save changes** in the lower-right corner of the screen. The changes are saved. - -When **Username & Password** is the selected authentication method, you can update the password by choosing **Update stored password** next to the **Password** field. In the pop-up window, enter a new password in the first field and then enter it again in the second field to confirm. Select **Update stored password** in the pop-up window. The new password is saved. Select **Test connection** to confirm that the connection is valid. +## Permissions -When **AWS SigV4** is the selected authentication method, you can update the credentials by selecting **Update stored AWS credential**. In the pop-up window, enter a new access key in the first field and a new secret key in the second field. Select **Update stored AWS credential** in the pop-up window. The new credentials are saved. Select **Test connection** in the upper-right corner of the screen to confirm that the connection is valid. +To work with data sources in OpenSearch Dashboards, you must be assigned the correct cluster-level [data source permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions#data-source-permissions). -To delete the data source connection, select the delete icon ({::nomarkdown}delete icon{:/}). +## Types of data streams -## Create an index pattern +To configure data sources through OpenSearch Dashboards, go to **Management** > **Dashboards Management** > **Data sources**. This flow can be used for OpenSearch data stream connections. See [Configuring and using multiple data sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/). -Once you've created a data source connection, you can create an index pattern for the data source. An _index pattern_ is a template that OpenSearch uses to create indexes for data from the data source. See [Index patterns]({{site.url}}{{site.baseurl}}/dashboards/management/index-patterns/) for more information and a tutorial. +Alternatively, if you are running OpenSearch Dashboards 2.16 or later, go to **Management** > **Data sources**. This flow can be used to connect Amazon Simple Storage Service (Amazon S3) and Prometheus. See [Connecting Amazon S3 to OpenSearch]({{site.url}}{{site.baseurl}}/dashboards/management/S3-data-source/) and [Connecting Prometheus to OpenSearch]({{site.url}}{{site.baseurl}}/dashboards/management/connect-prometheus/) for more information. ## Next steps - Learn about [managing index patterns]({{site.url}}{{site.baseurl}}/dashboards/management/index-patterns/) through OpenSearch Dashboards. - Learn about [indexing data using Index Management]({{site.url}}{{site.baseurl}}/dashboards/im-dashboards/index/) through OpenSearch Dashboards. - Learn about how to connect [multiple data sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/). -- Learn about how to [connect OpenSearch and Amazon S3 through OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/management/S3-data-source/). -- Learn about the [Integrations]({{site.url}}{{site.baseurl}}/integrations/index/) tool, which gives you the flexibility to use various data ingestion methods and connect data from the Dashboards UI. - +- Learn about how to connect [OpenSearch and Amazon S3]({{site.url}}{{site.baseurl}}/dashboards/management/S3-data-source/) and [OpenSearch and Prometheus]({{site.url}}{{site.baseurl}}/dashboards/management/connect-prometheus/) using the OpenSearch Dashboards interface. +- Learn about the [Integrations]({{site.url}}{{site.baseurl}}/integrations/index/) plugin, which gives you the flexibility to use various data ingestion methods and connect data to OpenSearch Dashboards. diff --git a/_field-types/index.md b/_field-types/index.md index 7a7e816ada..e9250f409d 100644 --- a/_field-types/index.md +++ b/_field-types/index.md @@ -12,43 +12,77 @@ redirect_from: # Mappings and field types -You can define how documents and their fields are stored and indexed by creating a _mapping_. The mapping specifies the list of fields for a document. Every field in the document has a _field type_, which defines the type of data the field contains. For example, you may want to specify that the `year` field should be of type `date`. To learn more, see [Supported field types]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/). +Mappings tell OpenSearch how to store and index your documents and their fields. You can specify the data type for each field (for example, `year` as `date`) to make storage and querying more efficient. -If you're just starting to build out your cluster and data, you may not know exactly how your data should be stored. In those cases, you can use dynamic mappings, which tell OpenSearch to dynamically add data and its fields. However, if you know exactly what types your data falls under and want to enforce that standard, then you can use explicit mappings. +While [dynamic mappings](#dynamic-mapping) automatically add new data and fields, using explicit mappings is recommended. Explicit mappings let you define the exact structure and data types upfront. This helps to maintain data consistency and optimize performance, especially for large datasets or high-volume indexing operations. -For example, if you want to indicate that `year` should be of type `text` instead of an `integer`, and `age` should be an `integer`, you can do so with explicit mappings. By using dynamic mapping, OpenSearch might interpret both `year` and `age` as integers. +For example, with explicit mappings, you can ensure that `year` is treated as text and `age` as an integer instead of both being interpreted as integers by dynamic mapping. -This section provides an example for how to create an index mapping and how to add a document to it that will get ip_range validated. - -#### Table of contents -1. TOC -{:toc} - - ---- ## Dynamic mapping When you index a document, OpenSearch adds fields automatically with dynamic mapping. You can also explicitly add fields to an index mapping. -#### Dynamic mapping types +### Dynamic mapping types Type | Description :--- | :--- -null | A `null` field can't be indexed or searched. When a field is set to null, OpenSearch behaves as if that field has no values. -boolean | OpenSearch accepts `true` and `false` as boolean values. An empty string is equal to `false.` -float | A single-precision 32-bit floating point number. -double | A double-precision 64-bit floating point number. -integer | A signed 32-bit number. -object | Objects are standard JSON objects, which can have fields and mappings of their own. For example, a `movies` object can have additional properties such as `title`, `year`, and `director`. -array | Arrays in OpenSearch can only store values of one type, such as an array of just integers or strings. Empty arrays are treated as though they are fields with no values. -text | A string sequence of characters that represent full-text values. -keyword | A string sequence of structured characters, such as an email address or ZIP code. +`null` | A `null` field can't be indexed or searched. When a field is set to null, OpenSearch behaves as if the field has no value. +`boolean` | OpenSearch accepts `true` and `false` as Boolean values. An empty string is equal to `false.` +`float` | A single-precision, 32-bit floating-point number. +`double` | A double-precision, 64-bit floating-point number. +`integer` | A signed 32-bit number. +`object` | Objects are standard JSON objects, which can have fields and mappings of their own. For example, a `movies` object can have additional properties such as `title`, `year`, and `director`. +`array` | OpenSearch does not have a specific array data type. Arrays are represented as a set of values of the same data type (for example, integers or strings) associated with a field. When indexing, you can pass multiple values for a field, and OpenSearch will treat it as an array. Empty arrays are valid and recognized as array fields with zero elements---not as fields with no values. OpenSearch supports querying and filtering arrays, including checking for values, range queries, and array operations like concatenation and intersection. Nested arrays, which may contain complex objects or other arrays, can also be used for advanced data modeling. +`text` | A string sequence of characters that represent full-text values. +`keyword` | A string sequence of structured characters, such as an email address or ZIP code. date detection string | Enabled by default, if new string fields match a date's format, then the string is processed as a `date` field. For example, `date: "2012/03/11"` is processed as a date. numeric detection string | If disabled, OpenSearch may automatically process numeric values as strings when they should be processed as numbers. When enabled, OpenSearch can process strings into `long`, `integer`, `short`, `byte`, `double`, `float`, `half_float`, `scaled_float`, and `unsigned_long`. Default is disabled. +### Dynamic templates + +Dynamic templates are used to define custom mappings for dynamically added fields based on the data type, field name, or field path. They allow you to define a flexible schema for your data that can automatically adapt to changes in the structure or format of the input data. + +You can use the following syntax to define a dynamic mapping template: + +```json +PUT index +{ + "mappings": { + "dynamic_templates": [ + { + "fields": { + "mapping": { + "type": "short" + }, + "match_mapping_type": "string", + "path_match": "status*" + } + } + ] + } +} +``` +{% include copy-curl.html %} + +This mapping configuration dynamically maps any field with a name starting with `status` (for example, `status_code`) to the `short` data type if the initial value provided during indexing is a string. + +### Dynamic mapping parameters + +The `dynamic_templates` support the following parameters for matching conditions and mapping rules. The default value is `null`. + +Parameter | Description | +----------|-------------| +`match_mapping_type` | Specifies the JSON data type (for example, string, long, double, object, binary, Boolean, date) that triggers the mapping. +`match` | A regular expression used to match field names and apply the mapping. +`unmatch` | A regular expression used to exclude field names from the mapping. +`match_pattern` | Determines the pattern matching behavior, either `regex` or `simple`. Default is `simple`. +`path_match` | Allows you to match nested field paths using a regular expression. +`path_unmatch` | Excludes nested field paths from the mapping using a regular expression. +`mapping` | The mapping configuration to apply. + ## Explicit mapping -If you know exactly what your field data types need to be, you can specify them in your request body when creating your index. +If you know exactly which field data types you need to use, then you can specify them in your request body when creating your index, as shown in the following example request: ```json PUT sample-index1 @@ -62,8 +96,9 @@ PUT sample-index1 } } ``` +{% include copy-curl.html %} -### Response +#### Response ```json { "acknowledged": true, @@ -71,8 +106,9 @@ PUT sample-index1 "index": "sample-index1" } ``` +{% include copy-curl.html %} -To add mappings to an existing index or data stream, you can send a request to the `_mapping` endpoint using the `PUT` or `POST` HTTP method: +To add mappings to an existing index or data stream, you can send a request to the `_mapping` endpoint using the `PUT` or `POST` HTTP method, as shown in the following example request: ```json POST sample-index1/_mapping @@ -84,84 +120,29 @@ POST sample-index1/_mapping } } ``` +{% include copy-curl.html %} You cannot change the mapping of an existing field, you can only modify the field's mapping parameters. {: .note} ---- -## Mapping example usage +## Mapping parameters -The following example shows how to create a mapping to specify that OpenSearch should ignore any documents with malformed IP addresses that do not conform to the [`ip`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/ip/) data type. You accomplish this by setting the `ignore_malformed` parameter to `true`. +Mapping parameters are used to configure the behavior of index fields. See [Mappings and field types]({{site.url}}{{site.baseurl}}/field-types/) for more information. -### Create an index with an `ip` mapping +## Mapping limit settings -To create an index, use a PUT request: +OpenSearch has certain mapping limits and settings, such as the settings listed in the following table. Settings can be configured based on your requirements. -```json -PUT /test-index -{ - "mappings" : { - "properties" : { - "ip_address" : { - "type" : "ip", - "ignore_malformed": true - } - } - } -} -``` - -You can add a document that has a malformed IP address to your index: - -```json -PUT /test-index/_doc/1 -{ - "ip_address" : "malformed ip address" -} -``` - -This indexed IP address does not throw an error because `ignore_malformed` is set to true. - -You can query the index using the following request: - -```json -GET /test-index/_search -``` +| Setting | Default value | Allowed value | Type | Description | +|-|-|-|-|-| +| `index.mapping.nested_fields.limit` | 50 | [0,) | Dynamic | Limits the maximum number of nested fields that can be defined in an index mapping. | +| `index.mapping.nested_objects.limit` | 10,000 | [0,) | Dynamic | Limits the maximum number of nested objects that can be created in a single document. | +| `index.mapping.total_fields.limit` | 1,000 | [0,) | Dynamic | Limits the maximum number of fields that can be defined in an index mapping. | +| `index.mapping.depth.limit` | 20 | [1,100] | Dynamic | Limits the maximum depth of nested objects and nested fields that can be defined in an index mapping. | +| `index.mapping.field_name_length.limit` | 50,000 | [1,50000] | Dynamic | Limits the maximum length of field names that can be defined in an index mapping. | +| `index.mapper.dynamic` | true | {true,false} | Dynamic | Determines whether new fields should be dynamically added to a mapping. | -The response shows that the `ip_address` field is ignored in the indexed document: - -```json -{ - "took": 14, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 1, - "hits": [ - { - "_index": "test-index", - "_id": "1", - "_score": 1, - "_ignored": [ - "ip_address" - ], - "_source": { - "ip_address": "malformed ip address" - } - } - ] - } -} -``` +--- ## Get a mapping @@ -170,14 +151,16 @@ To get all mappings for one or more indexes, use the following request: ```json GET /_mapping ``` +{% include copy-curl.html %} -In the above request, `` may be an index name or a comma-separated list of index names. +In the previous request, `` may be an index name or a comma-separated list of index names. To get all mappings for all indexes, use the following request: ```json GET _mapping ``` +{% include copy-curl.html %} To get a mapping for a specific field, provide the index name and the field name: @@ -185,14 +168,14 @@ To get a mapping for a specific field, provide the index name and the field name GET _mapping/field/ GET //_mapping/field/ ``` +{% include copy-curl.html %} -Both `` and `` can be specified as one value or a comma-separated list. - -For example, the following request retrieves the mapping for the `year` and `age` fields in `sample-index1`: +Both `` and `` can be specified as either one value or a comma-separated list. For example, the following request retrieves the mapping for the `year` and `age` fields in `sample-index1`: ```json GET sample-index1/_mapping/field/year,age ``` +{% include copy-curl.html %} The response contains the specified fields: @@ -220,3 +203,8 @@ The response contains the specified fields: } } ``` +{% include copy-curl.html %} + +## Mappings use cases + +See [Mappings use cases]({{site.url}}{{site.baseurl}}/field-types/mappings-use-cases/) for use case examples, including examples of mapping string fields and ignoring malformed IP addresses. diff --git a/_field-types/mappings-use-cases.md b/_field-types/mappings-use-cases.md new file mode 100644 index 0000000000..835e030bab --- /dev/null +++ b/_field-types/mappings-use-cases.md @@ -0,0 +1,122 @@ +--- +layout: default +title: Mappings use cases +parent: Mappings and fields types +nav_order: 5 +nav_exclude: true +--- + +# Mappings use cases + +Mappings provide control over how data is indexed and queried, enabling optimized performance and efficient storage for a range of use cases. + +--- + +## Example: Ignoring malformed IP addresses + +The following example shows you how to create a mapping specifying that OpenSearch should ignore any documents containing malformed IP addresses that do not conform to the [`ip`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/ip/) data type. You can accomplish this by setting the `ignore_malformed` parameter to `true`. + +### Create an index with an `ip` mapping + +To create an index with an `ip` mapping, use a PUT request: + +```json +PUT /test-index +{ + "mappings" : { + "properties" : { + "ip_address" : { + "type" : "ip", + "ignore_malformed": true + } + } + } +} +``` +{% include copy-curl.html %} + +Then add a document with a malformed IP address: + +```json +PUT /test-index/_doc/1 +{ + "ip_address" : "malformed ip address" +} +``` +{% include copy-curl.html %} + +When you query the index, the `ip_address` field will be ignored. You can query the index using the following request: + +```json +GET /test-index/_search +``` +{% include copy-curl.html %} + +#### Response + +```json +{ + "took": 14, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "test-index", + "_id": "1", + "_score": 1, + "_ignored": [ + "ip_address" + ], + "_source": { + "ip_address": "malformed ip address" + } + } + ] + } +} +``` +{% include copy-curl.html %} + +--- + +## Mapping string fields to `text` and `keyword` types + +To create an index named `movies1` with a dynamic template that maps all string fields to both the `text` and `keyword` types, you can use the following request: + +```json +PUT movies1 +{ + "mappings": { + "dynamic_templates": [ + { + "strings": { + "match_mapping_type": "string", + "mapping": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + } + ] + } +} +``` +{% include copy-curl.html %} + +This dynamic template ensures that any string fields in your documents will be indexed as both a full-text `text` type and a `keyword` type. diff --git a/_field-types/metadata-fields/field-names.md b/_field-types/metadata-fields/field-names.md new file mode 100644 index 0000000000..b17e94fbb4 --- /dev/null +++ b/_field-types/metadata-fields/field-names.md @@ -0,0 +1,43 @@ +--- +layout: default +title: Field names +nav_order: 10 +parent: Metadata fields +--- + +# Field names + +The `_field_names` field indexes field names that contain non-null values. This enables the use of the `exists` query, which can identify documents that either have or do not have non-null values for a specified field. + +However, `_field_names` only indexes field names when both `doc_values` and `norms` are disabled. If either `doc_values` or `norms` are enabled, then the `exists` query still functions but will not rely on the `_field_names` field. + +## Mapping example + +```json +{ + "mappings": { + "_field_names": { + "enabled": "true" + }, + "properties": { + }, + "title": { + "type": "text", + "doc_values": false, + "norms": false + }, + "description": { + "type": "text", + "doc_values": true, + "norms": false + }, + "price": { + "type": "float", + "doc_values": false, + "norms": true + } + } + } +} +``` +{% include copy-curl.html %} diff --git a/_field-types/metadata-fields/id.md b/_field-types/metadata-fields/id.md new file mode 100644 index 0000000000..f66f4b8e13 --- /dev/null +++ b/_field-types/metadata-fields/id.md @@ -0,0 +1,86 @@ +--- +layout: default +title: ID +nav_order: 20 +parent: Metadata fields +--- + +# ID + +Each document in OpenSearch has a unique `_id` field. This field is indexed, allowing you to retrieve documents using the GET API or the [`ids` query]({{site.url}}{{site.baseurl}}/query-dsl/term/ids/). + +If you do not provide an `_id` value, then OpenSearch automatically generates one for the document. +{: .note} + +The following example request creates an index named `test-index1` and adds two documents with different `_id` values: + +```json +PUT test-index1/_doc/1 +{ + "text": "Document with ID 1" +} + +PUT test-index1/_doc/2?refresh=true +{ + "text": "Document with ID 2" +} +``` +{% include copy-curl.html %} + +You can then query the documents using the `_id` field, as shown in the following example request: + +```json +GET test-index1/_search +{ + "query": { + "terms": { + "_id": ["1", "2"] + } + } +} +``` +{% include copy-curl.html %} + +The response returns both documents with `_id` values of `1` and `2`: + +```json +{ + "took": 10, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "test-index1", + "_id": "1", + "_score": 1, + "_source": { + "text": "Document with ID 1" + } + }, + { + "_index": "test-index1", + "_id": "2", + "_score": 1, + "_source": { + "text": "Document with ID 2" + } + } + ] + } +``` +{% include copy-curl.html %} + +## Limitations of the `_id` field + +While the `_id` field can be used in various queries, it is restricted from use in aggregations, sorting, and scripting. If you need to sort or aggregate on the `_id` field, it is recommended to duplicate the `_id` content into another field with `doc_values` enabled. Refer to [IDs query]({{site.url}}{{site.baseurl}}/query-dsl/term/ids/) for an example. diff --git a/_field-types/metadata-fields/ignored.md b/_field-types/metadata-fields/ignored.md new file mode 100644 index 0000000000..e867cfc754 --- /dev/null +++ b/_field-types/metadata-fields/ignored.md @@ -0,0 +1,147 @@ +--- +layout: default +title: Ignored +nav_order: 25 +parent: Metadata fields +--- + +# Ignored + +The `_ignored` field helps you manage issues related to malformed data in your documents. This field is used to index and store field names that were ignored during the indexing process as a result of the `ignore_malformed` setting being enabled in the [index mapping]({{site.url}}{{site.baseurl}}/field-types/). + +The `_ignored` field allows you to search for and identify documents containing fields that were ignored as well as for the specific field names that were ignored. This can be useful for troubleshooting. + +You can query the `_ignored` field using the `term`, `terms`, and `exists` queries, and the results will be included in the search hits. + +The `_ignored` field is only populated when the `ignore_malformed` setting is enabled in your index mapping. If `ignore_malformed` is set to `false` (the default value), then malformed fields will cause the entire document to be rejected, and the `_ignored` field will not be populated. +{: .note} + +The following example request shows you how to use the `_ignored` field: + +```json +GET _search +{ + "query": { + "exists": { + "field": "_ignored" + } + } +} +``` +{% include copy-curl.html %} + +--- + +#### Example indexing request with the `_ignored` field + +The following example request adds a new document to the `test-ignored` index with `ignore_malformed` set to `true` so that no error is thrown during indexing: + +```json +PUT test-ignored +{ + "mappings": { + "properties": { + "title": { + "type": "text" + }, + "length": { + "type": "long", + "ignore_malformed": true + } + } + } +} + +POST test-ignored/_doc +{ + "title": "correct text", + "length": "not a number" +} + +GET test-ignored/_search +{ + "query": { + "exists": { + "field": "_ignored" + } + } +} +``` +{% include copy-curl.html %} + +#### Example reponse + +```json +{ + "took": 42, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "test-ignored", + "_id": "qcf0wZABpEYH7Rw9OT7F", + "_score": 1, + "_ignored": [ + "length" + ], + "_source": { + "title": "correct text", + "length": "not a number" + } + } + ] + } +} +``` + +--- + +## Ignoring a specified field + +You can use a `term` query to find documents in which a specific field was ignored, as shown in the following example request: + +```json +GET _search +{ + "query": { + "term": { + "_ignored": "created_at" + } + } +} +``` +{% include copy-curl.html %} + +#### Reponse + +```json +{ + "took": 51, + "timed_out": false, + "_shards": { + "total": 45, + "successful": 45, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 0, + "relation": "eq" + }, + "max_score": null, + "hits": [] + } +} +``` diff --git a/_field-types/metadata-fields/index-metadata.md b/_field-types/metadata-fields/index-metadata.md new file mode 100644 index 0000000000..657f7d62a5 --- /dev/null +++ b/_field-types/metadata-fields/index-metadata.md @@ -0,0 +1,86 @@ +--- +layout: default +title: Index +nav_order: 25 +parent: Metadata fields +--- + +# Index + +When querying across multiple indexes, you may need to filter results based on the index into which a document was indexed. The `index` field matches documents based on their index. + +The following example request creates two indexes, `products` and `customers`, and adds a document to each index: + +```json +PUT products/_doc/1 +{ + "name": "Widget X" +} + +PUT customers/_doc/2 +{ + "name": "John Doe" +} +``` +{% include copy-curl.html %} + +You can then query both indexes and filter the results using the `_index` field, as shown in the following example request: + +```json +GET products,customers/_search +{ + "query": { + "terms": { + "_index": ["products", "customers"] + } + }, + "aggs": { + "index_groups": { + "terms": { + "field": "_index", + "size": 10 + } + } + }, + "sort": [ + { + "_index": { + "order": "desc" + } + } + ], + "script_fields": { + "index_name": { + "script": { + "lang": "painless", + "source": "doc['_index'].value" + } + } + } +} +``` +{% include copy-curl.html %} + +In this example: + +- The `query` section uses a `terms` query to match documents from the `products` and `customers` indexes. +- The `aggs` section performs a `terms` aggregation on the `_index` field, grouping the results by index. +- The `sort` section sorts the results by the `_index` field in ascending order. +- The `script_fields` section adds a new field called `index_name` to the search results containing the `_index` field value for each document. + +## Querying on the `_index` field + +The `_index` field represents the index into which a document was indexed. You can use this field in your queries to filter, aggregate, sort, or retrieve index information for your search results. + +Because the `_index` field is automatically added to every document, you can use it in your queries like any other field. For example, you can use the `terms` query to match documents from multiple indexes. The following example query returns all documents from the `products` and `customers` indexes: + +```json + { + "query": { + "terms": { + "_index": ["products", "customers"] + } + } +} +``` +{% include copy-curl.html %} diff --git a/_field-types/metadata-fields/index.md b/_field-types/metadata-fields/index.md new file mode 100644 index 0000000000..cdc079e1e5 --- /dev/null +++ b/_field-types/metadata-fields/index.md @@ -0,0 +1,21 @@ +--- +layout: default +title: Metadata fields +nav_order: 90 +has_children: true +has_toc: false +--- + +# Metadata fields + +OpenSearch provides built-in metadata fields that allow you to access information about the documents in an index. These fields can be used in your queries as needed. + +Metadata field | Description +:--- | :--- +`_field_names` | The document fields with non-empty or non-null values. +`_ignored` | The document fields that were ignored during the indexing process due to the presence of malformed data, as specified by the `ignore_malformed` setting. +`_id` | The unique identifier assigned to each document. +`_index` | The index in which the document is stored. +`_meta` | Stores custom metadata or additional information specific to the application or use case. +`_routing` | Allows you to specify a custom value that determines the shard assignment for a document in an OpenSearch cluster. +`_source` | Contains the original JSON representation of the document data. diff --git a/_field-types/metadata-fields/meta.md b/_field-types/metadata-fields/meta.md new file mode 100644 index 0000000000..220d58f106 --- /dev/null +++ b/_field-types/metadata-fields/meta.md @@ -0,0 +1,87 @@ +--- +layout: default +title: Meta +nav_order: 30 +parent: Metadata fields +--- + +# Meta + +The `_meta` field is a mapping property that allows you to attach custom metadata to your index mappings. This metadata can be used by your application to store information relevant to your use case, such as versioning, ownership, categorization, or auditing. + +## Usage + +You can define the `_meta` field when creating a new index or updating an existing index's mapping, as shown in the following example request: + +```json +PUT my-index +{ + "mappings": { + "_meta": { + "application": "MyApp", + "version": "1.2.3", + "author": "John Doe" + }, + "properties": { + "title": { + "type": "text" + }, + "description": { + "type": "text" + } + } + } +} + +``` +{% include copy-curl.html %} + +In this example, three custom metadata fields are added: `application`, `version`, and `author`. These fields can be used by your application to store any relevant information about the index, such as the application it belongs to, the application version, or the author of the index. + +You can update the `_meta` field using the [Put Mapping API]({{site.url}}{{site.baseurl}}/api-reference/index-apis/put-mapping/) operation, as shown in the following example request: + +```json +PUT my-index/_mapping +{ + "_meta": { + "application": "MyApp", + "version": "1.3.0", + "author": "Jane Smith" + } +} +``` +{% include copy-curl.html %} + +## Retrieving `meta` information + +You can retrieve the `_meta` information for an index using the [Get Mapping API]({{site.url}}{{site.baseurl}}/field-types/#get-a-mapping) operation, as shown in the following example request: + +```json +GET my-index/_mapping +``` +{% include copy-curl.html %} + +The response returns the full index mapping, including the `_meta` field: + +```json +{ + "my-index": { + "mappings": { + "_meta": { + "application": "MyApp", + "version": "1.3.0", + "author": "Jane Smith" + }, + "properties": { + "description": { + "type": "text" + }, + "title": { + "type": "text" + } + } + } + } +} +``` +{% include copy-curl.html %} diff --git a/_field-types/metadata-fields/routing.md b/_field-types/metadata-fields/routing.md new file mode 100644 index 0000000000..9064e20c49 --- /dev/null +++ b/_field-types/metadata-fields/routing.md @@ -0,0 +1,92 @@ +--- +layout: default +title: Routing +nav_order: 35 +parent: Metadata fields +--- + +# Routing + +OpenSearch uses a hashing algorithm to route documents to specific shards in an index. By default, the document's `_id` field is used as the routing value, but you can also specify a custom routing value for each document. + +## Default routing + +The following is the default OpenSearch routing formula. The `_routing` value is the document's `_id`. + +```json +shard_num = hash(_routing) % num_primary_shards +``` + +## Custom routing + +You can specify a custom routing value when indexing a document, as shown in the following example request: + +```json +PUT sample-index1/_doc/1?routing=JohnDoe1 +{ + "title": "This is a document" +} +``` +{% include copy-curl.html %} + +In this example, the document is routed using the value `JohnDoe1` instead of the default `_id`. + +You must provide the same routing value when retrieving, deleting, or updating the document, as shown in the following example request: + +```json +GET sample-index1/_doc/1?routing=JohnDoe1 +``` +{% include copy-curl.html %} + +## Querying by routing + +You can query documents based on their routing value by using the `_routing` field, as shown in the following example. This query only searches the shard(s) associated with the `JohnDoe1` routing value: + +```json +GET sample-index1/_search +{ + "query": { + "terms": { + "_routing": [ "JohnDoe1" ] + } + } +} +``` +{% include copy-curl.html %} + +## Required routing + +You can make custom routing required for all CRUD operations on an index, as shown in the following example request. If you try to index a document without providing a routing value, OpenSearch will throw an exception. + +```json +PUT sample-index2 +{ + "mappings": { + "_routing": { + "required": true + } + } +} +``` +{% include copy-curl.html %} + +## Routing to specific shards + +You can configure an index to route custom values to a subset of shards rather than a single shard. This is done by setting `index.routing_partition_size` at the time of index creation. The formula for calculating the shard is `shard_num = (hash(_routing) + hash(_id)) % routing_partition_size) % num_primary_shards`. + +The following example request routes documents to one of four shards in the index: + +```json +PUT sample-index3 +{ + "settings": { + "index.routing_partition_size": 4 + }, + "mappings": { + "_routing": { + "required": true + } + } +} +``` +{% include copy-curl.html %} diff --git a/_field-types/metadata-fields/source.md b/_field-types/metadata-fields/source.md new file mode 100644 index 0000000000..c9e714f43c --- /dev/null +++ b/_field-types/metadata-fields/source.md @@ -0,0 +1,54 @@ +--- +layout: default +title: Source +nav_order: 40 +parent: Metadata fields +--- + +# Source + +The `_source` field contains the original JSON document body that was indexed. While this field is not searchable, it is stored so that the full document can be returned when executing fetch requests, such as `get` and `search`. + +## Disabling the field + +You can disable the `_source` field by setting the `enabled` parameter to `false`, as shown in the following example request: + +```json +PUT sample-index1 +{ + "mappings": { + "_source": { + "enabled": false + } + } +} +``` +{% include copy-curl.html %} + +Disabling the `_source` field can impact the availability of certain features, such as the `update`, `update_by_query`, and `reindex` APIs, as well as the ability to debug queries or aggregations using the original indexed document. +{: .warning} + +## Including or excluding fields + +You can selectively control the contents of the `_source` field by using the `includes` and `excludes` parameters. This allows you to prune the stored `_source` field after it is indexed but before it is saved, as shown in the following example request: + +```json +PUT logs +{ + "mappings": { + "_source": { + "includes": [ + "*.count", + "meta.*" + ], + "excludes": [ + "meta.description", + "meta.other.*" + ] + } + } +} +``` +{% include copy-curl.html %} + +These fields are not stored in the `_source`, but you can still search them because the data remains indexed. diff --git a/_getting-started/intro.md b/_getting-started/intro.md index edd178a23f..f5eb24ba2b 100644 --- a/_getting-started/intro.md +++ b/_getting-started/intro.md @@ -56,6 +56,7 @@ ID | Name | GPA | Graduation year 1 | John Doe | 3.89 | 2022 2 | Jonathan Powers | 3.85 | 2025 3 | Jane Doe | 3.52 | 2024 +... | | | ## Clusters and nodes diff --git a/_ml-commons-plugin/index.md b/_ml-commons-plugin/index.md index f0355b6be3..50d637379e 100644 --- a/_ml-commons-plugin/index.md +++ b/_ml-commons-plugin/index.md @@ -32,6 +32,10 @@ ML Commons supports various algorithms to help train ML models and make predicti ML Commons provides its own set of REST APIs. For more information, see [ML Commons API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/index/). +## ML-powered search + +For information about available ML-powered search types, see [ML-powered search]({{site.url}}{{site.baseurl}}/search-plugins/index/#ml-powered-search). + ## Tutorials Using the OpenSearch ML framework, you can build various applications, from implementing conversational search to building your own chatbot. For more information, see [Tutorials]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/index/). \ No newline at end of file diff --git a/_query-dsl/joining/index.md b/_query-dsl/joining/index.md new file mode 100644 index 0000000000..20f48c0b16 --- /dev/null +++ b/_query-dsl/joining/index.md @@ -0,0 +1,18 @@ +--- +layout: default +title: Joining queries +has_children: true +nav_order: 55 +--- + +# Joining queries + +OpenSearch is a distributed system in which data is spread across multiple nodes. Thus, running a SQL-like JOIN operation in OpenSearch is resource intensive. As an alternative, OpenSearch provides the following queries that perform join operations and are optimized for scaling across multiple nodes: + +- `nested` queries: Act as wrappers for other queries to search [nested]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/nested/) fields. The nested field objects are searched as though they were indexed as separate documents. +- `has_child` queries: Search for parent documents whose child documents match the query. +- `has_parent` queries: Search for child documents whose parent documents match the query. +- `parent_id` queries: A [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/nested/) field type establishes a parent/child relationship between documents in the same index. `parent_id` queries search for child documents that are joined to a specific parent document. + +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then joining queries are not executed. +{: .important} \ No newline at end of file diff --git a/_query-dsl/term/fuzzy.md b/_query-dsl/term/fuzzy.md index 7a426fd794..0e448bbbda 100644 --- a/_query-dsl/term/fuzzy.md +++ b/_query-dsl/term/fuzzy.md @@ -89,5 +89,5 @@ Parameter | Data type | Description Specifying a large value in `max_expansions` can lead to poor performance, especially if `prefix_length` is set to `0`, because of the large number of variations of the word that OpenSearch tries to match. {: .warning} -If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, fuzzy queries are not run. +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then fuzzy queries are not executed. {: .important} diff --git a/_query-dsl/term/prefix.md b/_query-dsl/term/prefix.md index 2a429c9f0e..087c26cc30 100644 --- a/_query-dsl/term/prefix.md +++ b/_query-dsl/term/prefix.md @@ -66,5 +66,5 @@ Parameter | Data type | Description `case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping). `rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`. -If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, prefix queries are not run. If `index_prefixes` is enabled, the `search.allow_expensive_queries` setting is ignored and an optimized query is built and run. +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then prefix queries are not executed. If `index_prefixes` is enabled, then the `search.allow_expensive_queries` setting is ignored and an optimized query is built and run. {: .important} diff --git a/_query-dsl/term/range.md b/_query-dsl/term/range.md index ceb264db76..1e6fece218 100644 --- a/_query-dsl/term/range.md +++ b/_query-dsl/term/range.md @@ -217,5 +217,5 @@ Parameter | Data type | Description `boost` | Floating-point | A floating-point value that specifies the weight of this field toward the relevance score. Values above 1.0 increase the field’s relevance. Values between 0.0 and 1.0 decrease the field’s relevance. Default is 1.0. `time_zone` | String | The time zone used to convert [`date`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/) values to UTC in the query. Valid values are ISO 8601 [UTC offsets](https://en.wikipedia.org/wiki/List_of_UTC_offsets) and [IANA time zone IDs](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). For more information, see [Time zone](#time-zone). -If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, range queries on [`text`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) and [`keyword`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) fields are not run. +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then range queries on [`text`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) and [`keyword`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) fields are not executed. {: .important} diff --git a/_query-dsl/term/regexp.md b/_query-dsl/term/regexp.md index 4a038729c0..34a0c916ce 100644 --- a/_query-dsl/term/regexp.md +++ b/_query-dsl/term/regexp.md @@ -61,5 +61,5 @@ Parameter | Data type | Description `max_determinized_states` | Integer | Lucene converts a regular expression to an automaton with a number of determinized states. This parameter specifies the maximum number of automaton states the query requires. Use this parameter to prevent high resource consumption. To run complex regular expressions, you may need to increase the value of this parameter. Default is 10,000. `rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`. -If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, `regexp` queries are not run. +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then `regexp` queries are not executed. {: .important} diff --git a/_query-dsl/term/wildcard.md b/_query-dsl/term/wildcard.md index b2d7238758..c6e0499517 100644 --- a/_query-dsl/term/wildcard.md +++ b/_query-dsl/term/wildcard.md @@ -64,5 +64,5 @@ Parameter | Data type | Description `case_insensitive` | Boolean | If `true`, allows case-insensitive matching of the value with the indexed field values. Default is `false` (case sensitivity is determined by the field's mapping). `rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`. -If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, wildcard queries are not run. +If [`search.allow_expensive_queries`]({{site.url}}{{site.baseurl}}/query-dsl/index/#expensive-queries) is set to `false`, then wildcard queries are not executed. {: .important} diff --git a/_search-plugins/index.md b/_search-plugins/index.md index 79e0e715d0..3604245f11 100644 --- a/_search-plugins/index.md +++ b/_search-plugins/index.md @@ -16,29 +16,31 @@ OpenSearch provides many features for customizing your search use cases and impr ## Search methods -OpenSearch supports the following search methods: +OpenSearch supports the following search methods. -- **Traditional lexical search** +### Traditional lexical search - - [Keyword (BM25) search]({{site.url}}{{site.baseurl}}/search-plugins/keyword-search/): Searches the document corpus for words that appear in the query. +OpenSearch supports [keyword (BM25) search]({{site.url}}{{site.baseurl}}/search-plugins/keyword-search/), which searches the document corpus for words that appear in the query. -- **Machine learning (ML)-powered search** +### ML-powered search - - **Vector search** +OpenSearch supports the following machine learning (ML)-powered search methods: - - [k-NN search]({{site.url}}{{site.baseurl}}/search-plugins/knn/): Searches for k-nearest neighbors to a search term across an index of vectors. +- **Vector search** - - **Neural search**: [Neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) facilitates generating vector embeddings at ingestion time and searching them at search time. Neural search lets you integrate ML models into your search and serves as a framework for implementing other search methods. The following search methods are built on top of neural search: + - [k-NN search]({{site.url}}{{site.baseurl}}/search-plugins/knn/): Searches for the k-nearest neighbors to a search term across an index of vectors. - - [Semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/): Considers the meaning of the words in the search context. Uses dense retrieval based on text embedding models to search text data. +- **Neural search**: [Neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) facilitates generating vector embeddings at ingestion time and searching them at search time. Neural search lets you integrate ML models into your search and serves as a framework for implementing other search methods. The following search methods are built on top of neural search: - - [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/): Uses multimodal embedding models to search text and image data. + - [Semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/): Considers the meaning of the words in the search context. Uses dense retrieval based on text embedding models to search text data. - - [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Uses sparse retrieval based on sparse embedding models to search text data. + - [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/): Uses multimodal embedding models to search text and image data. - - [Hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/): Combines traditional search and vector search to improve search relevance. + - [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Uses sparse retrieval based on sparse embedding models to search text data. - - [Conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/): Implements a retrieval-augmented generative search. + - [Hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/): Combines traditional search and vector search to improve search relevance. + + - [Conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/): Implements a retrieval-augmented generative search. ## Query languages diff --git a/_security/configuration/yaml.md b/_security/configuration/yaml.md index 3aabce53d5..4bcb8b0460 100644 --- a/_security/configuration/yaml.md +++ b/_security/configuration/yaml.md @@ -15,6 +15,80 @@ Before running [`securityadmin.sh`]({{site.url}}{{site.baseurl}}/security/config The approach we recommend for using the YAML files is to first configure [reserved and hidden resources]({{site.url}}{{site.baseurl}}/security/access-control/api#reserved-and-hidden-resources), such as the `admin` and `kibanaserver` users. Thereafter you can create other users, roles, mappings, action groups, and tenants using OpenSearch Dashboards or the REST API. +## action_groups.yml + +This file contains any initial action groups that you want to add to the Security plugin. + +Aside from some metadata, the default file is empty, because the Security plugin has a number of static action groups that it adds automatically. These static action groups cover a wide variety of use cases and are a great way to get started with the plugin. + +```yml +--- +my-action-group: + reserved: false + hidden: false + allowed_actions: + - "indices:data/write/index*" + - "indices:data/write/update*" + - "indices:admin/mapping/put" + - "indices:data/write/bulk*" + - "read" + - "write" + static: false +_meta: + type: "actiongroups" + config_version: 2 +``` + +## allowlist.yml + +You can use `allowlist.yml` to add any endpoints and HTTP requests to a list of allowed endpoints and requests. If enabled, all users except the super admin are allowed access to only the specified endpoints and HTTP requests, and all other HTTP requests associated with the endpoint are denied. For example, if GET `_cluster/settings` is added to the allow list, users cannot submit PUT requests to `_cluster/settings` to update cluster settings. + +Note that while you can configure access to endpoints this way, for most cases, it is still best to configure permissions using the Security plugin's users and roles, which have more granular settings. + +```yml +--- +_meta: + type: "allowlist" + config_version: 2 + +# Description: +# enabled - feature flag. +# if enabled is false, all endpoints are accessible. +# if enabled is true, all users except the SuperAdmin can only submit the allowed requests to the specified endpoints. +# SuperAdmin can access all APIs. +# SuperAdmin is defined by the SuperAdmin certificate, which is configured with the opensearch.yml setting plugins.security.authcz.admin_dn: +# Refer to the example setting in opensearch.yml to learn more about configuring SuperAdmin. +# +# requests - map of allow listed endpoints and HTTP requests + +#this name must be config +config: + enabled: true + requests: + /_cluster/settings: + - GET + /_cat/nodes: + - GET +``` + +To enable PUT requests to cluster settings, add PUT to the list of allowed operations under `/_cluster/settings`. + +```yml +requests: + /_cluster/settings: + - GET + - PUT +``` + +You can also add custom indexes to the allow list. `allowlist.yml` doesn't support wildcards, so you must manually specify all of the indexes you want to add. + +```yml +requests: # Only allow GET requests to /sample-index1/_doc/1 and /sample-index2/_doc/1 + /sample-index1/_doc/1: + - GET + /sample-index2/_doc/1: + - GET +``` ## internal_users.yml @@ -92,196 +166,24 @@ snapshotrestore: description: "Demo snapshotrestore user" ``` -## opensearch.yml - -In addition to many OpenSearch settings, this file contains paths to TLS certificates and their attributes, such as distinguished names and trusted certificate authorities. - -```yml -plugins.security.ssl.transport.pemcert_filepath: esnode.pem -plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem -plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem -plugins.security.ssl.transport.enforce_hostname_verification: false -plugins.security.ssl.http.enabled: true -plugins.security.ssl.http.pemcert_filepath: esnode.pem -plugins.security.ssl.http.pemkey_filepath: esnode-key.pem -plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem -plugins.security.allow_unsafe_democertificates: true -plugins.security.allow_default_init_securityindex: true -plugins.security.authcz.admin_dn: - - CN=kirk,OU=client,O=client,L=test, C=de - -plugins.security.audit.type: internal_opensearch -plugins.security.enable_snapshot_restore_privilege: true -plugins.security.check_snapshot_restore_write_privileges: true -plugins.security.cache.ttl_minutes: 60 -plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"] -plugins.security.system_indices.enabled: true -plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opendistro-asynchronous-search-response*"] -node.max_local_storage_nodes: 3 -``` - -For a full list of `opensearch.yml` Security plugin settings, see [Security settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/security-settings/). -{: .note} - -### Refining your configuration - -The `plugins.security.allow_default_init_securityindex` setting, when set to `true`, sets the Security plugin to its default security settings if an attempt to create the security index fails when OpenSearch launches. Default security settings are stored in YAML files contained in the `opensearch-project/security/config` directory. By default, this setting is `false`. - -```yml -plugins.security.allow_default_init_securityindex: true -``` - -An authentication cache for the Security plugin exists to help speed up authentication by temporarily storing user objects returned from the backend so that the Security plugin is not required to make repeated requests for them. To determine how long it takes for caching to time out, you can use the `plugins.security.cache.ttl_minutes` property to set a value in minutes. The default is `60`. You can disable caching by setting the value to `0`. - -```yml -plugins.security.cache.ttl_minutes: 60 -``` - -### Enabling user access to system indexes - -Mapping a system index permission to a user allows that user to modify the system index specified in the permission's name (the one exception is the Security plugin's [system index]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/)). The `plugins.security.system_indices.permission.enabled` setting provides a way for administrators to make this permission available for or hidden from role mapping. - -When set to `true`, the feature is enabled and users with permission to modify roles can create roles that include permissions that grant access to system indexes: - -```yml -plugins.security.system_indices.permission.enabled: true -``` - -When set to `false`, the permission is disabled and only admins with an admin certificate can make changes to system indexes. By default, the permission is set to `false` in a new cluster. - -To learn more about system index permissions, see [System index permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions/#system-index-permissions). - - -### Password settings - -If you want to run your users' passwords against some validation, specify a regular expression (regex) in this file. You can also include an error message that loads when passwords don't pass validation. The following example demonstrates how to include a regex so OpenSearch requires new passwords to be a minimum of eight characters with at least one uppercase, one lowercase, one digit, and one special character. - -Note that OpenSearch validates only users and passwords created through OpenSearch Dashboards or the REST API. - -```yml -plugins.security.restapi.password_validation_regex: '(?=.*[A-Z])(?=.*[^a-zA-Z\d])(?=.*[0-9])(?=.*[a-z]).{8,}' -plugins.security.restapi.password_validation_error_message: "Password must be minimum 8 characters long and must contain at least one uppercase letter, one lowercase letter, one digit, and one special character." -``` - -In addition, a score-based password strength estimator allows you to set a threshold for password strength when creating a new internal user or updating a user's password. This feature makes use of the [zxcvbn library](https://github.com/dropbox/zxcvbn) to apply a policy that emphasizes a password's complexity rather than its capacity to meet traditional criteria such as uppercase keys, numerals, and special characters. - -For information about defining users, see [Defining users]({{site.url}}{{site.baseurl}}/security/access-control/users-roles/#defining-users). - -This feature is not compatible with users specified as reserved. For information about reserved resources, see [Reserved and hidden resources]({{site.url}}{{site.baseurl}}/security/access-control/api#reserved-and-hidden-resources). -{: .important } - -Score-based password strength requires two settings to configure the feature. The following table describes the two settings. - -| Setting | Description | -| :--- | :--- | -| `plugins.security.restapi.password_min_length` | Sets the minimum number of characters for the password length. The default is `8`. This is also the minimum. | -| `plugins.security.restapi.password_score_based_validation_strength` | Sets a threshold to determine whether the password is strong or weak. There are four values that represent a threshold's increasing complexity.
`fair`--A very "guessable" password: provides protection from throttled online attacks.
`good`--A somewhat guessable password: provides protection from unthrottled online attacks.
`strong`--A safely "unguessable" password: provides moderate protection from an offline, slow-hash scenario.
`very_strong`--A very unguessable password: provides strong protection from an offline, slow-hash scenario. | - -The following example shows the settings configured for the `opensearch.yml` file and enabling a password with a minimum of 10 characters and a threshold requiring the highest strength: - -```yml -plugins.security.restapi.password_min_length: 10 -plugins.security.restapi.password_score_based_validation_strength: very_strong -``` - -When you try to create a user with a password that doesn't reach the specified threshold, the system generates a "weak password" warning, indicating that the password needs to be modified before you can save the user. - -The following example shows the response from the [Create user]({{site.url}}{{site.baseurl}}/security/access-control/api/#create-user) API when the password is weak: - -```json -{ - "status": "error", - "reason": "Weak password" -} -``` - -## allowlist.yml +## nodes_dn.yml -You can use `allowlist.yml` to add any endpoints and HTTP requests to a list of allowed endpoints and requests. If enabled, all users except the super admin are allowed access to only the specified endpoints and HTTP requests, and all other HTTP requests associated with the endpoint are denied. For example, if GET `_cluster/settings` is added to the allow list, users cannot submit PUT requests to `_cluster/settings` to update cluster settings. +`nodes_dn.yml` lets you add certificates' [distinguished names (DNs)]({{site.url}}{{site.baseurl}}/security/configuration/generate-certificates/#add-distinguished-names-to-opensearchyml) to an allow list to enable communication between any number of nodes or clusters. For example, a node that has the DN `CN=node1.example.com` in its allow list accepts communication from any other node or certificate that uses that DN. -Note that while you can configure access to endpoints this way, for most cases, it is still best to configure permissions using the Security plugin's users and roles, which have more granular settings. +The DNs get indexed into a [system index]({{site.url}}{{site.baseurl}}/security/configuration/system-indices) that only a super admin or an admin with a Transport Layer Security (TLS) certificate can access. If you want to programmatically add DNs to your allow lists, use the [REST API]({{site.url}}{{site.baseurl}}/security/access-control/api/#distinguished-names). ```yml --- _meta: - type: "allowlist" + type: "nodesdn" config_version: 2 -# Description: -# enabled - feature flag. -# if enabled is false, all endpoints are accessible. -# if enabled is true, all users except the SuperAdmin can only submit the allowed requests to the specified endpoints. -# SuperAdmin can access all APIs. -# SuperAdmin is defined by the SuperAdmin certificate, which is configured with the opensearch.yml setting plugins.security.authcz.admin_dn: -# Refer to the example setting in opensearch.yml to learn more about configuring SuperAdmin. -# -# requests - map of allow listed endpoints and HTTP requests - -#this name must be config -config: - enabled: true - requests: - /_cluster/settings: - - GET - /_cat/nodes: - - GET -``` - -To enable PUT requests to cluster settings, add PUT to the list of allowed operations under `/_cluster/settings`. - -```yml -requests: - /_cluster/settings: - - GET - - PUT -``` - -You can also add custom indexes to the allow list. `allowlist.yml` doesn't support wildcards, so you must manually specify all of the indexes you want to add. - -```yml -requests: # Only allow GET requests to /sample-index1/_doc/1 and /sample-index2/_doc/1 - /sample-index1/_doc/1: - - GET - /sample-index2/_doc/1: - - GET -``` - - -## roles.yml - -This file contains any initial roles that you want to add to the Security plugin. Aside from some metadata, the default file is empty, because the Security plugin has a number of static roles that it adds automatically. - -```yml ---- -complex-role: - reserved: false - hidden: false - cluster_permissions: - - "read" - - "cluster:monitor/nodes/stats" - - "cluster:monitor/task/get" - index_permissions: - - index_patterns: - - "opensearch_dashboards_sample_data_*" - dls: "{\"match\": {\"FlightDelay\": true}}" - fls: - - "~FlightNum" - masked_fields: - - "Carrier" - allowed_actions: - - "read" - tenant_permissions: - - tenant_patterns: - - "analyst_*" - allowed_actions: - - "kibana_all_write" - static: false -_meta: - type: "roles" - config_version: 2 +# Define nodesdn mapping name and corresponding values +# cluster1: +# nodes_dn: +# - CN=*.example.com ``` - ## roles_mapping.yml ```yml @@ -359,28 +261,37 @@ kibana_server: and_backend_roles: [] ``` +## roles.yml -## action_groups.yml - -This file contains any initial action groups that you want to add to the Security plugin. - -Aside from some metadata, the default file is empty, because the Security plugin has a number of static action groups that it adds automatically. These static action groups cover a wide variety of use cases and are a great way to get started with the plugin. +This file contains any initial roles that you want to add to the Security plugin. Aside from some metadata, the default file is empty, because the Security plugin has a number of static roles that it adds automatically. ```yml --- -my-action-group: +complex-role: reserved: false hidden: false - allowed_actions: - - "indices:data/write/index*" - - "indices:data/write/update*" - - "indices:admin/mapping/put" - - "indices:data/write/bulk*" + cluster_permissions: - "read" - - "write" + - "cluster:monitor/nodes/stats" + - "cluster:monitor/task/get" + index_permissions: + - index_patterns: + - "opensearch_dashboards_sample_data_*" + dls: "{\"match\": {\"FlightDelay\": true}}" + fls: + - "~FlightNum" + masked_fields: + - "Carrier" + allowed_actions: + - "read" + tenant_permissions: + - tenant_patterns: + - "analyst_*" + allowed_actions: + - "kibana_all_write" static: false _meta: - type: "actiongroups" + type: "roles" config_version: 2 ``` @@ -400,20 +311,105 @@ admin_tenant: description: "Demo tenant for admin user" ``` -## nodes_dn.yml +## opensearch.yml -`nodes_dn.yml` lets you add certificates' [distinguished names (DNs)]({{site.url}}{{site.baseurl}}/security/configuration/generate-certificates/#add-distinguished-names-to-opensearchyml) an allow list to enable communication between any number of nodes and/or clusters. For example, a node that has the DN `CN=node1.example.com` in its allow list accepts communication from any other node or certificate that uses that DN. +In addition to many OpenSearch settings, this file contains paths to TLS certificates and their attributes, such as distinguished names and trusted certificate authorities. -The DNs get indexed into a [system index]({{site.url}}{{site.baseurl}}/security/configuration/system-indices) that only a super admin or an admin with a Transport Layer Security (TLS) certificate can access. If you want to programmatically add DNs to your allow lists, use the [REST API]({{site.url}}{{site.baseurl}}/security/access-control/api/#distinguished-names). +```yml +plugins.security.ssl.transport.pemcert_filepath: esnode.pem +plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem +plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem +plugins.security.ssl.transport.enforce_hostname_verification: false +plugins.security.ssl.http.enabled: true +plugins.security.ssl.http.pemcert_filepath: esnode.pem +plugins.security.ssl.http.pemkey_filepath: esnode-key.pem +plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem +plugins.security.allow_unsafe_democertificates: true +plugins.security.allow_default_init_securityindex: true +plugins.security.authcz.admin_dn: + - CN=kirk,OU=client,O=client,L=test, C=de + +plugins.security.audit.type: internal_opensearch +plugins.security.enable_snapshot_restore_privilege: true +plugins.security.check_snapshot_restore_write_privileges: true +plugins.security.cache.ttl_minutes: 60 +plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"] +plugins.security.system_indices.enabled: true +plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opendistro-asynchronous-search-response*"] +node.max_local_storage_nodes: 3 +``` + +For a full list of `opensearch.yml` Security plugin settings, see [Security settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/security-settings/). +{: .note} + +### Refining your configuration + +The `plugins.security.allow_default_init_securityindex` setting, when set to `true`, sets the Security plugin to its default security settings if an attempt to create the security index fails when OpenSearch launches. Default security settings are stored in YAML files contained in the `opensearch-project/security/config` directory. By default, this setting is `false`. ```yml ---- -_meta: - type: "nodesdn" - config_version: 2 +plugins.security.allow_default_init_securityindex: true +``` -# Define nodesdn mapping name and corresponding values -# cluster1: -# nodes_dn: -# - CN=*.example.com +An authentication cache for the Security plugin exists to help speed up authentication by temporarily storing user objects returned from the backend so that the Security plugin is not required to make repeated requests for them. To determine how long it takes for caching to time out, you can use the `plugins.security.cache.ttl_minutes` property to set a value in minutes. The default is `60`. You can disable caching by setting the value to `0`. + +```yml +plugins.security.cache.ttl_minutes: 60 +``` + +### Enabling user access to system indexes + +Mapping a system index permission to a user allows that user to modify the system index specified in the permission's name (the one exception is the Security plugin's [system index]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/)). The `plugins.security.system_indices.permission.enabled` setting provides a way for administrators to make this permission available for or hidden from role mapping. + +When set to `true`, the feature is enabled and users with permission to modify roles can create roles that include permissions that grant access to system indexes: + +```yml +plugins.security.system_indices.permission.enabled: true +``` + +When set to `false`, the permission is disabled and only admins with an admin certificate can make changes to system indexes. By default, the permission is set to `false` in a new cluster. + +To learn more about system index permissions, see [System index permissions]({{site.url}}{{site.baseurl}}/security/access-control/permissions/#system-index-permissions). + + +### Password settings + +If you want to run your users' passwords against some validation, specify a regular expression (regex) in this file. You can also include an error message that loads when passwords don't pass validation. The following example demonstrates how to include a regex so OpenSearch requires new passwords to be a minimum of eight characters with at least one uppercase, one lowercase, one digit, and one special character. + +Note that OpenSearch validates only users and passwords created through OpenSearch Dashboards or the REST API. + +```yml +plugins.security.restapi.password_validation_regex: '(?=.*[A-Z])(?=.*[^a-zA-Z\d])(?=.*[0-9])(?=.*[a-z]).{8,}' +plugins.security.restapi.password_validation_error_message: "Password must be minimum 8 characters long and must contain at least one uppercase letter, one lowercase letter, one digit, and one special character." +``` + +In addition, a score-based password strength estimator allows you to set a threshold for password strength when creating a new internal user or updating a user's password. This feature makes use of the [zxcvbn library](https://github.com/dropbox/zxcvbn) to apply a policy that emphasizes a password's complexity rather than its capacity to meet traditional criteria such as uppercase keys, numerals, and special characters. + +For information about defining users, see [Defining users]({{site.url}}{{site.baseurl}}/security/access-control/users-roles/#defining-users). + +This feature is not compatible with users specified as reserved. For information about reserved resources, see [Reserved and hidden resources]({{site.url}}{{site.baseurl}}/security/access-control/api#reserved-and-hidden-resources). +{: .important } + +Score-based password strength requires two settings to configure the feature. The following table describes the two settings. + +| Setting | Description | +| :--- | :--- | +| `plugins.security.restapi.password_min_length` | Sets the minimum number of characters for the password length. The default is `8`. This is also the minimum. | +| `plugins.security.restapi.password_score_based_validation_strength` | Sets a threshold to determine whether the password is strong or weak. There are four values that represent a threshold's increasing complexity.
`fair`--A very "guessable" password: provides protection from throttled online attacks.
`good`--A somewhat guessable password: provides protection from unthrottled online attacks.
`strong`--A safely "unguessable" password: provides moderate protection from an offline, slow-hash scenario.
`very_strong`--A very unguessable password: provides strong protection from an offline, slow-hash scenario. | + +The following example shows the settings configured for the `opensearch.yml` file and enabling a password with a minimum of 10 characters and a threshold requiring the highest strength: + +```yml +plugins.security.restapi.password_min_length: 10 +plugins.security.restapi.password_score_based_validation_strength: very_strong +``` + +When you try to create a user with a password that doesn't reach the specified threshold, the system generates a "weak password" warning, indicating that the password needs to be modified before you can save the user. + +The following example shows the response from the [Create user]({{site.url}}{{site.baseurl}}/security/access-control/api/#create-user) API when the password is weak: + +```json +{ + "status": "error", + "reason": "Weak password" +} ``` diff --git a/images/dashboards/data-source-UI.png b/images/dashboards/data-source-UI.png deleted file mode 100644 index bc07237847..0000000000 Binary files a/images/dashboards/data-source-UI.png and /dev/null differ diff --git a/images/dashboards/delete-data-source.png b/images/dashboards/delete-data-source.png deleted file mode 100644 index 2d0337a92b..0000000000 Binary files a/images/dashboards/delete-data-source.png and /dev/null differ