Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise Amazon S3 data source #8109

Merged
merged 11 commits into from
Aug 29, 2024
46 changes: 18 additions & 28 deletions _dashboards/management/S3-data-source.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,51 +10,41 @@ has_children: true
Introduced 2.11
{: .label .label-purple }

Starting with OpenSearch 2.11, you can connect OpenSearch to your Amazon Simple Storage Service (Amazon S3) data source using the OpenSearch Dashboards UI. You can then query that data, optimize query performance, define tables, and integrate your S3 data within a single UI.
You can connect OpenSearch to your Amazon Simple Storage Service (Amazon S3) data source using the OpenSearch Dashboards interface and then query that data, optimize query performance, define tables, and integrate your S3 data.

## Prerequisites

To connect data from Amazon S3 to OpenSearch using OpenSearch Dashboards, you must have:
Before connecting a data source, verify the following requirements are met:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

- Access to Amazon S3 and the [AWS Glue Data Catalog](https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/connectors/s3glue_connector.rst#id2).
- Access to OpenSearch and OpenSearch Dashboards.
- An understanding of OpenSearch data source and connector concepts. See the [developer documentation](https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/datasources.rst#introduction) for information about these concepts.
- You have access to Amazon S3 and the [AWS Glue Data Catalog](https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/connectors/s3glue_connector.rst#id2).
- You have access to OpenSearch and OpenSearch Dashboards.
- You have an understanding of OpenSearch data source and connector concepts. See the [developer documentation](https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/datasources.rst#introduction) for more information.

## Connect your Amazon S3 data source
## Connect your data source

To connect your Amazon S3 data source, follow these steps:
To connect your data source, follow these steps:

1. From the OpenSearch Dashboards main menu, select **Management** > **Data sources**.
2. On the **Data sources** page, select **New data source** > **S3**. An example UI is shown in the following image.
1. From the OpenSearch Dashboards main menu, go to **Management** > **Dashboards Management** > **Data sources**.
2. On the **Data sources** page, select **Create data source connection** > **Amazon S3**.
3. On the **Configure Amazon S3 data source** page, enter the data source and authentication details and permissions.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
4. Select the **Review Configuration** button to verify the connection details.
5. Select the **Connect to Amazon S3** button to establish a connection.

<img src="{{site.url}}{{site.baseurl}}/images/dashboards/data-sources-UI.png" alt="Amazon S3 data sources UI" width="700"/>
## Manage your data source

3. On the **Configure Amazon S3 data source** page, enter the required **Data source details**, **AWS Glue authentication details**, **AWS Glue index store details**, and **Query permissions**. An example UI is shown in the following image.

<img src="{{site.url}}{{site.baseurl}}/images/dashboards/S3-config-UI.png" alt="Amazon S3 configuration UI" width="700"/>

4. Select the **Review Configuration** button and verify the details.
5. Select the **Connect to Amazon S3** button.

## Manage your Amazon S3 data source

Once you've connected your Amazon S3 data source, you can explore your data through the **Manage data sources** tab. The following steps guide you through using this functionality:
To manage your data source, follow these steps:

1. On the **Manage data sources** tab, choose a date source from the list.
2. On that data source's page, you can manage the data source, choose a use case, and manage access controls and configurations. An example UI is shown in the following image.

<img src="{{site.url}}{{site.baseurl}}/images/dashboards/manage-data-source-UI.png" alt="Manage data sources UI" width="700"/>

3. (Optional) Explore the Amazon S3 use cases, including querying your data and optimizing query performance. Go to **Next steps** to learn more about each use case.
2. On the data source's page, you can manage the data source, choose a use case, and configure access controls.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"On the page for the data source"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
3. (Optional) Explore the Amazon S3 use cases, including querying your data and optimizing query performance. Refer to the [**Next steps**](#next-steps) section to learn more about each use case.

## Limitations

This feature is still under development, including the data integration functionality. For real-time updates, see the [developer documentation on GitHub](https://github.com/opensearch-project/opensearch-spark/blob/main/docs/index.md#limitations).
This feature is under development, including the data integration functionality. For up-to-date information, refer to the [developer documentation on GitHub](https://github.com/opensearch-project/opensearch-spark/blob/main/docs/index.md#limitations).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Next steps

- Learn about [querying your data in Data Explorer]({{site.url}}{{site.baseurl}}/dashboards/management/query-data-source/) through OpenSearch Dashboards.
- Learn about ways to [optimize the query performance of your external data sources]({{site.url}}{{site.baseurl}}/dashboards/management/accelerate-external-data/), such as Amazon S3, through Query Workbench.
- Learn about [optimizing the query performance of your external data sources]({{site.url}}{{site.baseurl}}/dashboards/management/accelerate-external-data/), such as Amazon S3, through Query Workbench.
- Learn about [Amazon S3 and AWS Glue Data Catalog](https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/connectors/s3glue_connector.rst) and the APIS used with Amazon S3 data sources, including configuration settings and query examples.
- Learn about [managing your indexes]({{site.url}}{{site.baseurl}}/dashboards/im-dashboards/index/) through OpenSearch Dashboards.

Loading