Skip to content

Commit

Permalink
Merge branch 'opensearch-project:main' into jira-source
Browse files Browse the repository at this point in the history
  • Loading branch information
san81 authored Oct 30, 2024
2 parents 2cee7bc + c5c2fa6 commit dd90582
Show file tree
Hide file tree
Showing 32 changed files with 1,290 additions and 93 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
<img src="https://raw.githubusercontent.com/opensearch-project/data-prepper/main/docs/images/DataPrepper_auto.svg" height="64px" alt="Data Prepper">
<img src="https://raw.githubusercontent.com/opensearch-project/data-prepper/main/docs/images/DataPrepper_auto.svg" height="64px" alt="OpenSearch Data Prepper">

[![codecov](https://codecov.io/gh/opensearch-project/data-prepper/branch/main/graph/badge.svg?token=IS7GOIY622)](https://codecov.io/gh/opensearch-project/data-prepper)
# Data Prepper
# OpenSearch Data Prepper

We envision Data Prepper as an open source data collector for observability data (trace, logs, metrics) that can filter, enrich, transform, normalize, and aggregate data for downstream analysis and visualization. It will support stateful processing across multiple instances of data pipelines for observability use cases such as distributed tracing and multi-line log events (e.g. stack traces, aggregations, and log-to-metric transformations). Currently Data Prepper supports processing of distributed trace data and log ingestion with plans to support metric data in the future.
We envision OpenSearch Data Prepper as an open source data collector for observability data (trace, logs, metrics) that can filter, enrich, transform, normalize, and aggregate data for downstream analysis and visualization. It will support stateful processing across multiple instances of data pipelines for observability use cases such as distributed tracing and multi-line log events (e.g. stack traces, aggregations, and log-to-metric transformations). Currently OpenSearch Data Prepper supports processing of distributed trace data and log ingestion with plans to support metric data in the future.

Please read the [Overview](docs/overview.md) to understand what Data Prepper is and how it works.
Please read the [Overview](docs/overview.md) to understand what OpenSearch Data Prepper is and how it works.

## Getting Started

Our [Getting Started](docs/getting_started.md) guide is the best starting point for anybody who wants to run Data Prepper.
Our [Getting Started](docs/getting_started.md) guide is the best starting point for anybody who wants to run OpenSearch Data Prepper.

Please read the [Trace Analytics](docs/trace_analytics.md) guide or [Log Analytics](docs/log_analytics.md) to get started with using Data Prepper for trace or log analytics use cases.
Please read the [Trace Analytics](docs/trace_analytics.md) guide or [Log Analytics](docs/log_analytics.md) to get started with using OpenSearch Data Prepper for trace or log analytics use cases.

## Project Resources

* [Downloads](https://opensearch.org/downloads.html)
* [Documentation](https://opensearch.org/docs/latest/clients/data-prepper/index/)
* [Configuration Reference](https://opensearch.org/docs/latest/clients/data-prepper/data-prepper-reference/)
* Need help? Try the [Data Prepper category](https://discuss.opendistrocommunity.dev/c/data-prepper/61) in the OpenSearch forums
* Need help? Try the [OpenSearch Data Prepper category](https://discuss.opendistrocommunity.dev/c/data-prepper/61) in the OpenSearch forums
* [Project Principles](https://opensearch.org/#principles)

## Contribute
Expand Down
22 changes: 11 additions & 11 deletions RELEASING.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Releasing

This document outlines the process for releasing Data Prepper.
It is a guide for maintainers of the Data Prepper project to release a new version.
This document outlines the process for releasing OpenSearch Data Prepper.
It is a guide for maintainers of the OpenSearch Data Prepper project to release a new version.

## Overview

Expand All @@ -16,13 +16,13 @@ This document has three broad categories of steps to follow:

### Branch setup

Data Prepper uses a release branch for releasing.
OpenSearch Data Prepper uses a release branch for releasing.
The [Developer Guide](docs/developer_guide.md#backporting) discusses this in detail.

The repository has a release branch for a major/minor version.
Patch versions will continue on the same branch.
For example, Data Prepper `2.6.0` was released from the `2.6` branch.
Additionally, Data Prepper `2.6.1` and `2.6.2` were also released from the `2.6` branch.
For example, OpenSearch Data Prepper `2.6.0` was released from the `2.6` branch.
Additionally, OpenSearch Data Prepper `2.6.1` and `2.6.2` were also released from the `2.6` branch.

If you are creating a new major/minor release, then you will need to create the release branch.
Use GitHub to create a new branch.
Expand All @@ -42,7 +42,7 @@ Steps:

### Update versions

The Data Prepper version is defined in the [`gradle.properties`](https://github.com/opensearch-project/data-prepper/blob/main/gradle.properties) file.
The OpenSearch Data Prepper version is defined in the [`gradle.properties`](https://github.com/opensearch-project/data-prepper/blob/main/gradle.properties) file.
We must update this whenever we create a new release.
We will need two PRs to update it.

Expand Down Expand Up @@ -76,7 +76,7 @@ Note: This step can be automated through [#4877](https://github.com/opensearch-p
### Update the THIRD-PARTY file

We should update the `THIRD-PARTY` file for every release.
Data Prepper has a GitHub action that will generate this and create a PR with the updated file.
OpenSearch Data Prepper has a GitHub action that will generate this and create a PR with the updated file.

Steps:
* Go the [Third Party Generate action](https://github.com/opensearch-project/data-prepper/actions/workflows/third-party-generate.yml)
Expand Down Expand Up @@ -118,10 +118,10 @@ Also tag this with the `backport {major}.{minor}` to create a PR that you can me

## <a name="performing-a-release">Performing a release</a>

This section outlines how to perform a Data Prepper release using GitHub Actions and the OpenSearch build infrastructure.
The audience for this section are Data Prepper maintainers.
This section outlines how to perform a OpenSearch Data Prepper release using GitHub Actions and the OpenSearch build infrastructure.
The audience for this section are OpenSearch Data Prepper maintainers.

### Start the release Data Prepper action
### Start the release OpenSearch Data Prepper action

To run the release, go to the [Release Artifacts](https://github.com/opensearch-project/data-prepper/actions/workflows/release.yml)
GitHub Action.
Expand Down Expand Up @@ -159,7 +159,7 @@ This includes building the artifacts, testing, drafting a GitHub release, and pr

The release build will create a new GitHub issue requesting to release the project.
This needs two maintainers to approve.
To approve, load [Data Prepper issues](https://github.com/opensearch-project/data-prepper/issues).
To approve, load [OpenSearch Data Prepper issues](https://github.com/opensearch-project/data-prepper/issues).
Look for and open a new issue starting with _Manual approval required for workflow_.
Verify that the metadata looks correct and that we want to release.
Add a new comment on the issue with the word _approve_ or _approved_ in it.
Expand Down
20 changes: 10 additions & 10 deletions TRIAGING.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<img src="https://raw.githubusercontent.com/opensearch-project/data-prepper/main/docs/images/DataPrepper_auto.svg" height="64px" alt="Data Prepper">
<img src="https://raw.githubusercontent.com/opensearch-project/data-prepper/main/docs/images/DataPrepper_auto.svg" height="64px" alt="OpenSearch Data Prepper">

The Data Prepper maintainers seek to promote an inclusive and engaged community of contributors.
The OpenSearch Data Prepper maintainers seek to promote an inclusive and engaged community of contributors.
In order to facilitate this, weekly triage meetings are open to all and attendance is encouraged for anyone who hopes to contribute, discuss an issue, or learn more about the project.
To learn more about contributing to the Data Prepper project visit the [Contributing](./CONTRIBUTING.md) documentation.
To learn more about contributing to the OpenSearch Data Prepper project visit the [Contributing](./CONTRIBUTING.md) documentation.

### Do I need to attend for my issue to be addressed/triaged?

Expand All @@ -17,7 +17,7 @@ However, should we run out of time before your issue is discussed, you are alway
### How do I join the triage meeting?

Meetings are hosted regularly Tuesdays at 2:30 PM US Central Time (12:30 PM Pacific Time) and can be joined via the links posted on the [OpenSearch Meetup Group](https://www.meetup.com/opensearch/events/) list of events.
The event will be titled `Data Prepper Triage Meeting`.
The event will be titled `OpenSearch Data Prepper Triage Meeting`.

After joining the video meeting, you can enable your video / voice to join the discussion.
If you do not have a webcam or microphone available, you can still join in via the text chat.
Expand All @@ -28,9 +28,9 @@ If you have an issue you'd like to bring forth please consider getting a link to

Meetings are 30 minutes and structured as follows:

1. Initial Gathering: As we gather, feel free to turn on video and engage in informal and open-to-all conversation. A volunteer Data Prepper maintainer will share the [Data Prepper Tracking Board](https://github.com/orgs/opensearch-project/projects/82/) and proceed.
1. Initial Gathering: As we gather, feel free to turn on video and engage in informal and open-to-all conversation. A volunteer OpenSearch Data Prepper maintainer will share the [Data Prepper Tracking Board](https://github.com/orgs/opensearch-project/projects/82/) and proceed.
2. Announcements: We will make any announcements at the beginning, if necessary.
3. Untriaged issues: We will review all untriaged [issues](https://github.com/orgs/opensearch-project/projects/82/views/6) for the Data Prepper repository. If you have an item here, you may spend a few minutes to explain your request.
3. Untriaged issues: We will review all untriaged [issues](https://github.com/orgs/opensearch-project/projects/82/views/6) for the OpenSearch Data Prepper repository. If you have an item here, you may spend a few minutes to explain your request.
4. Member Requests: Opportunity for any meeting member to ask for consideration of an issue or pull request.
5. Release review: If time permits, and we find it necessary, we will review [items for the current release](https://github.com/orgs/opensearch-project/projects/82/views/14).
6. Follow-up issues: If time permits, we will review the [follow up items](https://github.com/orgs/opensearch-project/projects/82/views/18).
Expand All @@ -46,15 +46,15 @@ Attending the triage meetings is a great way for a new contributor to learn abou
If you have an existing issue you would like to discuss, you can always comment on the issue itself.
Alternatively, you are welcome to come to the triage meeting to discuss.

### Is this meeting a good place to get help using Data Prepper?
### Is this meeting a good place to get help using OpenSearch Data Prepper?

While we are always happy to help the community, the best resource for usage questions is the [the Data Prepper discussion forum](https://github.com/opensearch-project/data-prepper/discussions) on GitHub.
While we are always happy to help the community, the best resource for usage questions is the [OpenSearch Data Prepper discussion forum](https://github.com/opensearch-project/data-prepper/discussions) on GitHub.

There you can find answers to many common questions as well as speak with implementation experts and Data Prepper maintainers.
There you can find answers to many common questions as well as speak with implementation experts and OpenSearch Data Prepper maintainers.

### What are the issue labels associated with triaging?

There are several labels that are particularly important for triaging in Data Prepper:
There are several labels that are particularly important for triaging in OpenSearch Data Prepper:

| Label | When applied | Meaning |
| ----- | ------------ | ------- |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,10 @@ public String toString() {
}

private String checkAndTrimKey(final String key) {
checkKey(key);
if(!supportedActions.equals(Collections.singleton(EventKeyFactory.EventAction.DELETE)))
{
checkKey(key);
}
return trimTrailingSlashInKey(key);
}

Expand Down Expand Up @@ -158,10 +161,12 @@ private static boolean isValidKey(final String key) {
|| c == '.'
|| c == '-'
|| c == '_'
|| c == '~'
|| c == '@'
|| c == '/'
|| c == '['
|| c == ']')) {
|| c == ']'
)) {

return false;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@ void constructor_throws_with_invalid_key(final String key) {
"key-with-hyphen",
"key_with_underscore",
"key@with@at",
"key[with]brackets"
"key[with]brackets",
"key~1withtilda"
})
void getKey_returns_expected_result(final String key) {
assertThat(new JacksonEventKey(key).getKey(), equalTo(key));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -526,7 +526,6 @@ public void testKey_withNullKey_throwsNullPointerException() {
private <T extends Throwable> void assertThrowsForKeyCheck(final Class<T> expectedThrowable, final String key) {
assertThrows(expectedThrowable, () -> event.put(key, UUID.randomUUID()));
assertThrows(expectedThrowable, () -> event.get(key, String.class));
assertThrows(expectedThrowable, () -> event.delete(key));
}

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,10 @@ private void buildPipelineFromConfiguration(
final List<Processor> processors = processorComponentList.stream().map(IdentifiedComponent::getComponent).collect(Collectors.toList());
if (!processors.isEmpty() && processors.get(0) instanceof RequiresPeerForwarding) {
return PeerForwardingProcessorDecorator.decorateProcessors(
processors, peerForwarderProvider, pipelineName, processorComponentList.get(0).getName(), pipelineConfiguration.getWorkers()
processors, peerForwarderProvider, pipelineName, processorComponentList.get(0).getName(),
dataPrepperConfiguration.getPeerForwarderConfiguration() != null ?
dataPrepperConfiguration.getPeerForwarderConfiguration().getExcludeIdentificationKeys() : null,
pipelineConfiguration.getWorkers()
);
}
return processors;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@
import java.time.Duration;
import java.util.ArrayList;
import java.util.Collections;
import java.util.stream.Collectors;
import java.util.List;
import java.util.Map;
import java.util.Set;

/**
* Class to hold configuration for Core Peer Forwarder in {@link DataPrepperConfiguration},
Expand Down Expand Up @@ -65,6 +67,7 @@ public class PeerForwarderConfiguration {
private Integer forwardingBatchQueueDepth = 1;
private Duration forwardingBatchTimeout = DEFAULT_FORWARDING_BATCH_TIMEOUT;
private boolean binaryCodec = true;
private List<Set<String>> excludeIdentificationKeys;

public PeerForwarderConfiguration() {}

Expand All @@ -76,6 +79,7 @@ public PeerForwarderConfiguration (
@JsonProperty("server_thread_count") final Integer serverThreadCount,
@JsonProperty("max_connection_count") final Integer maxConnectionCount,
@JsonProperty("max_pending_requests") final Integer maxPendingRequests,
@JsonProperty("exclude_identification_keys") final List<Set<String>> excludeIdentificationKeys,
@JsonProperty("ssl") final Boolean ssl,
@JsonProperty("ssl_certificate_file") final String sslCertificateFile,
@JsonProperty("ssl_key_file") final String sslKeyFile,
Expand Down Expand Up @@ -139,6 +143,7 @@ public PeerForwarderConfiguration (
setBinaryCodec(binaryCodec == null || binaryCodec);
checkForCertAndKeyFileInS3();
validateSslAndAuthentication();
this.excludeIdentificationKeys = excludeIdentificationKeys;
}

public int getServerPort() {
Expand Down Expand Up @@ -169,6 +174,13 @@ public boolean isSsl() {
return ssl;
}

public Set<Set<String>> getExcludeIdentificationKeys() {
if (excludeIdentificationKeys == null) {
return null;
}
return excludeIdentificationKeys.stream().collect(Collectors.toSet());
}

public String getSslCertificateFile() {
return sslCertificateFile;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,14 @@
public class PeerForwardingProcessorDecorator implements Processor<Record<Event>, Record<Event>> {
private final PeerForwarder peerForwarder;
private final Processor innerProcessor;
private final boolean peerForwardingDisabled;

public static List<Processor> decorateProcessors(
final List<Processor> processors,
final PeerForwarderProvider peerForwarderProvider,
final String pipelineName,
final String pluginId,
final Set<Set<String>> excludeIdentificationKeys,
final Integer pipelineWorkerThreads) {

Set<String> identificationKeys;
Expand Down Expand Up @@ -69,13 +71,27 @@ public static List<Processor> decorateProcessors(

final PeerForwarder peerForwarder = peerForwarderProvider.register(pipelineName, firstInnerProcessor, pluginId, identificationKeys, pipelineWorkerThreads);

return processors.stream().map(processor -> new PeerForwardingProcessorDecorator(peerForwarder, processor))
.collect(Collectors.toList());
return processors.stream().map(processor ->
new PeerForwardingProcessorDecorator(peerForwarder, processor, isPeerForwardingDisabled(processor, excludeIdentificationKeys))
).collect(Collectors.toList());
}

private PeerForwardingProcessorDecorator(final PeerForwarder peerForwarder, final Processor innerProcessor) {
private static boolean isPeerForwardingDisabled(Processor processor, Set<Set<String>> excludeIdentificationKeysSet) {
if (processor instanceof RequiresPeerForwarding && excludeIdentificationKeysSet != null && excludeIdentificationKeysSet.size() > 0) {
Set<String> identificationKeys = new HashSet<String>(((RequiresPeerForwarding) processor).getIdentificationKeys());
return excludeIdentificationKeysSet.contains(identificationKeys);
}
return false;
}

private PeerForwardingProcessorDecorator(final PeerForwarder peerForwarder, final Processor innerProcessor, final boolean peerForwardingDisabled) {
this.peerForwarder = peerForwarder;
this.innerProcessor = innerProcessor;
this.peerForwardingDisabled = peerForwardingDisabled;
}

boolean isPeerForwardingDisabled() {
return peerForwardingDisabled;
}

@Override
Expand Down
Loading

0 comments on commit dd90582

Please sign in to comment.