Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] Add MurmurHash3 converter #34155

Closed
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/ottl_murmurhash3_func.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg/ottl

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add `MurmurHash3` function to convert the `target` to a hexadecimal string of the murmurHash3 hash/digest"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [34077]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
1 change: 1 addition & 0 deletions cmd/oteltestbedcol/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,7 @@ require (
github.com/tinylib/msgp v1.2.0 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
github.com/valyala/fastjson v1.6.4 // indirect
github.com/vultr/govultr/v2 v2.17.2 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
Expand Down
2 changes: 2 additions & 0 deletions cmd/oteltestbedcol/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions connector/countconnector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ require (
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/featuregate v1.12.1-0.20240716231837-5753a58f712b // indirect
Expand Down
2 changes: 2 additions & 0 deletions connector/countconnector/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions connector/datadogconnector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ require (
github.com/tinylib/msgp v1.1.9 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
Expand Down
2 changes: 2 additions & 0 deletions connector/datadogconnector/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions connector/routingconnector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ require (
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/featuregate v1.12.1-0.20240716231837-5753a58f712b // indirect
Expand Down
2 changes: 2 additions & 0 deletions connector/routingconnector/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions connector/sumconnector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ require (
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/featuregate v1.12.1-0.20240716231837-5753a58f712b // indirect
Expand Down
2 changes: 2 additions & 0 deletions connector/sumconnector/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions exporter/datadogexporter/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ require (
github.com/tinylib/msgp v1.1.9 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
github.com/valyala/fastjson v1.6.4 // indirect
github.com/vultr/govultr/v2 v2.17.2 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
Expand Down
2 changes: 2 additions & 0 deletions exporter/datadogexporter/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions exporter/datadogexporter/integrationtest/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ require (
github.com/stretchr/objx v0.5.2 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
Expand Down
2 changes: 2 additions & 0 deletions exporter/datadogexporter/integrationtest/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions exporter/elasticsearchexporter/integrationtest/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ require (
github.com/tilinna/clock v1.1.0 // indirect
github.com/tklauser/go-sysconf v0.3.13 // indirect
github.com/tklauser/numcpus v0.7.0 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
github.com/valyala/fastjson v1.6.4 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
go.elastic.co/apm/module/apmelasticsearch/v2 v2.6.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions exporter/elasticsearchexporter/integrationtest/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions exporter/honeycombmarkerexporter/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ require (
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/rs/cors v1.11.0 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
go.opentelemetry.io/collector v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/config/configauth v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/config/configcompression v1.12.1-0.20240716231837-5753a58f712b // indirect
Expand Down
2 changes: 2 additions & 0 deletions exporter/honeycombmarkerexporter/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions internal/filter/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ require (
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/twmb/murmur3 v1.1.8 // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/collector/internal/globalgates v0.105.1-0.20240717163034-43ed6184f9fe // indirect
go.opentelemetry.io/otel v1.28.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions internal/filter/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions pkg/ottl/e2e/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -510,6 +510,18 @@ func Test_e2e_converters(t *testing.T) {
tCtx.GetLogRecord().Attributes().PutDouble("test", 60)
},
},
{
statement: `set(attributes["test"], MurmurHash3("Hello World"))`,
want: func(tCtx ottllog.TransformContext) {
tCtx.GetLogRecord().Attributes().PutStr("test", "dbc2a0c1ab26631a27b4c09fcf1fe683")
},
},
{
statement: `set(attributes["test"], MurmurHash3("Hello World", version="32"))`,
want: func(tCtx ottllog.TransformContext) {
tCtx.GetLogRecord().Attributes().PutStr("test", "ce837619")
},
},
{
statement: `set(attributes["test"], Nanoseconds(Duration("1ms")))`,
want: func(tCtx ottllog.TransformContext) {
Expand Down
1 change: 1 addition & 0 deletions pkg/ottl/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ require (
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.105.0
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatatest v0.105.0
github.com/stretchr/testify v1.9.0
github.com/twmb/murmur3 v1.1.8
go.opentelemetry.io/collector/component v0.105.1-0.20240717163034-43ed6184f9fe
go.opentelemetry.io/collector/pdata v1.12.1-0.20240716231837-5753a58f712b
go.opentelemetry.io/collector/semconv v0.105.1-0.20240717163034-43ed6184f9fe
Expand Down
2 changes: 2 additions & 0 deletions pkg/ottl/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions pkg/ottl/ottlfuncs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,7 @@ Available Converters:
- [Minute](#minute)
- [Minutes](#minutes)
- [Month](#month)
- [MurmurHash3](#murmurhash3)
- [Nanoseconds](#nanoseconds)
- [Now](#now)
- [ParseCSV](#parsecsv)
Expand Down Expand Up @@ -947,6 +948,23 @@ Examples:

- `Month(Now())`

### MurmurHash3

`MurmurHash3(target, Optional[version])`

The `MurmurHash3` Converter converts the `target` to a hexadecimal string of murmurHash3 hash/digest

`target` is a Getter that returns a string.

`version` is an optional string. MurmurHash3 has 32-bit and 128-bit versions. The default value is `128`. Valid values are `32` and `128`.

The returned type is `string`.

Examples:

- `MurmurHash3(attributes["device.name"])`
- `MurmurHash3("sometext", version="32")`

### Nanoseconds

`Nanoseconds(value)`
Expand Down
81 changes: 81 additions & 0 deletions pkg/ottl/ottlfuncs/func_murmurhash3.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package ottlfuncs // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/ottlfuncs"

import (
"context"
"encoding/binary"
"encoding/hex"
"fmt"

"github.com/twmb/murmur3"

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl"
)

const (
v32 = "32"
v128 = "128" // default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make these enum values? (To go along with my other nit about a switch statement)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an enum means the syntax to call the 32-bit version will be MurmurHash3("something", version=0). I think this is more confusing than MurmurHash3("something", version="32").

)

type MurmurHash3Arguments[K any] struct {
Target ottl.StringGetter[K]
Version ottl.Optional[string] // 32-bit or 128-bit
}

func NewMurmurHash3Factory[K any]() ottl.Factory[K] {
return ottl.NewFactory("MurmurHash3", &MurmurHash3Arguments[K]{}, createMurmurHash3Function[K])
}

func createMurmurHash3Function[K any](_ ottl.FunctionContext, oArgs ottl.Arguments) (ottl.ExprFunc[K], error) {
args, ok := oArgs.(*MurmurHash3Arguments[K])

if !ok {
return nil, fmt.Errorf("MurmurHash3Factory args must be of type *MurmurHash3Arguments[K]")
}

version := v128
if !args.Version.IsEmpty() {
v := args.Version.Get()

switch v {
case v32, v128:
version = v
default:
return nil, fmt.Errorf("invalid arguments: %s. Version should be either \"32\" or \"128\"", v)
}
}

return MurmurHash3HexString(args.Target, version)
}

// MurmurHash3HexString returns the hexadecimal representation of the hash in little-endian format.
// MurmurHash3, developed by Austin Appleby, is sensitive to endianness. Unlike some other languages like Python,
// which use little-endian for all architectures, the Go library `spaolacci/murmur3` has some open issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to directly reference the open issues being referenced. That way in the future we could remove this if they're ever fixed, and just to be able to quickly see the underlying issue.

// related to endianness compatibility across languages. This function ensures consistency by using
// little-endian and returns the hash value as a hexadecimal string.
func MurmurHash3HexString[K any](target ottl.StringGetter[K], version string) (ottl.ExprFunc[K], error) {
return func(ctx context.Context, tCtx K) (any, error) {
val, err := target.Get(ctx, tCtx)
if err != nil {
return nil, err
}

switch version {
case v32:
h := murmur3.Sum32([]byte(val))
b := make([]byte, 4)
binary.LittleEndian.PutUint32(b, h)
return hex.EncodeToString(b), nil
case v128:
h1, h2 := murmur3.Sum128([]byte(val))
b := make([]byte, 16)
binary.LittleEndian.PutUint64(b[:8], h1)
binary.LittleEndian.PutUint64(b[8:], h2)
return hex.EncodeToString(b), nil
default:
return nil, fmt.Errorf("invalid argument: %s", version)
}
}, nil
}
Loading