Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add meta_struct support on traces and spans #4334

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

vpellan
Copy link
Contributor

@vpellan vpellan commented Jan 31, 2025

What does this PR do?

This PR adds meta_struct support to traces and spans. meta_struct is a way to send complex structures to the backend. It is sent as a Map<String, ByteArray> with the ByteArray being the structure encoded in messagepack.

Motivation:

meta_struct is required for stack trace collection, which is required for exploit prevention. But this can also be used by other products.

Change log entry

None.

Additional Notes:

How to test the change?

@github-actions github-actions bot added appsec Application Security monitoring product tracing labels Jan 31, 2025
@vpellan vpellan force-pushed the vpellan/meta-struct branch from b9e8d5d to b3dc5c2 Compare January 31, 2025 12:39
@vpellan vpellan force-pushed the vpellan/meta-struct branch from b3dc5c2 to 0836e11 Compare January 31, 2025 12:48
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Jan 31, 2025

Datadog Report

Branch report: vpellan/meta-struct
Commit report: fcda7ca
Test service: dd-trace-rb

✅ 0 Failed, 22126 Passed, 1476 Skipped, 5m 20.88s Total Time

@vpellan vpellan removed the appsec Application Security monitoring product label Jan 31, 2025
@pr-commenter
Copy link

pr-commenter bot commented Jan 31, 2025

Benchmarks

Benchmark execution time: 2025-02-03 14:51:41

Comparing candidate commit fcda7ca in PR branch vpellan/meta-struct with baseline commit cefebdf in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 31 metrics, 2 unstable metrics.

@codecov-commenter
Copy link

codecov-commenter commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 89.58333% with 10 lines in your changes missing coverage. Please review.

Project coverage is 97.71%. Comparing base (cefebdf) to head (fcda7ca).
Report is 34 commits behind head on master.

Files with missing lines Patch % Lines
lib/datadog/tracing/span_operation.rb 20.00% 4 Missing ⚠️
lib/datadog/tracing/trace_operation.rb 63.63% 4 Missing ⚠️
lib/datadog/tracing/metadata/meta_struct.rb 84.61% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4334      +/-   ##
==========================================
- Coverage   97.72%   97.71%   -0.01%     
==========================================
  Files        1368     1370       +2     
  Lines       82998    83090      +92     
  Branches     4220     4223       +3     
==========================================
+ Hits        81108    81190      +82     
- Misses       1890     1900      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vpellan vpellan force-pushed the vpellan/meta-struct branch from faf7819 to fcda7ca Compare February 3, 2025 14:27
@vpellan vpellan marked this pull request as ready for review February 3, 2025 14:30
@vpellan vpellan requested a review from a team as a code owner February 3, 2025 14:30
Copy link
Member

@Strech Strech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work 👏🏼

I think we can run a round of light improvements over the naming and interfaces, rest looks solid!

# @public_api
module MetaStruct
def set_meta_struct(meta_struct)
self.meta_struct.merge!(meta_struct)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name prefix set_ implies that any data prior to that call will be overwritten.

@@ -113,6 +113,9 @@ def to_msgpack(packer = nil)
packer.write(span.meta)
packer.write('metrics')
packer.write(span.metrics)
packer.write('meta_struct')
# We encapsulate the resulting msgpack in a binary msgpack
packer.write(span.meta_struct.transform_values(&:to_msgpack))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to encapsulate, then the logic of the values transformation should not be exposed, instead to_msgpack should be called on entire meta_struct

packer.write(span.meta_struct.to_msgpack)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, to_msgpack is implicitly called by packer.write, if present.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But packer.write(span.meta_struct.transform_values(&:to_msgpack)) is ok.

@@ -6,13 +6,14 @@ module Datadog
attr_accessor end_time: (Time | nil)
attr_accessor id: Integer
attr_accessor meta: Hash[String, String]
attr_accessor meta_struct: Hash[String, untyped]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be a type alias inside the MetaStruct rbs instead

@@ -23,7 +23,7 @@ module Datadog
attr_writer sampled: untyped
attr_writer service: untyped

def initialize: (?agent_sample_rate: untyped?, ?events: untyped?, ?hostname: untyped?, ?id: untyped?, ?max_length: untyped, ?name: untyped?, ?origin: untyped?, ?parent_span_id: untyped?, ?rate_limiter_rate: untyped?, ?resource: untyped?, ?rule_sample_rate: untyped?, ?sample_rate: untyped?, ?sampled: untyped?, ?sampling_priority: untyped?, ?service: untyped?, ?tags: untyped?, ?metrics: untyped?, ?remote_parent: untyped?) -> void
def initialize: (?agent_sample_rate: untyped?, ?events: untyped?, ?hostname: untyped?, ?id: untyped?, ?max_length: untyped, ?name: untyped?, ?origin: untyped?, ?parent_span_id: untyped?, ?rate_limiter_rate: untyped?, ?resource: untyped?, ?rule_sample_rate: untyped?, ?sample_rate: untyped?, ?sampled: untyped?, ?sampling_priority: untyped?, ?service: untyped?, ?tags: untyped?, ?meta_struct: untyped?, ?metrics: untyped?, ?remote_parent: untyped?) -> void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the new class you can use a type alias from MetaStruct

self.meta_struct.merge!(meta_struct)
end

def get_meta_struct(key)
Copy link
Member

@Strech Strech Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now sure about naming, but if I call get_meta_struct I would expect the whole meta-struct object in return. I think we should somehow highlight that it's a part inside the meta-struct.

Maybe names like

get_meta_struct_value
get_meta_struct_field

or something way better (see ⬇️)

module Metadata
# Adds complex structures tagging behavior through meta_struct
# @public_api
module MetaStruct
Copy link
Member

@Strech Strech Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about naming, we call Metadata as a single word, but MetaStruct as two words. Does it make sense if we call it Metastruct? We can avoid long names on reading too

get_metastruct_value("key")

Copy link
Member

@p-datadog p-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes due to mismatch between no change log entry claimed in PR description and the new API being marked public.

module Tracing
module Metadata
# Adds complex structures tagging behavior through meta_struct
# @public_api
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this module is part of the public API I would expect at least a brief description of the methods in it added as comments, otherwise it seems difficult to use and potentially users could misuse the API (which then we would need to support, since the API is public).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the PR says there is no change log entry needed, which is inconsistent with adding a public API.

# Adds complex structures tagging behavior through meta_struct
# @public_api
module MetaStruct
def set_meta_struct(meta_struct)
Copy link
Member

@marcotc marcotc Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Strech why not def meta_struct= here and def meta_struct(key) for reading?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, because it's merge eh... this is in interesting api

Copy link
Member

@marcotc marcotc Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, how is this API going to be used? By key? In batch? Is the "read" ever invoked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants