-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add dependency processor using Apache Beam #6560
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: yunmaoQu <[email protected]>
af5f794
to
60fb334
Compare
plugin/storage/memory/memory.go
Outdated
perTenant: make(map[string]*Tenant), | ||
defaultConfig: cfg, | ||
perTenant: make(map[string]*Tenant), | ||
useNewDependencies: false, // 添加初始化 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use English in comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- where is it hooked up to anything?
- what would be the e2e testing for this component?
@yurishkuro I have fixed it |
Signed-off-by: yunmaoQu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mahadzaryab1 interesting direction here
type Config struct { | ||
AggregationInterval time.Duration `yaml:"aggregation_interval"` | ||
InactivityTimeout time.Duration `yaml:"inactivity_timeout"` | ||
Store *memory.Store `yaml:"-"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storage is not part of the config. The processor itself can retrieve storage from storage extension - see adaptive sampling processor for example.
require.NoError(t, err) | ||
|
||
// Wait for the processor to process the trace | ||
time.Sleep(cfg.AggregationInterval + 100*time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use sleep in tests, use assert.Eventually
|
||
func DefaultConfig() Config { | ||
return Config{ | ||
AggregationInterval: 5 * time.Second, // Default dependency aggregation interval: 5 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way too frequent. We used to flush every 15min at Uber, since the topology does not change that often, but I can see a need for more frequent flushing if we want to capture movement of metrics.
AggregationInterval: 5 * time.Second, // Default dependency aggregation interval: 5 seconds | |
AggregationInterval: 10 * time.Minute, |
The comment is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to flush every minute to support investigating metrics movement, but it will create too much data. We may have to consider integrating with Prometheus instead, and only focus on capturing topology here, which definitely does not need 1min frequency.
Signed-off-by: yunmaoQu <[email protected]>
@yurishkuro Except this ,I update all based on your review. |
config *Config | ||
aggregator *dependencyAggregator // Define the aggregator below. | ||
telset component.TelemetrySettings | ||
dependencyWriter *memory.Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I mentioned, you cannot have concrete store dependency here. The processor needs to work with any storage supported by Jaeger, as long as they implement WriteDependencies.
Example:
f, err := jaegerstorage.GetStorageFactory(storageName, host) |
func (tp *dependencyProcessor) Shutdown(ctx context.Context) error { | ||
close(tp.closeChan) | ||
if tp.aggregator != nil { | ||
if err := tp.aggregator.Close(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if aggregator has a Close() function why does it need to be passed closeChan
?
type Config struct { | ||
// AggregationInterval defines how often the processor aggregates dependencies. | ||
// This controls the frequency of flushing dependency data to storage. | ||
// Default dependency aggregation interval: 10 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to mention default, it's defined elsewhere and we don't have to keep two places in sync.
// AggregationInterval defines how often the processor aggregates dependencies. | ||
// This controls the frequency of flushing dependency data to storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// AggregationInterval defines how often the processor aggregates dependencies. | |
// This controls the frequency of flushing dependency data to storage. | |
// AggregationInterval defines the length of aggregation window after | |
// which the accumulated dependencies are flushed into storage. |
|
||
cfg.AggregationInterval = 1 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg.AggregationInterval = 1 * time.Second | |
cfg.AggregationInterval = 1 * time.Second | |
// is considered complete and ready for dependency aggregation. | ||
// Default trace completion timeout: 2 seconds of inactivity | ||
InactivityTimeout time.Duration `yaml:"inactivity_timeout"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add Validate method and use valid:
notations in the field tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
"github.com/jaegertracing/jaeger/plugin/storage/memory" | ||
) | ||
|
||
type dependencyAggregator struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a lot of code here. How can it be unit tested with beam?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
AggregationInterval: 10 * time.Second, | ||
InactivityTimeout: 2 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggregationInterval: 10 * time.Second, | |
InactivityTimeout: 2 * time.Second, | |
AggregationInterval: 10 * time.Minute, | |
InactivityTimeout: 2 * time.Minute, |
} | ||
|
||
func (dp *dependencyProcessor) ConsumeTraces(ctx context.Context, td ptrace.Traces) error { | ||
batches := v1adapter.ProtoFromTraces(td) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to fall back to the old Jaeger model? This is new code, let's keep it in the ptrace
domain.
} | ||
|
||
func newDependencyAggregator(cfg Config, telset component.TelemetrySettings, dependencyWriter *memory.Store) *dependencyAggregator { | ||
beam.Init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be called more than once?
"github.com/jaegertracing/jaeger/plugin/storage/memory" | ||
) | ||
|
||
func TestDependencyProcessorEndToEnd(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this is a unit test, not an e2e test. By e2e I mean running the new processor inside a real Jaeger binary and validating the correct processing through the API calls. See cmd/jaeger/internal/integration/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot. I will fix it
Which problem is this PR solving?
Resolves #5911
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:npm run lint
andnpm run test