Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra attributes for log_msg #2595

Closed
wants to merge 41 commits into from
Closed

Conversation

bachittle
Copy link
Contributor

@bachittle bachittle commented Jan 4, 2023

This addresses the issue: #617. It is experimental and should be more tested and reviewed before merging into the main branch. Perhaps it could be a separate branch in your repository, or part of v2.x?

spdlog already has a set amount of pre-defined attributes built into the log msg struct, such as the level, time, logger name, thread id, etc. This code simply adds additional parameters that can be passed to a log message that will be appended to the end of the message.

Checklist:

  • attributes object built into log_msg
  • endpoint via custom logger methods (push_context, pop_context, clear_context)
  • printing the attributes is possible
  • custom patterns can print attributes in whatever manner the user chooses
  • whitespace scrambling of keys and values
  • special restriction on keys to prevent whitespace delimeter (perhaps convert to underscore?)
  • an API that is thread-safe and exception-safe implicitly for users.

using the API:
Refer to example.cpp and its new attribute_example() for a concrete example of attributes in action.

@bachittle
Copy link
Contributor Author

build is failing due to lack of C++17 support, I think due to my custom string_view usage. Will try integrating with the built-in string_view_t and see if I can revert back to having C++11 support

@gabime
Copy link
Owner

gabime commented Jan 5, 2023

Yes, I was wondering why the 17 requirement.

@bachittle
Copy link
Contributor Author

I was wondering, for further comment, how we should go about printing key-value pairs?

Currently I am using a logfmt-esque approach as the default, but perhaps the user can specify in the pattern matcher, like for example if they would like to do JSON instead to print pairs in the form of: {"key": "value"}. Either a new pattern would have to be set (logger->set_attr_pattern()), or it could be appended to the end of any given pattern (something like "{ ..., \"%k\": \"%v\", }") but then I'm not sure how the pattern matcher will know what patterns to repeat. I personally think the first option is more feasible.

* init fixed attribute, currently having issues with the factory

* using the much simpler api already defined: no factory, just define the sinks yourself

* async logger fixed attribute support

* fixed missing fixed attributes in default log calls

* forgot some spots, all cases were found?

* potential fix for illegal vector iterators

* kept up to date with attributes branch

* cleaner fixed logger api

* less function overloads, just add and remove context when needed

* fixed example to use new api

* added append function for nested context

* pushing and popping contexts for nested loggers

* fixed error due to implicit conversion

* fixed sign conversion error

* better attribute example with optional json example

Co-authored-by: Bailey Chittle <[email protected]>
@bachittle
Copy link
Contributor Author

I made a attribute stack of sorts so that context can be pushed/pulled when required. This can allow nested context, which would be nice for something like a stack trace. The API is as follows:
logger->push_context(): pushes an attribute_list onto the stack
logger->pop_context(): pops the last attribute_list added to the stack
logger->clear_context(): clears the entire stack and essentially resets logger to default

@gabime
Copy link
Owner

gabime commented Jan 12, 2023

Sounds good👍

@lowdesertpunk
Copy link
Contributor

I think having attributes on logger level or on log message level support two different uses cases. The first case is useful for persistent contextual attributes, while the second when doing structured logging in every log message. In both cases the attributes should reach sinks separately from the actual payload to allow converting to JSON, or supplying them to a a different system where attributes are first-class citizens (mostly what the original issue #617 is about).

Contextual attributes

// done once
logger->push_context({"user_id", get_user_id()});

// later
logger->info(msg1);
// ...
logger->info(msg2);

Structured messages

logger->info("backup done", {{"src", backup_path}, {"count", num_backup_items}});

I also did a change (living in a fork) which support the second case. Attributes and attribute names are living on the stack (via initializer_list / span and string_view, while attribute values stored in strings created by fmt::to_string to allow using user defined types. Also planning to extend attribute values to a variant supporting a few basic types so that sinks can handle them appropriately. In case of async logging log_msg_buffer copies stack allocated object as necessary.

Does this sound like something which would be useful to merge upstream? Happy to open a PR later on, but currently there are C++17 dependencies (span and variant) which need to be resolved.

@lowdesertpunk
Copy link
Contributor

Hmm, I somehow missed the for of @bobhansen in the original issue which does something very similar as indicated, probably making my changes obsolete.

@bachittle
Copy link
Contributor Author

bachittle commented Jan 13, 2023

I originally had overloads as well, but I think the main issue with overloading is just how many spots would have to be overloaded in order to get this to work. This is because there are a lot of helper functions (ex: info(), warn(), etc.), that would all need an extra overload just to pass attributes for the desired use case.

The contextual logger makes things simple, and even possible to just do one message at a time. For example:

logger->push_context({"user_id", get_user_id()});
logger->info(msg1);
logger->pop_context();

Sure, it's a few extra lines, but lessens the amount of overloads to account for every use case. In my opinion it is better to have a minimum viable product which could work for every use case, such as this, and add the overloads if there is a demand, since those are more so optional sugar.

bachittle and others added 4 commits January 13, 2023 11:27
* initial testing of attributes

* updating test cases for new api

* more testing, fixing bugs found

* fixed unused error

* errors might be caused due to passing pointers by reference, these simple pointer passes should be by value

* my only other idea for getting this to work. will experiment

* fixing tests before merge (#4)

* my only other idea for getting this to work. will experiment

* I think I fixed the thread issue, lets see...

* unused lambda capture acting funny

* see if its the context that is causing issues

* unused warning... again

* reverting to basic, see where the issue is in github actions (testing most of it locally)

* this works again?

* some diff cleanup before squash merging

* cleaner search for stop iterator
@lowdesertpunk
Copy link
Contributor

The contextual logger makes things simple, and even possible to just do one message at a time.

Possible, although in the current form it's neither exception, nor thread-safe. Thread safety could be solved by thread local variables, but exception safety would need some interface change.

@bachittle
Copy link
Contributor Author

Could you elaborate as to how it's not exception safe? Perhaps with an example? I do understand the thread-local issues and my test cases deal with this accordingly, using thread-local variables as you suggested.

There is a benefit to calling log methods with overloads as that would indeed be thread-safe, it could perhaps be an addition to the codebase as a new pull request, but they both require the same root features I have implemented, just one requires more overloads and is a bit more of a hassle to test. I am also experimenting with using RAII patterns to generate logger contexts as that might have a better chance of being thread-safe.

@bachittle
Copy link
Contributor Author

Maybe I will restrict the library user from using contexts unless they do so in a thread-safe manner by making the public API only for instantiating new logger objects, although this will break some other functionalities that involve using a logger in a global context, like backtracing.

@lowdesertpunk
Copy link
Contributor

lowdesertpunk commented Jan 25, 2023

Could you elaborate as to how it's not exception safe? Perhaps with an example?

logger->push_context({"user_id", get_user_id()});
logger->info("some other formatted message {}", map.at(key));
logger->pop_context();

If key is non-existent the context, meant for a single message, will remain active for further logs. Handling contexts via RAII objects as you indicated would be a possible solution.

@bachittle
Copy link
Contributor Author

bachittle commented Jan 25, 2023

I guess depending on the use case someone might be happy to have added context to their error messages, but that seems like a feature that arises from a bug lol. There are a few possible API solutions:

1: nested logger objects

{
auto lg = logger->clone("nested_logger");
lg->push_context({{"user_id", get_user_id()}});
lg>log("message {}", map.at(key));
}

again, a similar solution for the thread-safe option, nested logger objects will follow RAII and on exception will be cleaned up. Loses the global logger context and some features associated with that.

2: logger lambda

logger->context({{"user_id", get_user_id()}}, [&](){
  logger->log("message {}", map.at(key))
});

the lambda would allow for extra code to be appended at the beginning and end (like push/pop).

3: context lifetime object

{
auto ctx = logger::context({{"user_id", get_user_id()}});
logger::info("has context");
}
logger::info("doesn't have context");

the destructor for ctx will pop context.

Either way the easiest way for this to work is to use thread-local logger objects. Both 2 and 3 still suffer from thread-safety due to sharing data. If a logger is shared amongst threads then it needs to do something else, so overloads might be needed for that use case.

@pparuzel
Copy link

Does this allow for thread-local context and if so, does the thread-local context work with asynchronous logging?

@bobhansen
Copy link

bobhansen commented May 31, 2023

3: context lifetime object

{
auto ctx = logger::context({{"user_id", get_user_id()}});
logger::info("has context");
}
logger::info("doesn't have context");

the destructor for ctx will pop context.

We used this solution to very good effect. It is can be exception-safe, prevents users from forgetting to pop context, and is fairly ergonomic.

We solved the thread-safety in https://github.com/bobhansen/structured_spdlog by having the contexts stored in a thread-local linked-list of shared_ptr objects (heap allocations! booo!). This was a performance trade; in the first implementation, we had to continuously copy the thread-locals that were at the top of the stack into each async logging message.

We also ran into the need for an API to snapshot contexts and share them between threads. e.g. taking the current thread context and pushing into a thread pool to go with some execution, while being able to preserve the context of the request.

@falbrechtskirchinger
Copy link

We solved the thread-safety in https://github.com/bobhansen/structured_spdlog by having the contexts stored in a thread-local linked-list of shared_ptr objects (heap allocations! booo!). This was a performance trade; in the first implementation, we had to continuously copy the thread-locals that were at the top of the stack into each async logging message.

I'm in the planning stages of what sounds like a re-implementation of what you've done or maybe a hybrid of yours and this PR.

The shared_ptr covers the case of the thread terminating before the message is logged, right? I had resigned myself to always copying…

@bachittle
Copy link
Contributor Author

closing this as the PR is stale, and it seems like this MDC PR does something very similar to what I was attempting to accomplish: #2907

@bachittle bachittle closed this Apr 26, 2024
@gabime
Copy link
Owner

gabime commented Apr 26, 2024

@bachittle Thanks for the effort anyway. The new mdc impl was chosen because it is super simple and doesnt require any breaking changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants