-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch result logging to DEBUG level #417
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agreed with the performance implications of serializing, truncating and writing the output of an API e.g. TableI.slice
in the server log files especially as we are scaling up the usage of the tabular services with larger number of columns/rows.
Possibly the biggest impact of the current proposal is that all remoted methods will now log their output at the debug level. Again this might be desirable but this will modify how logs can be used during troubleshooting. Incidentally, this raises the wider question of whether similar changes would be useful in the server Java processes which also log execution results INFO level (https://github.com/ome/omero-server/blob/e2ef2b9c6c395badbfa5fb3b1d20040434151b8c/src/main/java/ome/services/util/ServiceHandler.java#L122-L134).
Keeping a parallelism with the Java processes was also going to be my primary comment. Otherwise, 👍 |
Maybe as a compromise we keep The Java logging stack is pretty good at abridging/truncating large objects and object hierarchies and avoids calling |
I'm not overly worried, but I have definitely written parsers in the past which look for |
👍 At INFO, the default level, you now get something like:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the last commit using the default properties, table logs now display as:
2024-06-26 09:29:58,447 INFO [ omero.remote] (Dummy-5 ) Meth: TableI.getNumberOfRows
2024-06-26 09:29:58,447 INFO [ omero.tables.TableI] (Dummy-5 ) Table-00955375-BC38-4FC6-B753-A130BBC6F454.getNumberOfRows() => 96
2024-06-26 09:29:58,447 INFO [ omero.remote] (Dummy-5 ) Rslt: <class 'int'>
2024-06-26 09:29:58,448 INFO [ omero.remote] (Dummy-6 ) Meth: TableI.getWhereList
2024-06-26 09:29:58,450 INFO [ omero.tables.TableI] (Dummy-6 ) Table-00955375-BC38-4FC6-B753-A130BBC6F454.getWhereList((Well==55), {}, 0, 96, 1) => size=1
2024-06-26 09:29:58,450 INFO [ omero.remote] (Dummy-6 ) Rslt: <class 'list'>
2024-06-26 09:29:58,451 INFO [ omero.remote] (Dummy-7 ) Meth: TableI.slice
2024-06-26 09:29:58,451 INFO [ omero.tables.TableI] (Dummy-7 ) Table-00955375-BC38-4FC6-B753-A130BBC6F454.slice(size=111, size=1)
2024-06-26 09:29:58,463 INFO [ omero.remote] (Dummy-7 ) Rslt: <class 'omero.grid.Data'>
After setting omero.logging.level
to 10
via configuration and restarting the server, the
2024-06-26 10:03:14,361 INFO [ omero.remote] (Dummy-5 ) Meth: TableI.getNumberOfRows
2024-06-26 10:03:14,361 INFO [ omero.tables.TableI] (Dummy-5 ) Table-7020D1AB-9AFD-40BA-8C67-581096217CAB.getNumberOfRows() => 96
2024-06-26 10:03:14,361 DEBUG [ omero.perf] (Dummy-5 ) start[1719396194361] time[0] tag[omero.tables.TableI.getNumberOfRows]
2024-06-26 10:03:14,362 DEBUG [ omero.remote] (Dummy-5 ) Rslt: 96
2024-06-26 10:03:14,363 INFO [ omero.remote] (Dummy-6 ) Meth: TableI.getWhereList
2024-06-26 10:03:14,365 INFO [ omero.tables.TableI] (Dummy-6 ) Table-7020D1AB-9AFD-40BA-8C67-581096217CAB.getWhereList((Well==54), {}, 0, 96, 1) => size=1
2024-06-26 10:03:14,365 DEBUG [ omero.perf] (Dummy-6 ) start[1719396194363] time[1] tag[omero.tables.TableI.getWhereList]
2024-06-26 10:03:14,365 DEBUG [ omero.remote] (Dummy-6 ) Rslt: [53]
2024-06-26 10:03:14,367 INFO [ omero.remote] (Dummy-7 ) Meth: TableI.slice
2024-06-26 10:03:14,367 INFO [ omero.tables.TableI] (Dummy-7 ) Table-7020D1AB-9AFD-40BA-8C67-581096217CAB.slice(size=111, size=1)
2024-06-26 10:03:14,379 DEBUG [ omero.perf] (Dummy-7 ) start[1719396194367] time[12] tag[omero.tables.TableI.slice]
2024-06-26 10:03:14,379 DEBUG [ omero.remote] (Dummy-7 ) Rslt: object #0 (::omero::grid::Data)
{
lastModification = 1719396194283
rowNumbers =
{
[0] = 53
}
If there was a desire to go further as suggested in #417 (comment), I suspect a getResultsString
logic could be introduced similar to what exists in Java. From my side, what is proposed here immediately addressing the immediate performance issue while retaining the current logging structure by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having read the discussion above, I don't have anything else to add. Looks good.
Converting every remoted result value to a string can be an incredibly expensive operation. This is especially true if hundreds of thousands of table cells, along with their row indices are passing through the OMERO.tables API. In such a scenario the
omero.grid.Data
object will be fully converted to a string and then the first 120 characters logged which can take longer than the query, slice or read operation itself.Rather than remove the functionality entirely, which might be justified (the first 120 characters is rarely that useful), I've opted to raise the level to debug. This impacts OMERO.tables and the script Processor logging.
The change improves performance and memory usage when retrieving even moderate amounts of data quite substantially. We could look to adjusting OMERO.web's
MAX_TABLE_DOWNLOAD_ROWS
up accordingly./cc @erindiel, @kkoz, @DavidStirling, @emilroz