Microsoft has released a few days ago the news that has released the source code of its tool «GCToolkit», which is a set of libraries to parse the Java Garbage Collection log files, with which all the GCToolkit code available on GitHub under the MIT license.
GCToolkit consists of three Java modules covering APIs, GC log file parsers, and a Vert.x toolkit-based message backplate for building responsive applications on the JVM. With this utility, users can create arbitrary and complex scans of the state of managed memory in the JVM.
As the name suggests, this is a set of libraries for parsing Java garbage collection (GC) log files and parsing them in separate events. Expose an API to improve engagement with the toolkit and data aggregation, this allows the user to create arbitrary complex analyzes of the state of the JVM's managed memory.
According to the team, this is the user entry point in GCToolkit that hides the details of the internal modules in a few method calls. In addition to the API, there are two other modules: the parsing module and Vert.x. The Parser Module is based on a collection of regular expressions and code written to be considered the most robust GC log analyzer available.
The messaging backend based on Vert.x uses two message buses: the former transmits data from a data source. The current implementation passes log lines from the GC log file. The consumers of this bus are the analyzers that convert the data from the data source into events that represent a GC cycle or safe point. These events are published on the second message bus: the event bus. The event bus subscribers can then be notified and process the events that interest them.
Parser emits discrete JVM events, allowing you to write code to capture and analyze data from these events. To facilitate data capture and analysis of GC log files, GCToolkit provides a simple aggregation framework. The type of data users want to capture or the type of analysis they want to perform is at the user's discretion. For example, to capture pause events to analyze heap occupancy, the aggregator captures the event, extracts the relevant data, and passes the data to the aggregation.
This brings the data together in a meaningful analysis, for example total heap occupancy after garbage collection. The resulting data can be presented in the form of a graph, table, or another more user-friendly format. More importantly, according to the team, a suboptimal collector configuration will result in an application that requires more CPU and memory, while degrading the end-user experience. In other words, a poorly tuned collector often means more expensive runtime and dissatisfied users.
With Microsoft's growing interest in the Java platform, focus in open source it is also increasing the benefits for the Java community. After making significant contributions to port macOS M1 and Windows to Arm, Microsoft reaffirmed its commitment to OpenJDK by introducing its own version of OpenJDK and joining the Eclipse Adoptium working group (formerly known as AdoptOpenJDK).
By making GCToolkit open source, Microsoft is trying to provide a better way to see the internals of the JVM on how it handles GC and memory allocation. Better visibility allows better configuration, which benefits both the end users of the application and the technical personnel responsible for its management.
The simple API and easy-to-use output mechanisms promise to improve the task of reading GC logs by providing various mechanisms to analyze, extract, and visualize data.