TL;DR An implementation of logging with whole product concern in Spring Boot and Logback.
Logging as a First Class Citizen describes what I think modern applications/services should be logging and why. Now, lets look at “how”. I struggled with this for a while with Log4j, writing custom encoders to add kv pairs to log entries and aspects to extract API access event information - tedious and ugly.
I switched from Splunk to ELK and was introduced to Logstash. Most of what I read was about cleaning up log entries with Logstash filters and routing versatility with plugins. What I really wanted to know was what ELK preferred for efficient ingestion and Kibana queries. While searching for logging guidance and pouring over the documentation, I found that Logback and the logstash-logback-encoder solved all of my previous challenges while replacing custom code for configuration.
The logstash-logback-encoder README.md is a great resource for setup and evolution of your logging scheme. It took me a couple days of joining information from different sites and experimenting to get what I wanted. What I offer here is an opinionated solution and why each element of the strategy is important. You could probably just start with this setup and then evolve it to suit your situation.
While the implementation is Spring Boot specific, the ideas readily apply to other situations.
NOTE Code and log examples below are from this implementation.
Logback is the most versatile, configurable, and easily extensible logging system I have seen. It allows great control over your log entry format, and specifically addresses our two logging concerns: issue remediation and API historical trending. Log Events provide the narrative messages that tell us what is going on in our code. Log AccessEvents provide rich detail about external callers and provide for historical trending.
The logstash-logback-encoder renders log entries that are most easily and efficiently ingested by the ELK stack. It produces a JSON log entry stream (as most logging systems will) and allows simple key-value additions to the log entry which is critical to this scheme.
I have used Log4J in the past and ended up having to write aspects to capture API call entries and exits for call count, duration, status. Using Log4J’s JSON output would have required me to write things like logstash’s StructuredArguments and Markers. With logback, I get that for free at the cost of a little configuration research.
Application logging is best suited to support the issues remediation and api call metrics and their trending.
Logging can provide narrative entries that tell you what happened and why; they allow you to trace call execution in diary form. These log entries describe every milestone in the call and each significant branching decision. As an example:
Contact change requested for bob.martin
Account bmc.helpdesk is authorized to make this change
Contact change request is valid
Contact change request is different than existing contact record
Contact record updated
Contact email has changed, sending email address verification to bob@bobmartin.com
Successful call requests are boring, but you get the idea. If the user submits a ticket about his contact details, you can see that they were properly updated and he should have a verification email in his inbox.
The API call metrics call count, duration, and response status can provide very basic, but essential context around how your application fairs in production.
Your base statistics are from production, but, as the wise man says: If you never test in development then everything in production is a test. These production stats become the baseline for your load testing and these metrics also provide a way to measure your load testing. All of this depends on your final product: Kibana dashboards built on top of these metrics.
TIP: When you need deeper insight into what pieces of code are causing your performance problems, Pivotal’s tc Server Insight plugin is in a class by itself. It is a drop-in replacement for Apache Tomcat and can be built in to Spring Boot. The Insight plugin is exposed through a webapp whose main dashboard highlights performance problems. You can focus on a single, poorly performing API call and drill down into all internal calls made to accomplish the API task. You can quickly identify the problem source and frequently have enough context to understand the root problem. The license agreement is free for development use.
This logging implementation is a sort of middleware for the final product: Kibana searches and queries. Optimizing for efficient log entry ingestion and queries requires a machine optimized format. But that final product should not get in the way of development or make developer’s lives tedious by forcing them to struggle through reading that machine optimized format.
There are three basic use cases:
Unit tests will use a /src/test/resources/logback-test.xml
if provided. We can configure a simple STDOUT appender optimized for human reading.
During local development, we can use the same human readable unit test STDOUT appender configuration in /src/main/resources/logback.xml
. Developers can manually enable/disable it, or depending on your deployment configuration, have it automatically enabled for the developer’s environment. You can also provide a file appender that uses the same encoding scheme that deployed code uses so developers can verify that what they want sent to ELK is actually written.
Deployed code will use a separate appender that encodes log entries tailored to log ingestion and searching.
Frequently, developers force the logging system to adapt to their old habits - ingesting plain text as files, inefficiently parsing it, and not providing parsable fields. This simply doesn’t scale:
When we think of logging as a product with ELK as a consumer, and operations and product management as the end user, this all changes. We can optimize log format and delivery and significantly reduce each of these problems.
Our delivery system will use something like Logstash or Fluentd to route from our application/service to the logging pipeline. That pipeline begins with intermediate storage like Redis or Kafka so log messages are not dropped should the backend be overloaded or down. Then log entries are moved to the log management system’s durable storage.
Our logging format changes from plain text to a regular set of key-value pairs: JSON. In addition, everything that we want to search on, especially large metrics type searches, must be in a key-value pair. The basic logback configuration, as well as all other logging frameworks, force you to put all of your parameters in the log message field. This means that the log management system must regex the message field and typical inconsistency in log messages makes this practically impossible. Logback’s Structured Arguments allow you to append key-value pairs to the log entry, effectively changing a message field regex to a select by column. It also helps with consistency, especially if you use a common dictionary of key names containing one definition of userId
.
Logback’s StructuredArguments provide one more tremendous benefit: recording complete method arguments. When a parameter is encoded as a StructuredArgument, the object is encoded as json, including all arrays and parent classes. When you have a production issue, you can find the log entry and paste the entire request body into something like Postman to recreate the call. Without this feature, recreating the problem is a significant time hole.
When we can separate request variables and narrative entries into their own fields, we can really clean up the narrative entries. A couple of Kibana clicks will show only those narrative entries providing a diary view shown above.
Enough concept and motivation, let’s get to the implementation.
ref: HTTP-access logs with logback-access, Jetty and Tomcat
LoggingEvents are used with traditional application logging calls like log.info()
. They directly address our need for issue remediation via narrative log entries and API call parameters. The types of field Providers
are listed here.
AccessEvents are focused on the area between HTTP request receipt and application code delivery. They provide fields like HTTP method, URI, status code, and elapsed time and directly address our need for historical trending. The complete list of AccessEvent field Providers
are listed here.
ref: logstash-logback-encoder README.md
Logback 1.1.6 adds support for easily reading logback-access.xml
from your jar. Below we include logback classic, core, and access, overriding the Logback dependencies included by spring-boot-starter-web
and logstash-logback-encoder
. Note that logback also requires slf4j which is already included in Spring Boot.
<!-- Logging -->
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>4.6</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.1.7</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.1.7</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-access</artifactId>
<version>1.1.7</version>
</dependency>
Since Tomcat is embedded, we need to programmatically inject a Valve to send access events to Logback. Our config file src/main/resources/logback-access.xml
will be automatically loaded from the jar through the class loader.
@Configuration
public class LogbackAccessEventConfiguration {
@Bean
public EmbeddedServletContainerCustomizer containerCustomizer() {
return new EmbeddedServletContainerCustomizer() {
@Override
public void customize(ConfigurableEmbeddedServletContainer container) {
if (container instanceof TomcatEmbeddedServletContainerFactory) {
((TomcatEmbeddedServletContainerFactory) container)
.addContextCustomizers(new TomcatContextCustomizer() {
@Override
public void customize(Context context) {
LogbackValve logbackValve = new LogbackValve();
logbackValve.setFilename("logback-access.xml");
context.getPipeline().addValve(logbackValve);
}
});
}
}
};
}
}
If you are using a dedicated Tomcat/jetty to host your app
NOTE:
Your logback-access.xml goes in $TOMCAT_HOME/conf/
(where server.xml is).
ref: Providers for LoggingEvents
Our LoggingEvent configuration uses the LoggingEventCompositeJsonEncoder
to allow StructuredArguments and Markers to insert k-v pairs into the log entry; de-clutters our narrative message.
We will rename the log entry fields to have a log
prefix to avoid key name collisions in application code injected fields.
This encoder configuration defines what the log framework will log for you. Note the addition of the static logContext
field that holds your application or service name. This really helps disambiguate log entries in searches. You can focus on code from only one service and avoid unintentional overlap in query terms.
<configuration>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<!-- https://github.com/logstash/logstash-logback-encoder#composite_encoder -->
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<timestamp/>
<version/>
<!-- https://github.com/logstash/logstash-logback-encoder#provider_pattern -->
<pattern>
<pattern>
{
"logMessage": "%message",
"logLevel": "%level",
"logThreadId": "%thread",
"logClass": "%class{32}",
"logMethod": "%method",
"logContext": "myServiceName"
}
</pattern>
</pattern>
<!-- log guid support -->
<mdc/>
<!-- StructuredArgument and Marker support -->
<arguments/>
<logstashMarkers/>
<stackTrace>
<throwableConverter class="net.logstash.logback.stacktrace.ShortenedThrowableConverter">
<maxDepthPerThrowable>10</maxDepthPerThrowable>
<maxLength>2048</maxLength>
<shortenedClassNameLength>32</shortenedClassNameLength>
<exclude>sun\.reflect\..*\.invoke.*</exclude>
<exclude>net\.sf\.cglib\.proxy\.MethodProxy\.invoke</exclude>
<rootCauseFirst>true</rootCauseFirst>
</throwableConverter>
</stackTrace>
</providers>
</encoder>
</appender>
<root level="info">
<appender-ref ref="STDOUT" />
</root>
</configuration>
ref: Providers for AccessEvents
Logback integrates into Tomcat/Jetty to read Servlet Container access events. This is a completely separate logging system than LoggingEvents but we will combine both in STDOUT.
Although it is possible, we won’t include request bodies in the access log. I think this information is better done in the application code so that all information is available in the API request log entry to reproduce the request. The LoggingEvents and AccessEvents can be tedious to tie together in a busy log management system.
Above, we renamed fields with a ‘log’ prefix. Since we are not injecting our own key names into the access event, we don’t need to do that here. I did rename the fields from their standard forms, because they are not as intuitive as those listed.
<configuration>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<!-- https://github.com/logstash/logstash-logback-encoder#accessevent-patterns -->
<!-- https://github.com/logstash/logstash-logback-encoder#providers-for-accessevents -->
<!-- http://logback.qos.ch/xref/ch/qos/logback/access/PatternLayout.html -->
<encoder class="net.logstash.logback.encoder.AccessEventCompositeJsonEncoder">
<providers>
<timestamp/>
<version/>
<method>
<fieldName>method</fieldName>
</method>
<requestedUri>
<fieldName>uri</fieldName>
</requestedUri>
<statusCode>
<fieldName>statusCode</fieldName>
</statusCode>
<elapsedTime>
<fieldName>elapsedTime</fieldName>
</elapsedTime>
<contentLength>
<fieldName>contentLength</fieldName>
</contentLength>
<remoteHost>
<fieldName>remoteHost</fieldName>
</remoteHost>
<pattern>
<pattern>
{
"queryString": "%q"
}
</pattern>
</pattern>
</providers>
</encoder>
</appender>
<appender-ref ref="STDOUT" />
</configuration>
ref:
LoggingEvents allow the possibility of injecting additional fields into the log entry. This is a very good thing because it de-clutters the narrative message, provides additional fields for robust Kibana queries, and allow you to provide enough information to reproduce the API call when debugging.
StructuredArguments are the preferred method, but you may use LogstashMarkers to suppress static analyzers that complain about unused arguments.
Here are some common uses:
// Use the implicit String.format
log.info("log message {}", localVarName);
// Add "name":"value" to the JSON output,
// but only add the value to the formatted message.
log.info("log message {}", value("name", "value"));
// Add "name":"value" to the JSON output,
// and add name=value to the formatted message.
log.info("log message {}", keyValue("name", "value"));
// Add "name":"value" ONLY to the JSON output.
log.info("log message", keyValue("name", "value"));
// Add multiple key value pairs to both JSON and formatted message
log.info("log message {} {}", keyValue("name1", "value1"), keyValue("name2", "value2")));
StructuredArguments have shorthand method names like v()
for value()
and kv()
for keyValue()
Let’s start tying this all together with some ReST API code examples. Our goals are to provide a narrative log message for diary view AND enough information to reproduce the API call for debugging.
Logback uses the slf4j facade and a static import to augment those methods. I am using the most excellent lombok @Slf4j annotation to repace the boilerplate
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(LogExample.class);
with @Slf4j
import static net.logstash.logback.argument.StructuredArguments.*
// ...
@Slf4j
@RestController
@RequestMapping(value = "/contacts", produces = MediaType.APPLICATION_JSON_UTF8_VALUE)
public class ContactsController {
// ...
@RequestMapping(value = "/{id}", method = RequestMethod.GET)
public ResponseEntity get(
@PathVariable long id,
@RequestHeader HttpHeaders headers) {
log.info("Get Contact {} requested", v("id", id), kv("headers", headers));
return ResponseEntity.ok(contactsService.get(id));
}
produces the LoggingEvent
{
"@timestamp": "2016-03-13T22:36:44.067-06:00",
"@version": 1,
"logMessage": "Get Contact 5 requested",
"logLevel": "INFO",
"logThreadId": "http-nio-8080exec-1",
"logClass": "c.g.s.c.ContactsController",
"logMethod": "get",
"id": 5,
"headers": {
"host": [
"localhost:8080"
],
"connection": [
"keep-alive"
],
"cache-control": [
"max-age=0"
],
"accept": [
"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
],
"upgrade-insecure-requests": [
"1"
],
"user-agent": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36"
],
"accept-encoding": [
"gzip, deflate, sdch"
],
"accept-language": [
"en-US,en;q=0.8"
],
"cookie": [
"_ga=GA1.1.1621264663.1446705863"
]
}
}
There are a lot of headers here, which may be useful to a UI component. API components may want to be more restrictive and pluck only the interesting headers.
Both will want to avoid authentication headers and other sensitive information which is more elegantly done in a Logstash grok filter.
This AccessEvent is also produced:
{
"@timestamp": "2016-03-13T22:36:44.440-06:00",
"@version": 1,
"method": "GET",
"uri": "/contacts/5",
"statusCode": 200,
"elapsedTime": 30,
"contentLength": -1,
"remoteHost": "0:0:0:0:0:0:0:1",
"queryString": ""
}
The create method
@RequestMapping(method = RequestMethod.POST, consumes = MediaType.APPLICATION_JSON_UTF8_VALUE)
public ResponseEntity<Contact> create(
@RequestBody @Validated(Contact.Post.class) Contact contact) {
log.info("Create Contact {} requested", contact.getEmail(), kv("contact", contact));
return ResponseEntity.ok(contactsService.create(contact));
}
produces the LoggingEvent
{
"@timestamp": "2016-03-13T22:57:39.111-06:00",
"@version": 1,
"logMessage": "Create Contact bob@thebuilder.com requested",
"logLevel": "INFO",
"logThreadId": "http-nio-8080-exec-1",
"logClass": "c.g.s.c.ContactsController",
"logMethod": "create",
"contact": {
"id": null,
"firstName": "Bob",
"lastName": "The Builder",
"companyName": "Bob's Construction",
"address": "1234 Beaver lane",
"city": "Austin",
"county": "Pitkin",
"state": "TX",
"zip": "81615",
"phone1": "3136364482",
"phone2": "3136369982",
"email": "bob@thebuilder.com",
"website": "http://bobthebuilder.com"
}
}
We can see we can easily copy the contact JSON into Postman to reproduce this call.
The AccessEvent event looks like
{
"@timestamp": "2016-03-13T22:57:39.492-06:00",
"@version": 1,
"method": "POST",
"uri": "/contacts",
"statusCode": 200,
"elapsedTime": 487,
"contentLength": -1,
"remoteHost": "0:0:0:0:0:0:0:1",
"queryString": ""
}
ref: Logback MDC
Let’s say you get a customer ticket describing what the customer did and how it failed. How do you tie your UI and possibly several API components together to get a complete picture of the problem?
While something like zipkin is awesome, it is separate from the logging system - we want to see all narrative entries from components from the UI, through all API components, to datastore access.
In a thread per service call language like java, there is a pretty simple solution using logback’s MDC component. We can use a tracing token (logguid) that follows the API chain wherever it goes. Each component follows the same strategy:
You can create a global request interceptor for item one and a base class to handle outbound requests for item two.
How is this life changing? You get a customer ticket stating that their contact update didn’t work. Trying to get more information from the customer is usually a waste of time. You are left to searching through log errors and trying to tie that back to a log entry containing the user’s id. Really frustrating and always a waste of time.
With the addition of the logguid, you can search for the user’s id and then search by logguid to see which interaction failed.
Of course, if this is a frequent scenario, you could add userId to MDC and follow the same procedure as with the logguid.
You now have the building blocks for making yours, and your operator’s lives easier. When you connect your log stream to ELK, or any other log management system, you can begin to create great dashboards that help you understand production performance, quantify load test results, and simplify issue remediation.
If logging for a whole product concern is new, outstanding! If not, I hope I could provide a couple of new ideas, and if not, outstanding as well, cause you are ahead of the herd.