9 Essential Laws of MuleSoft Logging Success
Logging is integral to maintain the health of any application, without which troubleshooting can become very complicated. In a middleware solution, logs quickly become a lifeline to discover issues within the application and provide insights into its behavior. Logging even has a lifecycle, like code, and effective logging evolve as they move from development to testing to production.
The logging functionality provided by MuleSoft can encourage good logging practices. But the tools can’t do it alone - proper logging across the applications requires a disciplined approach and standards agreed upon by the developers.
The following nine best practices can help your team build a “culture of logging” and make the best use of the log data to extract valuable information using third-party tools such as Splunk and ELK.
1. Log events with meaningful context
Context is critical, whether you’re being asked a question or looking at a log. All too often, we see a lack of meaningful context in writing log messages. Vague logs slow debugging, measurement, and improvements in a system.
An example is the below log messages.Without meaningful context, log events sometimes become cryptic and challenging to understand.
Encountered some error
Failed to process file
These log statements don’t give the reader any clue about what’s going on with the application. Context, such as what operation the application was executing when the error occurred or the name of the resource under consideration, would add valuable information for the reader. Increasing the usefulness of the log messages is as simple as re-writing them as below:
The system encountered an error while retrieving order details for order id: 34553
Failed to process file: order_in.csv due to missing shipping data
2. Use categories to turn on/off logging during runtime.
MuleSoft’s logging functionality allows for categories.Categories are an effective way of managing logging statements dynamically.
As the application progresses from Dev to QA and then to the Production environment, it becomes essential to tune the log statements. For example, when an application is in the Dev environment, it can be helpful to log a lot more information than when that application is in production.
This fine-tuning of log statements can be made simpler using log categories. Log categories can be specified in the log connector and used to group a set of messages. The level of those messages can be changed dynamically in the Runtime Manager.
3. Log everything
Unfortunately, there is no magic rule on what to log or how much to log. During development and testing, start by logging as much information as possible. Additional messages, such as which sub-flow is under execution or which route is being processed, can help tremendously with debugging.
As discussed above, categories can help with managing the level of logging that occurs at runtime. When the application is ready to deploy to production, you can use categories to dynamically turnoff logging messages that only provide debugging information.
The logs should be treated like code. A tight feedback loop should be incorporated in tuning the log statements to be scaled up or down as appropriate.
4. Choose the correct logging level.
MuleSoft provides five log levels. One of the more difficult tasks in creating practical and useful logging standards is identifying at which level a statement should be logged. The below description can be used as a guideline:
● TRACE - TRACE should only be used in development to track bugs and should never make it into production.
● DEBUG - At this level, debugging information such as payload should be logged. Expect this level to write a lot of data. Because of this, it should be sparingly used in production and used only to debug an ongoing issue.
● INFO - All user-driven logs and system-specific messages (begin and end flows, choice routes, conditional statements, etc.) should be at this level. INFO should be the default log level of every application.
● WARN - Log all messages at this level that potentially could become an error. For example, WARN level logging should be done if an SLA is not met or the system is very close to the threshold, such as database response is beyond a certain level.
● ERROR - All internal and external errors should be logged at this level.
5. Using correlation identifier to trace system calls across application layers
Distributed systems are great and are more versatile than monolithic systems. They also present a challenge in getting a holistic view of system interactions.
The best practice for addressing this is to the integration pattern of a correlation identifier. This connects the system calls using a universal identifier, which can then be used to dig through the logs to identify the system flow. More about this pattern can be found HERE.
We also discuss how to leverage correlation identifiers on the Big Compass blog.
6. Log in a machine-readable format.
Well written log entries are great for humans trying to track down a problem or analyze performance but are not machine friendly. Log management tools such as Splunk and ELK can be efficiently used when the log files are ingested in JSON format.
MuleSoft’s log4j2 framework can write the JSON logs, which can be easily ingested by log management tools. This enables an easier search and analysis.
MuleSoft also provides a JSON logger connector that automatically converts a standard log message into JSON while also masking sensitive data such as personally identifiable information (PII). The connector code is available for use and is available HERE.
7. Avoid logging sensitive information.
Logging can present a security problem if standards aren’t established around what can and can’t be logged. You should avoid logging security data such as passwords, credit card information, social security numbers, and other PII.
Logging should be subject to code reviews and feedback loops to help identify vulnerabilities and assist with remediation. As noted above, JSON loggers should be used whenever possible to mask sensitive information.
8. Log beyond troubleshooting
Log management tools make logs even more useful by providing valuable information beyond troubleshooting. They can be effectively used for auditing, profiling, and analysis.
Logs can be used to extract information about user interactions with the system and help identify non-functional interactions such as time taken to get a response from external systems. This information can help with improving user experience, speeding applications, and identifying other areas for improvement.
The Anypoint monitoring platform enables DevOps teams to set alerts based on defined log metrics and present it as a dashboard in a visual format. Further information can be found HERE.
9. Use log4j framework efficiently
MuleSoft relies on the log4j2 framework to write to the log files. This framework also enables streaming of the logs to multiple destinations and the ability to separate them based on logging categories. A sample use case might be the need to log all database interactions to a separate log so that an audit service can ingest it. All other logs can be sent over to Splunk using Log4j2’s appenders and based on the project's needs.
Need help with logging?
MuleSoft understands how powerful logging can be for troubleshooting, improvement, and management of integrations. That’s why they offer tools to facilitate fine-grained handling of logs and the ability to stream logging data to other log tools and dashboards. However, to fully take advantage of your logs, they should be treated as code, with well-defined uses and standards. If you’d like to improve how your MuleSoft applications utilize logging, contact Big Compass. We’d be happy to work with you to define and clarify your logging strategies.
ADOPTION & EXPANSION
+ Number of APIs
+ Business coverage
+ Number of contracted apps
+ API usage
+ API reuse
EFFICIENCY & COST SAVINGS
+ Number of APIs in each SDLC stage
+ Time spent in each SDLC stage
+ Cost and time to build an API
+ App development velocity
+ Number of launches per year
+ Number of defects
SECURITY & VULNERABILITIES
Time since the last version was published
Number of throttling issues
+ Time to onboard
+ Number of deployments
+ Number of incidents
+ Percentage of customers impacted. per incident
+ Time to resolve incidents