Java Diagnostic Tool — Introduction to the Use and Principles of Arthas

1.2k words
This article provides a detailed introduction to the usage and implementation principles of Arthas, the Java diagnostic tool open-sourced by Alibaba, including core functions such as online problem troubleshooting, performance analysis, class loading analysis, and the underlying implementation mechanism based on the JVM Instrumentation API.

Arthas Basic Introduction

Arthas is a Java diagnostic tool open-sourced by Alibaba's middleware team in September 2018. With its excellent performance and rich features, Arthas quickly gained widespread recognition in the developer community, currently boasting nearly 20,000 stars on GitHub. As a "production-grade" diagnostic tool, it enables real-time observation and problem localization of online applications without restarting the JVM.

GitHub Address: https://github.com/alibaba/arthas

Core Problems Solved by Arthas

  1. Global Monitoring of System Status: Provides an overall view to quickly understand the health status of the application.
  2. Real-time Diagnosis of Online Issues: No need for local reproduction; directly locate problems in the production environment.
  3. Precise Identification of Performance Bottlenecks: Quickly pinpoint time-consuming operations when service response slows down.
  4. Verification of Code Execution Status: Confirm whether the online code version and class loading status meet expectations.
  5. Thread Blocking Issue Investigation: Identify and resolve thread deadlock or blocking issues.

Arthas Usage Guide

Environment Deployment

Linux System:

bash
# One-click installation
curl -L https://alibaba.github.io/arthas/install.sh | sh

# Or manually download and start
wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar
# One-click installation
curl -L https://alibaba.github.io/arthas/install.sh | sh

# Or manually download and start
wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar

Windows System:

bash
wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar
wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar

Real-time Monitoring of System Status

The Dashboard command provides a real-time view of key system metrics, including JVM memory usage, thread state distribution, GC activity, and other core runtime data, allowing developers to quickly grasp the overall health status of the system.

dashboard

Online Issue Diagnosis

When difficult-to-reproduce issues occur in the production environment, Arthas provides a series of precise diagnostic tools:

1. watch Command: Method Execution Observation

This command allows for in-depth observation of the complete lifecycle of method calls, including parameters, return values, and exception information. Combined with OGNL expressions, it can also achieve conditional filtering and complex data extraction.

bash
watch demo.MathGame primeFactors returnObj
watch demo.MathGame primeFactors returnObj

Professional Tip: Use options json-format true for more readable JSON format output.

2. tt Command: Time Tunnel

The Time Tunnel feature acts like a "method execution recorder," recording all calls within a specified time period for later analysis:

bash
tt -t demo.MathGame primeFactors
tt -t demo.MathGame primeFactors

3. stack Command: Call Stack Analysis

Quickly obtain the complete call chain for a specified method to help understand the context of method calls:

bash
stack demo.MathGame primeFactors
stack demo.MathGame primeFactors

Precise Location of Performance Issues

In the face of system performance degradation, Arthas provides professional performance analysis tools:

trace Command: Method Time Consumption Analysis

bash
# Trace method calls and their time consumption
trace demo.MathGame run

# Filter JDK internal calls, focusing on business code
trace -j demo.MathGame run

# Only focus on calls that exceed the time threshold
trace demo.MathGame run '#cost > 10'

# Multi-target tracing
trace -E com.test.ClassA|org.test.ClassB method1|method2|method3
# Trace method calls and their time consumption
trace demo.MathGame run

# Filter JDK internal calls, focusing on business code
trace -j demo.MathGame run

# Only focus on calls that exceed the time threshold
trace demo.MathGame run '#cost > 10'

# Multi-target tracing
trace -E com.test.ClassA|org.test.ClassB method1|method2|method3

Note: The trace command itself incurs a small performance overhead, and the result data may have slight deviations, but it does not affect problem localization.

Combining with the monitor command allows for continuous monitoring of method execution statistics:

bash
monitor -c 5 demo.MathGame primeFactors
monitor -c 5 demo.MathGame primeFactors

Code and Class Loading Analysis

When application behavior does not meet expectations, it is often necessary to check the runtime code status:

1. sc Command: Class Loading Analysis

bash
# Class search
sc -d demo.*

# View detailed information about a class
sc -d demo.MathGame

# Check class field structure
sc -d -f demo.MathGame
# Class search
sc -d demo.*

# View detailed information about a class
sc -d demo.MathGame

# Check class field structure
sc -d -f demo.MathGame

2. jad Command: Instant Decompilation

Verify whether the running code is the expected version to resolve code deployment issues:

bash
jad demo.MathGame
jad demo.MathGame

Thread Issue Diagnosis

The thread command provides powerful thread analysis capabilities:

bash
# Locate blocking threads
thread -b

# Identify the busiest threads
thread -n 3

# Set sampling interval
thread -i
# Locate blocking threads
thread -b

# Identify the busiest threads
thread -n 3

# Set sampling interval
thread -i

Arthas Principle Introduction

Core Technical Foundation

The powerful features of Arthas are built on two key technologies:

1. Java Instrumentation Mechanism

The Instrumentation API introduced in Java 5 is the core foundation of Arthas, providing the ability to modify loaded classes at runtime in the JVM. Arthas utilizes this mechanism to dynamically "attach" to the target JVM process using Java Agent technology without restarting the application.

Its key workflow includes:

  1. Using the JVM's attach mechanism to connect to the target process.
  2. Injecting a custom Java Agent.
  3. Obtaining the Instrumentation instance within the Agent.
  4. Using Instrumentation.redefineClasses() or retransformClasses() methods to achieve dynamic modification of class definitions.

This allows Arthas to add monitoring logic to target methods without intruding on the original code.

2. ASM Bytecode Manipulation Framework

Arthas uses the lightweight ASM framework for bytecode manipulation, achieving:

  1. Efficient Class Structure Analysis: Quickly analyze the fields and methods of a class.
  2. Precise Bytecode Enhancement: Insert monitoring code at method entry, exit, and exception handling points.
  3. Low-overhead Runtime Transformation: Ensure that the modification process minimizes performance impact on the target application.

The visitor-oriented API design of ASM allows Arthas to precisely control every detail of bytecode transformation while keeping performance overhead at the lowest level.

System Architecture and Workflow

Arthas adopts a layered design, primarily consisting of the following core components:

Command Processing Engine

The command processing engine is responsible for parsing user input and routing it to the corresponding command handler. It employs a pluggable modular design and supports command extension through the SPI mechanism, allowing third-party developers to easily add custom commands.

Bytecode Enhancer

The bytecode enhancer is the core of Arthas, working through the following steps:

  1. Class Filtering: Filter the classes that need enhancement based on user-specified class names and method names.
  2. Bytecode Analysis: Parse the structure of the target class to determine injection points.
  3. Code Injection: Insert monitoring code before method execution, after execution, and at exception handling points.
  4. Class Redefinition: Apply the modified bytecode using the Instrumentation API.

Data Collection and Presentation

When the enhanced method is executed, the injected code collects relevant data (such as method parameters, return values, execution time, etc.) and sends it back to the Arthas console through an internal communication mechanism, ultimately presenting it in a user-friendly manner.

Detailed Explanation of Command Implementation Principles

Arthas commands can be divided into two main categories based on their implementation methods:

1. Non-bytecode Enhancement Class Commands

These commands primarily utilize standard APIs provided by the JVM to obtain information:

  • dashboard Command: Comprehensive use of MXBean interfaces in the java.lang.management package to obtain JVM runtime status.
  • sc Command: Calls Instrumentation.getAllLoadedClasses() to enumerate loaded classes and analyzes class structure using the reflection API.
  • thread Command: Combines Thread.getAllStackTraces() and JMX interfaces to obtain thread status and stack traces.
  • jvm Command: Access various JVM runtime MXBeans to obtain memory, GC, class loading, and other metrics.

The advantage of these commands lies in their simplicity of implementation, requiring no modification of target code, thus incurring minimal performance overhead, making them suitable for frequent use.

2. Bytecode Enhancement Class Commands

These commands achieve advanced monitoring and analysis functions through dynamic modification of class definitions:

  • watch Command: Injects code before and after the target method to capture method parameters, return values, and exception information.
  • trace Command: Recursively enhances the target method and all methods it calls to construct a complete call tree and time consumption analysis.
  • tt Command: Injects recording code at method call points to save the call context and provide playback functionality.
  • monitor Command: Inserts counting and timing code to statistically analyze method call frequency and RT distribution.

The typical process for implementing bytecode enhancement includes:

  1. Locating Classes and Methods: Determine the target based on user input.
  2. Building Enhancers: Create method access adapters and define enhancement logic.
  3. Bytecode Transformation: Modify the original method bytecode and insert monitoring points.
  4. Redefining Classes: Apply the modified bytecode to the JVM in real-time.
  5. Data Processing: Collect, process, and display the runtime captured data.

This approach enables extremely powerful functionality but requires careful handling to avoid introducing unnecessary performance overhead or unexpected behavior.

Security and Performance Considerations

Arthas is designed with the special needs of online environments in mind:

  1. Isolation of Enhanced Code: Uses independent class loaders to load Arthas core classes, avoiding interference with the target application.
  2. Minimizing Performance Impact: Employs dynamic sampling, conditional filtering, and other techniques to reduce the overhead of data collection.
  3. Graceful Exit Mechanism: Removes all enhanced code through the shutdown command, restoring the application to its original state.
  4. Access Control: Supports access control to prevent unauthorized use.

Usage Precautions

  1. Execute the shutdown command after completing diagnosis: Ensure the removal of all dynamically injected code to avoid long-term performance impact.
  2. Handle JDK core classes with caution: Modifying classes under the java.* package may lead to JVM instability and should be avoided whenever possible.
  3. Remote connection configuration: Ensure that IP and port settings are correct, especially when used in container or cloud environments.

Summary

As a comprehensive Java diagnostic tool, Arthas greatly simplifies the troubleshooting process for online issues. By understanding its implementation principles, developers can utilize this tool more efficiently, enhancing system stability and maintainability. Whether for daily monitoring or emergency fault handling, Arthas is an indispensable assistant for Java developers.

Comments

Pleaseto continueComments require admin approval before being visible

No comments yet. Be the first to comment!