A Sip of Elastic RUM (Real User Monitoring)

sorangutan

Sorry if I lured you into the mood of having a sip of a wonderful cocktail made with rum and you realized that the RUM I’m talking about is not the rum you are craving. But, be assured that Elastic RUM is equally wonderful! Let’s take a sip! I do want to warn you that it will take a bit of time to go through the amount of detail I will cover in this blog!

What is RUM?

Elastic real user monitoring captures user interactions with the web browser and provides a detailed view of the “real user experience” of your web applications from a performance perspective. Elastic’s RUM Agent is a JavaScript Agent, which means it supports any JavaScript-based application. RUM can provide valuable insight into your applications. Some of the common benefits of RUM include:

RUM performance data can help you identify bottlenecks and discover how site performance issues affect your visitors’ experience
User agent information captured by RUM enables you to identify the browsers, devices, and platforms most used by your customers so that you can make informed optimizations to your application
Together with location information, individual user performance data from RUM helps you understand regional performance of your website worldwide
RUM provides insight and measurement for your application’s service level agreements (SLA)

Getting started with RUM using Elastic APM

In this blog, I will take you through the complete process of instrumenting a simple web application made of a React frontend and a Spring Boot backend, step by step. You will see how easy it is to use the RUM agent. As a bonus, you will also see how Elastic APM ties the frontend and the backend performance information together with a holistic, distributed trace view. Please see my previous blog for an overview of Elastic APM and distributed tracing if you are interested in knowing more details.

To use Elastic APM real user monitoring, you have to have the Elastic Stack with APM server installed. You can of course download and install the latest Elastic Stack with APM server locally on your computer. However, the easiest approach would be creating an Elastic Cloud trial account and have your cluster ready in a few minutes. APM is enabled for the default I/O Optimized template. From now on, I’ll assume you have a cluster ready to go.

Sample application

The application we are going to instrument is a simple car database application made of a React frontend and a Spring Boot backend that provides API access to an in-memory car database. The application is purposely kept simple. The idea is to show you detailed instrumentation steps starting from zero so that you can instrument your own applications following the same steps.

A simple application with a React frontend and Spring backend

Create a directory called CarApp anywhere on your laptop. Then clone both the frontend and the backend application into that directory.

git clone https://github.com/adamquan/carfront.git
git clone https://github.com/adamquan/cardatabase.git

As you can see, the application is extremely simple. There are only a couple of components in the React frontend and a few classes in the backend Spring Boot application. Build and run the application following the instructions in GitHub for both the frontend and backend. You should see something like this. You can browse, filter cars, and perform CRUD options on them.

The simple React user interface

Amazed by how much information is captured by the RUM agent by default? Pay special attention to the markers like timeToFirstByte, domInteractive, domComplete and firstContentfulPaint. Mouse over the black dots to see the names. They provide you with great details about content retrieval and browser rendering of these contents. Also pay attention to all the performance data about resource loading from the browser. By just initializing your RUM agent, without any custom instrumentation, you get all these detailed performance metrics, out of the box! When there is a performance issue, these metrics enable you to easily decide whether the issue is due to slow backend services, a slow network, or simply a slow client browser. That is very impressive!

For those of you who need a refresher, here is a quick explanation of the web performance metrics. Do keep in mind that for modern web application frameworks like React, these metrics might only represent the “static” part of the web page, due to the async nature of React. For example, dynamic contents might still be loading after domInteractive, as you will see later.

timeToFirstByte is the amount of time a browser waits to receive the first piece of information from the web server after requesting it. It represents a combination of network and server-side processing speed.
domInteractive is the time immediately before the user agent sets the current document readiness to “interactive,” which means the browser has finished parsing all of the HTML and DOM construction is complete.
domComplete is the time immediately before the user agent sets the current document readiness to “complete,” which means the page and all of its subresources like images have finished downloading and are ready. The loading spinner has stopped spinning.
firstContentfulPaint is the time the browser renders the first bit of content from the DOM. This is an important milestone for users because it provides feedback that the page is actually loading.

Flexible custom instrumentation

The RUM agent provides detailed instrumentation for your browser interaction out of the box, as you just saw. You can also perform custom instrumentations when needed. For example, because the React application is a single-page-application and deleting a car will not trigger a “page load,” RUM does not by default capture the performance data of deleting a car. We can use custom transactions for something like that.

With our current release (APM Real User Monitoring JavaScript Agent 4.x), users have to manually create transactions for AJAX calls and Single-Page-Application (SPA) calls that do not trigger a page load. In some frameworks, like JSF, you have very little control over the JavaScript. So, manually creating transactions for button clicks which initiate AJAX requests is not viable. Even if the developer has direct control over the AJAX requests, instrumenting a big application would be a lot of effort. We are planning to enhance the RUM agent so that it will automatically create a transaction for these requests in case there is not an active one currently. This would make the auto-instrumentation cover a lot more of the application without developers having to programmatically add tracing logic to their applications.

The "New Car" button in our frontend application allows you to add a new car to the database. We will instrument the code to capture the performance of adding a new car. Open the file Carlist.js in the components directory. You will see the following code:

// Add new car
addCar(car) {
    var transaction = apm.startTransaction("Add Car", "Car");
    var httpSpan = transaction.startSpan('Add Car', 'Car')
    apm.addTags(car);
    fetch(SERVER_URL + 'api/cars',
        {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify(car)
        })
        .then(res => this.fetchCars())
        .catch(err => console.error(err))
    httpSpan.end()
    transaction.end();   
}

The code basically created a new transaction and a new span called “Add Car” of “Car” type. Then, it tagged the transaction with the car to provide contextual information. The transaction and span ended at the end of the method.

Add a new car from the application web UI. Click on the APM UI in Kibana. You should see an “Add Car” transaction listed. Make sure you select “Car” in the “Filter by type” dropdown. By default, it displays “page-load” transactions.

Filtering by type in Elastic APM

Click on the “Tags” tab. You will see the tags we added. Tags and logs add valuable contextual information to your APM traces.

Exploring by type tag in Elastic APM

See the big picture with distributed tracing

As a bonus point, we are going to also instrument our backend Spring Boot application so that you have a complete view of the overall transaction from the web browser all the way to the backend database, all in one view. Elastic APM distributed tracing enables you to do so.

Configuring distributed tracing in RUM agents

Distributed tracing is enabled by default in the RUM agent. However, it only includes requests made to the same origin. In order to include cross-origin requests you must set the distributedTracingOrigins configuration option. You will also have to set the CORS policy in the backend application, as we will discuss in the next section.

For our application, the frontend is served from http://localhost:3000. To include requests made to http://localhost:8080, we need to add the distributedTracingOrigins configuration to our React application. This is done inside rum.js. The code is already there. Simply uncommenting the line will do.

var apm = initApm({
  ...
  distributedTracingOrigins: ['http://localhost:8080']
})

This effectively tells the agent to add the distributed tracing HTTP header (elastic-apm-traceparent) to requests made to http://localhost:8080.

To use the default instrumentation out of the box on the server side, you need to download the Java agent and start your application with it. Here is how I configured my Eclipse project to run the backend Spring Boot application. You will have to configure this using your own APM URL and APM token.

-javaagent:/Users/aquan/Downloads/elastic-apm-agent-1.4.0.jar 
-Delastic.apm.service_name=cardatabase 
-Delastic.apm.application_packages=com.packt.cardatabase
-Delastic.apm.server_urls=https://aba7c3d90b0b4820b05b0a9df44c096d.apm.us-central1.gcp.cloud.es.io:443 
-Delastic.apm.secret_token=jeUWQhFtU9e5Jv836F

Configuring Eclipse to send backend data to Elastic APM

For readers who are really paying attention to the timeline visualization above, you might be wondering why the “Car List” page-load transaction ends at 193 ms, which is the domInteractive time, while data is still being served from the backend. Great question! This is due to the fact that the fetch calls are async by default. The browser “thinks” it finished parsing all the HTML and DOM construction is complete at 193 ms because it loaded all the “static” HTML contents served from the web server. On the other hand, React is still loading data from the backend server asynchronously.

Cross-origin resource sharing (CORS)

The RUM agent is only one piece of the puzzle in a distributed trace. In order to use distributed tracing, we need to properly configure other components too. One of the things that you will normally have to configure is cross-origin resource sharing, the “notorious” CORS! This is because the frontend and the backend services are typically deployed separately. With the same-origin policy, your frontend requests from a different origin to the backend will fail without properly configured CORS. Basically, CORS is a way for the server side to check if requests coming in from a different origin are allowed. To read more about cross-origin requests and why this process is necessary, please see the MDN page on Cross-Origin Resource Sharing.

What does that mean for us? It means two things:

We must set the distributedTracingOrigins configuration option, as we have done.
With that configuration, the RUM agent also sends an HTTP OPTIONS request before the real HTTP request to make sure all the headers and HTTP methods are supported and the origin is allowed. Specifically, http://localhost:8080 will receive an OPTIONS request with the following headers:
```
Access-Control-Request-Headers: elastic-apm-traceparent
Access-Control-Request-Method: [request-method]
Origin: [request-origin]
    
```
And APM server should respond to it with these headers and a 200 status code:
```
Access-Control-Allow-Headers: elastic-apm-traceparent
Access-Control-Allow-Methods: [allowed-methods]
Access-Control-Allow-Origin: [request-origin]
    
```

The MyCorsConfiguration class in our Spring Boot application does exactly that. There are different ways of configuring Spring Boot to do this, but here we are using a filter based approach. It’s configuring our server-side Spring Boot application to allow requests from any origin with any HTTP headers and any HTTP methods. You may not want to be this open with your production applications.

@Configuration
public class MyCorsConfiguration {
    @Bean
    public FilterRegistrationBean<CorsFilter> corsFilter() {
        UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource();
        CorsConfiguration config = new CorsConfiguration();
        config.setAllowCredentials(true);
        config.addAllowedOrigin("*");
        config.addAllowedHeader("*");
        config.addAllowedMethod("*");
        source.registerCorsConfiguration("/**", config);
        FilterRegistrationBean<CorsFilter> bean = new FilterRegistrationBean<CorsFilter>(new CorsFilter(source));
        bean.setOrder(0);
        return bean;
    }
}

Let’s close the blog with one more powerful feature of the RUM agent: source maps. Source maps make debugging errors with your application much easier by telling you exactly where the error happened in your original source code, instead of the cryptic minified code.

Easy debugging with source maps

It is a common practice to minify JavaScript bundles for production deployment, for performance reasons. However, debugging minified code is inherently difficult. The screenshot below shows an error message captured by the RUM agent for a production build. As you can see, the exception stack trace does not make much sense, because it’s minified code. All the error lines show javascript files like 2.5e9f7401.chunk.js, and it always points to “line 1” because of the way minification was done. Wouldn’t it be nice if you could see exactly your source code as you developed it here?

Minified code in Elastic APM can be resolved with a source map

Summary

Hopefully this blog made it clear that instrumenting your applications with Elastic RUM is simple and easy, yet extremely powerful. Together with other APM agents for backend services, RUM gives you a holistic view of application performance from an end user perspective through distributed tracing.

Once again, to get started with Elastic APM, you can download Elastic APM server to run it locally, or create an Elastic Cloud trial account and have a cluster ready in a few minutes.

As always, reach out on the Elastic APM forum if you want to open up a discussion or have any questions. Happy RUMing!

https://www.elastic.co/blog/performing-real-user-monitoring-rum-with-elastic-apm