Tools enabling Real User Monitoring, or RUM (also known as End User Experience, or EUE, and some similar things) have proliferated hugely over the last 10 years, to the point there are now countless hopeful vendors jostling for contention.
RUM certainly adds a powerful additional perspective to external monitoring, and whilst it does not replace "traditional" active (synthetic) testing – they are complimentary – it can dramatically improve some aspects of end-user performance visibility, particularly for companies with a broad international reach.
As always, a crowded marketplace can contain sheep as well as goats. I thought that it may be useful to share some musings on RUM product distinctions. The relative importance of each will depend on particular circumstances, and there is a degree of functional convergence occurring, at least among the players who are carving out significant market share (and who clearly wish to be around for the long-term). Hopefully these will be helpful to the uninitiated who are considering a purchase decision.
Ten points to consider about your potential RUM tool (in no particular order):
1. Sophistication/Coverage
Many RUM products are based on the standard W3C Navigation metrics. Although provision of standard page load metrics is one of the drivers of growth in tooling options, be aware that these are not supported in all browsers, primarily older versions and Safari. In certain cases, basic performance data is collected from these to supplement the core W3C metrics and offer more complete coverage.
Key aspects of sophistication include:
- Ability to record user journeys (logical transactions). Less evolved products act at individual page level only.
- Ability to capture and report individual session-level data – reporting on business relevant metrics such as transaction abandonment and shopping cart conversion by different categories of user.
- Detailed reporting – bounce rate, stickiness (time on site), etc. A tabular comparison between candidate products may be useful here.
- Ability to record "above the line" performance (browser fill aka perceived render time). This metric is rarely estimated by active monitoring tools (the only one of which I am aware is WebPageTest). As such, RUM tooling supporting this give a useful additional perspective into end-user satisfaction. Beware of such metrics as "time to paint" which, whilst conceptually similar are of much less value – the key understanding is the point at which a given user regards the page as having loaded (filled the browser viewport).
2. Standalone or Integrated
RUM tooling operates by instrumenting site pages with a JavaScript beacon that writes back collected data following the unload step. The JavaScript may be deployed in one of two ways:
- By manual instrumentation of the site. This requires insertion of the JavaScript as high in the page headers as possible (unlike behavioral tags, a performance tag must be triggered early in the page download in order to time subsequent steps accurately). Depending upon the number of pages to be instrumented, it may be more practical to insert via a simple cut and paste operation or via an include statement. Manual instrumentation has the disadvantage of introducing an ongoing maintenance overhead for upgrading versions etc.
- By dynamic injection. Some tools offer the option of dynamically injecting the tag from the application server.
Integrated products offer RUM functionality as part of "end to end" visibility of traffic as a component of an Application Performance Management (APM) tool set. While typically involving greater investment, such products often offer better overall value through their enhanced "root cause" isolation ability.
3. Real-Time Reporting
Tools vary in two principal ways with regard to data handling:
- The duration of storage of captured data. As with other monitoring, the problem for vendors storing customer data is that they rapidly become data storage rather than monitoring companies. However, the ability to view trend data over extended periods is extremely useful, so individual vendor strategies to manage that requirement are relevant. This problem is exacerbated (for the vendor) if object level metrics are captured. Understand the options in this area.
- The frequency of update of customer data. This can vary between 24 hours and less than 5 minutes. Near real-time updates are relevant to active operations management, while daily information has limited historic value only.
4. All Traffic or Traffic Sampling
As RUM data is inferential in nature, it is important to capture all visitor traffic rather than a sample. Some tooling offers the option of user defined sampling, often to reduce license costs. This is unlikely to be good practice except possibly in the case of extremely high traffic sites. Within Europe, this situation is exacerbated by EU legislation enabling individual users to opt for "do not send" headers which restrict the transmission of tag based data, further limiting the overall coverage.
5. API
RUM tooling will always provide some output charting, etc. Additional value can be derived from integration of RUM data with outputs from other tooling. This is particularly so for those RUM products that do not report on session level data such as conversion and abandonment rates. In such cases, it is beneficial to combine such data from web analytics with RUM-based performance metrics.
6. Page or Object Level
Although, theoretically, all products could be extended to capture object level rather than page delivery metrics, in practice this tends to be restricted to the capture of specific individual objects (often for reasons associated with data handling as mentioned above).
7. User Event Capture
The ability to record the time between 2 events (e.g. mouse clicks). Such sub-page instrumentation is of value in supporting design and development decisions.
8. Extensibility
The ability to capture and integrate non-performance user data. Examples include: associating user login details with session performance, and collecting details of originating application or database server.
9. Reporting
Extent, type and customizability of in-built reporting. Various aspects include:
- Standard, inbuilt reports – extent and ease of use / comprehensibility.
- Custom reporting functions.
- Ease of data export - for manipulation / integration / display by external (e.g. dashboard) tools.
10. Investment
License models vary. Most are based on some combination of extent of visitor traffic (monthly or annual) and number of domains to be monitored. Some APM vendors include the cost of the RUM component within the overall agent license investment.
Although the importance of any particular aspect will vary depending upon precise use case and the nature of the application, hopefully the above will provide a useful checklist for initial engagement. Happy hunting!
Larry Haig is Senior Consultant at Intechnica.