UI performance rendering

UI Performance Rendering

Think-it logo
Mobile Chapter
Engineering.17 min read
The rendering process relies on multiple system parts, and each task might lead to rendering issues that can point to the necessary optimizations.

Understanding the Android rendering process is essential for identifying areas of improvement and optimization. The rendering process relies on multiple system parts, and each task might lead to rendering issues that can point to the necessary optimizations.

Android UI Rendering

To optimize Android UI rendering performance, it's crucial to understand the different tasks involved in the rendering process, as illustrated in the diagram below:

UI performance rendering

As shown in the UI Rendering Process graph, you can see that from firing a VSync operation and reaching Graphics buffers a lot of work is happening which allows the rendering to happen as optimal and effective as it can be. so let’s dive deeper into each of these tasks and explore it further.

VSync: At the Choreographer level, which is triggered countless times per second, it prepares to render information where a VSync operation starts. VSync prepares information to render between two consecutive frames, dispatching events happening on the UI to be treated in the “JavaSDK lands.”

Input: After the VSync operation, the UI Thread processes input events, which can trigger changes in properties like background, text, and text color.

Animation: The UI Thread also handles animations, which might trigger changes in properties.

After taking care of input the rendering process has to go through the traversal pass which includes, as mentioned in the graph, the 3 next tasks: Measure, Layout, Draw.

Measure: In this step, the UI thread takes care of measuring the frame to be rendered, which means basically knowing its size through its height and width.

Layout: This step takes care of laying out the frame to be rendered and essentially deals with its positioning and the management of margins/paddings.

Draw: UI Thread processes at this level of drawing the frame which doesn’t mean to draw them on the screen but to prepare how the frame is going to be drawn.

For that, the UI Thread is going to retrieve a DisplayList of the frame that needs to be redrawn, which is basically a structure that store the rendering information of any view, for example, if we take a TextView, and see how its code is written, we will find a bunch graphic commands like DrawBackground DrawLine DrawDrawable (drawable are icons or image in Android dialect), so these graphic commands actually end up as a display list.

Then we retrieve the DisplayList for basically the entire hierarchy, so not only the view itself but we have the entire view hierarchy is reproduced in this hierarchy of DisplayLists all the way down to the view in update.

Sync: After UI Thread has gone through the traversal and ready the pieces of information, it syncs them into the rendering thread, which is basically sending the hierarchy DisplayList to the Rendering Thread.

Execute: In this step, the render thread will execute the received information, which means basically converting it into a native version rather the “Java” version produced in the UI thread.

Get Buffer: In this step simply the Renderer thread fetches a buffer from the graphics layer

Issue: After seeking a buffer the Rendering thread starts issuing OpenGL commands to draw the new frame

Swap Buffer: Finally the Rendering thread swap the buffer and turn it over the GPU

And basically, these are the steps of the rendering process that are happening across the different layers visible in the Android UI Rendering Process graph.

Android UI Rendering Benchmarking

Using the Profile GPU Rendering Tool

The Profile GPU Rendering tool in Android helps you understand where your app's rendering performance can be optimized. By generating a visual representation of rendering times, it allows you to pinpoint exactly where improvements are needed.

Enabling Profile GPU Rendering

To enable the Profile GPU Rendering tool:

  1. Open Developer Options:
    • On your device, go to Settings > System > Developer options. If Developer options is not visible, go to Settings > About phone and tap Build number seven times to enable it.
  2. Enable Profile GPU Rendering:
    • In Developer options, scroll down to Profile GPU rendering and select either On screen as bars or In adb shell dumpsys gfxinfo.

When the Profile GPU Rendering tool is enabled, a bar chart is displayed on your screen, representing the rendering time for each frame. Each bar has different colors, corresponding to different stages of the rendering pipeline

Identifying Bottlenecks

Understanding each stage of the rendering pipeline is crucial for diagnosing performance issues. Each stage corresponds to a specific color in the GPU rendering profile, providing insights into where your app spends time during each frame.

Miscellaneous

Description: Represents non-rendering work occurring on the main thread between two consecutive frames.

When this segment is large:

  • Cause: Callbacks, intents, or other non-rendering tasks running on the main thread.
  • Solutions: Offload work to background threads where possible. Use tools like Method tracing or Systrace to identify and optimize tasks running on the main thread.

Input Handling

Description: Measures the time spent handling input events, including code execution resulting from input event callbacks.

When this segment is large:

  • Cause: Too much or too complex work inside input-handler event callbacks. Since these callbacks run on the main thread, they can cause performance issues.
  • Solutions: Optimize the work within these callbacks or offload work to a different thread. Additionally, RecyclerView scrolling can appear here; ensure operations like view inflation or population are efficient. Profiling tools like Traceview or Systrace can help investigate further.

Animation

Description: Shows how long it took to evaluate all animators running in the frame, such as ObjectAnimator, ViewPropertyAnimator, and Transitions.

When this segment is large:

  • Cause: Property changes due to animations, like fling animations in ListView or RecyclerView, which cause significant view inflation and population.
  • Solutions: Simplify animations or reduce the number of simultaneous animations. Investigate using tools like Traceview or Systrace to identify specific issues.

Measurement/Layout

Description: Involves measuring the size of view items and laying them out on the screen. This stage calculates sizes and positions for all views in the hierarchy.

When this segment is large:

  • Cause: A large number of views to layout, or inefficiencies in the view hierarchy. Issues can also arise from code in onLayout(boolean, int, int, int, int) or onMeasure(int, int).
  • Solutions: Simplify and optimize view hierarchies, and improve the performance of layout and measurement code. Profiling tools like Traceview and Systrace can help identify bottlenecks.

Draw

Description: Translates view rendering operations (e.g., drawing a background or text) into native drawing commands, recorded in a display list.

When this segment is large:

  • Cause: High complexity in onDraw() methods, or many views becoming invalidated at once, requiring regeneration of display lists.
  • Solutions: Simplify custom views, optimize onDraw() logic, and minimize unnecessary view invalidations. Investigate using profiling tools to find specific issues.

Sync/Upload

Description: Represents the time taken to transfer bitmap objects from CPU memory to GPU memory.

When this segment is large:

  • Cause: Large bitmaps or a high number of resource loads. This stage is critical for ensuring resources are available in GPU memory before rendering.
  • Solutions: Optimize bitmap resolutions to match display sizes, and use prepareToDraw() to pre-upload bitmaps asynchronously.

Issue Commands

Description: Measures the time to issue all commands necessary for drawing display lists to the screen, typically via OpenGL ES.

When this segment is large:

  • Cause: High complexity and quantity of display lists. For instance, many draw operations or inefficient commands.
  • Solutions: Reduce the number of draw operations and optimize command batching. Investigate using profiling tools to identify command inefficiencies.

Process/Swap Buffers

Description: Represents the time taken for the graphics driver to present the updated image to the screen.

When this segment is large:

  • Cause: The GPU is overburdened with commands, causing the CPU to wait for space in the command queue.
  • Solutions: Reduce the complexity of GPU work by optimizing shaders, reducing overdraw, and simplifying rendering logic. Profiling tools can help identify specific bottlenecks.

Understanding these stages and their potential bottlenecks allows you to target specific areas for optimization, improving overall rendering performance and providing a smoother user experience.

Android Studio Profiler

Fixing performance problems involves identifying areas in which your app makes inefficient use of resources such as the CPU, memory, graphics, network, or the device battery.

To find and fix these problems, we use the Android Profiler tools that provide real-time data to help you to understand how your app uses CPU, memory, network, and battery resources.

The Android Profiler let you inspect 3 different areas:

  • Profile CPU Activity
  • Profile memory usage
  • Profile energy use

Profiling is essential for identifying performance bottlenecks and optimizing your app's UI rendering, which directly impacts user experience and app quality. To open the Profiler window, choose View > Tool Windows > Profiler  or click Profile in the toolbar.

Profile CPU activity

Optimizing your app’s CPU usage has many advantages, such as providing a faster and smoother user experience and preserving device battery life.

For example:

  • Reduced jank and smoother animations
  • Faster app responsiveness
  • Improved overall app performance

You can use the CPU Profiler to inspect your app’s CPU usage and thread activity in real time while interacting with your app, or you can inspect the details in recorded traces.

To open the CPU Profiler, follow these steps:

  1. Open Profiler
  2. Click anywhere in the CPU timeline to open the CPU Profiler.

When you open the CPU Profiler, it immediately starts displaying your app’s CPU usage and thread activity.

  1. Event timeline: Shows the activities in your app as they transition through different states in their lifecycle, and indicates user interactions with the device, including screen rotation events.
  2. CPU timeline: Shows real-time CPU usage of your app as a percentage of total available CPU time and the total number of threads your app is using. The timeline also shows the CPU usage of other processes (such as system processes or other apps), so you can compare it to your app’s usage. You can inspect historical CPU usage data by moving your mouse along the horizontal axis of the timeline.
  3. Thread activity timeline: Lists each thread that belongs to your app process and indicates its activity along a timeline using 3 colors. After you record a trace, you can select a thread from this timeline to inspect its data in the trace pane.
    • Green: The thread is active or is ready to use the CPU. It's in a running or runnable state.
    • Yellow: The thread is active, but it’s waiting on an I/O operation (for example, disk or network I/O) before it can complete its work.
    • Gray: The thread is sleeping and is not consuming any CPU time. This sometimes occurs when the thread requires access to a resource that is not yet available. Either the thread goes into voluntary sleep, or the kernel puts the thread to sleep until the required resource becomes available.

Record trace

The CPU Profiler gives you the tool to record a trace for a period of time and inspect the details later.

To start recording a trace, while in the CPU Profiler click Record. Interact with your app and when you're done then click Stop.

The profiler automatically displays its tracing information in the trace pane.

  1. Selected range: Determines the portion of the recorded time to inspect in the trace pane. When you first record a trace, the CPU Profiler automatically selects the entire length of your recording in the CPU timeline. To inspect trace data for only a portion of the recorded time range, drag the edges of the highlighted region.
  2. Interaction section: Displays user interaction and app lifecycle events along a timeline.
  3. Threads section: Displays thread state activity (such as running, sleeping, etc.) and Call Chart (or trace event chart in System Trace) for every thread along a timeline.
    • Use your mouse to navigate the timeline.
    • Double-click the thread name or press Enter while a thread is selected to expand or collapse a thread.
    • Select a thread to see additional information in the Analysis pane. Hold Shift or Command to select multiple threads.
    • Select a method call to see additional information in the Analysis pane.
  4. Analysis pane: Displays trace data for the time range and thread or method call you have selected. In this pane, you can select how to view each stack trace (using the analysis tabs) and how to measure execution time (using the time reference dropdown menu).
  5. Analysis pane tabs: Choose how to display trace details.
  6. Time reference menu: Select one of the following to determine how timing information for each call is measured:
    • Wall clock time: Timing information represents actual elapsed time.
    • Thread time: Timing information represents actual elapsed time minus any portion of that time when the thread is not consuming CPU resources. For any given call, its thread time is always less than or equal to its wall clock time. Using thread time gives you a better understanding of how much of a thread’s actual CPU usage is consumed by a given method or function.
  7. Filter: Filters trace data by function, method, class, or package name.

Profile memory usage

The Memory Profiler helps you identify memory leaks that can lead to freezes, and even app crashes. It shows a realtime graph of your app's memory use and lets you capture a heap dump, force garbage collections, and track memory allocations.

The HEAP

Before profiling memory usage we need to understand how memory works in the Android system.

The android device has a fixed amount of memory, that can be used at a given time.

Things that can use memories are objects, threads or any other process that the android device is going to use, like starting new application or instantiating objects.

This fixed amount of memory is known as the HEAP.

The Heap on the android device basically denotes the amount of free memory at any given time.

You can think of the Heap as a pile of memory blocks, every time something is done that requires memory, a memory block is removed from the Heap and is used for something (like a new object).

For preventing the android system to be overwhelmed when it’s under heavy load, meaning a lot of memory is being used (meaning the heap is low), there is something called the garbage collector(GC) in the android system, it’s job is to access any application and search for objects, threads or processes that no longer needed and returns the allocated memory block to the heap.

The Garbage collector work independently of anything else and it’s a part of the android system.

Memory profiler

When you first open the Memory Profiler, you'll see a detailed timeline of your app's memory use and access tools to force garbage collection, capture a heap dump and record memory allocations.

  1. A button to force a garbage collection event.
  2. A button to capture a heap dump.
  3. A dropdown menu to specify how frequently the profiler captures memory allocations.
  4. Buttons to zoom in/out of the timeline.
  5. A button to jump forward to the live memory data.
  6. The event timeline, which shows the activity states, user input events and screen rotation events.
  7. The memory use timeline, which includes the following:
    • A stacked graph of how much memory is being used by each memory category.
    • A dashed line indicates the number of allocated objects.
    • An icon for each garbage collection event.

How memory is counted

The numbers you see at when you hover the mouse on the graph are based on all the private memory pages that your app has committed according to the Android system.

The categories in the memory count are as follows:

  • Java: Memory from objects allocated from Java or Kotlin code.
  • Native: Memory from objects allocated from C or C++ code.
  • Graphics: Memory used for graphics buffer queues to display pixels to the screen, including GL surfaces, GL textures, and so on. (Note that this is memory shared with the CPU, not dedicated GPU memory.)
  • Stack: Memory used by both native and Java stacks in your app. This usually relates to how many threads your app is running.
  • Code: Memory that your app uses for code and resources, such as dex bytecode, optimized or compiled dex code, .so libraries, and fonts.
  • Others: Memory used by your app that the system isn't sure how to categorize.
  • Allocated: The number of Java/Kotlin objects allocated by your app. This does not count objects allocated in C or C++.

Memory allocations

You can select a specific time range from the profiler graph and you can see a list of allocated objects grouped by class name and sorted by their heap count.

Memory allocations show you how each Java object in your memory was allocated. Specifically, the Memory Profiler can show you the following about object allocations:

  • What types of objects were allocated and how much space they use.
  • The stack trace of each allocation, including in which thread.
  • When the objects were deallocated (only when using a device with Android 8.0 or higher).
  1. List of objects allocated grouped by type.
  2. List of object instances.
  3. Where the object instance was allocated and in which thread.

Profile energy use

The Energy Profiler helps you to find where your app uses more energy than necessary.

The Energy Profiler monitors the use of the CPU, network radio, and GPS sensor, and it displays a visualization of how much energy each of these components uses.

The Energy Profiler does not directly measure energy consumption rather it uses a model that estimates the energy consumption for each resource on the device.

  1. Event timeline: Shows the activities in your app as they transition through different states in their lifecycle. This timeline also indicates user interactions with the device, including screen rotation events.
  2. Energy timeline: Shows estimated energy consumption of your app.
  3. System timeline: Indicates ”system events” that may affect energy consumption.

When you hover your mouse on the chart at a designated time the profiler shows you the use of the CPU, network radio, and GPS sensor and how much energy each of these components uses.

System events

System events are represented with color-coded bar in the system timeline and can be categorized as follows:

  • Wake locks is a mechanism for keeping the CPU or screen ON when the device would otherwise go to sleep when there is no user interaction. Failing to release a wake lock can cause the screen to stay ON for longer than necessary which can drain the battery quickly, so you need to keep your app on check for any unnecessary wake locks. Wake locks are represented with a red bar.
  • Alarms are background tasks that run outside of the context of your app at regular intervals, it gives you the ability to wake up a device at a specified time. If wakeup alarms are triggered excessively, they can drain a device's battery. Alarms are represented with a yellow bar.
  • Jobs are also background tasks that run under specified conditions, such as when the network becomes available. Jobs also are represented with a yellow bar.
  • Location requests use the GPS sensor, which can consume significant amounts of energy. It’s represented with a purple bar.

Conclusion

In this article, we explored the critical aspects of UI performance in Android applications, focusing on overdraw and rendering. We examined what overdraw is, how to debug and analyze it, and provided strategies to fix it. Additionally, we delved into the Android rendering process, including the various stages involved, and how to benchmark rendering performance using the Profile GPU Rendering tool. Lastly, we covered the Android Studio Profiler's capabilities in diagnosing and optimizing CPU, memory, and energy usage.

By understanding and addressing these performance issues, you can create more efficient and responsive applications, ultimately leading to a better user experience and improved device performance. To get started with optimizing your app's UI performance, begin by profiling your app using Android Studio's built-in tools. Identify areas with high CPU usage or excessive rendering time, and apply the techniques discussed in this article to address these issues. Remember to test your optimizations on various devices to ensure consistent performance across different hardware configurations. Regularly profiling your app and applying these optimization techniques will help ensure that your app runs smoothly and efficiently on a wide range of devices.

References

The information and insights provided in this blog post were gathered from various sources to ensure accuracy and comprehensiveness. Below are the key references:

  1. Profiling GPU Rendering on Android - YouTube video detailing the process of profiling GPU rendering on Android.
  2. Profile GPU Rendering - Official documentation on profiling GPU rendering from the Android Developers site.
  3. Inspect GPU Rendering - Detailed guide on inspecting GPU rendering on Android from the Android Developers site.

These resources offer valuable information and practical guidance on optimizing and profiling GPU rendering in Android applications.

Share this story