[UPDATE: The following blog has been updated for accuracy and comprehensiveness. Read our Comparison of ProGuard vs. R8:October 2019 edition
The new Android D8 compiler is the default compiler for Dalvik bytecode now. The optimizing compiler R8 is growing on the horizon, so we get a lot of questions how it relates to ProGuard. In this blog, we'll have a closer look.
History and background of R8
Dalvik bytecode is a fundamental part of most apps, so it rightly receives a lot of attention in the build process. Developers write their code in Java or Kotlin, but Android devices expect Dalvik bytecode. The compilation process for optimized releases has long looked as follows:
The traditional Java compiler compiles source code to Java bytecode. ProGuard can optionally optimize this code, producing Java bytecode that is smaller and faster. The dx compiler can finally convert this Java bytecode to Dalvik bytecode. The Dalvik bytecode is packaged in the apk file and eventually installed on the device. Depending on the version of Android, the bytecode is interpreted, compiled just in time (Dalvik VM), ahead of time (ART), or a combination of both (Android P). There's a wide range of Android devices with many constraints in processing power, memory, and bandwidth, so even in this day and age, the size and efficiency of the bytecode remains important. ProGuard typically reduces the bytecode size by 20-50% and improves the bytecode performance by anything up to 20%.
For developers, the performance of the build process is at least as important, so around 2015, the Android team introduced the compilers Jack and Jill. They integrated the functionality of the Java compiler, ProGuard, and the Dalvik compiler in a single step:
It greatly streamlined the build process, but didn't play very nice with the ecosystem of languages and tools that work with Java bytecode. The Android team abandoned it in 2017. With the new D8 compiler, they take a step back, simply replacing the dx compiler with a fresh implementation:
This setup provides a more gentle evolution, accommodating external tools. If anything, it still allows the Kotlin language to grow successfully. Moreover, D8 already produces better quality bytecode than dx, with fewer instructions and better register allocation.
That still leaves room for optimizing the build process: R8 is a spin-off of D8 that aims to integrate the functionality of ProGuard and D8:
R8 is in an early stage and still under heavy development, but let's already have a peek.
Goals
ProGuard and R8 have three important functions:
- Shrinking or tree shaking: removes unused classes, fields and methods from the application.
- Code optimization: makes the code smaller and more efficient at the instruction level.
- Name obfuscation: renames the remaining classes, fields and methods with short meaningless names. At this point, it mostly reduces the size of the code.
Shrinking and name obfuscation are fairly standard techniques. We did notice a few differences between ProGuard and R8 obfuscation at this time:
- R8 doesn't yet automatically recognize and handle simple cases of reflection.
- R8 can't yet rename class names in strings, in resource files and in resource file names.
- R8 doesn't yet rename inner classes according to the Java standard, which can trigger subtle compatibility issues.
Comparing ProGuard and R8 optimization features
Bytecode optimizations are the most complex part. These are global optimizations: the optimizer looks at the application in its entirety and can change the code to its heart's content -- as long as it preserves the app's external behavior. We've compared the features that we found in the source code of R8 and with the practical tests of our own suite of 2000+ tests:
Optimization | ProGuard | R8 |
Remove unused classes/fields/methods | x | x |
Inline constants | x | x |
Propagate constants | x | x |
Remove unused code | x | x |
Propagate constant arguments | x | |
Propagate constant fields | x | x |
Remove write-only fields | x | x |
Make classes/fields/methods final/... | x | |
Simplify plain enum types | x | |
Simplify basic container classes | x | |
Merge interfaces with single implementations | x | x |
Merge classes | x | |
Remove unused parameters | x | |
Propagate constant return values | x | x |
Make methods private | x | |
Make methods static | x | |
Desynchronize methods | x | |
Simplify tail recursion | x | |
Inline methods | x | x |
Outline common code into new methods | x | |
Merge code | x | |
Peephole optimizations | x | |
Merge Kotlin lambda constructs | x | |
Optimize Kotlin lambda constructs | x | |
Remove logging calls | x | x |
Remove logging code | x |
Some optimizations seem to be supported in the source code of R8 but weren't visible in the tests. We've marked the optimizations that we observed in at least some tests.
The levels of optimization varied. The optimizations of R8 seemed a bit more shallow. R8 converts code to an intermediate representation and performs a static single assignment (SSA) analysis, which is a standard approach to analyze method bodies. ProGuard performs partial evaluations across the code base, which looks farther into the dynamic behavior of the code. ProGuard can for example compute and propagate constants more effectively. ProGuard also performs escape analysis, currently so it can remove any unnecessary synchronization from methods.
Peephole optimizations are still missing from R8. These optimizations scan instructions as short sequences (hence the peephole) and perform tiny optimizations. In our experience, nibbling at the code, a few instructions at a time, often mitigates the performance death by a thousand papercuts, and it triggers other optimizations.
In this context, R8 only performs a single optimization pass, while ProGuard performs multiple passes -- 5 by default in Android builds. Since optimizations often trigger other optimizations, applying at least a few passes is useful, obviously with a trade-off of diminishing returns and processing time.
Removing logging code is an important example of the above observations. Just like ProGuard, R8 can remove logging calls with the option -assumenosideeffects, but it doesn't yet clean up related code as effectively.
Performance at build time
Comparing performance of the optimizations at build time is tricky, since ProGuard and R8 have somewhat different purposes. ProGuard reads and writes Java bytecode. R8 primarily reads Java bytecode and writes Dalvik bytecode -- writing Java bytecode is not functional yet. Comparing them as such yields the following results for our test suite:
ProGuard (1 pass) | 40 seconds |
ProGuard (5 passes) | 1 minute |
R8 | 1 minute |
R8's optimization doesn't seem faster at this time. On the plus side, it's integrated with D8, so it doesn't need a separate non-trivial compilation step to get Dalvik bytecode. R8's processing time will likely increase as more optimizations are added.
Compatibility of ProGuard and R8
The good news for developers is that R8 is backward compatible with ProGuard. If you have a working ProGuard configuration (maybe eclectically copied from Stackoverflow), you can carry that over to R8. It currently still ignores some options. Notably, R8 doesn't implement the options -whyareyoukeeping and -addconfigurationdebugging, which we consider essential to quickly get to a working configuration, as we've explained in a previous blog.
R8 also doesn't have the advanced processing and filtering of nested input and output archives, but the Android build process is pretty standardized anyway.
Interestingly, R8 does add a new option -assumevalue. It takes -assumenosideeffects a step further: it allows you to fix or change the primitive return values of methods. We've been reluctant to add such a feature to ProGuard, since it offers developers another way to shoot themselves in the foot, but this is probably a good moment to consider it again.
Conclusions
Whole-program optimization is hard! ProGuard has seen 15 years of development and testing. Millions of developers have applied it to millions of apps, which has yielded a wealth of feedback on common code and unusual code. The Android R8 is much younger, with a team actively improving its stability and extending its functionality. Developers can only benefit, with Google recognizing the importance of app size and performance, and investing heavily in it.
In a next blog, we'll discuss how our Android-specific DexGuard relates to R8. In short, DexGuard is in a league of its own. It focuses on hardening apps. At the same time, it optimizes not just bytecode, but also Android resources, resource files, assets and native libraries. Have a look at our DexGuard page if you want to learn more about its functionality.