JarOpt: Jar file optimizer.
Every .jar file has extra classes that an be stripped out: test classes, obsolete classes, unused libarary classes, non-class files.
This
Ant Task analyzes the dependencies in .jar files and discards unused classes.
This makes for smaller distributions, faster load times, etc.
Although not yet version 1.0, JarOpt is working and can be easily integrated into your build process.
It is
Open Source; free as in freedom and free as in beer.
Contents
Step 1: Download the binary (below).
Step 2: Add JarOpt to your build script.
Step 3: Watch the Ant console output for warnings
about dynamic references that can't be analyzed.
Resolve these by adding "include patterns."
Step-by-step Instructions (
link)
Step 1: Download the binary (below).
Step 2: copy the sample usage (below) into your Ant "build.xml" script.
Step 3: Modify the "taskdef" Task Definition to point to the jaropt.jar binary.
Step 4: Modify the "src", "dst", and "includePatterns" attributes.
-
Set "src" to your source .jar file, ie. src.jar.
-
Set "dst" to your target .jar file, ie. dst.jar.
-
Add an includePattern for each entry point to your .jar (ie. each class with a main() method),
ie. com.your.package.Application.
-
Add an includePattern for each class or package that must is dynamically loaded
(ie. Class.forName() or ClassLoader.loadClass()).
ie. Swing Look & Feels, database drivers, etc.
Step 5: It may take more than one try to discover all of the neccesary includePatterns.
You'll likely need to add includePatterns to protect classes that are loaded dynamically.
Set the "verbose" attribute to "true" to print a list of all of the classes
that are using dynamic class references (ie. Class.forName() or ClassLoader.loadClass()).
Version 0.77 (August 24, 2006)
Source distribution includes binary, source and this document.
This project has one public dependency:
Jakarta BCEL
(tested up to version 5.2). Jakarta BCEL is avaiable under the same license:
the
ASF (Apache) License.
It also depends on private libraries that I haven't yet published, but will soon.
Sample extract from an
Ant "build.xml" script:
<taskdef name="JarOpt"
classname="org.cmc.jaroptimizer.JarOptTask"
classpath="${jarfile}" />
<JarOpt src="${jarfile}" dst="${tempfile}"
stripNonClassFiles="false" verbose="true"
printDependencies="true" >
<includePattern>com.digitprop.tonic.*</includePattern>
<includePattern>ch.randelshofer.quaqua.*</includePattern>
<includePattern>org.cmc.jaroptimizer.JarOptTask</includePattern>
</JarOpt >
<!-- Now replace the original jar -->
<copy file="${tempfile}" tofile="${jarfile}" />
This example is from JarOpt's own build.xml file. So, we here see jaropt optimizing itself.
-
taskdef: point Ant to the jaropt.jar binary with the "classpath" attribute.
If you've added "jaropt.jar" to your ant's classpath (if, perhaps, you are invoking Ant from an IDE) then you may omit the "classpath" attribute.
-
JarOpt: invokes JarOpt.
-
src: the jar to optimize. This can be a) a .jar or .zip file, b) a folder, or c) a semicolon-delimited list of a) and/or b).
Folders are expected to contain classfiles in a traditional folder heirarchy corresponding to classes' packages.
That is, it should look like the output of a javac (compiler) invocation.
required. value: File path(s).
-
dst: the jar to write.
required. value: File path.
-
stripNonClassFiles: Should non-class files be stripped? This includes images, Manifest.mf files, Java source files - anything except .class files.
optional. value: true, false. default: false.
-
verbose: if true, JarOpt describes it's arguments and results.
optional. value: true, false. default: false.
-
printDependencies: if true, JarOpt describes the static dependencies that it finds.
optional. value: true, false. default: false.
-
includePattern(s): patterns a semicolon-delimited list of classfiles to always include.
You must include one pattern that describes your "main" class (in the sample, "org.cmc.jaroptimizer.JarOptTask").
You should also include patterns to include dynamically-loaded classes, ie. using Class.forName("com.some.class")
Common examples include database drivers and Swing "Look and Feels".
required. value: package patterns, ie. org.jakarta.*
Buildfile: E:\development\code\workspace\JarOptimizer\build.xml
Initialize:
[delete] Deleting directory E:\development\code\workspace\JarOptimizer\bin-ant
Compile:
[javac] Compiling 95 source files to E:\development\code\workspace\JarOptimizer\bin-ant
[javac] depend attribute is not supported by the modern compiler
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
Jar:
[jar] Building jar: E:\development\code\workspace\JarOptimizer\dist\jaropt.jar
[jar] Building jar: E:\development\code\workspace\JarOptimizer\dist\jaropt.jar
[JarOpt] log file: E:\development\code\workspace\JarOptimizer\jaropt.log
[JarOpt] is_interactive: false
[JarOpt] <other attributes missing here.>
[JarOpt] srcs: 1
[JarOpt] dst: E:\development\code\workspace\JarOptimizer\dist\jaropt.jaropt.jar
[JarOpt] 1228 classes found (96% of 1277)
[JarOpt] 49 non-classes found (4% of 1277)
[JarOpt] filters:
[JarOpt] 0 {include: com.digitprop.tonic.*}
[JarOpt] 1 {include: ch.randelshofer.quaqua.*}
[JarOpt] 2 {include: org.cmc.jaroptimizer.Explorer}
[JarOpt] 3 {include: org.cmc.jaroptimizer.JarOptTask}
[JarOpt] Missing dependency org.apache.tools.ant.BuildException (first seen: org.cmc.jaroptimizer.JarOptTask: Constant)
[JarOpt] Missing dependency org.apache.tools.ant.FileScanner (first seen: org.cmc.jaroptimizer.JarOptTask: Constant)
[JarOpt] Missing dependency org.apache.tools.ant.taskdefs.MatchingTask (first seen: org.cmc.jaroptimizer.JarOptTask: Constant)
[JarOpt] Missing dependency org.apache.tools.ant.types.FileSet (first seen: org.cmc.jaroptimizer.JarOptTask: Constant)
[JarOpt] Missing dependency org.apache.tools.ant.types.ZipFileSet (first seen: org.cmc.jaroptimizer.JarOptTask: ArgumentTypes)
[JarOpt] Missing dependency ch.randelshofer.quaqua.QuaquaManager (first seen: com.cmc.shared.porting.mac.MacPorting: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJApplicationUtils (first seen: com.cmc.shared.porting.mac.MacPorting: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJOpenApplicationHandler (first seen: com.cmc.shared.porting.mac.MacPorting$4: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJOpenDocumentHandler (first seen: com.cmc.shared.porting.mac.MacPorting$5: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJAboutHandler (first seen: com.cmc.shared.porting.mac.MacPorting$6: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJPrintDocumentHandler (first seen: com.cmc.shared.porting.mac.MacPorting$7: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJQuitHandler (first seen: com.cmc.shared.porting.mac.MacPorting$8: Constant)
[JarOpt] Missing dependency com.apple.mrj.MRJAboutHandler (first seen: com.cmc.shared.porting.mac.MacPorting$9: Constant)
[JarOpt] 560 classes kept (46% of 1228)
[JarOpt] 668 classes discarded (54% of 1228)
[JarOpt] 0 non-classes kept (0% of 49)
[JarOpt] 49 non-classes discarded (100% of 49)
[JarOpt] 560 files kept (44% of 1277)
[JarOpt] 717 files discarded (56% of 1277)
[JarOpt] 1,768,433 bytes kept (48% of 3,664,141)
[JarOpt] 1,895,708 bytes discarded (52% of 3,664,141)
[JarOpt] JarOpt.complete
[JarOpt] elapsed 1969 milliseconds
Main Build:
[echo] Ant at work!
BUILD SUCCESSFUL
Total time: 17 seconds
Comments
Note that JarOpt optimized a 1.5MB .jar file in a couple of seconds.
JarOpt generates warnings for missing dependencies.
It will not generate warnings for classes whose package begins with "javax.", "java." or "sun.".
Version 0.77 released August 24, 2006. Added dynamic class usage warnings,
with simple resolution suggestions. Reduced many of the dependencies.
Finally published source for the first time.
Version 0.76 released August 15, 2006.
Version 0.75 released August 12, 2006.
To do list:
-
Remove dependencies on private "shared libraries."
-
Smaller distribution
-
Support Ant FileSets.
-
Better handling of "bad" arguments. ie. warn if given classes whose path does not correspond to their package.
-
Better handling of Manifests and other .jar-specific non-class files.
-
Release source
-
Three dependency analysis modes: simple, per-method and dynamic.
-
Ant-style path patterns.
-
Mutiple degrees of "verbosity".
Ant is the most popular Java build tool. JarOpt is a Ant plugin
that optimizes jars. That is, it removes unneccesary code from jars (java binaries).
This makes applications load & run faster,
and reduces file sizes and thus download times. Jars can usually be
reduced by 50% or more.
As an Ant Task, JarOpt is easily added to your build process.
JarOpt determines which classes can be discarded by performing dependency analysis.
That is, it determines how class files depend upon each other.
There are two kinds of dependency analysis: static and dynamic.
Examples of static dependencies:
-
A class extends a superclass
-
A class instantiates another class.
-
A class method takes another class as an argument.
-
etc.
Java describes the static dependencies of a class in its
classfile format.
This makes static dependency analysis trivial.
Examples of dynamic dependencies:
-
A class instantiates another class
using Class.forName(...)
-
A class instantiates another class
using ClassLoader.loadClass(...);
One doesn't know at build time what classes will be dynamically loaded at runtime.
Many dynamic class references can be analyzed by decompiling the code.
Some simple examples:
-
Class.forName("com.mysql.jdbc.Driver").newInstance();
-
UIManager.setLookAndFeel(
"com.sun.java.swing.plaf.gtk.GTKLookAndFeel");
Java bytecode uses a stack-based virtual machine.
To determine the classname of a given dynamic reference, we need to
know what the state of the stack when Class.forName or ClassLoader.loadClass is invoked.
Dynamic references are either simple to analyze or impossible.
The classname passed to Class.forName is almost always a local constant
or external variable (ie. read from a properties file, or passed as an argument).
Therefore, we don't need to fully decompile the code around a dynamic reference
as it is unlikely to yield any benefit.
However, we'd like to be able to successfully analyze code that is this complicated:
private void method(boolean b)
throws Exception
{
String classname = b ? "org.package.class1" : "org.package.class2";
// some other code here.
Class my_class = Class.forName(classname);
...
}
In this example, there are two distinct code paths that yield to constant classnames.
It is sufficient to partially decompile the code.
We step through instructions
maintaining a "mock stack." We follow every branch
(loop or conditional), but "terminate execution" whenever we cross code we've
"already executed," ie. loop.
When come upon a dynamic class reference, we examine the stack for constants.
Put another way, rather than a brute force, totalizing approach, we
stage a rehearsal of the code.
Dynamic references that can't be analyzed are brought to the user's attention with warnings.
I initially wrote a java class file
decompiler. This approach turns out to be tricky and inefficient.
What's worse, java bytecode is not isomorphic with java: you can
express things in bytecode that simply cannot be decompiled into java.
There is no complete solution to dynamic
dependency analysis. We're looking for low-hanging fruit, which can
be easily gathered by staging the most likely outcome.