-XX:+UseCompressedOopsIt sounds like this largely solves the bloat problem that happens when you go from 32-bit to 64-bit with Java as this option tells Hotspot to allow heap sizes up to 32 gig, use the eight extra registers that are available with AMD64 and to use 32-bit pointers/references whenever possible to limit bloat. From what I've heard, tests indicate that the bloat is much smaller (~10% compared to the 50%+ you get with vanilla 64-bit) and performance is comparable. Here's a longer post by Ismael Juma to fill you in on the details: Note that, as of recently, this option was only available in the Hotspot "performance" release, but is slated to be added to the main release.
Friday, April 17, 2009
UseCompressedOops
Tuesday, August 12, 2008
Garbage Collection, or an OutOfMemoryError Guide
I've had my share of struggling with the Java Hotspot GC and I'm sure I'm not alone. I see semi-regular postings about OutOfMemoryErrors (OOMEs) on the solr-user list. Many of these are as simple as someone not realizing that they need to pass a max heap parameter (-Xmx) to the jvm to run any kind of serious application, but occasionally the issues are more subtle. Here are some tips and notes that I've put together while building applications using Sun's Hotspot JVM.
- If you haven't already, turn on GC logging by passing the following options to the jvm: "-XX:+PrintGC -XX:+PrintGCDetails"
- It is tremendously easier to debug GC/OOME problems if you can see what the GC is doing.
- For each "Full GC", you'll see three sizes for each of the three generations (young, old, perm): (1) space occupied before the GC, (2) space occupied after the GC, and (3) size of the generation. If #2 is close to #3 for old or perm, you likely need to increase the maximum size for that generation.
- The Permanent Generation stores class information and interned strings. If you need to change the max size, use the option -XX:MaxPermSize (e.g. -XX:MaxPermSize=150m).
- The New and Old generations share the heap space. If you get a heap space OOME or your old gen space is getting low, you need to do one of two things: (1) increase the max heap size using -Xmx (e.g. -Xmx1g), or (2) increase the new ratio -XX:NewRatio (e.g. -XX:NewRatio=8), which determines how the old and new generations split up the heap. If you're on 32-bit, try #1; if you're on 64-bit try both.
- 64-bit JVM objects take up more space than 32-bit JVM objects. The size difference varies by object, but don't be surprised if you need 50% more heap when moving from a 32-bit JVM to a 64-bit one. Some size comparisons.
Additional resources:
Friday, March 14, 2008
It gets worse
32-bit | 64-bit | |
---|---|---|
Object | 8 | 16 |
Integer | 16 | 24 |
Long | 16 | 24 |
Float | 16 | 24 |
Double | 16 | 24 |
String | 40 | 64 |
Date | 24 | 32 |
Calendar | 432 | 544 |
byte[0] | 16 | 24 |
byte[32] | 48 | 56 |
byte[128][0] | 2576 | 4120 |
ArrayList<Integer>(1) | 56 | 96 |
ArrayList<Integer>(2) | 80 | 128 |
Thursday, March 13, 2008
The size of a java.util.Calendar is 432 bytes
Calendar's are cool. They're one of the few objects (in any language) which I consider to have solved the datetime problem. So many datetime objects require extra work from the programmer for (seemlingly) simple queries such determining whether x occurred before or after a month ago. You typically can't simply add a year/month/date/hour/etc. and you always have to be careful of whether an index value is 0, 1, or -1900 based. Calendar lets you focus on the date operations you are performing rather than devising tricks to get around other people's laziness. You can add just about any date/time quantity, there are static fields for month values, and you can compare two Calendar's via the before and after methods.
But, recently, my love of Calendar's took a big hit. This was shortly after I stumbled upon Java Tip 130: Do you know your data size? I was tempted to discount this article due to it's age---5.5 years---so much has changed since then in the Java world. But, after running the SizeOf code and discovering that memory consumption for the various basic objects Vladimir Roubtsov tested hadn't changed, I re-read his article carefully and took every word to heart. That an empty String consumes 40 bytes took me by surprise. But, it wasn't until I tested Calendar that I was truly shocked. In the current system I'm working on, I am storing hundreds-of-thousands of objects with Calendar's as fields. No wonder they consumed so much memory! Now I understand whence that outlandish consumption came.
Going forward, I won't stop using Calendar's, but I'll be much more careful about my usage of them. No more throwing a Calendar into an object that I'll allocate thousands of times...
Other useful articles on memory usage:
- Playing with the Tiger: Measuring the size of your objects by Michael Nascimento Santos
- Determining Memory Usage in Java by Dr. Heinz M. Kabutz
Monday, January 28, 2008
The Subtle NullPointerException
Wednesday, January 23, 2008
MySQL Streaming Result Set, Part II
Wednesday, December 5, 2007
MySQL Streaming Result Set
By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.Ouch! But, that last sentence provides hope. On my plate was a project to write a tool to analyze site activity which was stored in a table with tens of millions of rows. Would be a bit clunky if I couldn't stream/iterate through the data. Here's the magic incantation:
To enable this functionality, you need to create a Statement instance in the following manner:In fact, you don't need to specify forward-only and read-only as they are the defaults, but the Integer.MIN_VALUE fetch size is essential. This signals the driver to stream the data rather than send it all-at-once. There are caveats to this approach. Be sure to read the ResultSet section of the MySQL JDBC API Implementation Notes before using this for serious work. Also, note that if there are delays in processing the streaming data that your network timeout values are sufficiently large. The mysqld net_write_timeout variable should be set at least as large as the longest possible delay. Also, there appear to be undocumented issues with streaming from one table and using a separate connection to read data from that same table. I've observed odd behavior in this case, such as the ResultSet being set to null. Best to ask for all the required data in the initial streaming query.stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE);The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this any result sets created with the statement will be retrieved row-by-row.