|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.io.InputStream
org.archive.io.MappedByteBufferInputStream
public class MappedByteBufferInputStream
An inputstream perspective on a MappedByteBuffer.
This class is effectively a random access input stream. Use
position()
to get current location and then mark and reset to
move about in the stream.
This class is no longer used but its kept around because it documents experience using nio for ARCReader. In summary, minor performance improvementwhen iterating over ARC Records. Was replaced by RandomAccess io implementation because each instance, if the size of an ARC file, took up too much system memory preventing our being able to open tens of instances concurrently. Maybe JVM 1.5 makes big improvements in nio and we'll then use this class again.
This class was made because I wanted to use java.nio memory-mapped files rather than old-school java.io reading arcs because: "Accessing a file through the memory-mapping mechanism can be far more efficient than reading or writing data by conventional means, even when using channels. No explicit system calls need to be made, which can be time-consuming. More importantly, the virtual memory system of the operating system automatically caches memory pages. These pages will be cached using system memory and will not consume space from the JVM's memory heap. Once a memory page has been made valid (brought in from disk), it can be accessed again at full hardware speed without the need to make another system call to get the data. Large, structured files that contain indexes or other sections that are referenced or updated frequently can benefit tremendously from memory mapping....", from the OReilly Java NIO By Ron Hitchens.
Using a ByteBuffer that holds the whole ARC file for sure makes the code simpler and the nice thing about using memory-mapped buffers for reading is that the memory used is allocated in the OS, not in the JVM. I played around w/ this on a machine w/ 512M of physical memory and a swap of 1G (/sbin/swapon -s). I made a dumb program to use file channel memory-mapped buffers to read a file. I was able to read a file of 1.5G using default JVM heap (64M on linux IIRC): i.e. I was able to allocate a buffer of 1.5G inside inside in my small-heap program. Anything bigger and I got complaints back about unable to allocate the memory. So, a channel based reader would be limited only by memory characteristics of the machine its running on (swap and physical memory -- not JVM heap size) ONLY, I discovered the following. Note, a spin on the 'unable to allocate the memory' was that I was unable to keep open tens of ARC instances concurrently because each was using 100meg plus of RAM.
Really big files generated complaint out of FileChannel.map saying the size parameter was > Integer.MAX_VALUE which is also odd considering the type is long. This must be an nio bug. Means there is an upperbound of Integer.MAX_VALUE (about 2.1G or so). This is unfortunate -- particularly as the c-code tools for ARC manipulations, see alexa/common/a_arcio.c, support > 2.1G -- but its good enough for now (ARC files are usually 100M).
The committee seems to still be out regards general nio performance. See NIO ByteBuffer slower than BufferedInputStream. It can be 4 times slower than java.io or 40% faster. For sure its 3x to 4x slower than reading from a buffer: http://jroller.com/page/cpurdy/20040405#raw_nio_performance. Tests done reading arcs show the difference to be little in the scheme of things.
Constructor Summary | |
---|---|
MappedByteBufferInputStream(java.nio.MappedByteBuffer mbb)
Constructor. |
Method Summary | |
---|---|
int |
available()
|
protected void |
checkClosed()
|
void |
close()
|
void |
mark(int markAmount)
|
boolean |
markSupported()
|
long |
position()
|
void |
position(long position)
|
int |
read()
|
int |
read(byte[] b,
int off,
int len)
|
void |
reset()
|
Methods inherited from class java.io.InputStream |
---|
read, skip |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MappedByteBufferInputStream(java.nio.MappedByteBuffer mbb)
mbb
- MappedByteBuffer to use.Method Detail |
---|
public int read() throws java.io.IOException
read
in class java.io.InputStream
java.io.IOException
public int read(byte[] b, int off, int len) throws java.io.IOException
read
in class java.io.InputStream
java.io.IOException
public void close() throws java.io.IOException
close
in interface java.io.Closeable
close
in class java.io.InputStream
java.io.IOException
protected void checkClosed() throws java.io.IOException
java.io.IOException
public boolean markSupported()
markSupported
in class java.io.InputStream
public void mark(int markAmount)
mark
in class java.io.InputStream
public void reset() throws java.io.IOException
reset
in class java.io.InputStream
java.io.IOException
public int available() throws java.io.IOException
available
in class java.io.InputStream
java.io.IOException
public long position()
position
in interface it.unimi.dsi.mg4j.io.RepositionableStream
public void position(long position)
position
in interface it.unimi.dsi.mg4j.io.RepositionableStream
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |