Class NRTCachingDirectory

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, Accountable

    public class NRTCachingDirectory
    extends FilterDirectory
    implements Accountable
    Wraps a RAMDirectory around any provided delegate directory, to be used during NRT search.

    This class is likely only useful in a near-real-time context, where indexing rate is lowish but reopen rate is highish, resulting in many tiny files being written. This directory keeps such segments (as well as the segments produced by merging them, as long as they are small enough), in RAM.

    This is safe to use: when your app calls {IndexWriter#commit}, all cached files will be flushed from the cached and sync'd.

    Here's a simple example usage:

       Directory fsDir = FSDirectory.open(new File("/path/to/index").toPath());
       NRTCachingDirectory cachedFSDir = new NRTCachingDirectory(fsDir, 5.0, 60.0);
       IndexWriterConfig conf = new IndexWriterConfig(analyzer);
       IndexWriter writer = new IndexWriter(cachedFSDir, conf);
     

    This will cache all newly flushed segments, all merges whose expected segment size is <= 5 MB, unless the net cached bytes exceeds 60 MB at which point all writes will not be cached (until the net bytes falls below 60 MB).

    • Constructor Summary

      Constructors 
      Constructor Description
      NRTCachingDirectory​(Directory delegate, double maxMergeSizeMB, double maxCachedMB)
      We will cache a newly created output if 1) it's a flush or a merge and the estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total cached bytes is <= maxCachedMB
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void close()
      Close this directory, which flushes any cached files to the delegate and then closes the delegate.
      IndexOutput createOutput​(java.lang.String name, IOContext context)
      Creates a new, empty file in the directory and returns an IndexOutput instance for appending data to this file.
      IndexOutput createTempOutput​(java.lang.String prefix, java.lang.String suffix, IOContext context)
      Creates a new, empty, temporary file in the directory and returns an IndexOutput instance for appending data to this file.
      void deleteFile​(java.lang.String name)
      Removes an existing file in the directory.
      protected boolean doCacheWrite​(java.lang.String name, IOContext context)
      Subclass can override this to customize logic; return true if this file should be written to the RAMDirectory.
      long fileLength​(java.lang.String name)
      Returns the byte length of a file in the directory.
      java.util.Collection<Accountable> getChildResources()
      Returns nested resources of this class.
      java.lang.String[] listAll()
      Returns names of all files stored in this directory.
      java.lang.String[] listCachedFiles()  
      IndexInput openInput​(java.lang.String name, IOContext context)
      Opens a stream for reading an existing file.
      long ramBytesUsed()
      Return the memory usage of this object in bytes.
      void rename​(java.lang.String source, java.lang.String dest)
      Renames source file to dest file where dest must not already exist in the directory.
      (package private) static boolean slowFileExists​(Directory dir, java.lang.String fileName)
      Returns true if the file exists (can be opened), false if it cannot be opened, and (unlike Java's File.exists) throws IOException if there's some unexpected error.
      void sync​(java.util.Collection<java.lang.String> fileNames)
      Ensures that any writes to these files are moved to stable storage (made durable).
      java.lang.String toString()  
      private void unCache​(java.lang.String fileName)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • maxMergeSizeBytes

        private final long maxMergeSizeBytes
      • maxCachedBytes

        private final long maxCachedBytes
      • uncacheLock

        private final java.lang.Object uncacheLock
    • Constructor Detail

      • NRTCachingDirectory

        public NRTCachingDirectory​(Directory delegate,
                                   double maxMergeSizeMB,
                                   double maxCachedMB)
        We will cache a newly created output if 1) it's a flush or a merge and the estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total cached bytes is <= maxCachedMB
    • Method Detail

      • listAll

        public java.lang.String[] listAll()
                                   throws java.io.IOException
        Description copied from class: Directory
        Returns names of all files stored in this directory. The output must be in sorted (UTF-16, java's String.compareTo(java.lang.String)) order.
        Overrides:
        listAll in class FilterDirectory
        Throws:
        java.io.IOException - in case of I/O error
      • deleteFile

        public void deleteFile​(java.lang.String name)
                        throws java.io.IOException
        Description copied from class: Directory
        Removes an existing file in the directory. This method must throw either NoSuchFileException or FileNotFoundException if name points to a non-existing file.
        Overrides:
        deleteFile in class FilterDirectory
        Parameters:
        name - the name of an existing file.
        Throws:
        java.io.IOException - in case of I/O error
      • fileLength

        public long fileLength​(java.lang.String name)
                        throws java.io.IOException
        Description copied from class: Directory
        Returns the byte length of a file in the directory. This method must throw either NoSuchFileException or FileNotFoundException if name points to a non-existing file.
        Overrides:
        fileLength in class FilterDirectory
        Parameters:
        name - the name of an existing file.
        Throws:
        java.io.IOException - in case of I/O error
      • listCachedFiles

        public java.lang.String[] listCachedFiles()
      • createOutput

        public IndexOutput createOutput​(java.lang.String name,
                                        IOContext context)
                                 throws java.io.IOException
        Description copied from class: Directory
        Creates a new, empty file in the directory and returns an IndexOutput instance for appending data to this file. This method must throw FileAlreadyExistsException if the file already exists.
        Overrides:
        createOutput in class FilterDirectory
        Parameters:
        name - the name of the file to create.
        Throws:
        java.io.IOException - in case of I/O error
      • sync

        public void sync​(java.util.Collection<java.lang.String> fileNames)
                  throws java.io.IOException
        Description copied from class: Directory
        Ensures that any writes to these files are moved to stable storage (made durable). Lucene uses this to properly commit changes to the index, to prevent a machine/OS crash from corrupting the index.
        Overrides:
        sync in class FilterDirectory
        Throws:
        java.io.IOException
        See Also:
        Directory.syncMetaData()
      • rename

        public void rename​(java.lang.String source,
                           java.lang.String dest)
                    throws java.io.IOException
        Description copied from class: Directory
        Renames source file to dest file where dest must not already exist in the directory. It is permitted for this operation to not be truly atomic, for example both source and dest can be visible temporarily in Directory.listAll(). However, the implementation of this method must ensure the content of dest appears as the entire source atomically. So once dest is visible for readers, the entire content of previous source is visible. This method is used by IndexWriter to publish commits.
        Overrides:
        rename in class FilterDirectory
        Throws:
        java.io.IOException
      • openInput

        public IndexInput openInput​(java.lang.String name,
                                    IOContext context)
                             throws java.io.IOException
        Description copied from class: Directory
        Opens a stream for reading an existing file. This method must throw either NoSuchFileException or FileNotFoundException if name points to a non-existing file.
        Overrides:
        openInput in class FilterDirectory
        Parameters:
        name - the name of an existing file.
        Throws:
        java.io.IOException - in case of I/O error
      • close

        public void close()
                   throws java.io.IOException
        Close this directory, which flushes any cached files to the delegate and then closes the delegate.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Overrides:
        close in class FilterDirectory
        Throws:
        java.io.IOException
      • doCacheWrite

        protected boolean doCacheWrite​(java.lang.String name,
                                       IOContext context)
        Subclass can override this to customize logic; return true if this file should be written to the RAMDirectory.
      • createTempOutput

        public IndexOutput createTempOutput​(java.lang.String prefix,
                                            java.lang.String suffix,
                                            IOContext context)
                                     throws java.io.IOException
        Description copied from class: Directory
        Creates a new, empty, temporary file in the directory and returns an IndexOutput instance for appending data to this file. The temporary file name (accessible via IndexOutput.getName()) will start with prefix, end with suffix and have a reserved file extension .tmp.
        Overrides:
        createTempOutput in class FilterDirectory
        Throws:
        java.io.IOException
      • slowFileExists

        static boolean slowFileExists​(Directory dir,
                                      java.lang.String fileName)
                               throws java.io.IOException
        Returns true if the file exists (can be opened), false if it cannot be opened, and (unlike Java's File.exists) throws IOException if there's some unexpected error.
        Throws:
        java.io.IOException
      • unCache

        private void unCache​(java.lang.String fileName)
                      throws java.io.IOException
        Throws:
        java.io.IOException
      • ramBytesUsed

        public long ramBytesUsed()
        Description copied from interface: Accountable
        Return the memory usage of this object in bytes. Negative values are illegal.
        Specified by:
        ramBytesUsed in interface Accountable
      • getChildResources

        public java.util.Collection<Accountable> getChildResources()
        Description copied from interface: Accountable
        Returns nested resources of this class. The result should be a point-in-time snapshot (to avoid race conditions).
        Specified by:
        getChildResources in interface Accountable
        See Also:
        Accountables