Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.greenstone.gatherer.msm.parsers
Class GreenstoneMetadataParser  view GreenstoneMetadataParser download GreenstoneMetadataParser.java

java.lang.Object
  extended byjava.util.AbstractMap
      extended byjava.util.HashMap
          extended byjava.util.LinkedHashMap
              extended byorg.greenstone.gatherer.msm.parsers.GreenstoneMetadataParser
All Implemented Interfaces:
java.lang.Cloneable, java.util.Map, org.greenstone.gatherer.msm.MetadataParser, java.io.Serializable

public class GreenstoneMetadataParser
extends java.util.LinkedHashMap
implements org.greenstone.gatherer.msm.MetadataParser

Provides a metadata parser implementation that knows how to locate, prepare for, then import metadata from a previous Greenstone collection. Is aware of such factors as the presence of Metadata Set files and hierarchy files. Updates the profiler where possible to allow for faster subsequent imports from a certain collection. Caches all the information about encountered collections in CollectCFG objects which are softly cached (ie are cached, but are reclaimed before an OutOfMemory exception would be thrown).

Version:
2.3

Nested Class Summary
private  class GreenstoneMetadataParser.BasicGDMDocument
          A 'basic' version of the more complete GDMDocument used elsewhere, this object provides the same functionality except that it doesn't use Metadata objects.
private  class GreenstoneMetadataParser.BasicMetadata
          A simplistic version of metadata, with no live references.
private  class GreenstoneMetadataParser.CollectCFG
          The CollectCFG object encapsulates important metadata information extracted from a collect.cfg file, such as required metadata sets, and hfile associations.
private  class GreenstoneMetadataParser.CollectCFGCache
          This class provides a cache for the instances of parsed collect.cfg files and their associated data.
private  class GreenstoneMetadataParser.HFile
          The HFile object provides a container for the mappings from indexes, of the form 1.1.1, to alias-value pairs.
private  class GreenstoneMetadataParser.MetadataXMLFileSearch
           
 
Nested classes inherited from class java.util.LinkedHashMap
 
Nested classes inherited from class java.util.HashMap
 
Nested classes inherited from class java.util.AbstractMap
 
Nested classes inherited from class java.util.Map
java.util.Map.Entry
 
Field Summary
private  GreenstoneMetadataParser.CollectCFGCache cfg_cache
          A cache of previously parsed collection configuration files.
private static java.lang.String CONFIG_FILENAME
          The default name and location for a collection configuration file (presuming that a collection file prefix will be added).
private static java.lang.String DESCRIPTION_ELEMENT
           
private  boolean dialog_cancelled
          Has this process been cancelled.
private static java.lang.String DIRECTORY_FILENAME
          The pattern to match when searching for directory level assignments.
private static java.lang.String DIRECTORY_FILENAME_SUFFIX
           
private static java.lang.String FILENAME_ELEMENT
           
private static java.lang.String FILESET_ELEMENT
           
private static java.lang.String GIMPORT
          The name of a gdm file.
private  java.util.ArrayList ignore_list
          A list of the collect.cfg paths that we should ignore.
private static java.lang.String IMPORT
           
private static int MAX_CFG_CACHE_SIZE
           
private static int MAX_GDM_CACHE_SIZE
           
private static java.lang.String METADATA_ELEMENT
           
private static java.lang.String METADATA_XML_FILENAME
           
private static java.lang.String MODE_ATTRIBUTE
           
private static java.lang.String NAME_ATTRIBUTE
           
private static java.lang.String SEPARATOR
           
private  java.util.HashMap transform
          A mapping from BasicMetadata to their fully enabled Metadata incarnation.
 
Fields inherited from class java.util.LinkedHashMap
 
Fields inherited from class java.util.HashMap
 
Fields inherited from class java.util.AbstractMap
 
Constructor Summary
GreenstoneMetadataParser()
          Default constructor needed for dynamic class loading.
 
Method Summary
private  void addMetadata(org.greenstone.gatherer.file.FileNode origin, org.greenstone.gatherer.file.FileNode destination, java.util.ArrayList metadata, java.io.File collection_dir, GreenstoneMetadataParser.CollectCFG collect_cfg, boolean dummy_run)
           
private  java.lang.String diff(java.lang.String base_str, java.lang.String target_str)
          Determine the different suffix between two string.
private  GreenstoneMetadataParser.BasicGDMDocument getDocument(java.io.File file)
          Retrieve the BasicGDMDocument found at the given file, or null if there is no such file or if it isn't a valid BasicGDMDocument.
 boolean process(org.greenstone.gatherer.file.FileNode destination, org.greenstone.gatherer.file.FileNode origin, boolean folder_level, boolean dummy_run)
          Locate and import any metadata parsed by this metadata parser given the file involved and its previous incarnation.
protected  boolean removeEldestEntry(java.util.Map.Entry entry)
          Returns true if this map should remove the eldest entry.
private  org.greenstone.gatherer.msm.ElementWrapper selectElement(java.lang.String element_name)
          Display a prompt allowing a user to select a metadata element to attempt to force add/merge or ignore a metadata element to.
 
Methods inherited from class java.util.LinkedHashMap
clear, containsValue, get
 
Methods inherited from class java.util.HashMap
clone, containsKey, entrySet, isEmpty, keySet, put, putAll, remove, size, values
 
Methods inherited from class java.util.AbstractMap
equals, hashCode, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Map
equals, hashCode
 

Field Detail

MAX_CFG_CACHE_SIZE

private static final int MAX_CFG_CACHE_SIZE
See Also:
Constant Field Values

MAX_GDM_CACHE_SIZE

private static final int MAX_GDM_CACHE_SIZE
See Also:
Constant Field Values

CONFIG_FILENAME

private static final java.lang.String CONFIG_FILENAME
The default name and location for a collection configuration file (presuming that a collection file prefix will be added).


DIRECTORY_FILENAME

private static final java.lang.String DIRECTORY_FILENAME
The pattern to match when searching for directory level assignments.

See Also:
Constant Field Values

DIRECTORY_FILENAME_SUFFIX

private static final java.lang.String DIRECTORY_FILENAME_SUFFIX
See Also:
Constant Field Values

DESCRIPTION_ELEMENT

private static final java.lang.String DESCRIPTION_ELEMENT
See Also:
Constant Field Values

FILENAME_ELEMENT

private static final java.lang.String FILENAME_ELEMENT
See Also:
Constant Field Values

FILESET_ELEMENT

private static final java.lang.String FILESET_ELEMENT
See Also:
Constant Field Values

GIMPORT

private static final java.lang.String GIMPORT
The name of a gdm file.

See Also:
Constant Field Values

IMPORT

private static final java.lang.String IMPORT
See Also:
Constant Field Values

METADATA_ELEMENT

private static final java.lang.String METADATA_ELEMENT
See Also:
Constant Field Values

METADATA_XML_FILENAME

private static final java.lang.String METADATA_XML_FILENAME
See Also:
Constant Field Values

MODE_ATTRIBUTE

private static final java.lang.String MODE_ATTRIBUTE
See Also:
Constant Field Values

NAME_ATTRIBUTE

private static final java.lang.String NAME_ATTRIBUTE
See Also:
Constant Field Values

SEPARATOR

private static final java.lang.String SEPARATOR
See Also:
Constant Field Values

ignore_list

private java.util.ArrayList ignore_list
A list of the collect.cfg paths that we should ignore.


dialog_cancelled

private boolean dialog_cancelled
Has this process been cancelled.


cfg_cache

private GreenstoneMetadataParser.CollectCFGCache cfg_cache
A cache of previously parsed collection configuration files.


transform

private java.util.HashMap transform
A mapping from BasicMetadata to their fully enabled Metadata incarnation.

Constructor Detail

GreenstoneMetadataParser

public GreenstoneMetadataParser()
Default constructor needed for dynamic class loading.

Method Detail

process

public boolean process(org.greenstone.gatherer.file.FileNode destination,
                       org.greenstone.gatherer.file.FileNode origin,
                       boolean folder_level,
                       boolean dummy_run)
Locate and import any metadata parsed by this metadata parser given the file involved and its previous incarnation.

Specified by:
process in interface org.greenstone.gatherer.msm.MetadataParser

removeEldestEntry

protected boolean removeEldestEntry(java.util.Map.Entry entry)
Description copied from class: java.util.LinkedHashMap
Returns true if this map should remove the eldest entry. This method is invoked by all calls to put and putAll which place a new entry in the map, providing the implementer an opportunity to remove the eldest entry any time a new one is added. This can be used to save memory usage of the hashtable, as well as emulating a cache, by deleting stale entries.

For example, to keep the Map limited to 100 entries, override as follows:

 private static final int MAX_ENTRIES = 100;
 protected boolean removeEldestEntry(Map.Entry eldest)
 {
   return size() > MAX_ENTRIES;
 }
 

Typically, this method does not modify the map, but just uses the return value as an indication to put whether to proceed. However, if you override it to modify the map, you must return false (indicating that put should leave the modified map alone), or you face unspecified behavior. Remember that in access-order mode, even calling get is a structural modification, but using the collections views (such as keySet) is not.

This method is called after the eldest entry has been inserted, so if put was called on a previously empty map, the eldest entry is the one you just put in! The default implementation just returns false, so that this map always behaves like a normal one with unbounded growth.


addMetadata

private void addMetadata(org.greenstone.gatherer.file.FileNode origin,
                         org.greenstone.gatherer.file.FileNode destination,
                         java.util.ArrayList metadata,
                         java.io.File collection_dir,
                         GreenstoneMetadataParser.CollectCFG collect_cfg,
                         boolean dummy_run)

diff

private java.lang.String diff(java.lang.String base_str,
                              java.lang.String target_str)
Determine the different suffix between two string.


getDocument

private GreenstoneMetadataParser.BasicGDMDocument getDocument(java.io.File file)
Retrieve the BasicGDMDocument found at the given file, or null if there is no such file or if it isn't a valid BasicGDMDocument.


selectElement

private org.greenstone.gatherer.msm.ElementWrapper selectElement(java.lang.String element_name)
Display a prompt allowing a user to select a metadata element to attempt to force add/merge or ignore a metadata element to. For instance an old version of a metadata.xml from the DLS collection might have an assigned metadata value "Publisher=EC Courier", however Publisher won't automatically match to any metadata set. This prompt will be displayed, and some effort will be made to systematically locate the appropriate set. In this case this should be the DLS metadata set as dls.Publisher should be the closest match. Regardless the element selected is returned.