The external data manager is a CyFlex software framework based on the ASSET tranMan system. Various processes will generate files of data which need to be buffered to disk and processed asynchronously. A record of the data processed needs to be retained and stored in a manner to indicate success or failure in processing each file. As a result the external data manager is a combination of software and a directory structure to support this processing.
Since the external data manager is a framework and not an actual program, code needs to be written for each specific type of data file which needs to be processed. The base framework code is written in Java and this document is aimed at Java programmers who need to implement a new file type or need to support existing implementations. So this assumes that the reader understands the object-oriented concepts of base classes and sub-classes.
One aspect of the external data manager is a standard directory structure which all implementations use. Say there is a file of type "file_type_1" which needs an external data manager implementation. Examples of a file type might be PAM data files, emissions composite files and MSU state change files.
The basic directory structure is:
.../file_type_1/ test_cell_1/ complete/ hold/ ready/ test_cell_2/ complete/ hold/ ready/ ... .../file_type_2/ ...
Note that this structure supports more than one file type. In general there will only be one file type for a given implementation. However PAM data, the first implementation of the external data manager, has 3 different data types, headers, data points and MSU data, which CyFlex was setup to send to different directories. However the processing for each type of PAM file is exactly the same. So only one instance of the external data manager is needed even though there are 3 different file types.
Also note that the structure supports more than one test cell with data to be processed. This structure came from the ASSET tranMan system. It allows the external data manager to be used on a central server supporting many test cells. Each test cell would transfer their files to be processed to the appropriate location within this structure. But this structure may also be used at individual test cells. The test cell directory is still needed even though there is only one. Actual implementations have used both approaches. PAM external data is managed through a central server while MSU external data is managed on individual test cells.
The ready directory holds all files to be processed. The external data manager is not responsible for putting files under this directory. Other tasks and processes are responsible for this. Once a file under the ready directory is processed it will be moved to either the complete or the hold directory. If the processing of a file in the ready directory is successful, the file will be moved to the complete directory. If the processing of a file is unsuccessful the file will be moved to the hold directory and an error message file will be created with a name based on the data file name but with ".msg" appended.
There are two main classes when developing external data manager applications. The first is com.cybermetrix.cyflex.extdatman.ExternalDataManager. This class is the main program. It's main function is to read a properties file and extract properties which are common to almost all of it's implementations. It also contains the main processing loop. This loop watches for files showing up in the ready directories, processes them and then moves them to the hold or complete directories.
There is one method in this class which might need to be overridden:
protected boolean fileComplete(File file)This method checks a file to be processed. At times large files being created will show up under the ready file directory before they have completed getting copied or transferred. The default implementation using the class string variable terminatorLine to see if the file ends with the expected terminator. For example PAM files must end with the line "*DONE". In some cases files do not have terminator lines. In this case this method should be overridden to just return "true" to indicate the file is complete since no checking can realistically be done.
The other class is com.cybermetrix.cyflex.DataType.
This class holds the main processing for a single file which showed up in a ready directory. It is abstract and the process and initialize methods must be implemented. See the JavaDoc in the source code for details.
public abstract int process(File file, File holdDirectory);This is where the actual processing of a file under the ready directory is performed. Since each implement of the external data manager can have wildly different processing applied to files of this type, this method must be implemented.
public abstract String initialize(File [][] sourceDirectories);This is invoked at startup to initialize this instance. For example this might obtain a database connection or connect to a remote socket server.
public void terminate();This is invoked when a graceful termination has been requested. The default implementation of this method is empty, no action performed. However instances may need to close sockets or database connections.
Common to both classes are integers defined in the DataType class which specify the result of the process method. The variables are SUCCESSFUL, DATA_ERROR and CONNECTION_ERROR.
The code for these base classes and implementions is stored in a
subversion repository:
http://cmxaps.cybermetrix.com/svn/cyflex_specsgui/trunk
After checking out this repository, the base source code is under
specfilegui.src/com/cybermetrix/cyflex/extdatman. Test cases may be
found under specfilegui.src/tests/tests.extdatman.
Further programming details can be found from the source code which has fairly complete JavaDoc information. Searching for code which extends either of these classes should turn up the existing implementions based on these classes. There should also be test cases for each of the located applications which should further illustrate the configuration and operation of these applications.