Runtime aspects of Resoa persistence

Security issues

Caching strategies

Persistence will cause many locks and releases of them during runtime. Locking happens within memory only, the JSON values are cached within maps (Resoa uses performance optimized map implementations from www.javolution.org).
Beside the LockManager memory caching, Resoa uses ehcache, on a memory basis only.

Read requests are always handled by LockManager and follow the sequence:

  1. first ask cache, if the required key is known
  2. second check within LOCK cache, if key exists
  3. finally perform a read request to the storage and cache the value within ehache

Values both in LockManager and ehcache are cached in JSON String representation. This way, object references are resolved new during each read request and cannot contain invalid data.

A Resoa node crashes, what happens?

As Resoa nodes are running within a grid, a crash does not affect the whole system. If a node falls down, all TCP connections to other Resoa components (nodes, controller, rest) are cut immediately, service requests, the node has served for, routed to other nodes.

The Resoa Controllers know the MASTER/SLAVE configuration, if the crashing domain was acting as a domain MASTER, the "MasterController" will immediately send a RuntimeInfo to another node, serving the same domain.

By receiving a RuntimeInfo of type MASTER, a node now processes all remaining journals from local disk into the storage backend and informs the connected SLAVES about all outstanding COMMIT’s. As long you run your Business Service Domains at least on two nodes, a crash will not have serious effects to your application, data consistency will be assured, as only one MASTER is in play at any time.

A controller node crash should not affect the stability of the grid, as long a second controller is still in play. If no controller is available, the services are still up, but without Rest session synchronization and persistence MASTER/SLAVE monitoring, also it is impossible, that new Rest or service nodes join the grid.

How to synchronize a node, which joins the grid?

As long your node does not serve persistent domains, it can join without further actions. In all other cases you must assure, that the local storage is synchronized before this node joins the request execution again.

Resoa Persistence offers two mechanisms, supporting the storage synchronization during runtime:

The “BACKUP” of the storage

A hot backup functionality is available on nodes, mainly using the Berkeley backup services.

The EXECUTION LOG

The storage implementation writes down a logfile for each service domain, where all transactions are recorded in a strict sequence. This logfile will be deleted, after a BACKUP has been performed successfully.

You should periodically invoke the backup service for all domains and nodes to avoid big execution log files. Best you do this daily in a time with low activity.

Before setting up a new node within the grid, the backup should be performed on the domain MASTER node. You must copy the newly generated backup files of the storage to the local storage directory of the new node.

When the node finally starts and joins the grid, it requests the actual execution log file from the domain MASTER (by using the ExecLogService). This way it receives all transactions, which are missed in the backup file. Beside the processing of the execution log, the node silently listens to all tasks and COMMIT's, but it executes them not before log processing has finished.

This way a 100% synchronization to the MASTER’s storage is guaranteed, and the node now is ready to act as a SLAVE.