Object Storage Architecture

Cascade is an object storage platform. Its architecture means it is more efficient, easier to use, and capable of handling much more data than traditional file storage solutions. Cascade automates day-to-day IT operations and can readily evolve to changes in scale, scope, applications, storage, server and cloud technologies over the life of data.


Topics


Object Container Structure

A Cascade object is composed of fixed-content data (a user’s file) and electronic “sticky notes” called metadata. Metadata describes the fixed-content data, including its properties. All the metadata for an object is viewable, but only some of it can be user-modified. The way metadata can be viewed and modified depends on the namespace configuration, the data access protocol and the type of metadata.

 

The Cascade metadata types are described in the following table:

 

Metadata type

Description

Fixed content data

An exact digital copy of a written file which is “fingerprinted” upon ingest using a hashing algorithm: MD5, SHA-1, SHA-256 (default), SHA384, SHA-512 or RIPMD160. These files become immutable after being successfully stored in a virtual storage pool. If the object is under retention, it cannot be deleted before the expiration of its retention period. See Compliance Mode. If versioning is enabled, multiple versions of a file can be retained.

System metadata

Composed of 28 properties that include:

  • The date and time the object was added to the namespace (ingest time).

  • The date and time the object was last changed (change time).

  • The cryptographic hash value of the object along with the namespace hash algorithm used to generate that value.

  • The protocol through which the object was ingested.

It also includes the object's policy settings, such as number of redundant copies, retention, shredding, indexing and versioning. POSIX metadata includes a user ID and group ID, a POSIX permissions value and POSIX time attributes.

Custom metadata

Optional, user-supplied descriptive information about a data object that is usually provided as well-formed XML. It is utilised to add more descriptive details about the object. This metadata can be utilised by future users and applications to understand and repurpose the object content. Cascade supports multiple custom metadata fields for each object.

Object ACL (access control list)

Optional, user-supplied metadata containing a set of permissions granted to users or user groups to perform operations on an object. ACL's control data access at an individual object level and are the most granular data access mechanism.


Store Objects

Cascade access nodes share responsibility for knowing where content is stored. Cascade stores fixed content file data separately from its metadata, placing them in separate parallel data structures.

 

For scaling purposes, Cascade data nodes also maintain a hash index. This is a sharded database that is distributed among all Cascade data nodes. The hash index provides the content addressable lookup function to find data. Each node is responsible for tracking a subset of the index called a region, which tells it where to find data and metadata.

 

Upon receiving a new file, any receiving node is able to write the fixed content file portion to storage it owns, as directed by the assigned service plan. It then computes a hash of the pathname, adds it to the object’s system metadata along with the object’s location, and forwards it to the node responsible for tracking the hash index region. Cascade protects its index with metadata protection level of 2 (MDPL2), which means it will store two copies, saved on different nodes. There is one authoritative copy, and at least one backup copy. A write is not considered complete until all MDPL copies are saved. The actual file is stored in a storage pool defined by the tenant administrator. Storage pools can be constructed with disks inside a Cascade data node or Cascade Storage.


Read Objects

Upon receiving a read request file, the Cascade node computes a hash using the object's pathname. The table below sets out each scenario and how they are managed.

 

If... then...
the Cascade node manages the particular hash index region it looks up the object’s location and fulfils the request.
the Cascade node doesn’t manage the hash region it queries the owner node for the file’s location.

there is a case of node failure

it queries the node with the backup hash index.

the whole site fails

the DNS directs the request to any surviving cluster participant in namespace replication.


Protocols and SDK

While Cascade supports 100% of the S3 and Swift API operations needed for CRUD programming (create, read, update and delete), many new cloud applications being developed still favour its native REST Protocol for Cascade. This protocol is more full-featured than S3 and Swift, providing insight into Cascade’s physical infrastructure, retention, multi-tenancy, search and metadata capabilities.

 

To ease the transition to REST, developers can choose the JAVA SDK for Cascade, complete with libraries, builder patterns and sample code. The SDK provides a fast track path to new applications that operate at a global scale, and with users who expect access from virtually any device that supports a network connection. For those not quite ready to shed their old file access methods, Cascade supports four legacy protocols that include NFS v3, CIFS (SMB 3.1.1), SMTP and WebDAV. Anything written with these APIs can also be accessed with any of the REST API, with directories and filenames intact.


Cascade Background Services

Cascade implements 13 background services, which are described in the table below. These services improve the overall health of the Cascade system, optimise efficiency and maintain the integrity and availability of stored object data. Services run either continuously, periodically (on a specific schedule), or in response to certain events.

 

The system-level administrator can enable, disable, start or stop any service and control the priority or schedule of each service. These controls include running a service for longer periods, running it alone, or assigning it a higher priority. Control runtime system loading by limiting the number of threads that the service can spawn, using simple high, medium and low designations. All scheduled services run concurrently but autonomously to each other, and thus each service may be simultaneously working on different regions of the metadata database. Each service iterates over stored content and eventually examines the metadata of every stored object.

 

On a new Cascade system, each service is scheduled to run on certain days during certain hours. If a particular service completes a full scan in the allotted period, the service stops. If it does not finish, the service resumes where it left off at its next scheduled time slot. After completing a scheduled scan interval, the service posts a summary message in the Cascade system event log.

 

The table below describes the Cascade background services.

 

Service

Description

Capacity Balancing

Attempts to keep the usable storage capacity balanced (roughly equivalent) across all storage nodes in the system. If storage utilisation for the nodes differs by a wide margin, the service moves objects around to bring the nodes closer to a balanced state.

Compression

Compresses object data to make more efficient use of physical storage space.

Content Verification

Guarantees data integrity of repository objects by ensuring that a file matches its digital hash signature. Cascade repairs the object if the hash does not match. Also detects and repairs metadata discrepancies.

Deduplication

Identifies and eliminates redundant objects in the repository, and merges duplicate data to free space.

Disposition

Automatic clean-up of expired objects. A namespace configuration policy authorises Cascade to automatically delete objects after their retention period expires.

Garbage Collection

Reclaims storage space by purging hidden data and metadata for objects marked for deletion, or left behind by incomplete transactions (unclosed NFS or CIFS files).

Scavenging

Ensures that all objects in the repository have valid metadata, and reconstructs metadata in case the metadata is lost or corrupted.

Migration

Migrates data off selected nodes or storage arrays so they can be retired.

Protection

Enforces data protection level (DPL) policy compliance, to ensure that the proper number of copies of each object exists in the system.

Replication

Copies one or more tenants from one Cascade system to another to ensure data availability and enable disaster recovery.

Shredding (secured deletion)

Overwrites storage locations where, for security reasons, copies of the deleted object were stored in such a way that none of its data or metadata can be reconstructed. The default Cascade shredding algorithm uses three passes to overwrite an object.

Storage Tiering

Determines which storage tiering strategy applies to an object, and evaluates where the copies of the object should reside, based on the rules in the applied service plan.

Geodistributed Erasure Coding

Geodistributed erasure coding can be applied when Cascade spans three or more sites. This technology provides 100% data availability despite whole site level outages. Geo-EC deployments consume 25-40% less storage than systems deployed with simple mirror replication.

 


Replication Topologies and Content Fencing

Cascade offers multisite replication technology called global access topology. With these bi-directional, active-active replication links, globally distributed Cascade systems are synchronised to allow users and applications to access data from the closest Cascade site. The results are improved collaboration, performance and availability. This is shown in the diagram below.

 

Metadata-only replication enables organisations to replicate entire objects or just object metadata. A metadata-only strategy allows all clusters to know about all objects, but it controls placing object payload only where needed while saving on WAN costs.

 

One practical use case for metadata-only replication is to create data fences, which allow organisations to share data, but ensure it stays hosted within a specific country or continent boundary. In this model, Cascade replicates metadata, but withholds mass movement of data files. Applications at the remote end are able to see files and directory structures, search metadata fields and even write content. In all cases, the final permanent resting place for the object is at the source. Global access topology supports flexible replication topologies that include chain, star and mesh configurations.

 

The replication process is object-based and asynchronous. The Cascade system in which the objects are initially created is called the primary system. The second system is called the replica. Typically, the primary system and the replica are in separate geographic locations and connected by a high-speed wide area network. The replication service copies one or more tenants, or namespaces, from one Cascade system to another, propagating object creations, object deletions and metadata changes. Cascade also replicates the following:

  • Tenant and namespace configuration
  • Tenant-level user accounts
  • Compliance and tenant log messages, and
  • Retention classes.

Search

Cascade provides access to metadata and content search tools that enable more elegant and automated queries for faster, more accurate results. Through these features you can gain a better understanding of the content of stored files, how content is used and how objects may be related to one another. This understanding can help you to enable more intelligent automation, along with big data analytics based on best-in-class metadata architecture.

 

Cascade software includes comprehensive built-in search capabilities that enable users to search for objects in namespaces, analyse a namespace based on metadata, and manipulate groups of objects to support e-discovery for audits and litigation. The search engine (Apache Lucene) executes on Cascade access nodes and can be enabled at both the tenant and namespace levels. Cascade supports two search facilities:

 

  • A web-based user interface, called the search console, provides an interactive interface to create and execute search queries with “AND”, “OR” logic. Templates with dropdown input fields prompt users for various selection criteria, such as objects stored before a certain date, or larger than a specified size. Clickable query results are displayed on-screen. From the search console, search users can open objects, perform bulk operations on objects (hold, release, delete, purge, privileged delete and purge, change owner, set ACL), and export search results in standard file formats for use as input to other applications.
  • The metadata query API enables REST clients to search Cascade programmatically. As with the search console, the response to a query is metadata for the objects that meet the query criteria, in XML or JSON format.

 

In either case, two types of queries are supported:

  • An object-based query locates objects that currently exist in the repository based on their metadata, including:
    • system metadata
    • custom metadata
    • ACLs
    • object location (namespace or directory).

Note: Multiple, robust metadata criteria can be specified in object-based queries. Objects must be indexed to support this type of query.

 

  • An operation-based query provides time-based retrieval of object transactions. It searches for objects based on operations performed on the objects during specified time periods. And it retrieves records of object creation, deletion and purge (user-initiated actions), and disposition and pruning (system-initiated actions). Operation-based queries return not only objects currently in the repository but also deleted, disposed, purged or pruned objects.

 

Each Cascade object supports up to 10 free-form XML metadata annotations up to 1GB total. This gives separate teams the freedom to work and search independently. An analytics team may add annotations specific to their applications, which are different from the billings applications. XML annotation can provide a significant advantage over simple key value pairs because the search engine can return more relevant results with XML.


 

The page cannot be found

The page you are looking for might have been removed, had its name changed, or is temporarily unavailable. Please make sure you spelled the page name correctly or use the search box.