Skip to content

You are viewing documentation for Immuta version 2021.5.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Immuta Architecture

Audience: All Immuta Users

Content Summary: This page details the major components, installation, scalability, availability, and security of the Immuta platform.

Immuta Components

Immuta's server-side software comprises the following major components:

  • The Immuta Web Service: This service is responsible for all web-based user interaction with Immuta, metadata ingest, data fingerprinting, and backing the Query Engine, Spark partition server, and NameNode plugin. Notionally a single web service, the fingerprinting functionality runs as a separate service internally and can be independently scaled.

  • The Immuta Metadata Catalog: This internal database maintains a small amount of data on each object you register with Immuta so that Immuta can provide responsive access to objects to your users while enabling you to dynamically create access policies on the objects.

  • The Immuta SQL Query Engine: This service tracks queryable data source configuration and exposes a SQL connection to the Immuta web service and to any SQL client, such as a SQL library in Python or R or a BI / Data Science tool. This service interprets client SQL queries, pushes queries to your connected business databases, applies policies, and returns query responses to your SQL clients.

Installation

Immuta's standard installation is a Helm installation to a Kubernetes cluster. This could be a Kubernetes cluster you manage or a hosted solution such as AKS, EKS, or GKE. This is the preferred deployment because of the minimal administration needed to achieve scale and availability. If Kubernetes is not available, Immuta can be installed using Docker on a Linux server. Please see the Immuta Installation Guide for details on all deployment options.

Immuta's optional SparkSQL and Hadoop capabilities install as plugins on your cluster. Please see the Native Access Patterns Installation Guide for full details.

Scalability

Immuta is designed to be scalable in several dimensions. For the standard Immuta deployment, minimal administrative effort is required to manage scaling beyond the addition of nodes to the Immuta system. Scalability can also be achieved in non-standard deployments, but requires the time of skilled systems administrator resources.

  • The Immuta web service is stateless and horizontally scalable.
  • By keeping a metadata catalog rather than maintaining separate copies of data, Immuta's database is designed to remain small and responsive. By running replicated instances of this internal database, the catalog can scale in support of the web service.
  • The Immuta SQL Query Engine can scale horizontally with user load. Individual queries are limited by the memory allocated to an individual instance in scenarios where queries cannot be fully pushed-down to business databases.

High Availability

Because each component of Immuta is designed to be horizontally scalable, Immuta can be configured for high availability. Upgrades and major configuration changes may require scheduled downtime, but even if Immuta's master internal database fails, recovery happens within seconds. With the addition of an external load balancer, Immuta's standard deployment comes preconfigured with these availability features.

Security

Immuta’s core function of policy enforcement and management is designed to improve your data security. Beyond this primary feature, Immuta protects your data in several other ways.

  • Immuta is designed to leverage your existing identity management system when desired. This design allows Immuta to benefit from the work your security team has already done to validate users, protect credentials, and define roles and attributes.

  • By default, all network communications with Immuta and within Immuta are encrypted via TLS. This practice ensures your data is protected while in transit.

  • Immuta does not make any persistent copies of data.

  • Immuta does not store raw customer data. However, it may temporarily cache samples of their data for SDD and fingerprinting. These samples are stored in the metadata database and cache containers.