Cluster Setup

A Helion Stackato cluster is made of a number of Helion Stackato VM instances linked together, each taking on a particular role for the platform. Each Helion Stackato VM contains all of software necessary to take on any role, and boots by default as a "micro cloud" with enough roles enabled to function as a self-contained system.

After designating one of the instances as a "Core" node, other nodes are connected and are given roles in the cluster, distributing the computational tasks of the platform as best suits the particular deployment.

Roles

A Helion Stackato node can take on one or more of the following roles:

The command line tool used to configure Helion Stackato servers is called kato. You can see a list of the available roles at the command line by running the kato info command.

Setup of cluster nodes is done using the kato node setup core and kato node attach sub-commands.

The kato info command will show:

  • assigned roles: roles currently configured to run on the node
  • available roles: roles which can be added with kato role add

Preparing the Core Node

In a Helion Stackato cluster, one node is dedicated as the core node. This node will have a controller, primary, base, and router role but can also include additional roles.

Boot a Helion Stackato VM and set up the core node as described below, then add the other nodes and assign roles.

MBUS_IP

A static IP address is necessary to provide a consistent network interface for other nodes to connect to. This address is called the MBUS IP. If your IaaS or cloud orchestration software provide IP addresses which persist indefinitely and are not reset on reboot you may not have to set this explicitly.

Take note of the internal IP address of the core node. It will be required when configuring additional nodes in the following steps, so that they can attach to the core node.

Make sure that the IP address of its eth0 interface is registering the correct address, which may not be the case if you have set a static IP and not yet rebooted or restarted networking. To check the IP address, run:

$ ifconfig eth0

If necessary, set the static IP address:

$ kato op static_ip

Note

If the IP address of the core node changes, you must reconfigure the cluster to use the new MBUS IP address. Run kato node migrate (or kato op static_ip) on the core node, then kato node attach on all other cluster nodes.

Hostname

Next, set the fully qualified hostname of the core node. This is required so that Helion Stackato's internal configuration matches the DNS record created for this system.

To set the hostname, run:

$ kato node rename hostname.example.com

Tip

To restart the router roles on the core node at a later time, add --no-restart to the command.

This hostname will become the basename of the "API endpoint" address used by clients (e.g. https://api.hostname.example.com).

Note

If you are building a cluster with multiple Routers separate from the core node, the load balancer or gateway router must take on the API endpoint address. Consult the Load Balancer and Multiple Routers section below.

Wildcard DNS

A wildcard DNS record is necessary to resolve not only the API endpoint, but all applications which will subsequently be deployed on the PaaS. Create a wildcard DNS record for the Core node (or Load Balancer/Router).

Core Node

On the core node, execute the following command:

$ kato node setup core api.hostname.example.com

This sets up the core node with just the implicit controller, primary, and router roles. The router role is required on this node even if there are other routers in the cluster and the node is not exposed to the internet.

If you intend to set up the rest of the cluster immediately, you would carry on to enable those roles you ultimately intend to run on the Core node. For example, to set up a core node with the controller, primary, router, and dea roles:

$ kato node setup core api.hostname.example.com
$ kato role add dea

Then proceed to configure the other VMs by attaching them to the Core node and assigning their particular roles. Register Helion Stackato and add the license key before adding nodes to the cluster.

Attaching Nodes and Enabling Roles

Adding nodes to the cluster involves attaching the new VMs to the Core node's IP address using the kato node attach command. This command will check that the new node has a version number compatible with the core node before attaching it.

Roles can be added (or removed) on the new node after attaching using the kato role command, but it is generally preferable to enable roles during the kato attach step using the -e (enable) option as described below for each of the node types.

Note

Setup and maintenance operations can be simplified if Passwordless SSH Authentication has been set up between the Core node and the other nodes in the cluster.

Router Nodes

In smaller clusters, the Router enabled with the core node should be sufficient, and the core node can function as the gateway. To attach a node enabling just the router role:

$ kato node attach -e router <CORE_IP>

If a Router-only node is the gateway to Helion Stackato, the public DNS entry for the API endpoint must point to that node. For larger clusters requiring multiple gateway Routers, see the Load Balancer and Multiple Routers section below.

Data Services Nodes

Data services can share a single node (small clusters) or run on separate nodes (recommended for production clusters). To set up all available data services on a single node and attach it to the core node, run the following command on the data services node:

$ kato node attach -e data-services <CORE_IP>

Note

The Harbor port service needs a publicly routable IP and exposed port range if you want to provide externally accessible TCP and UDP ports for user applications. See the Harbor Requirements & Setup documentation for details.

DEA Nodes

Nodes which stage application code and run application containers are called Droplet Execution Agents (DEAs). Once the controller node is running, you can begin to add some of these nodes with the kato node attach command. To turn a generic Helion Stackato VM into a DEA and connect it to the core node:

$ kato node attach -e dea <CORE_IP>

Continue this process until you have added all the desired DEA nodes. You will be prompted to enter the password for the core node during the attach process. You can avoid this prompt by configuring passwordless authentication from the DEA node to the core node before attaching:

$ ssh-copy-id stackato@<CORE_IP>

Verification

To verify that all the cluster nodes are configured as expected, run the following command on the core node:

$ kato status --all

Removing Nodes

Use the kato node remove to remove a node from the cluster. Run the following command on the core node:

$ kato node remove <NODE_IP>

Role Configuration using the Management Console

Once cluster nodes are connected to the core node, roles can be enabled or disabled using the Cluster Admin interface in the Management Console.

Example Clusters

Single-Node

This is a configuration (not actually a cluster) which you would not generally deploy in production, but it helps to illustrate the role architecture in Helion Stackato. A node in this configuration will function much like a micro cloud, but can be used as the starting point for building a cluster later.

All that is required here is to enable all roles except for mdns (not used in a clustered or cloud-hosted environment):

$ kato node setup core api.hostname.example.com
$ kato role add --all-but mdns

Three-Node

This is the smallest viable cluster deployment, but it lacks the fault tolerance of larger configurations:

  • 1 core node consisting of primary, controller, and router (and supporting processes)
  • 1 data-services node running the database, messaging and filesystem services
  • 1 DEA (Droplet Execution Agent) node

This configuration can support more users and applications than a single node, but the failure of any single node will impact hosted applications.

Five-Node

A typical small Helion Stackato cluster deployment might look like this:

  • 1 core node consisting of primary, controller, and router (and supporting processes)
  • 1 data-services node running the database, messaging and filesystem services
  • 3 DEA (Droplet Execution Agent) nodes

In this configuration, fault tolerance (and limited scalability) is introduced in the pool of DEA nodes. If any single DEA node fails, application instances will be automatically redeployed to the remaining DEA nodes with little or no application down time.

20-Node

A larger cluster requires more separation and duplication of roles for scalability and fault tolerance. For example:

  • 1 core node running the primary and controller roles (with supporting processes)
  • 1 supplemental Controller node (sharing a filesystem and PostgreSQL database with the core node)
  • 1 Load Balancer (Helion Stackato VM or hardware)
  • 2 Router nodes
  • 1 Filesystem service node
  • 1 PostgreSQL + MySQL data service node
  • 1 MongoDB, Redis, RabbitMQ + other data service node
  • 12 DEA (Droplet Execution Agent) nodes

In this configuration:

  • application instances span a larger group of DEA nodes so applications can be easily scaled to meet increasing demand
  • web requests are evenly distributed between two Router nodes, either of which can fail without any interruption of service
  • any data service node failure will be localized, not affecting data services on other nodes
  • the auxiliary controller balances the load on the Management Console and system management tasks

Roles Requiring Persistent or Shared Storage

Though all roles can run using the VM's default filesystem, in production clusters some roles should always be backed by a persistent filesystem (block storage or EBS volumes) to provide scalable storage space and easy snapshotting. Nodes with the following roles should have their /var/stackato/services directory on persistent storage:

  • Data Services: MySQL, PostgreSQL, MongoDB, Redis
  • Filesystem Service
  • Memcache
  • RabbitMQ
  • Harbor

Note

Although Memcache and Redis are in-memory data stores, system service info data is stored on disk, so backing them with a persistent filesystem is recommended.

In clusters with multiple Cloud Controllers, the nodes must share a common /home/stackato/stackato/data mount point as described below in order to work together properly.

See the Persistent Storage documentation for instructions on relocating service data, application droplets, and containers.

Port Configuration

The Helion Stackato micro cloud runs with the following ports exposed:

Port Type Service
22 tcp ssh
25 tcp smtp
80 tcp http
111 tcp portmapper
111 udp portmapper
443 tcp https
3306 tcp mysql
5432 tcp postgresql
5678 tcp DEA directory server
8181 tcp upload server
9001 tcp supervisord

On a production cluster, or a micro cloud running on a cloud hosting provider, only ports 22 (SSH), 80 (HTTPS) and 443 (HTTPS) need to be exposed externally (e.g. for the Router / core node).

Within the cluster (i.e. behind the firewall), it is advisable to allow communication between the cluster nodes on all ports. This can be done safely by using the security group / security policy tools provided by your hypervisor.

If you wish to restrict ports between some nodes (e.g. if you do not have the option to use security groups), the following summary describes which ports are used by which components. Source nodes initiate the communication, Destination nodes need to listen on the specified port.

Port Range Type Source Destination Required by
22 tcp all nodes all nodes ssh/scp/sshfs
4222 tcp all nodes controller NATS
3306 tcp dea,controller mysql nodes MySQL
5432 tcp dea,controller postgresql nodes PostgreSQL
5454 tcp all nodes controller redis
6464 tcp all nodes all nodes applog (redis)
7000 - 7999 tcp all nodes all nodes kato log tail
7474 tcp all nodes all nodes config (redis)
8181 tcp dea,router controller upload server
9001 tcp controller all nodes supervisord
9022 tcp dea controller droplets
9022 tcp controller dea droplets
9025 tcp controller router stackato-rest
9026 tcp router controller stackato-rest
41000 - 61000 tcp,udp dea,controller service nodes service gateways
41000 - 61000 tcp,udp router dea router,harbor

Each node can be internally firewalled using iptables to apply the above rules.

Comments:

  • Ports 80 and 443 need only be open to the world on router nodes.
  • Port 4222 should be open on all nodes for NATS communication with the MBUS IP (core Cloud Controller)
  • Port 9022 should be open to allow transfer of droplets to and from the DEAs, and Cloud Controllers.
  • Port 7845 is required if you plan to stream logs from all nodes in a cluster using kato log tail command.
  • External access on port 22 can be restricted if necessary to the subnet you expect to connect from. If you are providing the stackato ssh feature to your users (recommended), define a distinct security group for the public-facing Cloud Controller node that is the same as a generic Helion Stackato group, but has the additional policy of allowing SSH (Port 22) from hosts external to the cluster.
  • Within the cluster, port 22 should be open on all hosts to allow administrative access over SSH. Port 22 is also used to mount Filesystem service partitions in application containers on the DEA nodes (via SSHFS).
  • The optional Harbor port service has a configurable port range (default 41000 - 61000) which can be exposed externally if required.

Service Nodes

In addition to the ports listed above for service nodes and gateways, several service nodes assign a port for each individual user-requested service instance. These ranges should be kept open between DEA nodes and their respective service nodes. The default ranges are:

  • harbor: 35000 - 40000
  • memcached: 45001 - 50000
  • mongodb: 15001 - 25000
  • rabbit: 35001 - 40000
  • rabbit3: 25001 - 30000
  • redis: 5000 - 15000

Note

You can check the currently configured port range for each service with kato config (e.g. kato config get redis_node port_range).

Container Allowed Hosts & Ports

For security reasons, Docker application containers restrict access to hosts on the eth0 subnet. By default, only ports and hosts for built-in services and components (for example, service instances bound to an application) are explicitly allowed.

To configure a cluster for host and port access, you must determine the IP address of each DEA node using the kato node list command and then ssh to each DEA node, for example:

$ ssh -i myClusterPublicKey stackato@198.51.100.0

The following commands display the current configuration:

  • fence docker/allowed_subnet_ips: Display a list of all allowed IP addresses, for example:

    $ kato config get fence docker/allowed_subnet_ips
    - 192.51.100.0
    - 192.51.100.1
    - 192.51.100.2
    
  • fence docker/allowed_host_ports: Display a list of all allowed ports, for example:

    $ kato config get fence docker/allowed_host_ports
    - 80
    - 443
    - 8123
    - 3306
    - 6379
    

The following commands modify the current configuration:

  • fence docker/allowed_subnet_ips <ip-address:port>: Delete an IP address from the allowed_subnet_ips list, for example:

    $ kato config pop fence docker/allowed_subnet_ips 192.0.2.24:6379
    

    Note

    You can open a port for an individual IP address or an IP CIDR block. Port settings apply only to the specified individual IP address.

  • fence docker/allowed_host_ports <port>: Delete a port from the allowed_host_ports list, for example:

    $ kato config pop fence docker/allowed_host_ports 6379
    

The following settings allow or restrict access from application containers:

  • fence docker/allowed_host_ports: If applications need access to custom services on a specific port, but the IP address changes or is not known ahead of time, add the port to this list, for example:

    $ kato config push fence docker/allowed_host_ports 25
    

    Warning

    Because this action opens the port to all IP addresses, do not perform it on production systems.

  • fence docker/allowed_subnet_ips: If the specific IP address for a service is static and known, add the IP address with or without the port specification, for example:

    $ kato config push fence docker/allowed_subnet_ips 198.51.100.0
    $ kato config push fence docker/allowed_subnet_ips 198.51.100.1:9001
    
  • fence docker/block_network_ips: To explicitly block access to a specific IP address (internal or external):

    $ kato config push fence docker/block_network_ips 203.0.113.0
    

To apply these changes to new application containers, restart the DEA role. To allow applications to regain access to these IP addresses, restart the applications that have already been deployed.

Warning

Two additional settings are exposed in kato config but in most cases should not be modified:

  • fence docker/exposed_container_ports: Container ports to be accessed over the subnet (internal services).
  • fence docker/network_interface: The docker bridge interface.

Harbor (Port Service) Node Configuration

To enable port forwarding for user applications, you must configure the optional Harbor TCP/UDP port service on a node with a public network interface. The security group or firewall settings for this node should make the configured port range publicly-accessible. For complete configuration instructions, see Harbor Setup.

UTM / Gateway Firewalls

Helion Stackato upgrades, patches, and the staging of user applications require the download of packages from external sources. This is most often done over HTTP (port 80) or HTTPS (port 443).

If a UTM (unified threat management) device or other network gateway software is inspecting traffic between the Helion Stackato nodes and upstream package repositories, then user application staging, patching, and upgrades can fail because of timeouts or dropped packets.

Gateway devices and software should be configured to whitelist the upstream sources listed in the Proxy Settings for Upgrades as well as any package repositories or download sources required by buildpacks.

If packet inspection of all inbound traffic is a requirement in your network, consider setting up offline buildpacks and internal package repository mirrors (for example, nexus, debmirror, or rubygems-mirror).

Multiple Controllers

To improve performance, you can have a Helion Stackato cluster run multiple controller nodes on separate VMs. To do this, all controller nodes must share the following two important directories on a high-availability filesystem server:

/home/stackato/stackato/data
/var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads

Warning

These directories are not empty.

You must copy the contents of these directories into the new, shared directories after creating symlinks.

Note

You may use any appropriate type of filesystem, storage server, and network mount. However, when following the recommended workflow that uses sshfs ensure that the following options are set

  • idmap=user
  • reconnect
  • allow_other
  1. Create a shared filesystem on a Network Attached Storage device.

  2. Stop the controller process on the core node:

    $ kato stop controller
    
  3. Perform the following actions on the core node and each additional controller node, for example:

  1. Create a mount point:

    $ sudo mkdir /mnt/controller
    
  2. Give stackato ownership of the mount point:

    $ sudo chown stackato:stackato /mnt/controller
    
  3. Allow non-root users to specify mount options:

    $ sudo sed -i "s/#user_allow_other/user_allow_other/" /etc/fuse.conf
    
  4. Mount the shared filesystem on the mount point (only on secondary controllers):

    $ sshfs -o idmap=user -o reconnect -o allow_other -o ServerAliveInterval=15 \
    stackato@10.0.0.3:/mnt/controller /mnt/controller
    
  5. Move aside the original /home/stackato/stackato/data directory:

    $ mv /home/stackato/stackato/data /home/stackato/stackato/data.old
    
  6. Create a symlink from /home/stackato/stackato/data to the mount point:

    $ ln -s /mnt/controller /home/stackato/stackato/data
    
  7. Move the original /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads directory:

    $ mv /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads \
    /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads.old
    
  8. Create a symlink from /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads to the mount point:

    $ ln -s /mnt/controller /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads
    
  1. Copy the controller data and the droplets on the core node into the shared directory:

    $ cp -r /home/stackato/stackato/data.old/* \
    /var/stackato/data/cloud_controller_ng/tmp/staged_droplet_uploads.old/* /mnt/controller
    
  2. Start the controller process:

    $ kato start controller
    
  3. To enable only the controller process, run the following command on the additional controller nodes:

    $ kato node attach -e controller <CORE_IP>
    

Note

Although the router role must be enabled on the core node, it is not required for the additional controller nodes.

Load Balancer and Multiple Routers

For large-scale deployments that require multiple router nodes, you must configure a load balancer that will distribute connections between the routers.

Load Balancers and SSL Termination

You can configure load balancers to either terminate SSL connections or to pass them through to the routers. These configurations require different router settings.

  • If the load balancer terminates SSL:

    • Use HTTP or HTTPS listeners or ensure that the load balancer adds X-Headers.

    • In the router configuration, set prevent_x_spoofing to false.

    • Backends (connections to router nodes) should forward using HTTP.

      Note

      Using HTTPS is possible, but this requires chaining your certificates on the router nodes.

  • If the load balancer does not terminate SSL:

    • Use TCP listeners.
    • Set prevent_x_spoofing to true.
    • Ensure that listeners on port 443 forward to port 443 on the backend.
    • The certificates should be identical on all router nodes.

Helion Stackato Router as a Load Balancer

Though most users will prefer to use a hardware load balancer or elastic load balancing service provided by the cloud hosting provider, a Helion Stackato VM can be configured to take on this role.

The kato node setup load_balancer command retrieves IP addresses of every router in the cluster and configures an nginx process to distribute load (via round-robin) among a pool of Routers and handle SSL termination.

For example, to setup a cluster with a Helion Stackato Load Balancer and multiple Routers:

Rename the Load Balancer

The Load Balancer is the primary point of entry to the cluster. It must have a public-facing IP address and take on the primary hostname for the system as configured in DNS. Run the following on Load Balancer node:

$ kato node rename hostname.example.com

Set up the Core Node

The core node will need to temporarily take on the API endpoint hostname of the Helion Stackato system (i.e. the same name as the Load Balancer above). Run the following on the core node:

$ kato node rename hostname.example.com

If it is not already configured as the core node, do so now:

$ kato node setup core api.hostname.example.com

The kato node rename command above is being used to set internal Helion Stackato parameters, but all hosts on a network should ultimately have unique hostnames. After setup, rename the core node manually by editing /etc/hostname and /etc/hosts, then sudo service hostname restart.

Set up Supplemental Routers

As with the core node, you will need to run kato node rename on each router with the same API endpoint hostname. Run the following on each Router:

$ kato node rename hostname.example.com

Then enable the 'router' role and attach the node to the cluster:

$ kato node attach -e router <MBUS_IP>

As above, rename each host manually after configuration to give them unique hostnames. The MBUS_IP is the network interface of the Core node (usually eth0).

Configure the Helion Stackato Load Balancer

Note

A Helion Stackato node configured as a Load Balancer cannot have any other roles enabled.

Attach the Helion Stackato VM to the core node:

$ kato node attach <MBUS_IP>

To set up the node as a Load Balancer automatically:

$ kato node setup load_balancer --force

This command fetches the IP addresses of all configured routers in the cluster.

To set up the Load Balancer manually, specify the IP addresses of the Router nodes. For example:

$ kato node setup load_balancer 10.5.31.140 10.5.31.145

Helion Stackato Load Balancer SSL Certificates

The Helion Stackato load balancer terminates SSL connections. SSL certificates must be set up and maintained on this node, and the backend router nodes in order for application SSO and AOK to work correctly.

See the Using your own SSL certificate and CA Certificate Chaining sections for Helion Stackato Load Balancer instructions.