Troubleshooting

Server Log Files

If you need to troubleshoot or monitor Helion Stackato logs with a third party, they can be found in the ~/stackato/logs/ directory on the Helion Stackato server.

These logs are under daily log rotation with the use of logrotate. Up to three days worth of compressed logs are kept before deletion of the oldest archive.

To modify the log rotation, edit the /etc/logrotate.d/stackato file as needed. To disable Helion Stackato log rotation, delete the file or move it to another directory.

health_manager.log

The health_manager process is responsible for monitoring containers and making sure they are relaunched if there is a problem. The health_manager.log file contains information on all application instances running on the system.

Sometimes you may see CRASHED notifications such as this:

[2013-04-07 22:42:01.329571] hm - pid=2701 tid=5b3b fid=5cbf  DEBUG -- healthmanager.status: {"droplet":119,"state":"CRASHED"}

A CRASHED status means that the app crashed within the container and the health_manager is no longer able to find a running process that looks like that app (for example, for a Node app, the node process is not running; for a Java app, there is no Java process). Most of the time this is a problem with the app within the container.

Cross reference the droplet ID in the dea.log or stager.log files to find the application name, then check the logs for the application (for example, stackato crashlogs). By far the most common cause of crashing apps is a lack of memory, allocating more memory to an app is a good first step to see if this fixes the problem.

Inspecting User Apps as an Admin

Helion Stackato Admin accounts have root-like privileges. They can inspect all user applications and service instances running on the system.

The stackato group <command-users-groups-limits> command can be used by admin accounts to inspect applications and service instances for any group or user. For example:

$ stackato group jane.doe@example.com

This sets the scope of subsequent operations to the specified user. Use stackato group --reset to return to the scope of the logged-in admin user.

System Diagnosis

There may be cases where resolving an issue requires a complete view of the system metrics. This functionality is provided by the stackato admin report command. It generates a single file (by default named <target>-report.tgz) that can be provided to the Helion Stackato support team for analysis:

$ stackato admin report

The file is several megabytes in size and will take a few seconds to generate. Send it, along with a detailed description of your problem, to stackato-support@hpe.com.

Specific Cases

  • When pushing an app, the Helion Stackato Client reports OK but app is not running:

    The final output from pushing an app should look like:

    Staging Application: OK
    Starting Application: OK
    

    If the app is being pushed to multiple instances, the client waits until at least one instance is running, and exits at that point (it does not wait until all instances are active). If afterwards you run stackato apps and find the Health status at 0%, it is because the app crashed after starting successfully, not because the Helion Stackato client reported incorrectly.

  • DNS queries returning connection refused:

    This error is reported when the Helion Stackato server does not have an IP Address. To investigate and resolve, try the following:

    • Verify the ARP tables on the hypervisor host, and on the Helion Stackato server through its tty console:

      $ arp -n
      
    • Check that the DHCP client is running:

      $ pgrep dhclient
      $ grep dhclient /var/log/syslog
      
    • Connect to the DHCP server and verify that it is receiving client requests from the Helion Stackato server.

    • If your network is statically configured, assign an IP address on the Helion Stackato server by editing the interfaces file:

      /etc/network/interfaces