XAP on Cloudify Part 1

Introduction

GigaSpaces XAP has lacked an official Cloudify recipe for some time now, and this series of posts will address my efforts to create one. The task of automating the deployment and management of the potentially complex suite of services that XAP represents is a challenge in and of itself, and marrying that with the sometimes overlapping constructs of Cloudify is even more challenging. This post will address the rationale for the effort, a bit of history of the effort, and a description of the recipes as they exist today. This is a work in progress, and all interested in contributing are encouraged to contribute.

Rationale

There are many use cases for automating the deployment and management of XAP using Cloudify constructs. A few:

  • Repeatable deployments. This is one of the standard Cloudify (and devops) virtues. Repeatable, automated deployments are less error prone and convenient.
  • Platform independence. A Cloudify recipe serves as platform (in the infrastructure sense) independent description of deployment and configuration steps. A XAP recipe for Cloudify provides repeatable deployment on any Cloud and on non-virtual (or statically created virtual) hardware.
  • Composition. While it is conceivable to have a complete solution implemented entirely in XAP, in reality this is rare. XAP is generally married at least to a database, and almost always part of a stack where it takes the position of traditional middleware. With a Cloudify recipe, the deployment and management of an entire stack which includes XAP, can be automated.
  • Continuous Delivery. A stack including XAP that is automated via Cloudify can expose CI/CD meaningful remote commands that can provide stack-wide delivery functionality by integrating with popular CI build tools via a common interface.
  • Custom scaling behavior. By scaling using Cloudify rather than the XAP ESM, XAP scaling and rebalancing can be fully customized.
  • Improved dynamism. By describing a complete stack that includes XAP, the ability to scale dynamically in a holistic fashion is greatly increased. A classic example is exploiting public cloud capacity during volume spikes. With a XAP recipe, entire XAP-centric stacks can be spun up on demand, or in anticipation of demand.
  • Derived works. By descriping XAP in a recipe, the door is opened a bit to create derived cloud services based on XAP, such as Grid as a Service, and Replication as a Service, among others.  The recipes are designed to be easily inherited from to support such development.

Using The Recipes

Currently, you can grab the recipes from my fork of the cloudify-recipes repo here. The service recipes are under services/xap9x. A simple application recipe is also available that starts a single container at apps/xap9x-tiny.

The xap9x-tiny application will run without modification or reconfiguration. It uses a XAP 9.6 b9500 distribution and only deploys a single container, so no license is required (at least as currently configured). If you want to experiment with more nodes, or larger containers, you’ll have to provide a license key. This key will need to be set to the license property in both management and container recipes properties files. If you have your own XAP distribution you want to use, you’ll need to put it an HTTP server in a place accessible from your target infrastructure. Then you’ll need to modify the related properties in both management and container recipes.

Besides the license key, there are settings that may need to be tweaked to run a larger grid. For larger containers, you’ll possibly need to use a different machine template (perhaps 8GB or beyond system memory). To do this, you’ll need to create a machine template that refers to the proper cloud machine id, or ip address in the case of BYON. The easiest way to do this is to copy the default “SMALL_LINUX” or “SMALL_UBUNTU” into a separate file, modify the hardwardId to use the larger machine, and then add this to Cloudify via the “add-templates” CLI command. Of course you can simply add this template permanently to your *-cloud.groovy script. Once you have your larger machine, you’ll need to utilize it. The container recipe has a property setting named gsc_jvm_options, which are passed to the container when it starts. These settings are simply passed to the associated GSC in the GSC_JAVA_OPTIONS environment variable (described here).

When a cluster becomes large enough, the management machine may also need to be beefed up.  The management recipe has three settings for tailoring the various JVMs, and their function is equivalent to the gsc_jvm_options. They are:

  • gsm_jvm_options
  • lus_jvm_options
  • webui_jvm_options

To access the XAP Web UI after launching, the management service has a link to it in the “Details” area.

Custom Commands

Some basic custom commands are included:

deploy-pu – deploy a processing unit
Parameters:

  • name – the displayed name of the processing unit
  • url – the URL where the pu can be downloaded from (recipe uses http GET)
  • schema – the space schema for the pu. Defaults to partitioned-sync2backup.
  • partitions – primary partition count
  • backups – number of backups per partition. Can be 0.
  • maxpervm – maximum instances per container VM (not Cloud VM).
  • maxpermachine – maximum instances per physical machine or Cloud VM.
deploy-grid – deploy a space
Parameters:

  • name – the name of the space
  • schema – the space schema. Defaults to partitioned-sync2backup.
  • partitions – primary partition count
  • backups – number of backups per partition. Can be 0.
  • maxpervm – maximum instances per container VM (not Cloud VM).
  • maxpermachine – maximum instances per physical machine or Cloud VM.
undeploy-grid – undeploy a space
Parameters:

  • name – the name of the space

h4(impt). This command can be used to undeploy an arbitrary pu.

A Little History

In my previous work on Storm integration, I developed a simple recipe based on a single node XAP deployment. This was largely to avoid the potential complexities of a full blown XAP recipe, and a basic Storm integration doesn’t really require more than a single node anyway.

Nevertheless, much of that initial recipe was useful as a starting point for the full recipe. Initially, I wanted a single service recipe to represent XAP in Cloudify. This approach would be the simplest from the user perspective, but in the end wasn’t doable in an elegant way. In the end, I chose a two recipe approach: a xap-management recipe, and a xap-container recipe. xap-management orchestrates the management services (gsm, lus, and web ui), and xap-management manages GSCs (containers). The management recipe serves as the hub for custom commands to perform cluster-wide operations such as deployment.

The Recipes: Design

Since a XAP cluster (excluding GSAs) consists of a few management processes, and many containers, it is logical to have (at least) two recipes: one for management and one for containers. One approach might be to simply create a recipe for every type of grid process: ESM, GSM, GSA, LUS, and GSC. This would supply some additional flexibility, but since only two of each at the most is recommended in grids of any size, the amount of flexibility gained from having many recipes is minimal. Also, since Cloudify itself provides GSA (process watchdog) and ESM (dynamic scaling) functionality, there is no need to start additional instances of these. This leaves the GSM (grid management/deployment) and the LUS (lookup service). For the container part, the recipe only needs to start a GSC. At this point in the development process, dynamic rebalancing after an autoscaling event is not addressed.

The Recipes: xap-management

xap-management starts the management services of XAP, the GSM and LUS. It also starts the Web UI. Currently it simply starts the gsm.sh script and uses the embedded LUS. One enhancement target would be to make the starting of a separate LUS a configuration option. The recipe is elastic with a maximum of two instances. Critical configuration items (in the xap-management-service.properties files):

  • template – the cloud template name for the recipe to use.
  • gsm_jvm_options – java options passed to the GSM JVM
  • lus_jvm_options – java options passed to the LUS JVM
  • webui_jvm_options – java options passed to the LUS JVM
  • uiport – the HTTP port for the Web UI to listen on.
  • lusport – the LUS port. This port essential defines and separates grid from each other and Cloudify itself. This port must be the same in both management and container recipes, and different from the LUS port that Cloudify uses.
  • license – if running more than the free license allows, this property must be set to a valid XAP license string.

The Recipes: xap-container

xap-container start GSCs in the cluster.  The recipe is elastic and doesn’t practically limit the number of containers that can be started.  This recipe is much simple than the management recipe, and has no custom commands for public consumption.  It does have a custom command for editing the node hosts file.  This is used for providing a symbolic name for the lookup service, and reacting to management services relocations.   Except for the jvm options above, the configuration parameters are the same.  Since only a single process is started (a GSC), there is only one property to hold JVM options: gsc_jvm_options (which maps to the standard GSC_JAVA_OPTIONS environment variable).

What’s Missing and What’s Next

There are significant omissions and plans for these recipes.  An obvious omission is partition rebalancing on a scaling event.  Another is the need for far more metrics, and a replication gateway recipe.  But this is a start. In future posts, I’ll describe in more detail what’s missing and why, some of the implementation war stories, and future plans.

Conclusion
In my opinion, XAP has needed a general purpose Cloudify recipe, and this is a start in that direction. This is a work in progress. It has been tested on EC2, HPCS, and a BYON cluster. Comments and contributions are welcome, as this will be an ongoing effort. Again, the code is here, at least for now.



Leave a Reply