Introduction
At Instaclustr, we host some of our customers’ nodes on the Amazon Web Services (AWS) platform. Depending on customer’s system requirements, we may deploy multiple different instances on AWS:
- EBS-backed nodes, where data is stored externally to the instance on Amazon Elastic Block Service (EBS); and
- Instance-store (ephemeral storage) nodes where data is stored on a local SSD.
See our blog post ‘Cassandra on AWS EBS Infrastructure’ for some more info on our EBS-based nodes. In short, EBS offers better price/performance for most scenarios but there are still some instances where locals SSDs win. EBS also offers the advantage of surviving instance stop starts, allowing you to resize instances, etc. but this is less significant with Cassandra where the system can rebuild a node from replicas of the data.
The Problem
We use CoreOS as our basic operating system, allowing us to run docker-containerised applications easily and without the need to constantly maintain the underlying operating system. Unfortunately, official AMI images provided by the CoreOS team are only EBS-based, so using them for instance-store nodes results in the wrong storage configuration on our systems.
Previously, we used to create our custom CoreOS AMI, which was pre-formatted, included some of our utilities and was defined to properly mount instance storage. But having a custom image has many potential issues:
- Using a custom image, we may experience unexpected behaviour on the nodes.
- Using a custom image for some systems, and vanilla images for others may lead to inconsistent behaviour.
- If we’re not using the default CoreOS update mechanism, we need to build a new custom image for every new release we decide to deploy. Even with automatic tooling, this may require additional work.
- Additional maintenance requirements for our software engineers.
So, to make our environment more efficient, we decided to find a way to run all our systems with the official CoreOS images.
The Solution
The solution for the problem was simpler than imagined. First, to test that the official images can work with instance-store nodes, let’s manually create an instance with added ephemeral storage devices:
1 |
ec2-run-instances ami-cbfdb2a1 --aws-access-key --aws-secret-key -t m3.xlarge -k -g -b "/dev/xvdb=ephemeral0" -b "/dev/xvdc=ephemeral1" |
When the instance starts, we can check that it has the root device (/dev/xvda), and 2 instance-store volumes (/dev/xvdb, /dev/xvdc). This means that we can use them in our instance.
Note: for generic mounting of the above volumes on the CoreOS node, one can use the following cloud-config example:
1 2 3 4 5 6 7 8 9 |
coreos: units: - name: media-ephemeral.mount command: start content: | [Mount] What=/dev/xvdb Where=/media/ephemeral Type=ext3 |
Read more about mounting storage on CoreOS at https://coreos.com/os/docs/latest/mounting-storage.html
So, after verifying that, all we need is to update our provisioning module to include mounting ephemeral devices into instance storage nodes. In Java, this would look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
final RunInstancesRequest runInstancesRequest = new RunInstancesRequest() .withSubnetId(subnetID) .withPrivateIpAddress(existingPrivateAddress) .withImageId(imageId) .withInstanceType(InstanceType.fromValue(instanceType)) .withMinCount(1).withMaxCount(1) .withKeyName(sshKeyName) .withUserData(Base64.encodeBase64String(userdata.getBytes())) .withIamInstanceProfile(new IamInstanceProfileSpecification() .withName(instanceProfileName(nodeName, useCompositeKeyNaming))); final ImmutableSet.Builder deviceMappingBuilder = ImmutableSet.builder(); final BlockDeviceMapping[] internalDevices = new BlockDeviceMapping[]{ new BlockDeviceMapping().withVirtualName("ephemeral0").withDeviceName("/dev/xvdb"), new BlockDeviceMapping().withVirtualName("ephemeral1").withDeviceName("/dev/xvdc"), new BlockDeviceMapping().withVirtualName("ephemeral2").withDeviceName("/dev/xvdd"), new BlockDeviceMapping().withVirtualName("ephemeral3").withDeviceName("/dev/xvde") }; for (int i = 0; i < instanceStorageDiskCount; i++) { if (i >= internalDevices.length) { logger.warn("Number of internal devices requested is larger than predefined set"); break; } deviceMappingBuilder.add(internalDevices[i]); logger.info("Added internal device " + internalDevices[i].getDeviceName()); } runInstancesRequest.setBlockDeviceMappings(deviceMappingBuilder.build()); |
This code will attach the appropriate number of ephemeral volumes to your instance. In our case, we keep the configurations for all types of instances that we launch, so using the proper number of ephemeral volumes is easy by just fetching that from the configuration. Following that, our platform utilities will format and mount the volumes as expected.
As a result, we simplified our platform and switched to using the official CoreOS images for all our systems.