Skip to content

Great-Stone/nomad-demo-with-ecs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nomad Demo

Tested Env

Terraform > 0.12.0 Nomad >= 1.1.0 aws-iam-authenticator >= 0.5.0 docker java

Architecture

Diagram : <./diagram>

architecture

  • Nomad : External Nomad Server
  • EC2 : Nomad Client

Assumptions / Rough Edges

The demo makes some assumptions because of the quick, and local nature of it.

  • AWS access credentials will be available via environment variables or default profile. This is needed by Terraform and Nomad. Please see the AWS documentation for more details on setting this up.
  • The ECS cluster is running within us-east-1. If you need to change the AWS region please update ./nomad/{client-1,client-2,server}.hcl files.

Build Out

When running this demo, a small number of AWS resources will be created. The majority of these do not incur direct costs, however, the ECS task does. For mre information regarding this, please visit the Fargate pricing document. Also beware AWS data transfer costs.

  1. Change directory into the Terraform directory:

    $ cd ./terraform
    
  2. Modify the Terraform variables file with any custom configuration. The file is located at ./terraform/variables.tf.

  3. Perform the Terraform initialisation:

    $ terraform init
    
  4. Apply the Terraform plan to build out the AWS resources:

    $ terraform apply -auto-approve
    
  5. The Terraform output will contain demo_subnet_id and demo_security_group_id values, these should be noted for later use.

  6. Start the Nomad server and clients. Ideally each command is run in a separate terminal allowing for easy following of logs:

    $ cd ../nomad
    $ nomad agent -config=server.hcl
    $ nomad agent -config=client-1.hcl -plugin-dir=$(pwd)/plugins
    $ nomad agent -config=client-2.hcl -plugin-dir=$(pwd)/plugins
    
  7. Check the ECS driver status on a client node:

    $ nomad node status # To see Node IDs
    $ nomad node status <node-id> |grep "Driver Status"
    

Demo

The following steps will demonstrate how Nomad, and the remote driver handle multiple situations that operators will likely come across during day-to-day cluster management. Notably, how Nomad attempts to minimise the impact of task availability even when its availability is degraded.

  1. Using the Terraform output from before, update the nomad/demo-ecs.nomad file to reflect these details. In particular these two parameters need updating:
    security_groups  = ["sg-0d647d4c7ce15034f"]
    subnets          = ["subnet-010b03f1a021887ff"]
    
  2. Submit the remote task driver job to the cluster:
    $ nomad run demo-ecs.nomad
    
  3. Check the allocation status, and the logs to show the client is remotely monitoring the task:
    $ nomad status nomad-ecs-demo
    $ nomad logs -f <alloc-id>
    
  4. Navigate to the AWS ECS console and check the running tasks on the cluster. The URL will look like https://console.aws.amazon.com/ecs/home?region=us-east-1#/clusters/nomad-remote-driver-demo/tasks, but be sure to change the region if needed.
  5. Drain the node on which the remote task is currently being monitored from. This will cause Nomad to create a new allocation, but will not impact the remote task:
    $ nomad node drain -enable <node-id> 
    
  6. Here you can again check the logs of the new allocation and the AWS console to check the status of the ECS task. You should notice the remote task remains running, and the new allocation logs attach and monitor the same task as the previous allocation.
  7. Remove the drain status from the previously drained node so that it is available for scheduling again:
    $ nomad node drain -disable <node-id>
    
  8. Kill the Nomad client which is currently running to simulate a lost node situation. This can be done either by control-c of the process, or using kill -9.
  9. Check the logs of the new allocation and the AWS console to check the status of the ECS task. You should notice the remote task remains running, and the new allocation logs attach, and monitor the same task as the previous allocation.
  10. Now updated the ECS task definition. This process has been wrapped via Terraform using variables:
    $ cd ../terraform
    $ terraform apply -var 'ecs_task_definition_file=./files/updated-demo.json' -auto-approve
    
  11. Update the job specification in order to deploy to new, updated task definition:
    $ cd ../nomad
    $ sed -ie "s/nomad-remote-driver-demo:1/nomad-remote-driver-demo:2/g" demo-ecs.nomad
    
  12. Register the updated job on the Nomad cluster:
    $ nomad run demo-ecs.nomad
    
  13. Observing the AWS console, there will be a new task provisioning. Filtering by status stopped shows the previous task in stopping status. Nomad has successfully deployed the new version of the task.
  14. Stop the Nomad job and observe the task stopping within AWS:
    $ nomad stop nomad-ecs-demo
    

Tear Down

  1. Stop the Nomad clients and server processes, either by control-c or killing the process IDs.

  2. Destroy the created AWS resources, performing a plan and checking the destroy is targeting the expected resources:

    $ cd ../terraform
    $ terraform destroy -auto-approve
    

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors