Cattle, not pets.
Any of my fellow system administrators using Amazon’s container service must have hit this AGENT Connected false issue as we have with the Amazon ECS agent.
AWS’s advice is to login to your ECS EC2 instances and restart the agent manually!
For us this is not a very easy / scalable thing to do, so our current work around is to scale up and down. For example:
ecs-cli scale --size 3 --capability-iam ecs-cli compose service scale 3 ecs-cli compose service scale 2 ecs-cli scale --size 2 --capability-iam
This should effectively kill the errant ECS instance & its AWOL agent and bootstraps a fresh one. Another tip is always make sure you are running the latest ecs-cli, since the ECS AMIs are annoyingly hard coded into this client! Furthermore you might want to check you EC2 instances are not old, by inspecting their “Launch time”, keep them fresh by terminating old ones.
Hopefully AWS will fix this issue as they have fixed the silly old images not being cleared out one. Or perhaps we have to start using Lambda? ;) For further tips to get started with ECS, checkout this guide.
If you need to ssh into your cluster, you might find this aws-cli/jq/shell script handy:
#!/bin/bash # Choose your cluster so you can figure out the IPs and ssh in to inspect select cluster in $(aws ecs list-clusters | jq -r .clusterArns) do break done echo Cluster: "$cluster" aws ecs list-container-instances --cluster "$cluster" | jq -r .containerInstanceArns | while read instance do aws ecs describe-container-instances --cluster "$cluster" --container-instances "$instance" | jq -r .containerInstances.ec2InstanceId | while read instanceid do aws ec2 describe-instances --filters Name=instance-id,Values="$instanceid" | jq -r '.Reservations.Instances | .PrivateIpAddress' | while read IP do echo ssh -i ec2-user@$IP done done done
Do please get in touch via email below if you know better.
Devops at Spuul. Any tips or suggestions? Reach out!