1. 개요

OS 내에서 SSM Agent 서비스가 정상이나 AWS Console > SSM > Instances & Nodes > Managed Instance 에 등록이 안될 때 


2. 원인 및 해결방안

2.1. Routing 문제로 Metadata 접근 불가

metadata 정보를 받아오는 지 확인

C:\>curl http://169.254.169.254/latest/meta-data/
curl: (7) Failed to connect to 169.254.169.254 port 80: Timed out ## metadata 정보를 받아오지 못함

Default gateway 및 Routing Table 확인

C:\>ipconfig /all        ## gateway 확인
C:\>route print -4       ## IPv4 경로테이블에 169.254.169.** 관련 gateway 옳은지 확인
                            없다면 default routing으로 통신
route table 삭제 후 재등록 (169.254.169.*에 대한 static routing 경로가 다른 게이트웨이 IP 등으로 등록되었을 때 삭제)
## metadata 관련 routing 삭제
C:\>route -p delete 169.254.169.***
              ...
C:\>route -p delete 169.254.169.250
C:\>route -p delete 169.254.169.251
C:\>route -p delete 169.254.169.254

## 옳은 Gateway IP로 routing 추가
C:\>route -p ADD 169.254.169.*** MASK 255.255.255.255 "Gateway IP"
              ...
C:\>route -p ADD 169.254.169.250 MASK 255.255.255.255 "Gateway IP"
C:\>route -p ADD 169.254.169.251 MASK 255.255.255.255 "Gateway IP"
C:\>route -p ADD 169.254.169.254 MASK 255.255.255.255 "Gateway IP"

metadata 잘 받아오는지 확인 후 SSM Agent서비스 재기동


2.2. http Proxy설정되어 Metadata 접근불가

amazon-ssm-agent.log에서 아래와 같은 로그 확인됨

2020-03-12 14:38:50 INFO Windows Only: Job object creation on SSM agent successful 
2020-03-12 14:35:09 INFO Getting IE proxy configuration for current user: The operation completed successfully. 
2020-03-12 14:35:09 INFO Getting WinHTTP proxy default configuration: The operation completed successfully. 
2020-03-12 14:35:09 INFO Proxy environment variables: 
2020-03-12 14:35:09 INFO http_proxy: http://proxy.aws.lgcns.internal:3128 
2020-03-12 14:35:09 INFO https_proxy: 
2020-03-12 14:35:09 INFO no_proxy: 169.254.169.254 2020-03-12 14:35:09 INFO ssm-user already exists. Resetting password. 
2020-03-12 14:35:09 INFO Entering SSM Agent hibernate - RequestError: send request failed caused by: Post https://ssm.ap-northeast-2.amazonaws.com/: proxyconnect tcp: dial tcp: lookup proxy.aws.lgcns.internal: no such host

2.2.1. http proxy 불필요시 제거

C:\>netsh winhttp show proxy -> proxy 설정여부 확인

C:\>netsh winhttp reset proxy -> proxy 설정 삭제

2.2.2. Proxy 를 사용하는 구간에서의 세팅

Windows Powershell 설정

$serviceKey = "HKLM:\SYSTEM\CurrentControlSet\Services\AmazonSSMAgent"
$keyInfo = (Get-Item -Path $serviceKey).GetValue("Environment")
$proxyVariables = @("http_proxy=hostname:port", "no_proxy=169.254.169.254")

If($keyInfo -eq $null)
{
New-ItemProperty -Path $serviceKey -Name Environment -Value $proxyVariables -PropertyType MultiString -Force
} else {
Set-ItemProperty -Path $serviceKey -Name Environment -Value $proxyVariables
}
Restart-Service AmazonSSMAgent
##SSM Agent 프록시 구성 재설정
Remove-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Services\AmazonSSMAgent -Name Environment Restart-Service AmazonSSMAgent

Linux Systemd 설정

## /etc/systemd/system/amazon-ssm-agent.service vi편집기로 열고 아래 추가
env http_proxy=http://hostname:port
env https_proxy=http://hostname:port
env no_proxy=169.254.169.254 ## SSM Agent 재기동 systemctl restart amazon-ssm-agent.service

2.3. OS time 불일치 문제

Linux amazon-ssm-agent.log에서 아래와 같은 로그 확인됨

2020-11-25 22:49:21 INFO Entering SSM Agent hibernate - InvalidSignatureException: Signature not yet current: 20201125T134921Z is still later than 20201125T134631Z (20201125T134131Z + 5 min.) status code: 400, request id: d81445a5-15ba-4689-bbad-02b19b17b2aa

Linux Time Sync 후 agent 재기동한다. amazon linux2에서 기본 chrony 구성은 Amazon Time Sync Service IP(169.254.169.123) 주소를 사용하도록 이미 설정됨

yum erase 'ntp*' ## 미사용 ntp관련 서비스 제거
yum install chrony

cat /etc/chrony.conf | grep server ## "server 169.254.169.123 prefer iburst minpoll 4 maxpoll 4" 설정 되어있는지 확인함
service chronyd restart

## ssm agent 서비스 재기동
systemctl restart amazon-ssm-agent.service