简体   繁体   中英

Let's encrypt SSL with traefick on ECS Fargate

I've been trying to solve this for days, but without any luck:

Situation:

I have a ECS cluster on AWS using Fargate, this cluster contains an instance of Traefick 2.3.4 and other containers. I'm using Traefick as reverse proxy to forward the requests to the other containers. Using HTTP everything works fine, so I've decided to add also the secure connection to Traefick. I've tried everything that I could find on the Inte.net but nothing works, when I try to connect to the specified domain with curl it returns:

curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number

Here there are some test that I've done:

traefick.yml:

log:
  level: DEBUG

api:
  dashboard: true

entryPoints:
  web:
    address: :80
    http:
      redirections:
        entryPoint:
          to: websecure
          scheme: https
  websecure:
    address: ":443"

providers:
  ecs:
    clusters:
      - tools-cluster
    region: eu-west-2
    exposedByDefault: false

certificatesResolvers:
  letsencrypt:
    acme:
      caServer: https://acme-staging-v02.api.letsencrypt.org/directory
      email: #########################
      storage: acme.json
      httpchallenge:
        entrypoint: web

Labels:

"dockerLabels": {
        "traefik.enable": "true",
        "traefik.http.services.traefik.loadbalancer.server.port": "8080",
        "traefik.http.routers.traefik.rule": "Host(`${host}`)",
        "traefik.http.routers.traefik.entrypoints": "websecure",
        "traefik.http.routers.traefik.tls.certresolver": "letsencrypt",
        "traefik.http.routers.traefik.service": "api@internal"
      }

this version returns this error:

rror: 400 :: urn:ietf:params:acme:error:connection :: Fetching https://traefik.baaluu.com/.well-known/acme-challenge/td8IdOvJ1_GkigY-jPYaA4YsgeiS5FUiuUS-avbpsuY: Error getting validation data, url

It tries to retrieve that data but it can't because it is redirected to the https and it can't retrieve because https doesn't work, I've tried also without the auto redirect, and it returns a similar error, it can't retrieve that data.

But following this guide it should work correctly.

So I've decided to move to the dnsChallenge with this configuration: Traefick.yml

log:
  level: DEBUG

api:
  dashboard: true

entryPoints:
  web:
    address: :80
  websecure:
    address: ":443"

providers:
  ecs:
    clusters:
      - tools-cluster
    region: eu-west-2
    exposedByDefault: false

certificatesResolvers:
  letsencrypt:
    acme:
      caServer: https://acme-staging-v02.api.letsencrypt.org/directory
      email: ######################
      storage: acme.json
      dnsChallenge:
        provider: route53
        delayBeforeCheck: 3

and same labels as before:

"dockerLabels": {
        "traefik.enable": "true",
        "traefik.http.services.traefik.loadbalancer.server.port": "8080",
        "traefik.http.routers.traefik.rule": "Host(`${host}`)",
        "traefik.http.routers.traefik.entrypoints": "websecure",
        "traefik.http.routers.traefik.tls.certresolver": "letsencrypt",
        "traefik.http.routers.traefik.service": "api@internal"
      }

Still nothing, and I've this inside the logs: AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/170242259" That url contains:

{
  "type": "urn:ietf:params:acme:error:malformed",
  "detail": "Method not allowed",
  "status": 405
}

The latest test that I did is to remove the staging ca server:

log:
  level: DEBUG

api:
  dashboard: true

entryPoints:
  web:
    address: :80
  websecure:
    address: :443

providers:
  ecs:
    clusters:
      - tools-cluster
    region: eu-west-2
    exposedByDefault: false

certificatesResolvers:
  letsencrypt:
    acme:
      email: ###############
      storage: acme.json
      dnsChallenge:
        provider: route53
        delayBeforeCheck: 2

The ssl still doesn't work but I don't see any error message inside the logs: this is the last message that I get about a certificate:

Try to challenge certificate for domain [traefik.baaluu.com] found in HostSNI rule" providerName=letsencrypt.acme routerName=traefik@ecs rule="Host(`traefik.baaluu.com`)"

And there is not much more after that: 在此处输入图像描述 (I'm sorry for the picture but I don't find a way to extract that logs from ECS)

The other containers are still reachable on the http protocol.

If I try to connect to it using te.net I can reach the service:

telnet traefik.baaluu.com 443
Trying 3.8.30.164...
Connected to traefik-1547500306.eu-west-2.elb.amazonaws.com.
Escape character is '^]'.

Same goes for the 80

Looking better inside the logs I've also find this

retry due to: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ :: urn:ietf:params:acme:error:badNonce :: JWS has an invalid anti-replay nonce: \"0004cbkFTGjCALFGDYOmhruMl6_F_fRSj33cOMvdpx5Xd2M\", url: "
time="2020-12-10T13:08:21Z" level=debug msg="legolog: [INFO] retry due to: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ :: urn:ietf:params:acme:error:badNonce :: JWS has an invalid anti-replay nonce: \"0004cbkFTGjCALFGDYOmhruMl6_F_fRSj33cOMvdpx5Xd2M\", url: "

that contains this url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ

{
  "type": "dns-01",
  "status": "valid",
  "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/9205340157/1Wh0tQ",
  "token": "44R4gD4_ZmemiCn5rtkqJyWOcjoj09sEgobUvZLH6yc",
  "validationRecord": [
    {
      "hostname": "traefik.baaluu.com"
    }
  ]
}

So I suppose that the ssl has been generated correctly but I'm not sure.

Any idea or suggestion?

Thanks in advance.

H2K

Edit:

I've removed the ssl from the dashboard and I've put it on another container, now entering inside the dashboard I can see this: 在此处输入图像描述

So I suppose that the ssl is working for that domain, but I still can't connect to it.

Edit 2:

with te.net if I connect to that url on the port 443 and I request the page I can see the content:

telnet xxxxxxxxxxxxxxxxx 443
Trying 3.10.148.201...
Connected to traefik-1547500306.eu-west-2.elb.amazonaws.com.
Escape character is '^]'.
GET /index.html HTTP/1.1
Host: xxxxxxxxxxxxxxxxx

And the content of the page appears, so it is not a load balacer problem or routing problem, it seems that I can reach the container using the 443, simply the ssl is not there. It is like to have 2 http port and both are behaving in the same way. The 443 at the moment is like a port 80.

I've have also spent a number of days trying to work it out so i feel your pain.

The error is misleading, the request doesn't even make it past the ALB let alone traefik.

There are two factors to this issue,

  • The first being that when you specify a port 443 through docker compose as "443:443" you would assume that this creates a HTTPS listener, it actually creates a listener for 443 on the HTTP protocol. In addition the listener also sent the data to the fargate HTTP port and didn't redirect. I'm not sure if this is a bug, or because because i haven't specified that the protocol should be "x-aws-protocol: https" on the target port.

  • I also found some AWS documentation that said if you use a HTTPS port on a ALB that you need an SSL certificate in place at a ALB level. This kind of makes sense that you can't terminate the connection at a task level if you consider the swarm nature and security implications (better minds are welcome to explain)

With the above in mind i created a certificate in the ACM that covered all the the domains that i needed, changed the listener to the HTTPS protocol and specified the certificate i created. At this point i was able to configure traefik to accept traefik to the frontend.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM