Riaan's SysAdmin Blog

My tips, howtos, gotchas, snippets and stuff. Use at your own risk!

AzureDockerLetsEncryptTraefik

Traefik Wildcard Certificate using Azure DNS

dns challenge letsencrypt Azure DNS

Using Traefik as edge router(reverse proxy) to http sites and enabling a Lets Encrypt ACME v2 wildcard certificate on the docker Traefik container. Verify ourselves using DNS, specifically the dns-01 method, because DNS verification doesn’t interrupt your web server and it works even if your server is unreachable from the outside world. Our DNS provider is Azure DNS.

Azure Configuration

Pre-req

  • azure cli setup
  • Wildcard DNS entry *.my.domain

Get subscription id

$ az account list | jq '.[] | .id'
"masked..."

Create role

$ az role definition create --role-definition role.json 
  {
    "assignableScopes": [
      "/subscriptions/masked..."
    ],
    "description": "Can manage DNS TXT records only.",
    "id": "/subscriptions/masked.../providers/Microsoft.Authorization/roleDefinitions/masked...",
    "name": "masked...",
    "permissions": [
      {
        "actions": [
          "Microsoft.Network/dnsZones/TXT/*",
          "Microsoft.Network/dnsZones/read",
          "Microsoft.Authorization/*/read",
          "Microsoft.Insights/alertRules/*",
          "Microsoft.ResourceHealth/availabilityStatuses/read",
          "Microsoft.Resources/deployments/read",
          "Microsoft.Resources/subscriptions/resourceGroups/read"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "DNS TXT Contributor",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }

NOTE: If you screwed up and need to delete do like like this:
az role definition delete --name "DNS TXT Contributor"

Create json file with correct subscription and create role definition

$ cat role.json
  {
    "Name":"DNS TXT Contributor",
    "Id":"",
    "IsCustom":true,
    "Description":"Can manage DNS TXT records only.",
    "Actions":[
      "Microsoft.Network/dnsZones/TXT/*",
      "Microsoft.Network/dnsZones/read",
      "Microsoft.Authorization/*/read",
      "Microsoft.Insights/alertRules/*",
      "Microsoft.ResourceHealth/availabilityStatuses/read",
      "Microsoft.Resources/deployments/read",
      "Microsoft.Resources/subscriptions/resourceGroups/read"
    ],
    "NotActions":[

    ],
    "AssignableScopes":[
      "/subscriptions/masked..."
    ]
  }

  $ az role definition create --role-definition role.json 
  {
    "assignableScopes": [
      "/subscriptions/masked..."
    ],
    "description": "Can manage DNS TXT records only.",
    "id": "/subscriptions/masked.../providers/Microsoft.Authorization/roleDefinitions/masked...",
    "name": "masked...",
    "permissions": [
      {
        "actions": [
          "Microsoft.Network/dnsZones/TXT/*",
          "Microsoft.Network/dnsZones/read",
          "Microsoft.Authorization/*/read",
          "Microsoft.Insights/alertRules/*",
          "Microsoft.ResourceHealth/availabilityStatuses/read",
          "Microsoft.Resources/deployments/read",
          "Microsoft.Resources/subscriptions/resourceGroups/read"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "DNS TXT Contributor",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }

Checking DNS and resource group

$ az network dns zone list
  [
    {
      "etag": "masked...",
      "id": "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/iqonda.net",
      "location": "global",
      "maxNumberOfRecordSets": 10000,
      "name": "masked...",
      "nameServers": [
        "ns1-09.azure-dns.com.",
        "ns2-09.azure-dns.net.",
        "ns3-09.azure-dns.org.",
        "ns4-09.azure-dns.info."
      ],
      "numberOfRecordSets": 14,
      "registrationVirtualNetworks": null,
      "resolutionVirtualNetworks": null,
      "resourceGroup": "masked...",
      "tags": {},
      "type": "Microsoft.Network/dnszones",
      "zoneType": "Public"
    }
  ]

$ az network dns zone list --output table
  ZoneName    ResourceGroup    RecordSets    MaxRecordSets
  ----------  ---------------  ------------  ---------------
  masked...  masked...            14            10000

$ az group list --output table
  Name                                Location        Status
  ----------------------------------  --------------  ---------
  cloud-shell-storage-southcentralus  southcentralus  Succeeded
  masked...                    eastus          Succeeded
  masked...                    eastus          Succeeded
  masked...                    eastus          Succeeded

role assign

  $ az ad sp create-for-rbac --name "Acme2DnsValidator" --role "DNS TXT Contributor" --scopes "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/masked..."
  Changing "Acme2DnsValidator" to a valid URI of "http://Acme2DnsValidator", which is the required format used for service principal names
  Found an existing application instance of "masked...". We will patch it
  Creating a role assignment under the scope of "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/masked..."
  {
    "appId": "masked...",
    "displayName": "Acme2DnsValidator",
    "name": "http://Acme2DnsValidator",
    "password": "masked...",
    "tenant": "masked..."
  }

  $ az ad sp create-for-rbac --name "Acme2DnsValidator" --role "DNS TXT Contributor" --scopes "/subscriptions/masked.../resourceGroups/masked..."
  Changing "Acme2DnsValidator" to a valid URI of "http://Acme2DnsValidator", which is the required format used for service principal names
  Found an existing application instance of "masked...". We will patch it
  Creating a role assignment under the scope of "/subscriptions/masked.../resourceGroups/masked..."
  {
    "appId": "masked...",
    "displayName": "Acme2DnsValidator",
    "name": "http://Acme2DnsValidator",
    "password": "masked...",
    "tenant": "masked..."
  }

  $ az role assignment list --all | jq -r '.[] | [.principalName,.roleDefinitionName,.scope]'
  [
    "http://Acme2DnsValidator",
    "DNS TXT Contributor",
    "/subscriptions/masked.../resourceGroups/masked..."
  ]
  [
    "masked...",
    "Owner",
    "/subscriptions/masked.../resourcegroups/masked.../providers/Microsoft.Storage/storageAccounts/masked..."
  ]
  [
    "http://Acme2DnsValidator",
    "DNS TXT Contributor",
    "/subscriptions/masked.../resourceGroups/masked.../providers/Microsoft.Network/dnszones/masked..."
  ]

$ az ad sp list | jq -r '.[] | [.displayName,.appId]'
  The result is not complete. You can still use '--all' to get all of them with long latency expected, or provide a filter through command arguments
...

  [
    "AzureDnsFrontendApp",
    "masked..."
  ]

  [
    "Azure DNS",
    "masked..."
  ]

Traefik Configuration

reference

Azure Credentials in environment file

$ cat .env
    AZURE_CLIENT_ID=masked...
    AZURE_CLIENT_SECRET=masked...
    AZURE_SUBSCRIPTION_ID=masked...
    AZURE_TENANT_ID=masked...
    AZURE_RESOURCE_GROUP=masked...
    #AZURE_METADATA_ENDPOINT=

Traefik Files

    $ cat traefik.yml 
    ## STATIC CONFIGURATION
    log:
      level: INFO

    api:
      insecure: true
      dashboard: true

    entryPoints:
      web:
        address: ":80"
      websecure:
        address: ":443"

    providers:
      docker:
        endpoint: "unix:///var/run/docker.sock"
        exposedByDefault: false

    certificatesResolvers:
      lets-encr:
        acme:
          #caServer: https://acme-staging-v02.api.letsencrypt.org/directory
          storage: acme.json
          email: admin@my.doman
          dnsChallenge:
            provider: azure

        $ cat docker-compose.yml 
        version: "3.3"

        services:

            traefik:
              image: "traefik:v2.2"
              container_name: "traefik"
              restart: always
              env_file:
                - .env
              command:
                #- "--log.level=DEBUG"
                - "--api.insecure=true"
                - "--providers.docker=true"
                - "--providers.docker.exposedbydefault=false"
              labels:
                 ## DNS CHALLENGE
                 - "traefik.http.routers.traefik.tls.certresolver=lets-encr"
                 - "traefik.http.routers.traefik.tls.domains[0].main=*.iqonda.net"
                 - "traefik.http.routers.traefik.tls.domains[0].sans=iqonda.net"
                 ## HTTP REDIRECT
                 #- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
                 #- "traefik.http.routers.redirect-https.rule=hostregexp({host:.+})"
                 #- "traefik.http.routers.redirect-https.entrypoints=web"
                 #- "traefik.http.routers.redirect-https.middlewares=redirect-to-https"
              ports:
                - "80:80"
                - "8080:8080" #Web UI
                - "443:443"
              volumes:
                - "/var/run/docker.sock:/var/run/docker.sock:ro"
                - "./traefik.yml:/traefik.yml:ro"
                - "./acme.json:/acme.json"
              networks:
                - external_network

            whoami:
              image: "containous/whoami"
              container_name: "whoami"
              restart: always
              labels:
                - "traefik.enable=true"
                - "traefik.http.routers.whoami.entrypoints=web"
                - "traefik.http.routers.whoami.rule=Host(whoami.iqonda.net)"
                #- "traefik.http.routers.whoami.tls.certresolver=lets-encr"
                #- "traefik.http.routers.whoami.tls=true"
              networks:
                - external_network

            db:
              image: mariadb
              container_name: "db"
              volumes:
                - db_data:/var/lib/mysql
              restart: always
              environment:
                MYSQL_ROOT_PASSWORD: somewordpress
                MYSQL_DATABASE: wordpress
                MYSQL_USER: wordpress
                MYSQL_PASSWORD: wordpress
              networks:
                - internal_network

            wpsites:
              depends_on:
                - db
              ports:
                - 8002:80
              image: wordpress:latest
              container_name: "wpsites"
              volumes:
                - /d01/html/wpsites.my.domain:/var/www/html
              restart: always
              environment:
                WORDPRESS_DB_HOST: db:3306
                WORDPRESS_DB_USER: wpsites
                WORDPRESS_DB_NAME: wpsites
              labels:
                 - "traefik.enable=true"
                 - "traefik.http.routers.wpsites.rule=Host(wpsites.my.domain)"
                 - "traefik.http.routers.wpsites.entrypoints=websecure"
                 - "traefik.http.routers.wpsites.tls.certresolver=lets-encr"
                 - "traefik.http.routers.wpsites.service=wpsites-svc"
                 - "traefik.http.services.wpsites-svc.loadbalancer.server.port=80"
              networks:
                - external_network
                - internal_network

        volumes:
              db_data: {}

        networks:
          external_network:
          internal_network:
            internal: true

WARNING: If you are not using the staging endpoint for LetsEncrypt strongly reconside doing that while working on this. You can get blocked for a week.

Start Containers

$ docker-compose up -d --build
whoami is up-to-date
Recreating traefik ... 
db is up-to-date
...
Recreating traefik ... done

Showing some log issues you may see

$ docker logs traefik -f
    ...
    time="2020-05-17T21:17:40Z" level=info msg="Testing certificate renew..." providerName=lets-encr.acme
    ...
    time="2020-05-17T21:17:51Z" level=error msg="Unable to obtain ACME certificate for domains
    ..."AADSTS7000215: Invalid client secret is provided.

$ docker logs traefik -f
    ...
    \"keyType\":\"RSA4096\",\"dnsChallenge\":{\"provider\":\"azure\"},\"ResolverName\":\"lets-encr\",\"store\":{},\"ChallengeStore\":{}}"
     acme: error presenting token: azure: dns.ZonesClient#Get: Invalid input:     autorest/validation: validation failed: parameter=resourceGroupName constraint=Pattern value=\"\\\"sites\\\"\" details: value 

$ docker logs traefik -f
    ...
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *acme.Provider {\"email\":\"admin@iqonda.com\",\"caServer\":\"https://acme-staging-v02.api.letsencrypt.org/   directory\",\"storage\":\"acme.json\",\"keyType\":\"RSA4096\",\"dnsChallenge\":{\"provider\":\"azure\"},\"ResolverName\":\"lets-encr\",\"store\":{},\"ChallengeStore\":{}}"
    time="2020-05-17T22:23:38Z" level=info msg="Testing certificate renew..." providerName=lets-encr.acme
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *traefik.Provider {}"
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *docker.Provider {\"watch\":true,\"endpoint\":\"unix:///var/run/docker.sock\",\"defaultRule\":\"Host({{  normalize .Name }})\",\"swarmModeRefreshSeconds\":15000000000}"
    time="2020-05-17T22:23:48Z" level=info msg=Register... providerName=lets-encr.acme

In a browser looking at cert this means working but still stage url: CN=Fake LE Intermediate X1

NOTE: In Azure DNS activity log i can see TXT record was created and deleted. Record will be something like this: _acme-challenge.my.domain

Browser still not showing lock. Test with https://www.whynopadlock.com and in my case was just a hardcoded image on the page making it insecure.

admin

Bio Info for Riaan