Category: Traefik

Aug 16

traefik in kubernetes with AWS ALB

This is a quick update. In post https://blog.ls-al.com/traefik-in-kubernetes-using-terraform-helm-and-aws-alb/ I had a TODO to work on the AWS ALB health check. This is a fix for that.

NOTE: compared to my previous values file:

  1. added --ping and --ping.entrypount=web
  2. added ports traefik healthchecksport
  3. added websecure false but not related to health check just something I did not need exposed with ssl offloading on LB

helm values ping entrypoint

additionalArguments:
- --providers.kubernetescrd.ingressclass=traefik-pub3
- --ping
- --ping.entrypoint=web

# READ THIS: https://blog.ttauveron.com/posts/traefik_behind_google_l7_load_balancer/
ports:
  traefik:
    healthchecksPort: 8000
  websecure:
    expose: false
  ...

annotations in helm values file

alb.ingress.kubernetes.io/healthcheck-path: "/ping"
alb.ingress.kubernetes.io/healthcheck-port: "traffic-port"

Comments Off on traefik in kubernetes with AWS ALB
comments

Aug 14

traefik in kubernetes using terraform + helm and AWS ALB

If you are using Traefik in kubernetes but you want to use an AWS ALB (application load balancer) this recipe may work for you. You will note a few important things:

  1. Traefik relies on the underlying kubernetes provider to create an Ingress. If not specified this will be a loadbalancer CLB (classic load balancer). There is a way to make this a NLB (network load balancer) but the AWS provider is not doing an ALB so Traefik can't do an ALB. This recipe therefore relies on a NodePort service and ties the Ingress (ALB) to the NodePort service via the ingressclass annotation. If you do not like or want to use NodePort this is not for you.
  2. Yet to confirm can this recipe work if you did not intentionally install the AWS LBC (load balancer controller). And does this work on non AWS EKS or self-managed kubernetes on AWS.
  3. Still looking at why the AWS Target group health check is not able to use the /ping or /dashboard. This may be an issue with my security groups but for now I just created manually a IngressRoute /<>-health on the Traefik web entrypoint and updated the Target Group health check either programatically or in the AWS console.
  4. I did not want to complicate this with kubernetes so I am using the simplest way for helm to communicate with the cluster and point to the environment kube config to get to the cluster.
  5. I did some minimal templating to change the helm release name and corresponding kubernetes objects but for this post I just hard coded for simplicity.
  6. I commented out deployment as Daemonset for my testing. You need to decide what is better in your environment Deployment or Daemonset.

providers.tf

provider "helm" {
  kubernetes {
    config_path = "~/.kube/config-eks"
  }
}

versions.tf

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = ">= 2.0.1"
    }
  }
  required_version = ">= 0.15"
}

helm values

additionalArguments:
- --providers.kubernetescrd.ingressclass=traefik-pub

#deployment:
#  kind: DaemonSet

service:
  enabled: true
  type: NodePort

service:
  enabled: true
  type: NodePort
extraObjects:
  - apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: traefik-pub
      annotations:
        kubernetes.io/ingress.class: alb
        alb.ingress.kubernetes.io/scheme: internet-facing
        alb.ingress.kubernetes.io/security-groups: sg-,
        #alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig":
        #  { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
        alb.ingress.kubernetes.io/backend-protocol: HTTP
        alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1::certificate/b6ead273-66e9-4768-ad25-0924dca35cdb
        alb.ingress.kubernetes.io/healthcheck-path: "/traefik-pub-health"
        alb.ingress.kubernetes.io/healthcheck-port: "traffic-port"
        #alb.ingress.kubernetes.io/healthcheck-protocol: HTTP
        alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
    spec:
      defaultBackend:
        service:
          name: traefik-pub
          port:
            number: 80  

ingressClass:
  enabled: true
  isDefaultClass: false

ingressRoute:
  dashboard:
    enabled: true
    # Additional ingressRoute annotations (e.g. for kubernetes.io/ingress.class)
    annotations:
      kubernetes.io/ingress.class: traefik-pub
    # Additional ingressRoute labels (e.g. for filtering IngressRoute by custom labels)
    labels: {}
    entryPoints:
    - traefik
    labels: {}
    matchRule: PathPrefix(/dashboard) || PathPrefix(/api)
    middlewares: []
    tls: {}

rollingUpdate:
  maxUnavailable: 1
  maxSurge: 1

variables.tf (shortened for documentation)

variable "traefik_name" {
  description = "helm release name"
  type        = string
  default     = "traefik-pub"
}

variable "namespace" {
  description = "Namespace to install traefik chart into"
  type        = string
  default     = "test"
}

variable "traefik_chart_version" {
  description = "Version of Traefik chart to install"
  type        = string
  default     = "21.2.1"
}

chart.tf

resource "helm_release" "traefik" {
  namespace        = var.namespace
  create_namespace = true
  name             = var.traefik_name
  repository       = "https://traefik.github.io/charts"
  chart            = "traefik"
  version          = var.traefik_chart_version
  timeout = var.timeout_seconds

  values = [
    file("values.yml")
  ]

  set {
    name  = "deployment.replicas"
    value = var.replica_count
  }

}

Comments Off on traefik in kubernetes using terraform + helm and AWS ALB
comments

Jul 07

OCI Network Load Balancer Source Header Preservation

In my case running Traefik on docker I was not getting real ip addresses. I changed the NLB option Source header (IP, port) preservation: Enabled

To change this you need to remove the targets from the backend set first.

At the same time suddenly the console did not allow me to add the VM.Standard.A1.Flex server back into the backend set. It required me to upgrade to a paid account. Which is nonsense since I used this server as a target for a long time and now suddenly they want to be sneaky with free options. At least the CLI did add the IP back in.

❯ ocicli nlb backend create --backend-set-name server01-443 --network-load-balancer-id  --port 443 --ip-address 10.0.10.226

Comments Off on OCI Network Load Balancer Source Header Preservation
comments

May 20

Traefik Wildcard Certificate using Azure DNS

dns challenge letsencrypt Azure DNS

Using Traefik as edge router(reverse proxy) to http sites and enabling a Lets Encrypt ACME v2 wildcard certificate on the docker Traefik container. Verify ourselves using DNS, specifically the dns-01 method, because DNS verification doesn’t interrupt your web server and it works even if your server is unreachable from the outside world. Our DNS provider is Azure DNS.

Azure Configuration

Pre-req

  • azure cli setup
  • Wildcard DNS entry *.my.domain

Get subscription id

$ az account list | jq '.[] | .id'
"masked..."

Create role

$ az role definition create --role-definition role.json 
  {
    "assignableScopes": [
      "/subscriptions/masked..."
    ],
    "description": "Can manage DNS TXT records only.",
    "id": "/subscriptions/masked.../providers/Microsoft.Authorization/roleDefinitions/masked...",
    "name": "masked...",
    "permissions": [
      {
        "actions": [
          "Microsoft.Network/dnsZones/TXT/*",
          "Microsoft.Network/dnsZones/read",
          "Microsoft.Authorization/*/read",
          "Microsoft.Insights/alertRules/*",
          "Microsoft.ResourceHealth/availabilityStatuses/read",
          "Microsoft.Resources/deployments/read",
          "Microsoft.Resources/subscriptions/resourceGroups/read"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "DNS TXT Contributor",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }

NOTE: If you screwed up and need to delete do like like this:
az role definition delete --name "DNS TXT Contributor"

Create json file with correct subscription and create role definition

$ cat role.json
  {
    "Name":"DNS TXT Contributor",
    "Id":"",
    "IsCustom":true,
    "Description":"Can manage DNS TXT records only.",
    "Actions":[
      "Microsoft.Network/dnsZones/TXT/*",
      "Microsoft.Network/dnsZones/read",
      "Microsoft.Authorization/*/read",
      "Microsoft.Insights/alertRules/*",
      "Microsoft.ResourceHealth/availabilityStatuses/read",
      "Microsoft.Resources/deployments/read",
      "Microsoft.Resources/subscriptions/resourceGroups/read"
    ],
    "NotActions":[

    ],
    "AssignableScopes":[
      "/subscriptions/masked..."
    ]
  }

  $ az role definition create --role-definition role.json 
  {
    "assignableScopes": [
      "/subscriptions/masked..."
    ],
    "description": "Can manage DNS TXT records only.",
    "id": "/subscriptions/masked.../providers/Microsoft.Authorization/roleDefinitions/masked...",
    "name": "masked...",
    "permissions": [
      {
        "actions": [
          "Microsoft.Network/dnsZones/TXT/*",
          "Microsoft.Network/dnsZones/read",
          "Microsoft.Authorization/*/read",
          "Microsoft.Insights/alertRules/*",
          "Microsoft.ResourceHealth/availabilityStatuses/read",
          "Microsoft.Resources/deployments/read",
          "Microsoft.Resources/subscriptions/resourceGroups/read"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "DNS TXT Contributor",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }

Checking DNS and resource group

$ az network dns zone list
  [
    {
      "etag": "masked...",
      "id": "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/iqonda.net",
      "location": "global",
      "maxNumberOfRecordSets": 10000,
      "name": "masked...",
      "nameServers": [
        "ns1-09.azure-dns.com.",
        "ns2-09.azure-dns.net.",
        "ns3-09.azure-dns.org.",
        "ns4-09.azure-dns.info."
      ],
      "numberOfRecordSets": 14,
      "registrationVirtualNetworks": null,
      "resolutionVirtualNetworks": null,
      "resourceGroup": "masked...",
      "tags": {},
      "type": "Microsoft.Network/dnszones",
      "zoneType": "Public"
    }
  ]

$ az network dns zone list --output table
  ZoneName    ResourceGroup    RecordSets    MaxRecordSets
  ----------  ---------------  ------------  ---------------
  masked...  masked...            14            10000

$ az group list --output table
  Name                                Location        Status
  ----------------------------------  --------------  ---------
  cloud-shell-storage-southcentralus  southcentralus  Succeeded
  masked...                    eastus          Succeeded
  masked...                    eastus          Succeeded
  masked...                    eastus          Succeeded

role assign

  $ az ad sp create-for-rbac --name "Acme2DnsValidator" --role "DNS TXT Contributor" --scopes "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/masked..."
  Changing "Acme2DnsValidator" to a valid URI of "http://Acme2DnsValidator", which is the required format used for service principal names
  Found an existing application instance of "masked...". We will patch it
  Creating a role assignment under the scope of "/subscriptions/masked.../resourceGroups/sites/providers/Microsoft.Network/dnszones/masked..."
  {
    "appId": "masked...",
    "displayName": "Acme2DnsValidator",
    "name": "http://Acme2DnsValidator",
    "password": "masked...",
    "tenant": "masked..."
  }

  $ az ad sp create-for-rbac --name "Acme2DnsValidator" --role "DNS TXT Contributor" --scopes "/subscriptions/masked.../resourceGroups/masked..."
  Changing "Acme2DnsValidator" to a valid URI of "http://Acme2DnsValidator", which is the required format used for service principal names
  Found an existing application instance of "masked...". We will patch it
  Creating a role assignment under the scope of "/subscriptions/masked.../resourceGroups/masked..."
  {
    "appId": "masked...",
    "displayName": "Acme2DnsValidator",
    "name": "http://Acme2DnsValidator",
    "password": "masked...",
    "tenant": "masked..."
  }

  $ az role assignment list --all | jq -r '.[] | [.principalName,.roleDefinitionName,.scope]'
  [
    "http://Acme2DnsValidator",
    "DNS TXT Contributor",
    "/subscriptions/masked.../resourceGroups/masked..."
  ]
  [
    "masked...",
    "Owner",
    "/subscriptions/masked.../resourcegroups/masked.../providers/Microsoft.Storage/storageAccounts/masked..."
  ]
  [
    "http://Acme2DnsValidator",
    "DNS TXT Contributor",
    "/subscriptions/masked.../resourceGroups/masked.../providers/Microsoft.Network/dnszones/masked..."
  ]

$ az ad sp list | jq -r '.[] | [.displayName,.appId]'
  The result is not complete. You can still use '--all' to get all of them with long latency expected, or provide a filter through command arguments
...

  [
    "AzureDnsFrontendApp",
    "masked..."
  ]

  [
    "Azure DNS",
    "masked..."
  ]

Traefik Configuration

reference

Azure Credentials in environment file

$ cat .env
    AZURE_CLIENT_ID=masked...
    AZURE_CLIENT_SECRET=masked...
    AZURE_SUBSCRIPTION_ID=masked...
    AZURE_TENANT_ID=masked...
    AZURE_RESOURCE_GROUP=masked...
    #AZURE_METADATA_ENDPOINT=

Traefik Files

    $ cat traefik.yml 
    ## STATIC CONFIGURATION
    log:
      level: INFO

    api:
      insecure: true
      dashboard: true

    entryPoints:
      web:
        address: ":80"
      websecure:
        address: ":443"

    providers:
      docker:
        endpoint: "unix:///var/run/docker.sock"
        exposedByDefault: false

    certificatesResolvers:
      lets-encr:
        acme:
          #caServer: https://acme-staging-v02.api.letsencrypt.org/directory
          storage: acme.json
          email: admin@my.doman
          dnsChallenge:
            provider: azure

        $ cat docker-compose.yml 
        version: "3.3"

        services:

            traefik:
              image: "traefik:v2.2"
              container_name: "traefik"
              restart: always
              env_file:
                - .env
              command:
                #- "--log.level=DEBUG"
                - "--api.insecure=true"
                - "--providers.docker=true"
                - "--providers.docker.exposedbydefault=false"
              labels:
                 ## DNS CHALLENGE
                 - "traefik.http.routers.traefik.tls.certresolver=lets-encr"
                 - "traefik.http.routers.traefik.tls.domains[0].main=*.iqonda.net"
                 - "traefik.http.routers.traefik.tls.domains[0].sans=iqonda.net"
                 ## HTTP REDIRECT
                 #- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
                 #- "traefik.http.routers.redirect-https.rule=hostregexp({host:.+})"
                 #- "traefik.http.routers.redirect-https.entrypoints=web"
                 #- "traefik.http.routers.redirect-https.middlewares=redirect-to-https"
              ports:
                - "80:80"
                - "8080:8080" #Web UI
                - "443:443"
              volumes:
                - "/var/run/docker.sock:/var/run/docker.sock:ro"
                - "./traefik.yml:/traefik.yml:ro"
                - "./acme.json:/acme.json"
              networks:
                - external_network

            whoami:
              image: "containous/whoami"
              container_name: "whoami"
              restart: always
              labels:
                - "traefik.enable=true"
                - "traefik.http.routers.whoami.entrypoints=web"
                - "traefik.http.routers.whoami.rule=Host(whoami.iqonda.net)"
                #- "traefik.http.routers.whoami.tls.certresolver=lets-encr"
                #- "traefik.http.routers.whoami.tls=true"
              networks:
                - external_network

            db:
              image: mariadb
              container_name: "db"
              volumes:
                - db_data:/var/lib/mysql
              restart: always
              environment:
                MYSQL_ROOT_PASSWORD: somewordpress
                MYSQL_DATABASE: wordpress
                MYSQL_USER: wordpress
                MYSQL_PASSWORD: wordpress
              networks:
                - internal_network

            wpsites:
              depends_on:
                - db
              ports:
                - 8002:80
              image: wordpress:latest
              container_name: "wpsites"
              volumes:
                - /d01/html/wpsites.my.domain:/var/www/html
              restart: always
              environment:
                WORDPRESS_DB_HOST: db:3306
                WORDPRESS_DB_USER: wpsites
                WORDPRESS_DB_NAME: wpsites
              labels:
                 - "traefik.enable=true"
                 - "traefik.http.routers.wpsites.rule=Host(wpsites.my.domain)"
                 - "traefik.http.routers.wpsites.entrypoints=websecure"
                 - "traefik.http.routers.wpsites.tls.certresolver=lets-encr"
                 - "traefik.http.routers.wpsites.service=wpsites-svc"
                 - "traefik.http.services.wpsites-svc.loadbalancer.server.port=80"
              networks:
                - external_network
                - internal_network

        volumes:
              db_data: {}

        networks:
          external_network:
          internal_network:
            internal: true

WARNING: If you are not using the staging endpoint for LetsEncrypt strongly reconside doing that while working on this. You can get blocked for a week.

Start Containers

$ docker-compose up -d --build
whoami is up-to-date
Recreating traefik ... 
db is up-to-date
...
Recreating traefik ... done

Showing some log issues you may see

$ docker logs traefik -f
    ...
    time="2020-05-17T21:17:40Z" level=info msg="Testing certificate renew..." providerName=lets-encr.acme
    ...
    time="2020-05-17T21:17:51Z" level=error msg="Unable to obtain ACME certificate for domains
    ..."AADSTS7000215: Invalid client secret is provided.

$ docker logs traefik -f
    ...
    \"keyType\":\"RSA4096\",\"dnsChallenge\":{\"provider\":\"azure\"},\"ResolverName\":\"lets-encr\",\"store\":{},\"ChallengeStore\":{}}"
     acme: error presenting token: azure: dns.ZonesClient#Get: Invalid input:     autorest/validation: validation failed: parameter=resourceGroupName constraint=Pattern value=\"\\\"sites\\\"\" details: value 

$ docker logs traefik -f
    ...
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *acme.Provider {\"email\":\"admin@iqonda.com\",\"caServer\":\"https://acme-staging-v02.api.letsencrypt.org/   directory\",\"storage\":\"acme.json\",\"keyType\":\"RSA4096\",\"dnsChallenge\":{\"provider\":\"azure\"},\"ResolverName\":\"lets-encr\",\"store\":{},\"ChallengeStore\":{}}"
    time="2020-05-17T22:23:38Z" level=info msg="Testing certificate renew..." providerName=lets-encr.acme
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *traefik.Provider {}"
    time="2020-05-17T22:23:38Z" level=info msg="Starting provider *docker.Provider {\"watch\":true,\"endpoint\":\"unix:///var/run/docker.sock\",\"defaultRule\":\"Host({{  normalize .Name }})\",\"swarmModeRefreshSeconds\":15000000000}"
    time="2020-05-17T22:23:48Z" level=info msg=Register... providerName=lets-encr.acme

In a browser looking at cert this means working but still stage url: CN=Fake LE Intermediate X1

NOTE: In Azure DNS activity log i can see TXT record was created and deleted. Record will be something like this: _acme-challenge.my.domain

Browser still not showing lock. Test with https://www.whynopadlock.com and in my case was just a hardcoded image on the page making it insecure.

Comments Off on Traefik Wildcard Certificate using Azure DNS
comments