wiki

self-hosted infrastructure

TLDR: I use tailscale/zerotier to establish a smallish mesh network. I use envoy (not anymore, I now use nginx) as an edge router to forward L4 traffic. I mainly provision and manage services with nix, docker and sops. When it is absolutely required, I use k3s to deploy Kubernetes services. Traefik is used for routing, and authelia is used for blocking unauthorized access. To multiplexing protocols with a single port, I use aioproxy. I use syncoid and restic to back up my inevitably accumulated state. For CI/CD, I use github actions, hashicorp vault, depoly-rs and cachix. I use the grafana stack (prometheus, AlertManager, grafana and loki) for observablility.

Life of a Request

Principles

My principles can be best described as cloud nativeness. Cloud-native is an all-encompassing and vague term. I have a few concrete points on my mind.

  • Software-defined everything
  • Declarative
  • Infrastructure as code
  • Minimal state maintenance
  • Self-organization
  • Single source of truth

Networking

The first obstacle to self-host everything is that you don’t have a stable public accessible IP. There are a few solutions.

The cloud

  • I am paranoid enough to not trust the cloud, aka other people’s computer.
  • This approach is not cost-efficient. Even my Raspberry PI can beat many VPSes in terms of computing power. Not to mention I can easily insert a 256G SD card.
  • Locality. There is no place like LAN. I see no benefit in downloading youtube video to another VPS.

DDNS

This is simplest. I don’t use this mainly because it is not reliable in my setup. To name a few problems of DDNS,

  • 80, 443, 8080 blocked
  • Not portable router configurations. You need to set up port mapping or DMZ host in your router, which is hard to codify, if not impossible
  • CGNAT
  • ipv6 is still yet to come

Port Forwarding

There are many port-forwarding software. To name a few, autossh (my favorite), ngrok, frp, nps. You may also combine DDNS with port forwarding of your router. Here was my attempt to do this. The biggest problem of port forwarding is that it is not scalable and there is no generic inter-node connectivity. Port forwarding has the following weaknesses.

  • star topology, single point of failure
  • no inter-node connectivity
  • number of ports are limited, you only have one 443 (may use unix socket instead, but not all proxies support forwarding to unix socket)
  • hard to set up (authorization)
  • no hairpinning support
  • most port forwarding only supports TCP

Overlay Networks

Overlay networks are magic. What I meant is not container network interface kind of overlay network, but solutions like zerotier, tailscale, innernet, nebula, n2n. There all have interesting aspects. But none of they are self-organizing. They all require a centralized coordination server. What I have on my mind is something like matrix pinecone. I have been thinking on implementing a pinecone-like overlay network for a while, self-organizing, and tunneling traffic with libp2p. I currently rely on tailscale and zerotier to establish peer-to-peer connectivity. This works great in the following perspectives.

  • inter-node connectivity
  • all ports are belong to you
  • easy to set up (implementation-dependent)
  • transparent hole punching
  • transparent multi-path

Routing

L3

L3 routing is provided by the overlay network solutions.

L4

Considerations

For L4 routing, I care about transparency, protocol multiplexing and configuration-complexity.

  • Transparency

    This means that the backend service does not need to know there is a middle man do the heavy lifting. In particular, it means that the origin requester’s address is preserved. Typical HTTP reverse proxies are not transparent. They pass the original requester’s information by injecting an X-Forwarded-For header.

  • Protocol Multiplexing

    L4 protocol multiplexing means that we can use the same TCP port for HTTP, TLS and SSH. An example is sslh. It normally works by peeking into a few first bytes and determine which protocol this packet is, and then handing off the connection to another application which is listening on some other port.

  • Configuration Complexity

    Do we have to configure both the proxy and backend services? What if we change a user-fronting proxy address? Do the backend server need to adjust for this change? Any special configuration for different user-fronting proxies? What if an upstream server is down? Must I manually edit the configuration to reflect this change?

Solutions

  • iptables

    This is just like NAT. It is transparent. I believe you can multiplex port with some iptables extensions. It is not super pretty. A lethal problem is that the user-fronting proxy must be in the return path of the connection (usually the proxy is the default gateway). To circumvent this problem, we need some modifications to the routing table and routing policies. When there are two proxies which are connected to the same interface, there are multiple return paths, to select the correct one, we need policy based routing.

  • ipvs

    Compared with iptables, ipvs is much more manageable and scalable. Yet it still is too complicated.

  • usespace L4 proxy

    envoy/haproxy/nginx etc. can be used as L4 proxy. They accept incoming downstream connection and establish a new upstream connection, just like a pipe. This is much more manageable, the downside is that the original client’s information is lost in translation. To ease this problem, haproxy designed a protocol called PROXY (I can haz a more searchable name?). In short, it appends original request’s source and destination addresses to the TCP connection or UDP stream. As stated in the above document, this will solve the multiple return paths because we are initiating another TCP connection/UDP stream. Unfortunately, this solution is invasive as it requires the backend service to support PROXY protocol explicitly. Fortunately we have mmproxy. It accepts PROXY protocol packets, unwraps them and then forwards them to upstream. Moreover, it does so transparently. The original mmproxy does not support UDP, while this go implementation go-mmproxy supports.

  • aioproxy

    mmproxy is great when working with envoy. But it does not multiplex port like sslh, is not transparent, and does not work with non-PROXY protocol traffic. Non-transparent proxy is useful when we are trying to proxy a connection whose original requester, proxy and the backend server are all the same host (see below).

    • How transparent proxy works

      Let cip be the client ip, pip be the proxy ip and sip be the backend server IP.

      • Client connection: cip:45678 -> sip:22, client tries to connect to sip:22, but it actually connects to transparent proxy
      • Transparent proxy downstream connection: cip:45678 -> pip:44443, transparent proxy accepts traffic from cip:45678, the traffic originally targeted sip:22 is redirected to pip:44443 by netfilter.
      • Transparent proxy upstream connection: pip:45678 -> sip:22, transparent proxy establish a new connection to sip:22, it changes the socket source address to cip:45678 with the help of IP_TRANSPARENT.
      • Backend server connection: cip:45678 -> sip:22, backend server is fooled by the connection socket address, this connection is actually started from the transparent proxy. If the transparent proxy stands right in the middle of the return path from the backend server to the client, then the proxy can get the return packet from its upstream connection and send it to the client on behalf of backend server by its downstream connection.
    • What could go wrong when client and transparent proxy are on the same host

      If client and transparent proxy are on the same host 127.0.0.1, both of them will try to bind 127.0.0.1:45678, which would fail with Address Already in Use.

    • What could go wrong when we chain more than one transparent proxy

      On the other hand, if we use the scheme client <-> envoy <-> mmproxy <-> sslh <-> ssh, and when both mmproxy and sslh are configured to proxy transparently, the same bind error would occur (I have not tried it, I expect it to fail).

      So it is sometimes useful to proxy non-transparently, and it would be great if we can have an all-in-one proxy which can intelligently unwrap PROXY protocol traffic (when it fails to do so, just treats it as normal traffic and forwards it), supports transparent proxy to upstream and multiplexes port for different protocols.

      Here is my take on this problem. Aioproxy has rudimentary solutions for all above problems. There are a few things I intended to add. First, more protocol support for multiplexing. Most outstandingly, peeking into SNI, and forwarding connection accordingly. Second, as discussed above, it could go wrong when client and transparent proxy is on the same host. We need intelligent transparent forwarding, i.e. when client and transparent proxy is on the same host, do not use the same client address tuple. At this point, the aioproxy is abandoned in favor of caddy-l4. Caddy-l4 is not mature enough currently, but it has much greater potential, as we can use anything caddy already provided.

  • envoy+traefik+aioproxy

    This is my current setup. Envoy, traefik and aioproxy are a great match. Client connection to my edge proxy is wrapped with PROXY protocol by envoy and forwarded to traefik. Depending on the packet format, traefik would forward it to HTTP traffic to docker or Kubernetes, other TCP traffic to aioproxy (this works by setting SNI to rules to match Host("*"), see here), the PROXY protocol header is automatically peeled off when possible. It is not transparent to aioproxy. I don’t intend to optimize it for now. In fact, it would be better if I insert aioproxy in front of traefik, as this way every service is now ignorant of the proxy. But I didn’t implement intelligent transparent proxy mentioned above yet (this is fairly easy, and I am fairly lazy currently). It’s now done. There will be some problem when client and transparent proxy are on the same host, which is a frequent use case for me.

Intermission: Split Horizon DNS

I have a few ways to access my services. When I use my own devices, I can just access my services by overlay networks. My devices are part of the overlay network. I can access services via a stable address within 10.144.0.0/16. Overlay networks are magic. They automatically select paths for me, e.g. when my two devices are in the same network, they connect each other using LAN address, otherwise, they connect each other over WAN. Overlay networks can transparently do NAT-PMP/UPNP, punch holes. When one device is behind an impenetrable NAT, they automatically select a relay. I may want to make part of my services available outside the overlay network. In that case, access to the services is proxied by two public accessible VPSes. They forward traffic as described above. The problem is that my VPSes live in Far Far Away. I don’t want to travel around the world when I am in the overlay network. Can my device be intelligent enough to just try the overlay network first, when it fails to do so, use the backup VPSes? This is a well-known problem of split horizon dns. I have a stable domain name service-a.example.com, I want it to be resolved as 10.2.3.4 when I am in the corporate network (or I was using a VPN), otherwise please resolve it to 1.2.3.4. Here is a few solutions. By the way, this is a great read on this problem.

Hosts

The easiest and the most abominable solution. The downsides are

  • no wildcard support for Windows, Linux
  • no flexibility. You can not graceful fallback to another host or easily add another entry

Nsswitch

If you ever use mdns, you may wonder how is abc.local resolved to the host abc. The secret sauce lies in the following stanza of /etc/nsswitch.conf.

hosts:     files mdns_minimal [NOTFOUND=return] mymachines resolve [!UNAVAIL=return] dns mdns myhostname

Here, mdns_minimal and mymachine are dynamic libraries used by NSS to resolve hosts. They provide the functionality of resolving mdns hosts and machinectl hosts. Theoretically, I can just write another plugin for nsswitch like mdns_minimal, but nsswitch is also an abomination. It is glibc only, thus musl-linked and statically linked binaries would fail. As a matter of fact, supporting mdns on musl is a future idea, while golang fallbacks to glibc to resolve hostname when the hosts entry in nsswitch is too complicated. So it does not worth the effort to fiddle with nsswitch.

Coredns

I found salvation in coredns. Here is how I resolve a domain name with coredns enriched by coredns-mdns and coredns-alternate. The source code to this coredns instance is here.

.:5355 {
    template IN A mydomain.tld {
      match ^(|[.])(?P<p>.*)\.(?P<s>(?P<h>.*?)\.(?P<d>mydomain.tld)[.])$
      answer "{{ .Name }} 60 IN CNAME {{ if eq .Group.h `hub` }}hub_hostname{{ else }}{{ .Group.h }}{{ end }}.{{ .Group.d }}."
      fallthrough
    }
    template IN AAAA mydomain.tld {
      match ^(|[.])(?P<p>.*)\.(?P<s>(?P<h>.*?)\.(?P<d>mydomain.tld)[.])$
      answer "{{ .Name }} 60 IN CNAME {{ if eq .Group.h `hub` }}hub_hostname{{ else }}{{ .Group.h }}{{ end }}.{{ .Group.d }}."
      fallthrough
    }
    mdns mydomain.tld
    alternate original NXDOMAIN,SERVFAIL,REFUSED . 1.0.0.1 8.8.4.4 9.9.9.9 180.76.76.76 223.5.5.5
}

The Corefile above does the following things.

  • cname *.hostname.mydomain.tld to hostname.mydomain.tld
  • Let hostname.mydomain.tld be resolved to hostname.local by coredns-mdns
  • Anything not matched or not resolved here is forwarded to real world DNS servers

To resolve hostname.local, I use avahi to announce the workstation hostname. This solution is particular elegant, in the sense that all hosts need only to configure themselves. To use this DNS server for all applications, I configured systemd-resolved here. It is also possible to make other devices in the overlay network to use this DNS server. I haven’t done it yet.

Multicasting

LLMNR and mDNS can be leveraged to resolve hosts, if your VPN support multicasting (which zerotier supports, while tailscale doesn’t support yet). The downside is that, most resolvers only support single label for LLMNR, and `.local` postfix is required for mDNS. So you can not easily resolve usual domain like `test.example.com` to host `test`. The solution is to use coredns as described above.

L7

Now that we can resolve domains to desirable hosts, we can access services directly in the browser.

TLS Certificates and Termination

I use acme with dns-chanlledge. My DNS service provider is cloudflare. From letsencrypt, I got free wildcard certificates for *.hostname.mydomain.tld, *.local.mydomain.tld, optionally also some alias domains like *.hub.mydomain.tld. The certificates are obtained by setting NixOS options security.acme, and are shared between multiple applications. Currently, TLS is terminated by traefik using above certificates.

Edge Routing

Given TLS termination is not handled by the edge routers, we can only do intelligent routing based SNI. As far as I can tell, SNI dynamic forward proxy of envoy relies heavily on the DNS server to find out which backend server to forward traffic. This is less than ideal in my use case, because with the help of systemd-resolved’s LLMNR support (mDNS must be manually enabled for each interface, LLMNR seems to be easier to use), I can use easily resolve hostnames. All I need is obtaining a new hostname from the original hostname by a simple regex. I choose nginx over envoy to do that. Here is my nginx configuration.

user  nginx;
worker_processes  auto;

error_log  /var/log/nginx/error.log notice;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}

stream {
    log_format format '$remote_addr [$time_iso8601] '
                     '$protocol $status $bytes_sent $bytes_received '
                     '$session_time "$upstream_addr" '
                     '"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"';
    access_log  /var/log/nginx/access.log format;

    map $ssl_preread_server_name $ssl_backend {
        ~^([^.]+\.)*alias\.[^.]+\.[^.]+$   real-server:$server_port;
        ~^([^.]+\.)*(?P<my_hostname>[^.]+)\.[^.]+\.[^.]+$   $my_hostname:$server_port;
        default default-server:$server_port;
    }

    map $hostname $backend {
        ~^([^.]+\.)*alias\.[^.]+\.[^.]+$   real-server:$server_port;
        ~^([^.]+\.)*(?P<my_hostname>[^.]+)\.[^.]+\.[^.]+$   $my_hostname:$server_port;
        default $ssl_backend;
    }

    resolver 127.0.0.53 ipv6=off;

    server {
        listen 0.0.0.0:80 reuseport;
        listen 0.0.0.0:2022 reuseport;
        listen 0.0.0.0:2122 reuseport;
        listen 0.0.0.0:2222 reuseport;
        listen 0.0.0.0:443 reuseport;
        listen 0.0.0.0:4443 reuseport;
        listen 0.0.0.0:4000 reuseport;
        listen 0.0.0.0:5678 reuseport;
        listen 0.0.0.0:8080 reuseport;
        listen 0.0.0.0:80 udp reuseport;
        listen 0.0.0.0:2022 udp reuseport;
        listen 0.0.0.0:2122 udp reuseport;
        listen 0.0.0.0:2222 udp reuseport;
        listen 0.0.0.0:443 udp reuseport;
        listen 0.0.0.0:4443 udp reuseport;
        listen 0.0.0.0:4000 udp reuseport;
        listen 0.0.0.0:5678 udp reuseport;
        listen 0.0.0.0:8080 udp reuseport;
        proxy_pass $backend;
        proxy_protocol on;
        ssl_preread on;
    }
}

With this configuration, nginx can both forward http and https traffic based on TLS SNI and http hostname. $my_hostname here is resolved by systemd-resolved. I also added some aliases to simplify management of domain name prefixes which do not have a backing hostname. This is needed as I find no easy way to add an alias to an existing domain (c.f. this issue). Besides, I didn’t find any good LLMNR responder with customizable aliases. Note this is particularly easy to manage as nginx need not know its serving domain. I can add more edge proxies as needed. They work the same way. I can also add more backend servers as needed. All they need to respond to LLMNR requests.

Service and Routing Registration

Service and router registration is done in a self-organizing way. I don’t use subpath routing rules, as it may require extra work of rewriting paths. Routing is only matched by Host. All my services have dedicated domains. Cloudflare provides wildcard DNS resolution. My coredns configuration above also resolves domain names in a wildcard-matching fashion.

  • Fixed Services and Routings

    Generated from nix expressions. It is a obligation for me to praise how easily nix (a real programming language, albeit a weak one) eliminates boilerplate. Why is everyone trying to use some half-baked configuration format? Can we have a good language for general configurations? Spoiler alert: dhall-lang.

  • Docker

    This is managed by traefik with docker provider. All I need to do is add a label to the container. Traefik will automatically pick up the label and set up a routing rule according to the defaultRule. My rule is to use domainprefix label when applicable, otherwise fall back to container name.

    providers = {
      docker = {
        defaultRule = getRule
          ''{{ (or (index .Labels "domainprefix") .Name) | normalize }}'';
      };
    }
    
  • Kubernetes

    Just the usual Kubernetes ingress. I passed k3s kubeconfig to traefik by systemd environment variable here. Traefik will automatically apply Kubernetes ingress rules.

Deployment

I currently use nix to manage all my personal devices, ansible to manage all the cloud resources. Most services are managed by nix. When nix becomes too unwieldy, I resort to Kubernetes. An ideal setup would be using terraform to provision cloud resources, using nix to manage all services including Kubernetes ones. This is currently not possible for me because firstly, many resources I used does not have terraform provider. Secondly, nix currently does not support ad hoc variable assignment like terraform and ansible. It is possible to pass variables from the command line, but it is not pleasant to use. Thirdly, Kubernetes requires a lot of dedication. Currently nix, can’t manage Kubernetes efficiently.

Nix

Nix is a much more declarative, reliable and reproducible way to build infrastructure. Here is a short introduction. In short, building NixOS profiles is like building docker image. You build a new container image and run a container with that image as base. The container image itself is immutable. When you change your code, you need to build a new image. When you need some new operating system configuration, you build a new NixOS profile and switch to it. The best thing about NixOS is that nearly every aspect of the OS is tunable by NixOS options. The knobs are formed by the purely functional, lazy language nix.

Docker

I manage docker containers declaritively with nix. A typical docker container configuration is

mkContainer "wallabag" prefs.ociContainers.enableWallabag {
  dependsOn = [ "postgresql" ];
  environment = {
    "SYMFONY__ENV__DOMAIN_NAME" =
      "https://${prefs.getFullDomainName "wallabag"}";
  };
  traefikForwardingPort = 8978;
  middlewares = [ "authelia" ];
  volumes = [
    "/var/data/wallabag/data:/var/www/wallabag/data"
    "/var/data/wallabag/images:/var/www/wallabag/web/assets/images"
  ];
  environmentFiles = [ "/run/secrets/wallabag-env" ];
}

mkContainer is a function to make a new container. If prefs.ociContainers.enableWallabag is true, nix would make a container named wallabag which depends on the postgresql container and has such such volumes and such such environment variables. The environmentFiles is also read to set up environment variables. The file /run/secrets/wallabag-env is managed by sops-nix and is version-controlled. I also specified the middleware authelia for traefik, which means that not everyone is allowed to access this service.

Service Discovery

This is easy. Docker container within the same bridge network can access each other by the container name.

Configmaps and Secrets

I use docker command line flag --env and --env-file to pass my configurations as container environment variable. To mount secrets like Kubernetes, I use docker volume. The secrets are managed by sops-nix, which generate secret files according to my sops.yaml file.

Init Containers and Jobs

Kubernetes init containers are sometimes used to manage pods/services dependencies. For this specific use case, init containers are ugly hacks. Using systemd to manage container dependency is much more elegant. I only need to specify dependsOn in my nix file, e.g. dependsOn = ["postgresql"]; above. I override the ExecStartPost option for systemd units to do initialization job. Kubernetes jobs are just more containers, while cronjobs are just containers with systemd timer.

Ingress

See routing.

Ansible

As much as I love NixOS, I don’t use nix for everything. Nix does not work along with some technologies. I use ansible for two purposes, first setting up cloud resources (like setting up tailscale and envoy), second managing Kubernetes. Kubernetes is declarative, but using command line to manage Kubernetes is imperative. I use community.kubernetes. One pleasant side effect of using ansible to manage Kubernetes is what I did and what I need to do are well-documented.

Kubernetes

My Kubernetes distribution is k3s (provisioned by nix). Each Kubernetes cluster includes exactly one node for the time being. There are a few edge cases where I can’t simply use nix and docker. Jupyterhub and eclipse che are major ones, as they need to provision cluster resources dynamically, e.g. they need to spawn new containers on user request. This is doable with vanilla docker spawner for jupyter hub. I don’t think Che support this natively. Using Kubernetes is much preferable.

Security

Authentication and Authorization

Setup

I use authelia for authentication and authorization. I created an ForwardAuth middleware for traefik, which works like nginx auth_request. Upon receiving a client request, depending on the routing, traefik may initiate a subrequest to authelia possibly with necessary client credentials, if authelia is able to authenticate the user and authorize the request, the client request will be forwarded to the backend service with some extra headers containing client user information. There is not such thing as authorization yet. It’s only me using my services.

Weakness

Authelia is not satisfactory in many aspects. First, its policy engine is not flexible enough. Second, it requires a lot of boilerplate in the configuration, e.g. I need to specify many hard-coded base domain hostname-a.mydomain.tld instead of hostname-a. This is not desirable as I have many postfixes, and the configuration is shared.

Strength

What I really like about authelia is its simplicity and easy integration with traefik.

Future

I want to use a beyondcorp style identity-aware proxy with open policy agent support some other day. The last time I checked pomerium, I found envoy was hard to pack and pomerium was too oidc-centric, most of all it did not support ldap or other local user database.

SSO

Authelia just landed openid connect support. I haven’t tried it yet. One more thing about authelia is that I currently use a single text file as account backend. I have set up openldap on my machines, but I haven’t tried it on authelia yet. I intend to use freeipa instead (tried container, systemd within the container didn’t work), which is much more versatile.

Intrusion Prevention

Because of my distrust to other people’s computer, I intentionally made my edge proxy to be as dumb as possible. There ain’t such thing as intrusion detection system yet. Setting up fail2ban is easy, but I need to integrate it with traefik and aioproxy.

Backup

syncoid

I use syncoid for on-site backup. Syncoid is basically a zfs send | zfs receive wrapper. With naive rsync -avzh --process, I can easily encounter database corruption. Thanks to Zfs’s hard work, I don’t have to worry about this consistency. Syncoid also works incrementally. Another advantage of this method is that I can easily restore an entire zpool. But it requires a lot of free space, and it may take a while to finish. I attached an external disk to my main computer.

restic

I use restic for off-site backup. Of all the incremental backup tools, there are two distinctive features about restic. First, it supports all rclone backends, second, I can back up different directories from different hosts to the same endpoint. Here is my nix configuration.

restic = {
  backups = let
    go = name: conf: backend: {
      "${name}-${backend}" = {
        initialize = true;
        passwordFile = "/run/secrets/restic-password";
        repository = "rclone:${backend}:restic";
        rcloneConfigFile = "/run/secrets/rclone-config";
        timerConfig = {
          OnCalendar = "00:05";
          RandomizedDelaySec = "5h";
        };
        pruneOpts = [
          "--keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 75"
        ];
      } // conf;
    };
    mkBackup = name: conf:
      go name conf "backup-primary" // go name conf "backup-secondary";
  in mkBackup "vardata" {
    extraBackupArgs = [ "--exclude=postgresql" ];
    paths = [ "/var/data" ];
  };
};

I back up my data every day to two backend storage. For some files, I need to manually tune the backup process. For example, to back up postgresql database, I need to run pg_dump first. This may lock the whole table.

Observablility

I use grafana, loki, prometheus for observablility. I can’t praise enough this squad for its simplicity to set up. I basically just set up the components separately. They just work. Also, it is a share-nothing architecture, so in order to achieve high availability, all I need to do is add a new remote write target. For that, I use grafana cloud.

Metrics

Prometheus is pull based. It is quite easy to obtain nodes data from node exporter. Besides, almost all services now expose prometheus metrics. I enabled quite a few prometheus exporters (e.g. systemd, node, postgresql), whose data are sent both to my local machine and grafana cloud.

Logs

Loki lives up to its promise – like prometheus, for logs. The data are collected by promtail and sent to my local machine and grafana cloud. Most of my logs are stored with systemd-journal. It is quite easy to collect them with promtail.

Visualization

Grafana.

TODO Alerts

Alert manager.

Continuous Integration/Continuous Delivery

Worker

Github is quite generous for the offer of github actions. The free machines’ performance is quite good. It is no wonder that there are many miners trying to abuse them. As good as github actions, there are two nuisances for my usage.

  • disk size. The closure size of my top level system profile easily exceeds the size limit. I need to clean up some packages to get more free disk space.

Some of my machines’ profile can be as large as 70G. There is no way for github actions to build a profile that large.

  • running time limit. Nix channel updates can invaildate many binary caches. I need to build so many packages that github actions workflow frequently times out.

I need to manually rerun it, and cache my build artifacts with cachix.

Artifacts store

Most of nix’s builds are reproducible. The nix derivation output path depends on the hashes of the build inputs. Given the same inputs, we can easily check if there are valid binary caches for the output. I use cachix to cache my builds. Think cachix as a docker container registry. It is quite straightforward to use cachix action. I also set up cachix in my local machines, so that I can use the building results of github actions worker. It greatly reduces the building time on my local machines.

Deployment

I use deploy-rs to deploy my nixos configuration to the target machine. deploy-rs reads my flake.nix, builds the profile on the machine running deploy-rs command. It then copies the profile to target machine via ssh. Depending on my configuration, it may choose to download binary caches from substitutes firstly (thus reduces time by avoid possible slow ssh connection). It should be noted that deploy-rs build the profile on local machine. This is important for me as many of my machines are not powerful enough to build a profile quickly. deploy-rs also has elementary sanity check, e.g. automatically rollback to previous generation of profile if ssh connection didn’t come back after switch to the new profile. The only remaining complication is ssh connectivity.

Node Connectivity

To establish connectivity from github actions runner to my server, I use wstunnel. Well, this time I use port-mapping solution. Note that wstunnel dig tunnels over websocket. And I have described a lot about how I can access my services over http above. So it is quite a no-brainer for me to set up a tunnel. All I need to do is running wstunnel in server mode, set up a routing rule for it, and then ssh -o ProxyCommand="wstunnel --upgradePathPrefix=some-superb-secret-path -L stdio:%h:%p wss://wstunnel.example.com" remote-machine I keep the routing path some-superb-secret-path secret so that it would be impossible for other people to establish a tunnel to my machine.

Secrets management

One more thing, how to make github actions runner’s ssh connection to my machines more secure. I fully agree the sentimental of this article. We should use ssh certificates as more as possible. The question is now how to securely use ssh certificates. I need a system to automatically issue short-lived certificates. This system must be fully programmable. Smallstep certificates is not good in terms of programmability. I use Hashicorp Vault ssh secret engine for this. Here is how I use vault to issue short-lived ssh certificates.

Server Management

wstunnel

Adding the following to my ssh config,

Host wstunnel.*
    CheckHostIP no
    ProxyCommand wstunnel --upgradePathPrefix=some-superb-secret-path -L stdio:$(echo %h | cut -d. -f2):%p wss://%h.example.com

I am now road warrior who can access my servers anytime anywhere.

ttyd

ttyd is a web based terminal. I added a route for ttyd, then I can manage my servers through a web browser.

aioproxy

As stated above, aioproxy can multiplex ssh and https on the same port. I only need to open one port in my vps.

Next Step

Kubernetes after All?

I abandoned my plan of using Kubernetes for all. Currently, I refrain my usage of Kubernetes because first I didn’t find a satisfactory workflow for nix and Kubernetes, second I begin to feel Kubernetes is the new c++. I sincerely hope I can declaratively manage Kubernetes with nix the way I manage docker and traefik with nix. I find integrating kustomize and kubenix interesting, but it is not there yet. Both nix and Kubernetes are too overwhelming. They require you to go all-in. Nix is my daily driver. It is definitely here to stay. I need some Kubernetes features like node affinity (jupyter hub requires a faster node) and proxying traffic received from any node. As I said, Kubernetes is like c++. It is extremely powerful, but it is also extremely complex and can be easily misused. I partially agree “Let’s use Kubernetes!” Now you have 8 problems. I find also find kubevela to be interesting. I haven’t tried it yet. I hope it lives up to its promise. Also, Nomad looks interesting, it may well suit che and jupyter hub, but they do not support nomad.

Configuration Database

Nix is great. But it is hard for outside world to learn my nix configuration.

Security Hardening

Federated Storage

Personal Data Warehouse

Accounts (ldap)

Links to this note