Gemini reverse-proxying with OpenBSD and `relayd(8)` (part 2)

[Part 1]

The previous part to this post outlined how you can use OpenBSD's built-in `relayd` daemon to set up a Gemini TCP reverse proxy. This is useful when, say, you are self-hosting things, and do not wish to expose your private IP address to the wild west of the internet. But `relayd(8)` is mostly used not just as a relay, but it also has very good load-balancing and failback capabilities.

As vigilia.cc is self-hosted, and since we moved, the constant wind on the hilltop that we live on sometimes seems to blow away the 5G molecules in the air (either that, or I need a stronger antenna, whichever seems more plausible to you, dear Reader), the failback capability seemed like a very very good idea. I wanted to create a "status" page, that is served up from the cloud, and which can check on the home box and tell me and readers if something is wrong.

The idea is, that if the capsule would go down, because a _parliament_ of birds lands on the tree between the 5G router in the window and the closest mast and suddenly the 5G particles are scared away by their mean and greedy looks, then when you open vigilia.cc you would get a status page informing you of the sad events.

But how to do it?

The Plan, as Inspired by Manpages and Mastodon

Most of the very handy `relayd` functionalities , such as filtering based on Headers, Hosts, Requests are HTTP-specific. It can do basic TCP proxying, it can even mix in TLS, but it will not "understand" custom protocols like Gemini, and, after reading through the manpages for `relayd.conf(5)` and summoning the #OpenBSD hive mind on Mastodon (shoutout to @morgant [^FN] and @solene [^FN] for their help, and to @thorstenzoeller [^FN] and @continue [^FN] for support), it also turned out that there is no way in `relayd` to route "raw TCP" traffic based on Server Name Identifier / SNI (ie. the "hostname" the request is sent to).

Why this poses a difficulty is because I want _all_ gemini traffic to be accessible over the default port for gemini, 1965, because nobody ever remembers or bothers with servers running on different ports (it is why God gave us subdomains). But then without SNI filtering there is no way for me to "split" the traffic arriving to port 1965 of the VPS; there is no way for me to route some of those requests to the `gmid(8)` instance running in the cloud on OpenBSD.amsterdam [^FN] and then route the rest to my home server.

This lack of support for SNI filtering did give me a day or two of headaches. In the process of "thinking out loud" over Mastodon I realised that I can possibly run _two_ server blocks on the VPS, serving up exactly the same content, and one would be the "standard" status page and the other would be the "fallback" page. I also remembered that `gmid(8)` has a proxying feature, which, combined with the two-block server setup, could just do the trick.

So after giving it some thought and sleep, I came up with this:

    VPS                                                                                            
    ┌──────────────────────────────────────┐                                                       
    │relayd                                │       Home server                                     
    │┌─────────────────┬────────────────┐  │       ┌───────────────────────────────────────────┐
    ││Gemini-servers   │all 1965 traffic┼──┼───────►gmid                                       │
    │├─────────────────┼────────────────┤  │       │┌──────────────────┬──────────────────────┐│
┌───┼┼Fallback         │if check fails  │  │       ││vigilia.cc        │ home_ip:1965         ││
│   │└─────────────────┴────────────────┘  │       ││*.vigilia.cc      │ home_ip:1965         ││
│   │                                      │       ││other gmi servers │ home_ip:1965         ││
│   │gmid                                  │       │┌──────────────────┼──────────────────────┤│
│   │┌──────────────────┬─────────────────┐│  ┌────┼┼status.vigilia.cc │ proxies home_ip:1965 ││
└───┼►vigilia.cc        │ public_ip:1967  ││  │    ││                  │        to            ││
    │├──────────────────┼─────────────────┤│  │    ││                  │ VPS-internal_ip:1966 ││
    ││status.vigilia.cc │ internal_ip:1966◄┼──┘    │└──────────────────┴──────────────────────┘│
    │└──────────────────┴─────────────────┘│       └───────────────────────────────────────────┘
    └──────────────────────────────────────┘

In more detail the following happens:

A request is made to "*.vigilia.cc", which is pointed to the OpenBSD VPS.
VPS listens to all 1965 traffic with `relayd(8)`
VPS checks if the home server is up over Tailscale by pinging home_ip (TODO: write a proper, gemini-specific checking script)
If the check succeeds, port 1965 traffic is relayed to home_ip:1965
If the check fails, port 1965 traffic is relayed to the status page running on the VPS, listening on public_ip:1967 (you can check this out by opening vigilia on port 1967). [^FN]
VPS also serves "status.vigilia.cc" from the same status directory, listening only over the Tailscale IP (internal_ip) on port 1966 so as not to confuse `relayd`, waiting for connections to be bounced back here
Traffic arriving to public_ip:1965 on the VPS is forwarded to the home server listening over home_ip:1965
Home server serves up whatever is asked of it
Home server proxies "status.vigilia.cc" back to the VPS's internal_ip:1966 to serve up the status page.

The statuspage "round robin" solution is possibly a bit convoluted, but it serves its purpose really well. It allows access to the page over port 1965 seamlessly, which is the most important thing.

The Config

On the VPS

First, we set up `gmid(8)` to serve the status page:

# ...snip...
vps_local_ip= "100.32.200.100"
public_ip = "46.23.93.41"

server "status.vigilia.cc" {
        listen on $vps_local_ip port 1966
        root "/status.vigilia.cc"           
        cert $vigilia_pem       
        key $vigilia_key

        log on

        location "/" {
                fastcgi {
                        socket "/cgi.sock"
                }
        }

        location "*" {
                fastcgi {
                        socket "/cgi.sock"
                }
        }
}

server "vigilia.cc" {
        listen on $public_ip port 1967
        cert $vigilia_pem       
        key $vigilia_key
        root "/status.vigilia.cc"
        log on

        location "/" {
                fastcgi {
                        socket "/cgi.sock"
                }
        }

	location "*" {
		fastcgi {
			socket "/cgi.sock"
		}
	}
}

So as you can see, `gmid(8)` basically serves the same content over two blocks, under different domain names, once over the public up, once over the internal ip. This is because "status" will receive incoming connections from the home server over Tailscale; but when the home server fails I still need to serve up the status page as fallback. Speaking of fallback, here's the `relayd` config:

public_ip="46.23.93.41"
home_server="100.32.200.1"
geminiport="1965"

log connection

tcp protocol "gemini" {
        tcp { nodelay, socket buffer 1024}
}

table  {
        $home_server
}

table  {
        $public_ip
}
relay "gemini_with_fallback" {
        listen on $public_ip port $geminiport
        protocol "gemini"
        forward to  port 1965 check icmp
        forward to  port 1967
}

So as before in part 1, we define some macros, then define a "gemini" TCP protocol. Then comes the change:

table <geminiservers> { ...

Tables, according to the `relayd.conf(5)` man page, are usually used for load balancing. You can specify multiple hosts in a table, and based on what is fastest, `relayd(8)` would route to one of the servers in the table. Really useful if you have various servers scattered around the world.

Both for <geminiservers> and <fallback> we specify a single host. We could likely just set them up as macros (like we did at the beginning for "home_server" or "geminiport"), but to be able to use the "check" directive later in the relay, we need to use tables. (I think).

relay "gemini_with_fallback" { ...

So this part does the actual work. Having set up the previous parameters, it pretty much does what it says! (OpenBSD configs are sooo good, they are so easy to read).

The order of the "forward" directives matters. It goes line-by-line; so first it will try to forward to the entries found in <geminiservers>, to port 1965, and it will perform some basic "icmp" checking, ie. it will ping the hosts in the table to see if they are up. If the host(s) in the table wouldn't respond to pings, then it would continue to "forward to <fallback>", ie. to the `gmid(8)` instance running in the VPS -- the status page.

So now we have all traffic coming in to port 1965 to the VPS, and the VPS now handles them appropriately.

On the Home Server

On this side we just need to make sure that `gmid(8)` does what it is supposed to. This basically means setting up a new server block, and without giving it any content to serve, telling it to proxy all requests to $vps_local_ip:1966.

There are three "gotchas" regarding this:

First, `gmid` seems to support a protocol called `proxy-v1` which seemed like a good thing to add. But whenever I added it it gave me a syntax error on the config file. I will try to ask the author Omar Polo [^FN] to see how it could be done, but he's been away from the keyboard a bit.
Second, I added "verifyname off" because, I think, it would cause trouble with self-signed certificates. I might be mistaken, but it doesn't seem to hurt for internal connections like this.
Third, when I *thought* all the bits were working properly, and all the configs seemed to be pristine, I still got an "Error 59 - Bad/malformed host" error when I tried to open "status.vigilia.cc". The VPS didn't understand something when the connection was proxied back to it. After delving into some verbose output and some logs and keeping too many terminal windows open in parallel, I noticed that after the request arrived to $home_server the SNI was somehow scrubbed from proxying. Fortunately `gmid.conf(5)` has a setting where you can manually specify an SNI to forward. So I added that, and it will be reflected in the code snippet below.

I am only reproducing the new server block:

# ...snip...
server "status.vigilia.cc" {
        listen on $home_server
        cert $vigilia_crt
        key $vigilia_key
        proxy {         
                verifyname off
                sni "status.vigilia.cc"
                relay-to $vps_local_ip port 1966
        }
}

So this server block makes sure that "status" is forwarded to the internal tailscale IP address on port 1966, as we specified in the `gmid(8)` config file above for the VPS.

And basically this is it.

Some Outstanding Cherry for the Top (aka Shortcomings to Fix)

I'm quite happy with where this has gone, for now, but there is one final thing I really want to sort out, and that is "correct" checking hosts.

I want to know three things:

Is the server up?
Is the gemini server listening?
Is the gemini server healthy and is it serving content?

This is basically what "status" checks now via a python script calling "subroutines" (shell commands). First, it pings the $home_server IP address, and if there are replies, it gets the average connection time. Then it calls `nc(1)` and initiates a TLS connection to `gemini://vigilia.cc` on port 1965. It then checks the server response's first line. If it starts with "20 text/gemini", it expresses its delight over `stdout`.

And this is what I also need to do in a shell script.

Currently, as you will see above, the `relayd.conf(5)` file has a

	check icmp

directive. This is a shortcoming as it can only tell me whether the machine supposedly serving up Gemini is up or not. This means that the status page will only be displayed in place of "vigilia" if the host itself is entirely unreachable; if "only" the server crapped out then it's still going to try to relay traffic to it.

However, there's also the option to add a shell script as a check:

check script _path_

Execute an external program to check the host state. The program will be executed for each host by specifying the hostname on the command line:

        /usr/local/bin/checkload.pl front-www1.private.example.com

relayd(8) expects a positive return value on success and zero on failure. Note that the script will be executed with the privileges of the "_relayd" user and terminated after timeout milliseconds.

Porting over the "status" script could work very well indeed.

Another good checking method could be the "check send data" route - albeit I think I would need to specify some certs as well for it to work:

check send _data_ expect _pattern_ [tls]

For each host in the table, a TCP connection is established on the port specified, then data is sent. Incoming data is then read and is expected to match against pattern using shell globbing rules. If data is an empty string or nothing then nothing is sent on the connection and data is immediately read. This can be useful with protocols that output a banner like SMTP, NNTP, and FTP. If the tls keyword is present, the transaction will occur in a TLS tunnel.

Using this "check" method I could check if a "20 text/gemini" banner is present. But as I said, it might need me to specify some certfiles for the tls bit to work. But I might not. I will give it a go!

I will do some further research on this and will report back with part 3. Until then, happy browsing.

---

[^FN] @morgant@mastodon.social

[^FN] @solene@bsd.network

[^FN] @thorstenzoeller@exquisite.social

[^FN] @continue@to.any-key.press

[^FN] obsd.ams

[^FN] vigilia.cc:1967 gives you the status page

[^FN] Omar Polo. I would gladly link to his gemini capsule but it seems to be down. What he needs is a status page.

Revised: 2024-09-06 22:03 BST [Corrected some spelling etc]