--- title: "Hosting a static site from S3 via Caddy" date: 2024-08-06T16:30:26+02:00 --- I had setup [MinIO](https://min.io/) as a way to self-host S3 buckets for an unrelated project. As a way to force myself to test that setup regularly (and to make my VPS setup more enterprise-y), I opted to host my site(s) from there. Previously I just had the files on an XFS filesystem like a caveman and had [Caddy](https://caddyserver.com/) serve them. Using Caddy is really nice with its automatic SSL, sane config files and whatnot. Thus I wanted to keep using it. So instead of simply serving static files via the `file_server` directive, it now reverse proxies public S3 buckets served by MinIO. This, at first, seem pretty straightforward. A basic MinIO provides public buckets via an URL like `minio.host/$BUCKET/$OBJECT`. `$OBJECT` can, of course, be an identifier that resembles a directory structure. So the initial hunch was to simply configure something like this in the _Caddyfile_: ``` rewrite * /$BUCKETNAME{uri} reverse_proxy minio:9000 ``` This works. Somewhat. Obviously it doesn't serve `index.html` when the request points towards a directory. This is bad. All routes on this damn page rely on this to work... Actually, its even worse. By default MinIO serves a listing of all files in the bucket if you request "`/`". So all subdirectories are just broken, because the object does not actually exist and the root of the page is an ugly XML listing of all files. Not good. To prevent MinIO from proving a file listing for publicly readable buckets, simply remove the following _Actions_ from the access policy: `s3:ListBucket`, `s3:ListBucketMultipartUploads`. Or vice versa, you only want the permissions `s3:GetObject` and `s3:GetBucketLocation`. Not, MinIO will return 403 when trying to access "`/`" and 404 when trying to access a directory. We'll let _Caddy_ handle both errors by simply trying the same route again, with `/index.html` appended to it. ``` rewrite * /$BUCKETNAME{uri} reverse_proxy minio:9000 { @error status 403 404 handle_response @error { rewrite * {uri}/index.html reverse_proxy minio:9000 { @nestedError status 404 handle_response @nestedError { respond "not found" 404 } } } } ``` This retries the request when the first one returns 403 and 404. Only if the second attempt also returns 404, we present "not found" to the enduser. So. Done? Not quite... In S3 one just _pretends_ that object names are fully qualified paths. Right now, we always append `/index.html` to the request. This works fine for `https://janw.name/blog` but falls apart if the request URL is `https://janw.name/blog/`. Thats because the seconds one ends up as a request for the object `blog//index.html`, which does not exist. Only `blog/index.html` exists. We'll need to trim the trailing slash if it is present in the request. This can be done by appending the following to the configuration: ``` @pathWithSlash path_regexp dir (.+)/$ handle @pathWithSlash { rewrite @pathWithSlash {re.dir.1} } ``` We can then wrap the whole thing in a nice template like so: ``` (s3page) { @pathWithSlash path_regexp dir (.+)/$ handle @pathWithSlash { rewrite @pathWithSlash {re.dir.1} } rewrite * /{args[0]}{uri} reverse_proxy minio:9000 { @error status 403 404 handle_response @error { rewrite * {uri}/index.html reverse_proxy minio:9000 { @nestedError status 404 handle_response @nestedError { respond "not found" 404 } } } } } ``` And then use the template like so: ``` janw.name { import s3page "janw.name" } ``` In my case I simply hardcoded the MinIO on my internal network into the template (`minio:9000`). But this could be made configurable like the bucket name if required.