Mattermost goes live — the first public face of homelab-2nd
The first service goes public 🚀
So in the last post I laid out the grand plan — k3s, Flux GitOps, SOPS secrets, OMV MinIO storage, the whole overkill architecture for a chat app used by five people.
This is the part where I actually try to put something on the internet. Spoiler: it took three attempts and a fair amount of coffee 😅
The goal was simple: get Mattermost running on homelab-2nd, store its files on MinIO (on the OMV box), back up its Postgres to S3, and expose it at https://chat.example.com through a Cloudflare Tunnel. No open router ports, no exposed home IP, just a clean public HTTPS endpoint that hides behind Cloudflare’s edge.
Sounds straightforward, right? Right?? 😎
Why three attempts?
Two previous agent sessions tried to land this and both stalled:
First attempt (GLM agent, 2026-06-19): Got through the whole bootstrap — passwordless sudo, k3s, Flux, SOPS, OMV Docker + MinIO — but hit rate limits as soon as it touched the Mattermost-specific bits. It generated the Mattermost DB and MinIO secrets and started the HelmRelease, but stopped mid-way with the repo half-updated and the cluster in an inconsistent state. It also used the old strategy of injecting the DB connection string via
MM_SQLSETTINGS_DATASOURCEinstead of the chart’s nativeexternalDB.externalConnectionString. Classic.Second attempt (Mistral agent, 2026-06-20): Picked up the broken secret file. Correctly diagnosed
CreateContainerConfigErrorand that the SOPS-encryptedmattermost-db-credentials.sops.yamlhad been overwritten with an empty file. But then it kept writing literal***into a Python script instead of using the decoded password variable, got stuck in a loop of writing the same broken file, and never produced a working secret. Textbook example of a model losing focus as context grows and the task gets repetitive.
Lesson learned: complex multi-step infrastructure work needs focused sessions, small verified steps, and a model that can keep the whole state in its head. Or, you know, a human who actually reads error messages 🤷
This third attempt picked up the pieces with a deliberate plan: inspect actual cluster state, fix the real blockers (not the imagined ones), keep everything GitOps-driven, and document every failure because — as the Supreme Leader keeps saying — this is blog material.
Starting state
When I sat down to fix this, the cluster was actually closer to working than expected:
- CNPG cluster
mattermost-db-1in namespacemattermostwas Ready ✅ - HelmRelease
mattermostwas Ready=True and the pod was 1/1 Running ✅ GET /api/v4/system/pingalready returned HTTP 200 ✅
So the Postgres side was fine. The remaining blocker was file storage. The logs kept screaming:
1
Problem with file storage settings ... unable to check if the S3 bucket exists: Error response code .
Every time someone uploaded a file in Mattermost, it would fail. Chat works, but no cat pictures? Unacceptable 😿
The file storage rabbit hole 🐇
Env vars vs. the database config
The Mattermost Helm chart exposes file-storage settings via environment variables. We had set all the usual suspects:
1
2
3
4
5
MM_FILESETTINGS_DRIVERNAME=amazons3
MM_FILESETTINGS_AMAZONS3ENDPOINT=nas.example.com:9000 # no http:// — learned that the hard way
MM_FILESETTINGS_AMAZONS3BUCKET=mattermost-files
MM_FILESETTINGS_AMAZONS3PATHSTYLE=true
MM_FILESETTINGS_AMAZONS3SSL=false
But the running config inside Postgres told a completely different story:
1
2
3
4
5
DriverName: local
AmazonS3Endpoint: s3.amazonaws.com
AmazonS3Bucket: <empty>
AmazonS3SSL: true
AmazonS3PathStyle: <empty>
What happened? Mattermost had migrated its first-startup config.json into the database as a row in the Configurations table. After that migration, the active row in Postgres takes precedence over the env vars. Several S3 fields (especially AmazonS3PathStyle) don’t even appear in the current model/config.go, suggesting they’re no longer first-class config fields in Mattermost v11.
This is one of those “gotchas” that’s obvious in retrospect but will cost you an evening. Here’s how I confirmed it:
1
2
3
4
5
6
7
8
9
# Get the active configuration row from Postgres
kubectl -n mattermost exec mattermost-db-1 -c postgres -- \
psql -U postgres -d mattermost -c \
"SELECT value::jsonb->'FileSettings' FROM configurations ORDER BY id DESC LIMIT 1;"
# Confirm the data type of the value column
kubectl -n mattermost exec mattermost-db-1 -c postgres -- \
psql -U postgres -d mattermost -c \
"SELECT pg_typeof(value) FROM configurations LIMIT 1;"
Result: value is a text column containing a single JSON object (not an array, despite what some newer Mattermost docs might suggest).
Break-glass: editing the config row directly
Because the env vars weren’t overriding the DB-stored config, and there’s no supported chart value or env var to force path-style S3 in this version, the only option left was to edit the active Configurations row directly. This is break-glass territory — not something you want in your normal workflow, but sometimes you gotta do what you gotta do.
First attempt: naive SQL update (it went poorly)
I tried to merge a JSON blob with value::jsonb || $json$...$json$. Because value is text, this silently cast it and produced an array instead of an object. On the next Mattermost restart, the pod crashed with:
1
failed to load configuration: ... json: cannot unmarshal array into Go value of type model.Config
Oops. 😅
I recovered by extracting the first (and only) array element back into an object:
1
2
3
UPDATE configurations
SET value = (value::jsonb->0)::text
WHERE id = (SELECT id FROM configurations ORDER BY id DESC LIMIT 1);
Second attempt: jsonb_set per field (this one worked)
Used jsonb_set on each nested FileSettings field individually, then cast the result back to text:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
UPDATE configurations
SET value = (
jsonb_set(
jsonb_set(
jsonb_set(
jsonb_set(
jsonb_set(
value::jsonb,
ARRAY['FileSettings','DriverName'],
'"amazons3"'::jsonb
),
ARRAY['FileSettings','AmazonS3AccessKeyId'],
'"<ACCESS_KEY>"'::jsonb
),
ARRAY['FileSettings','AmazonS3SecretAccessKey'],
'"<SECRET_KEY>"'::jsonb
),
ARRAY['FileSettings','AmazonS3Bucket'],
'"mattermost-files"'::jsonb
),
ARRAY['FileSettings','AmazonS3Endpoint'],
'"nas.example.com:9000"'::jsonb
)
)::text
WHERE id = (SELECT id FROM configurations ORDER BY id DESC LIMIT 1);
Subsequent updates added AmazonS3Region='us-east-1' and AmazonS3SSL=false.
⚠️ Credential handling: The actual access key and secret came from the existing Kubernetes secret
mattermost-minio-creds(created via Flux from SOPS-encrypted manifests). They never appeared in the repo or in any commit. SOPS + age means the public repo stays clean.
After the update, verification looked much better:
1
2
3
driver | endpoint | bucket | region | ssl | accesskey
----------+---------------------------+------------------+-----------+------+---------------------
amazons3 | nas.example.com:9000 | mattermost-files | us-east-1 | false | homelab-minio-admin
The second blocker: DNS inside the cluster 🕵️
After fixing the DB config and restarting Mattermost, the S3 bucket-check error came back. But this time the problem was different — DNS lookup from inside a pod failed for nas.example.com.
1
2
kubectl -n mattermost run --rm -i --restart=Never dns-test \
--image=busybox -- sh -c 'nslookup nas.example.com'
Result: name did not resolve from within the cluster. The node itself (homelab-2nd) resolved it fine via mDNS/Avahi, but k3s CoreDNS had no record of it. mDNS doesn’t propagate into the cluster’s DNS resolver.
Break-glass fix: patch CoreDNS NodeHosts
Quick and dirty — patch the CoreDNS ConfigMap directly:
1
2
3
4
kubectl -n kube-system patch configmap coredns --type=merge -p \
'{"data":{"NodeHosts":"10.0.0.1 homelab-2nd\n192.168.1.180 nas.example.com\n"}}'
kubectl -n kube-system rollout restart deployment/coredns
After the restart, pod DNS resolved nas.example.com → 10.0.0.2. 🎉
But this is a manual patch on a k3s-managed ConfigMap — it could get overwritten on node updates. Not good enough for a GitOps-purist setup.
The durable fix: GitOps-managed CoreDNS custom server
The proper solution is a Flux-managed ConfigMap providing a custom CoreDNS server file. The existing Corefile already imports from /etc/coredns/custom/*.server, so we just drop a new one in:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# infrastructure/coredns/coredns-custom-homelab.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom-homelab
namespace: kube-system
data:
homelab.server: |
nas.example.com:53 {
errors
cache 30
hosts {
10.0.0.2 nas.example.com
fallthrough
}
}
This is referenced in infrastructure/kustomization.yaml and reconciled by Flux. Now the DNS entry survives node updates and is version-controlled. The manual patch is gone, replaced by something declarative. That’s the GitOps way 😎
Here’s the full flow of what we fixed:
flowchart TB
subgraph problem["The two real blockers"]
db["Mattermost config in Postgres<br/>overrides env vars"]
dns["CoreDNS can't resolve<br/>nas.example.com"]
end
subgraph fix["What we did"]
sql["jsonb_set on<br/>Configurations row"]
coredns["CoreDNS custom server<br/>via Flux ConfigMap"]
end
subgraph result["After fixes"]
mm["Mattermost pod 1/1 Running"]
s3["S3 bucket check passes"]
public["chat.example.com live"]
end
db -->|break-glass SQL| sql
dns -->|GitOps ConfigMap| coredns
sql --> mm
coredns --> s3
mm --> public
s3 --> public
Public ingress: Cloudflare Tunnel 🌐
Now that Mattermost actually works internally, time to make it public. The ingress strategy from the architecture decisions was clear: one Cloudflare Tunnel per service. This isolates services and avoids a single point of failure for all homelab traffic.
What got deployed
apps/mattermost/cloudflared-chat-deployment.yaml—cloudflaredDeployment in themattermostnamespaceapps/mattermost/chat-tunnel-token.sops.yaml— SOPS-encrypted tunnel tokenmattermost-helm-release.yamlupdated withMM_SERVICESETTINGS_SITEURL=https://chat.example.com
The tunnel route in Cloudflare Zero Trust:
| Public hostname | Service (internal) |
|---|---|
chat.example.com | http://mattermost-team-edition.mattermost.svc.cluster.local:8065 |
Note: the internal hop uses plain HTTP. TLS is terminated at the Cloudflare edge; the tunnel itself is TLS-encrypted back to the cluster. Internal HTTP between
cloudflaredand the service is fine in this architecture. No cert-manager needed.
DNS cutover
Nameservers for example.com were changed in the OVH panel to Cloudflare:
elly.ns.cloudflare.competer.ns.cloudflare.com
Propagation took roughly 1–4 hours. After that, chat.example.com resolved to Cloudflare anycast IPs.
Final verification
1
2
3
4
5
6
7
8
9
10
11
12
13
# DNS
dig chat.example.com +short
# 172.67.130.38
# 104.21.3.31
# HTTPS health check
curl -s -o /dev/null -w "HTTP %{http_code}\n" https://chat.example.com/api/v4/system/ping
# HTTP 200
# Site URL reported by Mattermost API
curl -s https://chat.example.com/api/v4/config/client?format=old | \
python3 -c 'import sys,json; print(json.load(sys.stdin).get("SiteURL"))'
# https://chat.example.com
✅ Public HTTPS works. No router ports open. Origin IP not exposed. Boom.
The repo state after all this
Relevant files in github.com/gulasz101/homelab-2nd:
| File | Purpose |
|---|---|
apps/mattermost/mattermost-helm-release.yaml | HelmRelease for Mattermost Team Edition |
apps/mattermost/postgres-cluster.yaml | CNPG Cluster for Mattermost |
apps/mattermost/mattermost-db-credentials.sops.yaml | SOPS-encrypted DB credentials + connection string |
apps/mattermost/minio-backup-creds.sops.yaml | SOPS-encrypted CNPG backup MinIO credentials |
apps/mattermost/mattermost-minio-creds.sops.yaml | SOPS-encrypted Mattermost file-storage MinIO credentials |
apps/mattermost/chat-tunnel-token.sops.yaml | SOPS-encrypted Cloudflare Tunnel token |
apps/mattermost/cloudflared-chat-deployment.yaml | cloudflared Deployment for the chat tunnel |
apps/mattermost/scheduled-backup.yaml | CNPG daily backup + WAL archive schedule |
infrastructure/coredns/coredns-custom-homelab.yaml | CoreDNS custom server for nas.example.com |
apps/mattermost/mattermost-initial-admin-password.sops.yaml | SOPS-encrypted initial admin credentials |
omv/minio/docker-compose.yml | MinIO on OMV (no secrets) |
omv/minio/.env.sops | SOPS-encrypted OMV MinIO credentials backup |
.sops.yaml | age public key for SOPS |
Things that didn’t work (aka the valuable part)
Let me be honest about all the failures, because this is the part that’s actually useful to read:
Empty SOPS file in HEAD. The first secret was staged but not committed, so Flux reconciled an empty file and the secret never appeared in the cluster. Fix: always
git diff --cachedandgit statusbefore assuming a file is in the repo. Basic, but easy to forget when you’re tired.Helm chart overrides env vars. Setting
MM_SQLSETTINGS_DATASOURCEmanually was useless because the chart generatesMM_CONFIGfromexternalDB.externalConnectionString. Fix: use the chart’s nativeexternalDBvalues, not your own env var injection.S3 endpoint must not include scheme.
http://nas.example.com:9000causedEndpoint url cannot have fully qualified paths. Fix: usenas.example.com:9000without thehttp://prefix.Mattermost stores config in Postgres after first startup. Env vars alone do not override the active
Configurationsrow for file storage. Fix: update the DB row directly (break-glass, documented here).The
valuecolumn is a JSON object, not an array. A naivejsonb_aggupdate turned it into an array and crashed the server. Fix: unwrap the array element back to an object, then usejsonb_setper field.Cluster pods couldn’t resolve
nas.example.com. Node-level mDNS does not propagate into k3s CoreDNS. Fix: add a static host entry via a custom CoreDNS server file, GitOps-managed so it survives node updates.One Cloudflare Tunnel per service is cleaner. It isolates services and avoids a single point of failure for all homelab traffic. Worth the extra setup.
cert-manager is NOT needed for Cloudflare Tunnel ingress. Cloudflare terminates TLS at the edge; the tunnel itself is TLS-encrypted back to the cluster. Internal HTTP between
cloudflaredand the service is fine.Username/password sign-in must be explicitly enabled. Disabling email sign-up and open server is not enough — if
EnableSignInWithUsernameis not set, the login page shows “This server doesn’t have any sign-in methods enabled”. Fix: setMM_SERVICESETTINGS_ENABLESIGNINWITHUSERNAME=truewhile keeping email sign-up disabled.Previous agent attempts failed due to rate-limits / context loss. Complex, multi-step infrastructure work needs focused sessions, small verified steps, and a model that can keep the whole state in its head. Or, you know, just do it yourself 😅
What’s running now
After all the fixes:
1
2
kubectl -n mattermost rollout restart deployment/mattermost-mattermost-team-edition
kubectl -n mattermost rollout status deployment/mattermost-mattermost-team-edition --timeout=180s
Result:
- Pod:
1/1 Running✅ - HelmRelease:
Ready=True, Released=True✅ - HTTP ping:
HTTP 200✅ - No more
unable to check if the S3 bucket existserrors ✅
Remaining log entries are expected and non-blocking:
- Playbooks plugin refuses to activate (requires paid license) — whatever
- SMTP not configured (intentional — no email for this deployment)
ffmpegnot installed (transcriptions disabled in the AI plugin)- Harmless
healthzwrite reset during readiness probes
Initial admin user wojtek was created and added to team gulaszteam. Supreme Leader confirmed successful login on 2026-06-20. Username/password sign-in is enabled, email sign-up and open server are disabled, and the /signup_user_complete page is intentionally blocked because public registration is off.
What’s next
The pattern is now established: deploy a service via HelmRelease, add a CNPG Postgres cluster, wire S3 storage to MinIO, expose it through its own Cloudflare Tunnel. The next services will follow the same recipe.
flowchart LR
subgraph done["✅ Done"]
mm["Mattermost<br/>chat.example.com"]
end
subgraph next["🔜 Next"]
nc["Nextcloud Photos"]
obs["Observability<br/>Grafana + Loki"]
llmhub["LLM Hub<br/>LiteLLM + OpenWebUI"]
end
pattern["GitOps recipe:<br/>HelmRelease + CNPG +<br/>MinIO S3 + Cloudflare Tunnel"]
pattern --> mm
pattern --> nc
pattern --> obs
pattern --> llmhub
The next post will be about the observability stack — because if there’s one thing I learned from homelab.one, it’s that running services without monitoring is just flying blind 😎
Repo is here if you want to follow along: github.com/gulasz101/homelab-2nd
Cheers! 🎸


