We run a travel blog (joyofexploringtheworld.com) on Docker Compose with Traefik v3 and Cloudflare. One morning Google Search Console showed every page blocked from indexing with “Failed: Robots.txt unreachable”. The site was working fine in a browser, so what was going on?

Two separate issues were conspiring to break Googlebot’s ability to fetch /robots.txt. Here’s what we found and how we fixed both.

The setup Link to heading

Our WordPress service sits on two Docker networks: app-network (shared with Traefik, Redis, imgproxy) and db-network (shared with MariaDB). We run two scaled WordPress containers behind Traefik’s load balancer.

wordpress:
  build: .
  scale: 2
  networks:
    - app-network
    - db-network
  labels:
    - 'traefik.enable=true'
    - 'traefik.http.routers.wordpress.entrypoints=websecure'
    # ...

Problem 1: Traefik routing to the wrong network Link to heading

When a Docker service connects to multiple networks, Traefik sees IP addresses for both. Our two WordPress containers had IPs on app-network (reachable from Traefik) and db-network (not reachable from Traefik). Traefik was randomly load-balancing to the db-network IP, causing intermittent 504 Gateway Timeouts.

We confirmed this by querying Traefik’s API:

curl -s http://127.0.0.1:8081/api/http/services/wordpress@docker \
  | jq '.loadBalancer.servers'

The output showed four server entries — two on 172.18.x.x (app-network, reachable) and two on 172.19.x.x (db-network, unreachable). Every other request was timing out.

The fix Link to heading

Add a single label telling Traefik which Docker network to use:

wordpress:
  labels:
    - 'traefik.docker.network=wordpress_app-network'

The value is the Docker network name as it appears in docker network ls — typically <project>_<network>, so wordpress_app-network for a project directory called wordpress. After restarting the WordPress containers, Traefik’s API showed only the two correct app-network IPs.

Problem 2: Wrong Content-Type on robots.txt Link to heading

With routing fixed, we expected Google Search Console’s Live Test to pass immediately. It didn’t. We dug into the response headers:

curl -sI https://joyofexploringtheworld.com/robots.txt | grep -i content-type

The response came back as text/html; charset=UTF-8 instead of text/plain. Googlebot may reject a robots.txt served with the wrong MIME type — the Robots Exclusion Protocol spec requires text/plain.

WordPress generates /robots.txt dynamically via PHP rewrite rules, so Apache’s FilesMatch "\.(html|htm|php)$" rule was setting text/html headers. Since there’s no physical robots.txt file on disk, FilesMatch on the filename doesn’t work — we needed to match the request URI.

The fix Link to heading

We added an <If> directive to our Apache config that matches the original request line:

<IfModule mod_headers.c>
    <If "%{THE_REQUEST} =~ m#/robots\.txt#">
        Header set Content-Type "text/plain; charset=UTF-8"
    </If>
</IfModule>

We use %{THE_REQUEST} (the original HTTP request line) rather than %{REQUEST_URI} because WordPress rewrites the URI to /index.php before headers are finalised. THE_REQUEST preserves the original path.

While we were in this file, we also added proper caching for sitemaps. Our no-cache rule for .php files was preventing Cloudflare from caching Rank Math’s dynamic XML sitemaps:

<IfModule mod_headers.c>
    <If "%{THE_REQUEST} =~ /sitemap.*\.xml/">
        Header set Cache-Control "public, max-age=3600, s-maxage=3600"
        Header unset Pragma
        Header unset Expires
    </If>
</IfModule>

Verifying the fix Link to heading

After restarting the containers and reloading Apache, we ran a quick stress test to confirm there were no more intermittent failures:

for i in $(seq 1 20); do
  docker compose exec -T wordpress \
    curl -so /dev/null -w '%{http_code} %{content_type}\n' \
    -H 'User-Agent: Googlebot' http://localhost/robots.txt
done

All 20 requests returned 200 text/plain; charset=UTF-8. Google Search Console’s Live Test passed shortly after, and the “Robots.txt unreachable” warning cleared within 24 hours.

What you can do Link to heading

  1. Always set traefik.docker.network when your service connects to multiple Docker networks. Without it, Traefik may route to an unreachable IP and cause intermittent 504s.
  2. Check Content-Type on /robots.txt — it must be text/plain. Dynamic robots.txt (generated by WordPress, Rank Math, Yoast, etc.) is served through PHP, so filename-based Apache rules won’t match.
  3. Use %{THE_REQUEST} in Apache <If> directives when WordPress rewrites are involved — %{REQUEST_URI} will show /index.php, not the original path.
  4. Don’t panic at GSC delays — even after a fix, Google’s Live Test can take minutes to hours to reflect reality. Verify with curl first.

See also: Running a WordPress Travel Blog on a Budget VPS: The Full Stack | Rank Math Sitemap Not Loading with Traefik | SEO Housekeeping: Focus Keywords and Sitemaps That Match


Built for a travel blog on a budget. This stack powers Joy of Exploring the World — curated travel itineraries, restaurant reviews, and destination guides. If you're planning your next trip, come explore with us.

All config files from this post are in the companion GitHub repo.