a small lesson learned in setting up a static web site with S3 and CloudFront
I created a static web site hosted in an S3 bucket namedwww.example.com
(not the real name) and enabled accessing it as a
website. I wanted delivery to be fast to everybody around the world,
so I created a CloudFront distribution in front of the S3 bucket.
I wanted S3 to automatically add “index.html” to URLs ending in a
slash (CloudFront can’t do this), so I configured the CloudFront
distribution to access the S3 bucket as a web site usingwww.example.com.s3-website-us-east-1.amazonaws.com
as the origin
server.
Before sending all of the www.example.com
traffic to the new setup,
I wanted to test it, so I added test.example.com
to the list of
CNAMEs in the CloudFront distribution.
After setting up Route53 so that DNS lookups for test.example.com
would resolve to the new CloudFront endpoint, I loaded it in my
browser and got the following error:
404 Not Found
Code: NoSuchBucket
Message: The specified bucket does not exist
BucketName: test.example.com
RequestId: [short string]
HostId: [long string]
Why would AWS be trying to find an S3 bucket named test.example.com
?
That was pointing at the CloudFront distribution endpoint, and
CloudFront was configured to get the content fromwww.example.com.s3-website-us-east-1.amazonaws.com
After debugging, I found out that the problem was that I had configured the CloudFront distribution to forward “all” HTTP headers. I thought that this would be a sneaky way to turn off caching in CloudFront so that I could keep updating the content in S3 and not have to wait to see the latest changes.
However, this also means that CloudFront was forwarding the HTTPHost
header from my browser to the S3 website handler. When S3 saw
that I was requesting the host of test.example.com
it looked for a
bucket of the same name and didn’t find it, resulting in the above
error.
When I turned off forwarding all HTTP headers in CloudFront, it then started sending through the correct header:
Host: www.example.com.s3-website-us-east-1.amazonaws.com
which S3 correctly interpreted as accessing the correct S3 bucketwww.example.com
in the website mode (adding index.html after
trailing slashes).
It makes sense for CloudFront to support forwarding the Host
header
from the browser, especially when your origin server is a dynamic web
site that can act on the original hostname. You can set up a
wildcard *.example.com
DNS entry pointing at your CloudFront
distribution, and have the back end server return different results
depending on what host the browser requested.
However, passing the Host
header doesn’t work so well for an origin
server S3 bucket in website mode. Lesson learned and lesson passed on.
Original article and comments: https://alestic.com/2017/03/cloudfront-s3-host-header/