If you’re using Amazon S3 to store your website’s files, and you want some of them to be private, only accessible to certain users, and particularly if you’re using Django, here’s how.
It took me the best part of a day to piece together the parts of this so I thought they should all be in one place for the next person. I can’t believe I found this so hard. I may have made mistakes, but it seems to work.
I’m going to assume you’ve already got your site set up to store static files on S3 (which is particularly useful if, say, your site is hosted on Heroku). You should already have:
- Installed django-storages.
- Installed boto.
- Set up an AWS account and created a bucket in S3.
Your media
and static
folders
This isn’t essential for the whole private-files-on-S3 gist of this post, but getting your media
and static
files to end up on S3 in separate folders is a little non-obvious, but very useful. So, a little aside to cover it… I’ve got this working nicely by using something like this Stackoverflow answer:
In your app create an s3utils.py
file and put this in there:
from storages.backends.s3boto import S3BotoStorage
StaticS3BotoStorage = lambda: S3BotoStorage(location='static')
MediaS3BotoStorage = lambda: S3BotoStorage(location='media')
And then, in your settings.py
, you’ll need something like this:
DEFAULT_FILE_STORAGE = 'yourproject.yourapp.s3utils.MediaS3BotoStorage'
STATICFILES_STORAGE = 'yourproject.yourapp.s3utils.StaticS3BotoStorage'
AWS_ACCESS_KEY_ID = 'YOURACCESSKEY'
AWS_SECRET_ACCESS_KEY = 'YOURSECRETACCESSKEY'
AWS_STORAGE_BUCKET_NAME = 'your-bucket-name'
S3_URL = 'http://%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
STATIC_DIRECTORY = '/static/'
MEDIA_DIRECTORY = '/media/'
STATIC_URL = S3_URL + STATIC_DIRECTORY
MEDIA_URL = S3_URL + MEDIA_DIRECTORY
Those STATIC_DIRECTORY
and MEDIA_DIRECTORY
settings aren’t standard Django settings, but we need the MEDIA_DIRECTORY
value when setting permissions on our private files a little later.
I think you’ll need to manually create the /media/
and /static/
directories in your S3 bucket. Then, if you run the collectstatic
Django management command your static files should end up in http://your-bucket-name.s3.amazonaws.com/static/. And any files uploaded through FileField
or ImageField
attributes on your models should end up in http://your-bucket-name.s3.amazonaws.com/media/ . If those model attributes specify upload_to
paths, they will be relative to /media/
.
Making files private
By default, those media
files are public — if you enter an uploaded file’s URL in your browser, you should be able to access it just fine.
Let’s assume that our model has two kinds of file, one public and one private. So in our models.py
we have this:
from django.db import models
class MyModel(models.Model):
...
public_file = models.FileField(blank=True, null=True, upload_to='open/')
private_file = models.FileField(blank=True, null=True, upload_to='seekrit/')
...
Assuming you create the /media/open/
and /media/seekrit/
directories then the files should get uploaded there fine, but all currently publicly-accessible.
For our private files we need to set their permissions after upload to be private. To do this, I’ve ended up with a custom save()
method on the model:
from django.conf import settings
from django.db import models
import boto
class MyModel(models.Model):
...
public_file = models.FileField(blank=True, null=True, upload_to='open/')
private_file = models.FileField(blank=True, null=True, upload_to='seekrit/')
...
def save(self, *args, **kwargs):
super(MyModel, self).save(*args, **kwargs)
if self.private_file:
conn = boto.s3.connection.S3Connection(
settings.AWS_ACCESS_KEY_ID,
settings.AWS_SECRET_ACCESS_KEY)
# If the bucket already exists, this finds that, rather than creating.
bucket = conn.create_bucket(settings.AWS_STORAGE_BUCKET_NAME)
k = boto.s3.key.Key(bucket)
k.key = settings.MEDIA_DIRECTORY + self.private_file
k.set_acl('private')
If you upload a private file with that in place, then you should no longer be able to access it directly. e.g., upload a file called test_file.pdf
and visiting http://your-bucket-name.s3.amazonaws.com/media/seekrit/test_file.pdf should get you an XML file containing an AccessDenied
error.
UPDATE: (12 Feb 2015) Shamim Hasnath suggests creating a new field class to use for the private file, and (3 Oct 2017) Robert Rollins has kindly added an improvement:
from django.core.files.storage import get_storage_class
class S3PrivateFileField(models.FileField):
"""
A FileField that gives the 'private' ACL to the files it uploads to S3, instead of the default ACL.
"""
def __init__(self, verbose_name=None, name=None, upload_to='', storage=None, **kwargs):
if storage is None:
storage = get_storage_class()(acl='private')
super(S3PrivateFileField, self).__init__(verbose_name=verbose_name,
name=name, upload_to=upload_to, storage=storage, **kwargs)
You can then use this for the private_file
on your model instead of the FileField()
we used above:
private_file = S3PrivateFileField(blank=True, null=True, upload_to='seekrit/')
Shamim suggests that once you’ve done this you don’t need to do the set_acl()
(in the save()
method) to make the file private, because the field sets the default_acl
parameter.
Allowing access to certain users
Now that we can upload private files, how do we allow certain users to access them? We need to create temporary signed URLs that let users access the file.
First, in your urls.py
add a URL for linking to the files:
from django.conf.urls.defaults import *
from yourproject.yourapp import views
urlpatterns = patterns('',
...
url(r'^(?P<pk>[\d]+)/secretfile/$', views.SecretFileView.as_view(), name='secret_file'),
...
)
This will let us link to files something like http://yourdomain.com/42/secretfile/, referring to the secret file on an object with a pk
of 42. This is what you should use when linking to the file, e.g., in a template:
<a href="{% url secret_file pk=object.pk %}">Download</a>
Then, in your app’s views.py
create the SecretFileView
:
from django import http
from django.shortcuts import get_object_or_404
from django.views.generic import RedirectView
from boto.s3.connection import S3Connection
from yourproject.yourapp.models import MyModel
logger = getLogger('django.request')
class SecretFileView(RedirectView):
permanent = False
get_redirect_url(self, **kwargs):
s3 = S3Connection(settings.AWS_ACCESS_KEY_ID,
settings.AWS_SECRET_ACCESS_KEY,
is_secure=True)
# Create a URL valid for 60 seconds.
return s3.generate_url(60, 'GET',
bucket=settings.AWS_STORAGE_BUCKET_NAME,
key=kwargs['filepath'],
force_http=True)
def get(self, request, *args, **kwargs):
m = get_object_or_404(MyModel, pk=kwargs['pk'])
u = request.user
if u.is_authenticated() and (u.get_profile().is_very_special() or u.is_staff):
if m.private_file:
filepath = settings.MEDIA_DIRECTORY + m.private_file
url = self.get_redirect_url(filepath=filepath)
# The below is taken straight from RedirectView.
if url:
if self.permanent:
return http.HttpResponsePermanentRedirect(url)
else:
return http.HttpResponseRedirect(url)
else:
logger.warning('Gone: %s', self.request.path,
extra={
'status_code': 410,
'request': self.request
})
return http.HttpResponseGone()
else:
raise http.Http404
else:
raise http.Http404
What does this all do? First we make sure we get the object from the pk
in the URL. Then we want to make sure this user can access the file. The conditions are up to you. Here we’re making sure the user is logged in, and either satisfies some condition set in an is_very_special()
method on the user’s UserProfile, or is a staff member.
If that’s OK, and our object actually has a private_file
uploaded, then we set the full filepath — this is why we had to create that MEDIA_DIRECTORY
setting earlier on, because it needs to be an absolute path, not relative to /media/
.
We then generate the signed, temporary URL. Here the URL will be valid for 60 seconds — after that it will no longer function, so it can’t be passed on to anyone else. Once we’ve got this URL, we redirect to it, and the user should be able to access the file — it should appear in the browser, or start downloading, depending on the file type.
Except, that’s not quite all…
Setting the Bucket Policy
Just because we’ve got a signed URL, this doesn’t yet mean the file will download. The private
permissions we set on it earlier still apply. We need to specify a policy that will let us bypass this with the signed URLs.
Go to your S3 console, select your Bucket, and right-click to show its Properties. You should see a link saying “Add bucket policy” (or “Edit bucket policy” if you already have one). A window should open, into which you should put something like this:
{
"Version": "2008-10-17",
"Id": "My Special Bucket Policy",
"Statement": [
{
"Sid": "Allow Signed Downloads for Private Files",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::12345678901:root"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/media/seekrit/*"
}
]
}
A few points on this…
- No, don’t change the
Version
date from2008-10-17
. - I don’t know what the significance of the
Id
orSid
are. - In the
Principal
value, the12345678901
shown here should be replaced with your AWS Account Number. This should be visible on the AWS Manage Your Account page, currently shown at the top-right. Remove the hyphens and put it in here. - In
Resource
put your actual Bucket Name in place ofyour-bucket-name
and set the path to point to the folder your private files are in. The asterisk on the end means this policy applies to all the files in that folder.
This seems to work for me, although I can’t claim to understand it in great depth.
That’s it
And there we go. I found most of that via Googling, but none of it was all in one place and it took way too long to piece together. Hopefully it’ll be useful to others.
Bear in mind that I DON’T REALLY KNOW WHAT I’M DOING and may have got things wrong. If you spot anything that could be improved, please do let me know (email or Twitter).
Commenting is disabled on posts once they’re 30 days old.