$30
Description
Your previous assignment was to build an AppEngine app that stored a few things and returned them from a JSON-enabled endpoint when requested. You also have some mutating functions that allow you to add and (optionally, until now) delete existing events.
For this assignment, you will add authentication so that all of the events are tied to one and only one user; every user manages a unique set of data just for them. That user must be logged in to see their events. This means you will create a registration form, a login form, and send and receive session cookies.
While it is tempting to use the built-in Python 2 User class, for this assignment you will be rolling your very own authentication[1]. If a person is not logged in, they will be offered a login screen, and if they are not registered, you will provide a way for them to sign up with their own chosen username and password.
Because your existing data is not attached to a user, you will need to create a user for yourself and then migrate the existing data over to it. Migrating data is part of the assignment, so you will need to demonstrate code that does the migration, along with a written strategy for how it is used.
With Google AppEngine’s datastore, you will probably want to use the user as a parent entity for the event data. Before, you used a constant parent entity (I called it “ROOT” in the examples), and now you will set up some kind of a user key that becomes the parent of all events.
If you have not successfully implemented all of the necessary facilities in the previous lab, a starting implementation will be provided for you.
As with the previous lab, you will turn in code and demonstrate functionality using a public cloud and link to your working system.
Minimum Requirements
● HTML/JavaScript still served statically, dynamic requests through JSON requests
● All interactions are over HTTPS.
● At least one user can log in with a password.
● When not logged in, the user is automatically directed to a login page.
● Create a secure session cookie on login, use it for ongoing communications.
● All JSON endpoints are protected; they also require a session token.
● Only secure derivatives of passwords are stored in the database. Use key stretching.
● Original data (that was not previously attached to a user) is migrated to a user.
My suggestion for storing password derivatives (for Python users) is to include Python bcrypt libraries inside your project directory, maybe under a “lib” folder. See this StackOverflow answer for one way to do that.
Overview
Password-based authentication is fairly simple (and not the most secure thing you can do). It consists of asking a user for their username and password over a secure channel, checking them against a database, and sending back a session token if they match. That session token can then be used to access your site for a time.
This lab involves using cookies. You should not use built-in functions for creating sessions (e.g., flask.session) this time around because we’re doing this to improve your mental model, not your productivity. In real life you would use a library for this, but this is an important point: in real life you will likely be the one evaluating whether that library does the right thing.
Best to know how it works.
From the user’s perspective, there are two pages: they try to go to /, but get a login form instead. They fill it in, and then they get the main page (/).
From the server’s perspective, when the user is not yet authenticated, 4 distinct requests come in that are handled:
A GET request for /. No valid session cookie, so it redirects the browser to /login instead.
A GET request for /login[3]. Returns the login form. Best if it includes a CSRF token.
A POST request for /login[4] containing the username and password (and CSRF cookie if you use one). Server checks password by running the appropriate function, creates and stores a session token, and sets a cookie containing that token.
A GET request for /. Now it has the session cookie. The server finds that session token in the database, sees what user is in there, and provides that user’s main page content.
The server really handles 3 different endpoints here: GET /, GET /login, and POST /login. Any content that should be protected by login will need to do the session cookie check with redirect. Here we use “/” as that path, but really all paths containing user content will need this check/redirect behavior. The GET /login handler returns a login form, and POST /login accepts the username and password from that form and does the necessary session token checks and creation.
The process is basically as depicted below:
CSRF Tokens
We’ll get deeper into cross-site request forgery issues later on, and you won’t be required to implement CSRF protection for this lab. It is important to know, however, that you should never ever make a mutating POST request without a CSRF token. It’s super dangerous to leave that out. Again, we’ll cover that in more depth later on, so for now we’re going to punt on it.
Non-Storage of Passwords
As part of this assignment, you will be creating a form that sends a username and password from a user’s browser to your service. This can happen when registering a new user or merely logging in an existing one. Because these will be sent over the Internet, you must ensure that the connection when POSTing this information to the server is always secure.
To do this in AppEngine, you can set “secure: always” in each handler in app.yaml. That will force SSL. Note that it won’t necessarily redirect you from HTTP to HTTPS, so you might need to ensure that you’re typing “https://” in your URL bar as you test.
If you wanted to be especially careful, you could compute an appropriate hash of the password using a cryptographic hash like SHA-256 in the browser and send that hash to the server. Note that this does not mean you can avoid using SSL: if someone hacks your site or is otherwise able to inject their own JavaScript into the transmission (because it’s an unsecured channel and they could feasibly execute a meddler-in-the-middle attack), then they would be able to exfiltrate the password with some malicious scripting while still preserving overall site functionality. In short, you always want end-to-end encryption when sending sensitive data — even if it’s a cryptographic hash of sensitive data — to a server.
Note that it is not at all typical to compute hashes in the browser like this. For one thing, you can never guarantee that a company is doing that, and for another, SSL means you already trust the endpoint and have an encrypted channel to it. Therefore, passwords are sent in plaintext over the encrypted channel. What’s special is what happens when they arrive: they are never to be stored in plaintext.
Again, your server should not store the actual password, but a cryptographic derivative (e.g., a secure hash) of it. When testing whether a username and password are correct, therefore, you will not compare passwords, but two hashes: one stored, and one computed from what the user typed during login. Use key stretching, something like Bcrypt or Argon2. Libraries exist for these: don’t make your own.
User Records
With AppEngine, it is possible to create what’s called a “parent document”[5]. We used a single universal one in the previous lab to ensure ordered database operations (everything was under a single “ROOT” key). For this assignment, you will create a document whose key is a user ID. The user document will contain a cryptographic derivative of the user’s password, and potentially other things that you think it ought to contain.
Shared Session Tokens
If you require a user to login every time they do everything on the site, you and your users are going to have a bad time. Therefore, once you have identified the user as being in your system, and the password is correct, you should create a random secret token that you then pass back in the Set-Cookie header of the response. Take very careful note of the domain of the cookie. Since you are likely going to be operating on somedomain.appspot.com, and the last two items are the default scope of the cookie, you will need to ensure that your cookie has the appropriate domain set. You will also want to set an expiration time for the cookie so that the browser knows to age it out.
Because the same user can log in from multiple places or devices, you will need a separate session document to keep track of sessions. This document will likely be keyed on the session token, and will contain a reference to the user. Because of the way the data store works, you should have a special root session key under which all of your sessions live.
It is also fairly important that you ensure that the user has a reasonable experience when the session token expires. For example, since you are asking for JSON information and posting in a similar way, it is possible that a user will load the site, see their dates, then leave it open for a while before attempting to make a change. What will you do to ensure that something reasonable happens when your backend request (which is not tied to a page load!) fails due to a need to log in? How will you surface that to the user when the request is happening in the background?
One option would be to continually refresh the session in the background. Another would be to trigger a full reload to an error page if a background request fails because of an expired session. You decide, just let us know what you choose!
Migrating Data
Once you have registered your first user (or a user that you want to have the original data), you should migrate the old data over to that user so that it is now behind a login. Make a migration plan and write it up. It can be a one-page document. Then execute the plan and verify that your data shows up for that user when logging in. Importantly, also test that it does not show up for other users.
Write-Up
For this lab, write up your migration plan. Discuss as part of the write-up what would need to happen (including how expiration of tokens would be handled, data migrated, etc.) if the following occurred:
● A user desires to change their username
● A user desires to change their password
● A user desires to delete their account and all associated data
● A user loses their password
● A user has their password stolen and used by someone else
Include what you learned. What surprised you about this task? What seems like it should be easier than it was? What seemed like it was easier than expected?
This write-up can be relatively short, but should be complete enough that someone technical could do the things you did.
Random Advice
As before, I will be posting random bits of advice on the lab, with code snippets, etc., as the week progresses. If you are stuck, make sure you use the labs channel to get help. Also, help your peers. I’m watching participation, and that’s a great way of getting a good participation score. Be helpful, start conversations, collaborate. Remember that the only requirement that I have for lab copying is that you write your own code using the ideas that you share with others. Share ideas, share solutions, write your own code.
[1] Never do this in real life. Great for a class, terrible for the world. Always use a well-tested library for authentication. ALWAYS.
[2] Hint: you will need to send a JSON message back to the browser indicating it should redirect, and your client code will see that message and do something like window.location = data.redirect —- something like that.
[3] Hey, server, you started it. Stop complaining.
[4] Usually we use GET for reading things and POST for writing.
[5] A document is basically a Python dictionary when your Python code sees it.