This is a complete beginner guide to Amazon Cognito. you’ll learn about User Pools, Identity Pools/Federated Identities, and how to tie them together.
Amazon Cognito is a huge service that offers many authentication and authorization features. Folks tend to get intimidated by the service because not only do you need to learn about Amazon Cognito itself, but also key Authentication and Authorization concepts.
The purpose of this article is to explain the core concepts of Cognito from a beginner perspective. I’ll also give you the TLDR of Authentication and Authorization concepts to round out your understanding.
By the end of this article, you’ll have a solid understanding of what Cognito has to offer and how it can be used to provide user login and access control to AWS services. This article requires zero understanding of how authentiction/authorization concepts work. We’ll start from essentially zero knowledge.
But before we get started with the features, lets quickly overview some important definitions you’ll want to understand. I’ve attempted to classify the definitions into categories that are specific to Cognito, and others that are more generally related to authentication/authorization. Don’t get intimidated if you don’t understand the specifics of these definitions right now – they’ll become more clear throughout the article. Feel free to scroll back to the following section if you need a reminder.
- Authentication – Sometimes called authN, Authentication refers to identifying who a user is. This is commonly accomplished with a login form and a username/password. However, a user can be authenticated using other means such as biometrics (such as using your face id when opening your iPhone).
- Authorization – Sometimes called authZ, Authorization refers to what a user has access to.
- Authorization Server – Despite having the name Authorization in it, an Authorization Server is also responsible for Authenticating the user. The Authorization Server generates tokens that are used to identify a user and guard access to backend resources. Authorization servers are also responsible for communicate with other Identity Providers such as Social Sign On providers (Google, Facebook, etc) to authenticate the user.
- OAuth 2.0 – The internet standard protocol for authorizing users to an application. Is a delegated authorization framework that allows users to obtain access to applications.
- OpenID Connect – An extension of OAuth 2.0 that offers authentication (login) features.
- Identity Token – An encoded string, that once decoded contains information about the user (name, email, etc).
- Access Token – An encoded string that is used to validate a user’s access to a resource server.
- Authorization Code – A temporary code (string) that is exchanged for an Access Token. Is used in the Authorization Code flow described below.
- Implicit Grant Flow – An OAuth 2.0 flow used to grant access tokens to users. This flow has security loopholes and should be avoided.
- Authorization Code Flow – An OAuth 2.0 flow used to grant authorization codes to users, which are then exchanged for access tokens. This is an improvement upon the Implicit Grant flow that has a better security profile and fewer loopholes.
- Scopes – Scope is a mechanism in OAuth 2.0 to limit an application’s access to a user’s account.
- Claims – Keys/Values about the user encoded in the Access/ID token.
- Identity Provider – Sometimes called IDP, is a system that provides authentication services to client applications. Cognito is an Identity Provider, as is Google and Facebook when leveraging social sign on.
- Social Sign On – A method to use a third party Identity Provider to access your Cognito User Pool.
Cognito Specific Definitions
- User Pools – A directory of users that sign up either directly through Amazon Cognito or via a Social Sign On Identity Provider (Google, Facebook, etc.).
- Federated Identities / Identity Pools – A method to grant Cognito users temporary access to AWS services.
- Hosted UI – A Cognito provided URL / Website that allows users to sign in to your userpool. Post successful authentication, the user is re-directed back to your website (sometimes called callback URL) with either an Access Token (in the case of Implicit Grant flow) or Authorization Code (in the case of Authorization Grant flow).
- Groups – Users in your Cognito User pool can be tagged as a member of a group that you define (i.e. student, instructor) to grant different levels of access.
- Triggers – Snippets of code that can be run at different stages during the login process (i.e. pre-signup, post-signup, post-login). Triggers can be used to add different functionality such as tagging a user as part of a group after they initially register.
Great – now we have an understanding of some common authentication/authorization and Cognito specific terms. Lets jump into discussing the two key categories of Cognito: User Pools and Federated Identities.
User Pools and Federated Identities (previously known as Identity Pools)
The first thing to understand about Amazon Cognito is that it is broadly separated into two distinct categories – User Pools and Federated Identities (up until recently, referred to as Identity Pools). Lets briefly overview what these two are before diving into the details for both.
User Pools at a High Level
To put it simply, a User Pool is a user directory within Amazon Cognito. It contains records of users that either sign up directly in Amazon Cognito, by Social Identity Providers (Google, Facebook, LWA, etc) or by using legacy SAML Identity Providers. Essentially, think of a user pool as a table of user records. User Pools abstract away the complexity of you as a developer having to store and manage usernames, passwords, logins, sign ups, lost passwords, so on and so forth.
As an administrator of Cognito and owner of a User Pool, you set the rules in terms of which Identity Providers an individual can use to register for your application. Some prefer to only allow users to sign up directly within Cognito. Others prefer to offer social sign on which allows you to delegate authentication to the other service like Google, Facebook, and others. Authentication is achieved through a well known OAuth 2.0 extension called Open ID Connect (OIDC). All well known Social Sign On providers like Google and Facebook support this well known protocol. You can learn more about OIDC here.
Here’s a brief image outlining the role of User Pools in the ecosystem of Amazon Cognito:
The key thing to remember is that User Pools are all about user directories and user management.
Now we have a high level understanding of what User Pools are. Lets briefly cover the other half of Cognito – Federated Identities.
Federated Identities at a High Level
Federated Identities (formerly Identity Pools) are a way for you to uniquely identify users and assign temporary access to AWS resources. With an Identity pool you link your User Pool or other Social Identity providers such as Google/Facebook. Afterwards, you can set up IAM Roles that users will inherit. You can set up some pretty complicated rule matching to specify which users (based on attribute or other means) will get to use which IAM Role. As a quick reminder, IAM stands for Identity and Access Management and is the primary AWS service used to manage access to AWS resources.
You can also have “Guest Users” that use a default role. This allows you to create applications that provide restricted temporary credentials to users that do not yet register for an account in your application.
The thing to remember about Federated Identities is that they are used to manage what AWS resources a user has access to.
Now that we have a high level understanding of the two big concepts in Cognito, lets dive into each of them more in detail – starting with User Pools.
User Pools in Detail
Part of the reason many folks get confused by User Pools is that there are so many additional features that Cognito offers. But don’t fret – we’re going to tackle these one at a time. So let’s go.
Creating a User Pool
The first step to use User Pools is to create one. When you initially set up a User Pool there are many settings you as an administrator need to specify. This includes things like:
- The name of your user pool.
- Which attributes you will require from the user (phone number, email, address, etc).
- Password policies such as minimum length or combinations of numbers and special characters.
- Requiring Multi Factor Authentication (MFA).
- Post User Signup Email Verification.
- Triggers to run Lambda functions at specific points during the user sign up, login, and many other hooks to provide custom functionality.
You can see the multitude of options available when creating a user pool in the screen shot below. Do note that the Cognito UI is currently undergoing an overhaul and may look slightly different depending on when you access it. The below screenshot is the original or legacy version of the UI.
Remember that user pools are simply a collection of users. But what if we want to leverage this pool of users for a multitude of different applications? This is where User Pool Application Clients come into the picture. We discuss them in the next section.
User Pool Application Clients
According to AWS, a Cognito Appliction Client is “an entity within a user pool that has permission to call unauthenticated API operations“. This is a pretty meaningless definition that doesn’t tell you much.
I like to think of an Application Client as a literally a different application that wants to interact with (sign up, login) with your user pool. You may use one app client for your web app, and another for your mobile application.
Application Clients are created by the administrator of the user pool. Post creation, you are provided with a public client secret. This secret is to be used in all API interactions with Cognito. The secret is used to identify the Application Client and distinguish it from others.
Remember that in order for a User Pool to provide any functionality, you need at least one application client. This makes sense since a user pool without a client to interact with it is effectively useless.
You don’t need to know too much more about Application Clients. But if you’re curious you can read more about them here.
Once you set up your first application client, you now have access to the Cognito Hosted UI which is our topic for our next section.
Cognito Hosted UI
The Cognito Hosted UI is a website provided by Amazon Cognito that allows the user to sign up or log in to your User Pool. Each Application Client has their own URL to access your User Pool.
Note that you do not HAVE to use the Hosted UI to log in your users. If you are looking to build a custom front end experience that captures the user’s username and password, you can do so by calling specific Cognito APIs using the AWS Cognito SDK. For example, the InitiateAuth API as documented here allows you to submit user usernames/passwords. This option may seem attractive, but it is much more complicated than using the hosted ui. Reason being, you will need to also implement all other API interactions with cognito such as user sign up, password recovery, changing passwords, etc. All of this is handled for you out of the box if you use the hosted UI.
I would suggest folks avoid this approach unless they have a ton of time on their hands or explicitly require a custom login experience.
Setting up your Hosted UI looks something like the screenshot below
The selected Identity Providers is Cognito User Pool. This means that currently, the only way for the user to sign up and log in with your user pool is by registering as a user directly in your user pool. In other words, Cognito will act as the Identity Provider.
Another important section is the Callback URL and Signout URL. These are whitelisted URL endpoints that should be owned by your application. How the flow essentially works is that once a user successfully signs in with their credentials through the hosted ui, Cognito will redirect them to the specific Callback URL. In this case, www.mywebsite.com/cb. Added to the query string will be either an access token (if using the implicit grant flow which is not a suggested approach) or an authorization code (a more secure approach). Additionally, if you are using the profile and openid scope, you will get access to the user’s profile attributes such as name, email, and address. This data is provided through the Identity Token which is also returned post sign in.
I want to take a quick detour to explain the difference between Implicit Grant and Authorization Code Grant. Note that these are OAuth 2.0 concepts and not something specific to Cognito.
Implicit Grant Flow
In the implicit grant flow, the interaction between client and server takes on this form:
As you can see in the diagram, the flow is quite simple – just replace “Okta” with Cognito. Post authentication, Cognito will redirect your client to your application’s callback URL. Embedded within the query string parameters will be an access token. The access token is then used in subsequent calls to your backend APIs. Your backend then cross-checks the access token with Cognito before letting through the request. The downside of this flow is that the access token is directly embedded in the URL. This is a security vulnerability that makes it possible for bad actors (browser extensions, packet sniffers) to get access to your token. This can be a big danger.
The Implicit Grant is kind of outdated and no longer suggested as an approach to use. Instead, you should consider using the Authorization Code Flow/Grant which is described below.
Authorization Code Flow
The Authorization Code Flow is a slightly more complicated flow but has better security characteristics. An Image that describes (again from Okta) can be seen below.