Building a web-based file browser for S3 sounds straightforward. List objects, show them in a folder tree, let users click to download. But once you add authentication, authorization, and the realities of browser-based access to AWS services, the complexity multiplies quickly.
This post walks through the architecture of a production-grade S3 file browser with Cognito authentication, the non-obvious challenges you'll face, and the design decisions that matter.
The Target Architecture
A well-designed S3 file browser has five layers:
Browser (React SPA)
↓
CloudFront (CDN + routing)
↓
API Gateway (REST API)
↓
Lambda (business logic)
↓
S3 (file storage) + DynamoDB (config/permissions) + Cognito (auth)
CloudFront serves dual duty: it hosts the static React frontend and proxies API requests to API Gateway under a unified domain. This eliminates CORS issues between the frontend and backend since both are served from the same origin.
API Gateway with a Cognito authorizer validates JWT tokens on every request. If the token is invalid or expired, the request is rejected before it reaches Lambda.
Lambda functions handle the business logic: listing files, generating pre-signed download URLs, and enforcing access policies. They read configuration from DynamoDB to determine which buckets and prefixes each user group can access.
Challenge 1: The CloudFront + API Gateway Routing
The first real challenge is setting up CloudFront to route correctly. You need two origin behaviors:
/api/*routes to API Gateway- Everything else routes to the S3 bucket hosting the React frontend
The tricky part is that API Gateway expects specific headers and path formatting. You'll need to configure CloudFront to forward the Authorization header to the API origin (it strips it by default), set the correct origin path to strip the /api prefix, and handle the API Gateway stage name in the URL.
If you're using CDK, this means creating an HttpOrigin for API Gateway with a custom origin path, and a BehaviorOptions that allows the necessary headers. Getting this wrong results in either 403s from API Gateway or the Cognito authorizer rejecting requests because it never receives the token.
Challenge 2: Cognito Token Management in the Browser
Cognito issues three tokens: an ID token, an access token, and a refresh token. Your frontend needs to manage all three correctly.
The ID token contains the user's identity claims (email, name, Cognito groups). The access token is what you send to API Gateway. The refresh token lets you get new tokens without forcing the user to log in again.
The subtle issues:
- Token expiry: Access tokens expire after 1 hour by default. Your API client needs to detect 401 responses, use the refresh token to get new tokens, and retry the original request transparently.
- Token storage: Storing tokens in localStorage is convenient but vulnerable to XSS. The
amazon-cognito-identity-jslibrary uses localStorage by default. For higher-security applications, consider httpOnly cookies through a backend token exchange. - Group claims: Cognito groups are included in the ID token under the
cognito:groupsclaim. API Gateway's Cognito authorizer validates the token but doesn't enforce group membership. You need to check groups in your Lambda function.
Challenge 3: S3 Listing with Pagination
S3's ListObjectsV2 API returns a maximum of 1,000 objects per call. For buckets with thousands of files, you need to handle pagination with ContinuationToken.
But there's a UX consideration: you don't want to list every object in a bucket. Users expect folder-based navigation. S3 doesn't have real folders, but you can simulate them using the Delimiter and Prefix parameters. Setting Delimiter to / returns CommonPrefixes (folders) and objects at the current level only.
Your Lambda function should accept a prefix parameter and return two lists: folders (common prefixes) and files (objects at the current level). The frontend renders these as a navigable folder tree.
Another gotcha: S3 returns the full key for each object, including the prefix. Your frontend should strip the current prefix to show just the filename, and handle edge cases like empty folders (which are sometimes represented as zero-byte objects with a trailing /).
Challenge 4: Pre-Signed URL Generation
When a user clicks "download," your Lambda function generates an S3 pre-signed URL and returns it to the browser. The browser then redirects to or fetches from that URL directly.
Key decisions:
- URL expiry: Shorter is more secure. 5-15 minutes is typical. Too short and slow downloads might fail. Too long and shared URLs become a security risk.
- Content-Disposition: Include a
ResponseContentDispositionparameter in the pre-signed URL to force the browser to download the file rather than trying to render it inline. Without this, PDFs and images will open in the browser tab. - Filename encoding: Files with special characters in their names (spaces, unicode, etc.) need careful URL encoding in both the S3 key and the Content-Disposition header.
The Lambda function must verify that the requesting user's group has access to the requested file before generating the URL. This authorization check is your primary security boundary.
Challenge 5: Group-Based Access Control
This is where the design gets interesting. You need a data model that maps Cognito groups to S3 bucket/prefix permissions, and a way for admins to manage these mappings.
A DynamoDB single-table design works well here. Each record maps a group name to a bucket name and allowed prefix. Your Lambda functions query this table to determine what a user can see and download based on their Cognito group membership.
The complexity increases when users belong to multiple groups. Your file listing endpoint needs to aggregate permissions across all groups, deduplicate results, and present a unified view. If group A has access to reports/ and group B has access to reports/confidential/, a user in both groups should see the full reports/ tree without duplicate entries.
Challenge 6: Error Handling and Edge Cases
Production file browsers need to handle many edge cases gracefully:
- Buckets in different regions: Pre-signed URLs must be generated with the correct region endpoint. A us-east-1 pre-signed URL won't work for an eu-west-1 bucket.
- Large files: Downloads over a few gigabytes may time out or fail. Consider using multipart download with range headers for very large files.
- Deleted objects: A file might be deleted between listing and download. Handle 404s gracefully in the frontend.
- Versioned buckets: Decide whether to show object versions or just the latest version.
- Empty states: What does the UI show when a user has no group memberships, or their groups have no bucket access configured?
Is It Worth Building from Scratch?
If you have specific requirements that no existing product meets, building a custom S3 file browser is absolutely viable. The architecture above is well-understood and uses standard AWS services.
But the total effort is significant. Between the infrastructure setup, the Lambda functions, the React frontend with auth handling, the admin interface for managing permissions, and the testing and hardening work, you're looking at 3-6 weeks of focused development for a senior engineer. And then you own the maintenance.
BucketDrive implements this entire architecture as a single deployable CloudFormation stack. The frontend, backend, auth layer, access control, and audit logging are all pre-built and tested. It deploys into your AWS account in 5 minutes and gives you the same architecture described above without the engineering investment. If you want to understand the problem deeply before deciding whether to build or buy, this post should give you a realistic picture of what's involved.
Skip the build, deploy the solution
BucketDrive gives you a production-ready S3 file browser with Cognito auth out of the box. One CloudFormation stack. Five minutes.
Try BucketDrive Free