June 9th 2020
Serverless computing enables you to run functions without provisioning resources. Most of the top cloud providers offer good serverless options, but this article offers a review of storage options for serverless on AWS. First, you will learn about the core principles of storage and data persistence in serverless computing, and then you’ll discover what serverless options are available on AWS for databases, microservices, IoT, and storage synchronization.
Storage and Data Persistence in Serverless Computing
For many functions, statelessness is not a problem but for applications, some form of data persistence is required. To enable this persistence, developers need to attach storage resources that can be freely accessed and used by ephemeral functions.
Storage Types That Work Well for Serverless Apps
When creating serverless applications, you can either attach local storage or cloud storage for data persistence. Local storage has the benefit of providing you full control over data but severely limits the scalability that serverless is designed to provide. Because of this, it is much more common to attach apps to cloud storage services.
Cloud services are attached via service API and can provide storage that scales smoothly with your app traffic. The most commonly used storage types include:
- Databases — relational, NoSQL, graph, and time series database services are available, depending on how you need to access and store your data. Examples of database services include Amazon Aurora Serverless, Google Cloud Datastore, and IBM Cloudant.
- Object storage — enables you to store any type of unstructured data. This storage type is useful for documents, collaborative files, and media. Examples of object storage include S3, Google Cloud Storage, and Azure Storage.
- In-memory cache — provides high-performance, low latency access to data that is useful for highly time-sensitive applications. Examples of in-memory cache services include AWS Elasticache and Google App Engine’s Memcache. Unfortunately, there are currently no serverless versions of in-memory caches.
Serverless Storage Properties
When looking for the storage service that best fits your application needs, it makes sense to prioritize serverless storage options. These services mirror the model used by FaaS and allow automatic scaling, require no provisioning, and only charge for resources used.
By avoiding those services that require provisioning, and instead allow you to dynamically scale in real-time, you avoid bottlenecks. You also avoid having to pay for unused resources caused by downward fluctuations in traffic.
Storage Options for Serverless on AWS
When considering which services are right for you, it helps to consider what you need to do with your data and what limitations you have. Below are a few services that can be especially helpful for serverless applications and workloads.
NoSQL vs Relational databases
When choosing a serverless database service in AWS, you have two main options — DynamoDB and Aurora Serverless.
DynamoDB is a NoSQL database that uses document store and key-value store models. It supports both eventual and immediate consistency and is accessible via REST API. With DynamoDB you can optimize read-time queries but cannot denormalize data sets. It is best for web-apps, mobile backends, and microservices.
Aurora is a relational database that uses a secondary document store model. It provides immediate consistency and is accessible via ADO.NET, Java Database Connectivity (JDBC), and open database connectivity (ODBC) APIs. With Aurora you have greater query flexibility in exchange for performance. Another consideration is that Aurora supports nearly double the programming languages that DynamoDB does. It is best for web and mobile gaming apps and software as a service (SaaS).
When building microservices, most of your data should be stored in AWS S3. S3 is an object storage service. With this service, you can enable your users to directly access your storage rather than going through a microservice gateway. This can provide more efficiency in data retrieval but requires careful authorization measures.
If you need to process a large number of events before storage, AWS Kinesis Streams is another service you should consider. This service allows you to temporarily store events while you batch process data with Lambda. Once finished, data can be sent directly to S3 where it can be accessed as needed.
Metrics and Internet of Things (IoT)
If your application involves analysis and creation of metrics or includes streaming data from IoT sensors, a time series database service may be useful. Amazon Timestream is designed for this purpose and enables you to store and process data in immutable time intervals.
This service is especially useful if you need to ensure data fidelity, for example, with financial applications. It also works well for DevOps performance tooling or adding web-traffic analytics to your applications.
Data storage synchronization
Although not technically storage itself, you may need to use a pub/sub messaging service. Pub/sub messaging allows you to set up asynchronous communication between services and is useful for ensuring data storage synchronization.
For this task, AWS offers SNS. SNS enables you to connect your microservices and set services as publishers, subscribers, or both. Then, based on the triggers you configure, data changes are pushed across your storage services.
This enables you to use different storage services or service configurations for each microservice if needed. Because of this, SNS enables you to customize storage without sacrificing performance.
To enable data persistence, you need to attach storage resources. Cloud storage options you can attach include databases, object storage, and in-memory cache. However, you might want to avoid storage services that require provisioning, and attach only serverless storage options.
On AWS, there are a number of good storage options for serveless, including DynamoDB for NoSQL databases and Aurora Serverless for relational databases. For microservices, you can use a combination of S3 for storage, AWS Lambda for functions, AWS CloudFront for serving files, and AWS Kinesis Streams for temporary event storage.
There are many more options for serverless on AWS. You can use the services reviewed in this article as your starting point for creating a serverless ecosystem that suits your projects. Try not to use too many services, though, because this might complicate management and security tasks.