Databases, like CloudSQL and Firestore, store data that needs to be persistently stored and readily accessed by an application or user. Data should be stored in the database when it could possibly be queried or updated. When data is no longer required to be queried or updated, it can be exported and stored in object storage.
In the case of time-series databases, data may be aggregated by larger time spans as time goes on. For example, an application may collect performance metrics every minute. After three days, there is no need to query to the minute level of detail, and data can be aggregated to the hour level. After one month, data can be aggregated to the day level. This incremental aggregation will save space and improve response times for queries that span large time ranges.
Object storage is often used for unstructured data and backups. Standard Storage class should be used for frequently accessed data. If data is accessed at most once a month, then Nearline storage can be used. When data is not likely to be accessed more than once in 90 days, then Coldline storage should be used. Archive storage is appropriate for objects that are not accessed more than once per year.
Consider how to take advantage of Cloud Storage's lifecycle management features, which allow you to specify actions to perform on objects when specific events occur. The two actions supported are deleting an object or changing its storage class. Standard Class storage objects can be migrated to either Nearline, Coldline, or Archive storage. Nearline storage can migrate to Coldline storage or Archive storage. Coldline storage can be migrated to Archive storage. DRA Storage can be transitioned to the other storage classes.
Lifecycle conditions can be based on the following:
The age of an object
When it was created, including CreatedBefore and CustomTimeBefore conditions
Days since a custom time metadata field on an object
The object's storage class
The number of versions of an object as well as the number of days since the object became noncurrent
Whether or not the object is “live” (an object in nonversions bucketed is “live”; archived objects are not live)
Storage class
You can monitor data lifecycle management either by using Cloud Storage usage logs or by enabling Pub/Sub notifications for Cloud Storage buckets. The latter will send a message to a Pub/Sub topic when an action occurs.
Systems Integration and Data Management
Business requirements can give information that is useful for identifying dependencies between systems and how data will flow through those systems.
Systems Integration Business Requirements
One of an architect's responsibilities is to ensure that systems work together. Business requirements will not specify technical details about how applications should function together, but they will state what needs to happen to data or what functions need to be available to users.
Let's review examples of systems integration considerations in the case studies. These are representative examples of system integration considerations; it is not an exhaustive list.
EHR Healthcare Systems Integration
The EHR Healthcare Systems case study notes that there are several legacy file and API-based integrations with insurance providers that will be replaced over the next several years. The existing systems will not be migrated to the cloud. This is an example of a rip-and-replace migration strategy.
Even though the existing systems will not be migrated, new cloud-native systems will be developed. As an architect working on that project, you would consider several challenges, including the following:
Understanding the volume and types of data exchanged
Deciding how to authenticate service requests
Encrypting data at rest and in transit
Managing encryption keys
Decoupling services to accommodate spikes in service demand
Designing ingestion and data pipelines
Monitoring and logging for service performance as well as security
Using multiregion storage and compute resources for high availability while operating within any regulations that put constraints on where data may be stored and processed
In addition to these technical design issues, the architect and business sponsors will need to determine how to retire existing on-premises systems while bringing the new systems online without disrupting services.
Helicopter Racing League
The Helicopter Racing League is highly focused on improving predictive analytics and integrating their findings with the viewer platform. Consider two types of analytics described in the case study: (1) viewer consumption patterns and engagement and (2) race predictions.
To understand viewer consumption patterns and engagement, the company will need to collect details about viewer behaviors during races. This will likely require ingestion systems that can scale to large volumes of data distributed over a wide geographic area. The ingestion system will likely feed a streaming analysis data pipeline (Cloud Dataflow would be a good option for this service), and the results of the initial analysis as well as telemetry data may be stored for further analysis.
In fact, the data may be stored in two different systems for further analysis. BigQuery is optimized for scanning large volumes of data and would make it a good choice for analyzing data that spans a race or multiple races and entails hundreds of terabytes of data. Bigtable provides low-latency writes and is highly performant for key-based lookups and small scans, such as time-series data for a single viewer over the past 10 minutes.
Mountkirk Games Systems Integration
Let's consider how datastores and microservices architectures can influence systems integration.
Online games, like those produced by Mountkirk Games, use more than one type of datastore. Player data, such as the player's in-game possessions and characteristics, could be stored in a document database like Cloud Datastore, while the time-series data could be stored in Bigtable, and billing information may be kept in a transaction processing relational database. Architects should consider how data will be kept complete and consistent across datastores. For example, if a player purchases a game item, then the application needs to ensure that the item is added to the player's possession record in the player database and that the charge is authorized by the payment system. If the payment is not authorized, the possession should not be available to the player.
Mountkirk Games uses a microservices architecture. Microservices are single services that implement one single function of an application, such as storing player data or recording player actions. An aggregation of microservices implements an application. Microservices make their functions accessible through application programming interfaces (APIs). Depending on security requirements, services may require that calls to their API functions are authenticated. High-risk services, such as a payment service, may require more security controls than other services. Cloud Endpoints may help to manage APIs and help secure and monitor calls to microservices.
TerramEarth Systems Integration
From the description of the current system, we can see that on-board applications communicate