Medium1 markMultiple Choice
Subtask 5.2: ReliabilitySREReliabilitySLISLO
This question is part of a case study — click to read the full scenario(Case 11)

CASE STUDY: TerramEarth

Company Overview: TerramEarth manufactures heavy equipment. 2 million vehicles in the field.
Current Environment: Vehicles send telemetry via cellular. Processing 100,000 msgs/sec. On-prem Hadoop cluster.
Business Requirements: Predict equipment failure. Reduce warranty costs. Provide fleet dashboard.
Executive Statements: CEO: 'Monetize data.' CFO: 'Storage costs spiraling.' CTO: 'Need scalable ingestion and ML.'
Technical Requirements: Ingest 500,000 msgs/sec. Store petabytes cost-effectively. Train ML models. Real-time anomaly detection.
Constraints: Intermittent connectivity. Strict vehicle authentication.

QUESTION:
Which architecture should you design to handle the ingestion of 500,000 messages per second from vehicles with intermittent connectivity?

GCP PCA · Question 15 · Reliability

CASE STUDY: TerramEarth

Company Overview: TerramEarth manufactures heavy equipment. 2 million vehicles in the field.
Current Environment: Vehicles send telemetry via cellular. Processing 100,000 msgs/sec. On-prem Hadoop cluster.
Business Requirements: Predict equipment failure. Reduce warranty costs. Provide fleet dashboard.
Executive Statements: CEO: 'Monetize data.' CFO: 'Storage costs spiraling.' CTO: 'Need scalable ingestion and ML.'
Technical Requirements: Ingest 500,000 msgs/sec. Store petabytes cost-effectively. Train ML models. Real-time anomaly detection.
Constraints: Intermittent connectivity. Strict vehicle authentication.

QUESTION:
To ensure the reliability of the new ingestion pipeline, the operations team wants to implement SRE practices. How should they measure the performance of the ingestion API?

Answer options:

A.

Measure the CPU utilization of the underlying VMs and alert if it exceeds 80%.

B.

Define Service Level Indicators (SLIs) for API latency and error rate, and set a Service Level Objective (SLO) of 99.9%.

C.

Create a Service Level Agreement (SLA) with the development team to guarantee zero downtime.

D.

Use Cloud Profiler to continuously monitor the code execution time.

How to approach this question

Recall the core concepts of Google Site Reliability Engineering (SRE): SLIs, SLOs, and SLAs.

Full Answer

B.Define Service Level Indicators (SLIs) for API latency and error rate, and set a Service Level Objective (SLO) of 99.9%.✓ Correct
Define Service Level Indicators (SLIs) for API latency and error rate, and set a Service Level Objective (SLO) of 99.9%.
Site Reliability Engineering (SRE) relies on Service Level Indicators (SLIs) to measure the actual performance of a service (e.g., latency, error rate). A Service Level Objective (SLO) is the target value for that SLI (e.g., 99.9% of requests succeed in <200ms). This provides a quantifiable measure of reliability.

Common mistakes

Confusing system metrics (CPU/RAM) with SLIs, or confusing internal SLOs with external SLAs.

Practice the full GCP Professional Cloud Architect Practice Exam 1

50 questions · hints · full answers · grading

More questions from this exam