Medium1 markMultiple Choice
Domain 2: Managing and Provisioning a Solution InfrastructureDomain 2Cloud BigtableSchema DesignCase Study
This question is part of a case study — click to read the full scenario(Case 16)

CASE STUDY: AutoMakers Inc

Company Overview:
AutoMakers Inc is a global vehicle manufacturer. They have recently launched a line of connected cars.

Current Technical Environment:

  • 1 million connected cars currently on the road
  • Cars send telemetry data (speed, engine temp, location) every 5 seconds
  • Current on-premises MQTT brokers are crashing under the load

Business Requirements:

  • Enable predictive maintenance to alert drivers before parts fail
  • Provide real-time fleet tracking for commercial customers
  • Support over-the-air (OTA) software updates

Executive Statements:

  • CEO: "Data is our new revenue stream. We need to monetize this telemetry data."
  • CTO: "We expect to have 10 million connected cars in 3 years. The architecture must scale infinitely without manual intervention."
  • CFO: "The cost of ingesting and storing this data must be strictly controlled. We cannot pay for idle capacity."

Technical Requirements:

  • Ingest up to 100,000 messages per second
  • Low-latency processing for real-time alerts
  • Time-series data storage for historical analysis
  • Handle variable network connectivity (cars driving through tunnels)

Constraints:

  • Strict budget for data ingestion
  • Small data engineering team

QUESTION:
To meet the CTO's requirement for infinite scaling and the technical requirement to ingest 100,000 messages per second, which ingestion and processing pipeline should you design?

GCP PCA · Question 17 · Domain 2: Managing and Provisioning a Solution Infrastructure

CASE STUDY: AutoMakers Inc

Company Overview:
AutoMakers Inc is a global vehicle manufacturer. They have recently launched a line of connected cars.

Current Technical Environment:

  • 1 million connected cars currently on the road
  • Cars send telemetry data (speed, engine temp, location) every 5 seconds
  • Current on-premises MQTT brokers are crashing under the load

Business Requirements:

  • Enable predictive maintenance to alert drivers before parts fail
  • Provide real-time fleet tracking for commercial customers
  • Support over-the-air (OTA) software updates

Executive Statements:

  • CEO: "Data is our new revenue stream. We need to monetize this telemetry data."
  • CTO: "We expect to have 10 million connected cars in 3 years. The architecture must scale infinitely without manual intervention."
  • CFO: "The cost of ingesting and storing this data must be strictly controlled. We cannot pay for idle capacity."

Technical Requirements:

  • Ingest up to 100,000 messages per second
  • Low-latency processing for real-time alerts
  • Time-series data storage for historical analysis
  • Handle variable network connectivity (cars driving through tunnels)

Constraints:

  • Strict budget for data ingestion
  • Small data engineering team

QUESTION:
When designing the Cloud Bigtable schema for the telemetry data, how should you structure the row key to prevent hotspotting and allow efficient querying of a specific car's history?

Answer options:

A.

Use a row key format of [timestamp]#[car_id].

B.

Use a row key format of [car_id]#[reversed_timestamp].

C.

Use a completely random UUID for the row key.

D.

Use an auto-incrementing integer as the row key.

How to approach this question

Bigtable row key design is critical. Avoid monotonically increasing values (like timestamps) at the start of the key. To query time-series data efficiently, put the identifier first, then a reversed timestamp.

Full Answer

B.Use a row key format of `[car_id]#[reversed_timestamp]`.✓ Correct
Use a row key format of `[car_id]#[reversed_timestamp]`.
In Cloud Bigtable, data is sorted lexicographically by row key. If you start a key with a timestamp, all writes hit a single node, causing a hotspot. By starting with the `car_id`, writes are distributed. By appending a reversed timestamp (e.g., `Long.MAX_VALUE - timestamp`), the newest data is sorted first, optimizing reads for the 'real-time fleet tracking' requirement.

Common mistakes

Choosing option A. While it seems logical to sort by time, it is the classic Bigtable anti-pattern that causes write hotspots.

Practice the full GCP Professional Cloud Architect Practice Exam 3

50 questions · hints · full answers · grading

More questions from this exam