Medium1 markMultiple Choice
Subtask 3.1: Security DesignSecurityCloud DLPServerlessHIPAA
This question is part of a case study — click to read the full scenario(Case 11)

CASE STUDY: HealthData Inc

Overview:
Industry: Healthcare Analytics
Size: 1000 employees

Environment:

  • Co-located data center
  • Hadoop cluster
  • SFTP servers
  • 50 TB patient data

Requirements:

  • ML models for diagnostics
  • Secure data sharing portals
  • Break data silos

Exec Statements:

  • CEO: Need compute for ML.
  • CRO: HIPAA compliance is top priority.
  • CTO: Managed services needed to replace Hadoop.

Tech Reqs:

  • Strict HIPAA compliance
  • Automated PHI de-identification
  • Comprehensive audit logging
  • CMEK
  • Network isolation (no public internet)

Constraints:

  • US data sovereignty
  • 7-year retention (immutable)
  • Easy auditor access

QUESTION: To replace the on-premises Hadoop cluster with a managed service while minimizing migration effort, which GCP service should you recommend?

GCP PCA · Question 14 · Security Design

CASE STUDY: HealthData Inc

Overview:
Industry: Healthcare Analytics
Size: 1000 employees

Environment:

  • Co-located data center
  • Hadoop cluster
  • SFTP servers
  • 50 TB patient data

Requirements:

  • ML models for diagnostics
  • Secure data sharing portals
  • Break data silos

Exec Statements:

  • CEO: Need compute for ML.
  • CRO: HIPAA compliance is top priority.
  • CTO: Managed services needed to replace Hadoop.

Tech Reqs:

  • Strict HIPAA compliance
  • Automated PHI de-identification
  • Comprehensive audit logging
  • CMEK
  • Network isolation (no public internet)

Constraints:

  • US data sovereignty
  • 7-year retention (immutable)
  • Easy auditor access

QUESTION: How should you design the architecture to automate the de-identification of Protected Health Information (PHI) as data is ingested?

Answer options:

A.

Use Cloud Storage triggers to invoke a Cloud Function that calls the Cloud DLP API to de-identify data before moving it to BigQuery.

B.

Write a custom Python script on a Compute Engine VM that uses regex to find and replace PHI.

C.

Use BigQuery authorized views to hide columns containing PHI from analysts.

D.

Enable default encryption at rest (Google-managed keys) on the BigQuery dataset.

How to approach this question

Combine an event-driven ingestion mechanism with GCP's native sensitive data protection API.

Full Answer

A.Use Cloud Storage triggers to invoke a Cloud Function that calls the Cloud DLP API to de-identify data before moving it to BigQuery.✓ Correct
Use Cloud Storage triggers to invoke a Cloud Function that calls the Cloud DLP API to de-identify data before moving it to BigQuery.
The Cloud Data Loss Prevention (DLP) API is designed to inspect, classify, and de-identify sensitive data. By triggering a Cloud Function when new data arrives in Cloud Storage, you can automatically run the data through DLP to mask PHI before it ever reaches the analytics data warehouse (BigQuery).

Common mistakes

Confusing access control (Authorized Views) with actual data de-identification.

Practice the full GCP Professional Cloud Architect Practice Exam 6

50 questions · hints · full answers · grading

More questions from this exam