How we used an API-first approach and automation to fast track our SOC 2 and HIPAA compliance

Pranav Shikarpur

✍

This article was written by author Baruch Mettler

We’re excited to announce that we’ve been recognized in the 2023 CSO50 Awards! This award highlights security projects and initiatives that demonstrate outstanding business value and thought leadership. Congratulations to all of the winners! 🎉

We were recognized for our innovative approach to attaining a clean SOC 2 type II report and HIPAA compliance attestation within six months of project initiation. These are significant achievements for an early-stage startup and we’re sharing the details in this blog so others can benefit!

As a quickly-scaling, early-stage API-based security services start-up, it's all hands on deck, with team members doing as much as possible, as quickly and efficiently as possible. This includes everything from building our organization and products to ensuring we comply with complex security compliance requirements.

And, no one ever thinks of security compliance as quick…

Just add: innovation and automation

Even with our team’s deep security expertise, we knew when it came to SOC 2 and HIPAA compliance, we’d need to rely on our other strengths too: innovation and automation 💪. So we formed a small engineering team (called Project Craton: a geological term for Pangea-era continental crust 🌏 ) tasked with determining the most effective approach to achieving compliance; with a focus on limiting the operational demands that compliance audits can bring.

Consolidated evidence collection

Accurate and complete evidence collection is one of the key components of any compliance framework. However, compliance evidence can be scattered between various systems and tools which makes it challenging to show consistent historical activities. Here at Pangea, we need to look across CI/CD pipelines, vulnerability scanners, cloud logs, email systems, and alerting tools like AWS GuardDuty and GCP Command Center.

The most common approach to evidence gathering for compliance audits is typically some sort of audit log strategy or manual ticket creation. However, this can come with its own potential challenges to reliable implementation including:

Sensitive information leakage in the logging output. In this case, a good process can quickly turn into a security liability when PII or API tokens leak into audit evidence.
The potential manipulation of audit logs by unauthorized modifications or deletions.

Due to the sensitive nature of our API-based security components and security platform as a service (SPaaS) offering, we track every change made to our infrastructure. As an agile organization, this includes dozens of changes each day through our infrastructure-as-code pipeline.

To reliably and accurately log data to meet compliance obligations it’s important to ensure you have:

Guaranteed log retention windows: various compliance frameworks require logs to be retained for a specific period of time.
A central log sync that can be written to from various and diverse environments. Modern corporate networks are often isolated and exist across multiple CSPs; cloud accessible logging endpoints make it easy to record data.
A system that will not run out of capacity. Storing data on local disk or on physical servers can consume available disk and prevent reliable audit logs.

Fortunately for us, this was an opportunity to dogfood our Secure Audit Log for immutable, cryptographically verifiable audit logs and Redact service to eliminate sensitive data in any logs or retained data. A win-win for us 👏.

Three pronged approach:

So how did we satisfy SOC 2 and HIPAA compliance requirements while rapidly building our offerings and team? We applied code first techniques to compliance evidence collection. Much like SOAR techniques have helped security operations, we used an API-first approach to connect with security tooling. These simple python scripts are able to query infrastructure and record compliance relevant facts.

Our team created a unique three-pronged approach:

First, automation was essential. We parsed our tool output with custom code. The clean output generated ready artifacts for governance, risk and compliance teams to quickly reference during audits. Using job scheduling via cron, we were able to provide the consistency that compliance frameworks require while eliminating human variability.
Next, using a verifiable and secure audit log was vital. For this, we were able to dogfood Pangea’s Secure Audit Log service to record an audit trail of every Terraform change made to production environments. This trail contains the actual actions performed and includes both successful and unsuccessful actions. Any questions as to the validity, timing, and completeness of our audit logs can be answered with certainty through a cryptographic process called a Merkle Tree. Additionally, cryptographic proof of the entire audit trail is verifiable with an immutable public ledger.
Lastly, automatic redaction of potentially sensitive information was key. Our CI/CD pipelines contain sensitive information and there was a non-zero chance that we inadvertently could end up logging tokens, API keys, and other secrets. So, automatic redaction of potentially sensitive information was a necessity and gave us another dogfooding opportunity. Using our Redact service we were able to mask PII and internally sensitive items like tokens and API keys.

The results: by the numbers

We’re proud of the innovative and collaborative approach our team developed and implemented. They kicked off work in September 2022 and we successfully completed our SOC 2 Type I October 2022, with SOC 2 Type II finishing up in April 2023. Additionally we received a clean HIPAA compliance attestation in May 2023. You can learn more about our compliance initiatives in our trust center. Thanks to our project team of Akshay, Ruchika, Jimmy, Andres, Jason, and Baruch. 🙌

Within a few months of project commencement, our team

Identified and resolved 354 medium-critical vulnerabilities, using automated techniques to fully manage their task creation, assignment, and tracking in Jira.
Analyzed and filtered 249 DMARC reports for important attacker telemetry, giving our security team insight into attacker campaigns and infrastructure.
Collected and archived over 10,000 key logs and alerts.
We estimate over 100 hours of time saved in audit evidence collection work.
Additionally, several hundred more person-hours have been saved with automated processing of vulnerability information, alert collection, and event triage.

Do these challenges sound familiar? Do you have a question about how our approach? Let us know in the comments or join our Slack community and chat with our team and developer community there!