Truth in IT
    • Sign In
    • Register
        • Videos
        • Channels
        • Pages
        • Galleries
        • News
        • Events
        • All
Truth in IT Truth in IT
  • Data Management ▼
    • Converged Infrastructure
    • DevOps
    • Networking
    • Storage
    • Virtualization
  • Cybersecurity ▼
    • Application Security
    • Backup & Recovery
    • Data Security
    • Identity & Access Management (IAM)
    • Zero Trust
    • Compliance & GRC
    • Endpoint Security
  • Cloud ▼
    • Hybrid Cloud
    • Private Cloud
    • Public Cloud
  • Webinar Library
  • TiPs
  • DRAW

Implementing Vault for Databricks Secret Management

HashiCorp
04/09/2026
0
0 (0%)
Share
  • Comments
  • Download
  • Transcript
Report Like Favorite
  • Share/Embed
  • Email
Link
Embed

Transcript


TL;DR

  • MiQ extended HashiCorp Vault from microservices to Databricks data pipelines to eliminate hard-coded secrets across 3,000+ daily jobs processing 30+ TB of data
  • Implementation included automated secret scanning with TruffleHog/GitLeaks, user-specific Vault folder structures, and a Python utility library for seamless authentication and secret retrieval
  • Custom integration with Studio (MiQ's low-code pipeline tool) provides inline secret detection and UI-based secret management, eliminating developer friction during migration
  • Weekly automated monitoring scans all Databricks workspaces and reports violations to owners and leads, preventing regression to insecure practices
  • Solution prioritized user experience by embedding Vault functionality directly into existing development workflows rather than requiring separate tools or processes

The Hard-Coded Secrets Challenge in Data Pipelines

MiQ, a programmatic media agency processing 30+ terabytes of data daily across 3,000+ Databricks jobs, faced a critical security challenge with hard-coded secrets scattered throughout their data pipelines. The company needed a platform-agnostic secret management solution that could scale across their Databricks workspaces while maintaining developer productivity. Rather than relying on Databricks' native secret engine, MiQ extended their existing HashiCorp Vault implementation from microservices to data pipelines, creating a unified secret management approach across their infrastructure.

Four-Phase Implementation Strategy

MiQ's solution involved capturing existing secrets using TruffleHog and GitLeaks scanners, building automated remediation workflows, developing a Python utility library for seamless Vault integration, and establishing continuous monitoring. The team created user-specific folder structures in Vault, automated the migration of hard-coded secrets, and built custom tooling to minimize disruption to data engineering workflows. Their approach prioritized user experience by integrating Vault directly into Studio, their in-house low-code/no-code pipeline development platform, eliminating the need for developers to context-switch between applications.

Automated Secret Management and Governance

The implementation includes a Python utility library that handles JWT token authentication, retrieves user-specific secrets from Vault, and integrates with AWS Secrets Manager for private key storage. MiQ enhanced their Studio platform with inline secret detection, preventing developers from saving code containing hard-coded credentials, and providing a UI-based workflow for moving secrets to Vault without writing additional code. Weekly automated scans across all Databricks workspaces generate reports identifying any new hard-coded secrets, with notifications sent to repository owners and team leads to maintain ongoing compliance and prevent regression to previous insecure practices.

Chapters

0:00 - Introduction and Speakers
1:05 - About MiQ and Programmatic Media
2:32 - How Programmatic Advertising Works
3:57 - Data Scale and Processing Stats
4:34 - Problem Statement: Hard-Coded Secrets
5:42 - Four-Phase Solution Approach
5:56 - Phase 1: Capturing Secret Statistics
6:40 - Phase 2: Fixing and Migration
7:41 - Phase 3: User Experience and Studio Integration
8:44 - Phase 4: Ongoing Monitoring
9:39 - Python Utility Library Architecture
11:05 - Studio UI Features and Inline Detection
12:45 - Weekly Monitoring Reports
14:01 - Q&A

Key Quotes

4:37 "Hard-coding the secrets directly into the data pipeline is a common but risky practice, which possesses several security and operational challenges, and which requires a secret managers to be used."
6:06 "We had secrets lying around across all our repositories. And we were coming across it, but we did not have any consolidated report on how many secrets are we talking about, what kind of secrets are we talking about, where is it lying, who owns it."
7:46 "This would have been quite disruptive if we had just asked user to move all their secrets to the new secret manager. And going forward also, asking them to change the way they have been doing their development by going to a new application, a new UI, making changes in their code."
12:10 "While you're typing your code at that time itself, inline, you'll get to know what all secrets you have added. And the validation will not allow you to save this unless we have moved this secret out of this code editor."
12:57 "That doesn't ensure that nobody's going to add, nobody's going to not add any secrets in their notebook. So we have this report, which is scheduled at a weekly basis, which scans all the notebooks across the MyQ and generates a report and share it with the respective owner."

Categories:
  • » Data Protection » Backup & Recovery
  • » Cybersecurity » Application Security
  • » Data Protection
Channels:
News:
Events:
Tags:
  • Data Protection
  • DevSecOps
  • Compliance & Governance
  • Technical Deep Dive
  • Customer Story
  • Best Practices
  • Secret Management
  • HashiCorp Vault
  • Databricks Security
  • Data Pipeline Governance
  • DevSecOps Automation
  • Secret Scanning
  • JWT Authentication
Show more Show less

Browse videos

  • Related
  • Featured
  • By date
  • Most viewed
  • Top rated
  •  

              Video's comments: Implementing Vault for Databricks Secret Management

              Upcoming Webinar Calendar

              • 04/30/2026
                10:00 AM
                04/30/2026
                Insights into SaaS Data Protection from the Keepit Annual Data Report 2026
                https://www.truthinit.com/index.php/channel/1868/insights-into-saas-data-protection-from-the-keepit-annual-data-report-2026/
              • 04/30/2026
                01:00 PM
                04/30/2026
                The New Economics of a VMware Exit
                https://www.truthinit.com/index.php/channel/1880/the-new-economics-of-vmware-exit/
              • 05/06/2026
                02:00 AM
                05/06/2026
                Detecting Cyber Attacks Before They Evolve Into Breaches with AI Insights
                https://www.truthinit.com/index.php/channel/1886/detecting-cyber-attacks-before-they-evolve-into-breaches-with-ai-insights/
              • 05/06/2026
                10:00 PM
                05/06/2026
                World Password Day: Strategies for Managing Your Passwords Effectively.
                https://www.truthinit.com/index.php/channel/1913/world-password-day-strategies-for-managing-your-passwords-effectively/
              • 05/07/2026
                05:00 AM
                05/07/2026
                World Password Day: Strategies for Managing Your Passwords Effectively.
                https://www.truthinit.com/index.php/channel/1914/world-password-day-strategies-for-managing-your-passwords-effectively/
              • 05/07/2026
                01:00 PM
                05/07/2026
                World Password Day: Strategies for Managing Your Passwords Effectively
                https://www.truthinit.com/index.php/channel/1915/world-password-day-strategies-for-managing-your-passwords-effectively/
              • 05/12/2026
                01:00 PM
                05/12/2026
                Transforming Black Box to Glass Box: Revealing Hidden Threats and AI Risks through Data Lineage
                https://www.truthinit.com/index.php/channel/1895/transforming-black-box-to-glass-box-revealing-hidden-threats-and-ai-risks-through-data-lineage/
              • 05/12/2026
                11:30 PM
                05/12/2026
                Implement Effective Strategies for Securing Active Directory and Minimizing Data Exposure
                https://www.truthinit.com/index.php/channel/1888/implement-effective-strategies-for-securing-active-directory-and-minimizing-data-exposure/
              • 05/13/2026
                01:00 AM
                05/13/2026
                Transforming the Black Box: Revealing AI Risks and Hidden Threats through Data Lineage
                https://www.truthinit.com/index.php/channel/1890/transforming-the-black-box-revealing-ai-risks-and-hidden-threats-through-data-lineage/
              • 05/13/2026
                05:00 AM
                05/13/2026
                Transforming Black Box to Glass Box: Revealing AI Risks and Hidden Threats through Data Lineage
                https://www.truthinit.com/index.php/channel/1894/transforming-black-box-to-glass-box-revealing-ai-risks-and-hidden-threats-through-data-lineage/
              • 05/19/2026
                01:00 PM
                05/19/2026
                Spring of Satori: A Deep Dive into 2026's Threat Landscape and Findings
                https://www.truthinit.com/index.php/channel/1930/spring-of-satori-a-deep-dive-into-2026s-threat-landscape-and-findings/
              • 05/21/2026
                11:00 AM
                05/21/2026
                The Autonomous Era: Orchestrating a Resilient Enterprise
                https://www.truthinit.com/index.php/channel/1372/the-autonomous-era-orchestrating-a-resilient-enterprise/
              • 05/27/2026
                04:00 AM
                05/27/2026
                Rivoluziona i rischi dell'AI in opportunità con Netskope AI Security
                https://www.truthinit.com/index.php/channel/1925/rivoluziona-i-rischi-dellai-in-opportunità-con-netskope-ai-security/
              • 05/28/2026
                10:00 AM
                05/28/2026
                Transforming AI from fantasy to purposeful management
                https://www.truthinit.com/index.php/channel/1924/transforming-ai-from-fantasy-to-purposeful-management/

              Upcoming Events

              • Apr
                30

                Insights into SaaS Data Protection from the Keepit Annual Data Report 2026

                04/30/202610:00 AM ET
                • Apr
                  30

                  The New Economics of a VMware Exit

                  04/30/202601:00 PM ET
                  • May
                    06

                    Detecting Cyber Attacks Before They Evolve Into Breaches with AI Insights

                    05/06/202602:00 AM ET
                    • May
                      06

                      World Password Day: Strategies for Managing Your Passwords Effectively.

                      05/06/202610:00 PM ET
                      • May
                        07

                        World Password Day: Strategies for Managing Your Passwords Effectively.

                        05/07/202605:00 AM ET
                        More events
                        Truth in IT
                        • Sponsor
                        • About Us
                        • Terms of Service
                        • Privacy Policy
                        • Contact Us
                        • Preference Management
                        Desktop version
                        Standard version