Q1. Over the past few years you have led the effort to bring Facebook's production engineering group and software engineering teams together to better address business requirements particularly around data security. What's been the most challenging part doing this? What lessons have you learn along the way?
It's not uncommon in the industry to see operations and software engineering to be two fairly independent disciplines within tech companies. The software engineering (SWE) teams often focus solely on writing code while relying on a centralized operations team to run services somewhat independently. In the last few years, companies are increasingly looking to adopt this model with "Secure Devops" and that makes a lot of sense. My goal at Facebook, from the very start, has been bringing these teams as close together as possible and making sure their values, roles and responsibilities are aligned across the board.
I strongly believe that operations isn't just one team's responsibility — it's everyone's. From the software engineers who are building the software to the embedded production engineers (PEs) that deploy and run the services, to the leaders that determine priorities. Everyone should be engaged in building reliable, feature-rich services in a holistic way.
The same applies to securing our infrastructure and services. The most rewarding and impactful work my team is focused on is ensuring security is built into services and products across the entire design lifecycle. Prioritizing software engineers being involved in how their software is deployed, how it's stress-tested, how it recovers during an outage and how it can be exploited from a security perspective, naturally leads to technology products that are a lot more resilient. As Facebook has grown, the operations and security teams have become more integrated with the software teams and this has enabled us to scale out our infrastructure with stronger automation and resiliency.
More specifically, to ensure engineers can keep up with their daily responsibilities and ensure they're securing our products, we invest heavily in building the coding frameworks that provide engineers with built-in safeguards as they write code and automate testing tools that can inspect code and find security errors at scale and as quickly as possible. For example, our security team built abstractions into code to remove full classes of issues, like XSS and CSRF vulnerabilities. What we've learned, sometimes the hard way is that we need to work closely together to prevent and solve these problems instead of depending on the security team to fix other teams' errors.
Q2. How has Facebook's operations team needed to evolve over the years to ensure that Facebook's products are available 24x7, in a secure way, to the tens of millions of users who hit its infrastructure on a daily basis?
Just like with operations, security is never just one team's mission. Given how essential security has become to every technology, we believe that every team has a responsibility to protect the systems they build. The traditional approach of centralized responsibility for security — long dominant in the technology industry — is due for a change. To enable this change, we need to examine the way we, as a community, enable data protection and shift from using audits, checkbox compliance, or bureaucracy as a crutch for getting security done.
Our teams at Facebook have had to evolve to empower our software teams to take ownership and responsibility for security and operations to enable data protection at scale. We've taken a defense-in-depth approach to ensure our software teams are as efficient as possible and continue to innovate while using multi-tiered security frameworks to catch bugs through code reviews, static analysis, bug bounties and rigorous pen-testing. We invest in talking to product and other teams about how their services could be exploited so we can work with them to implement secure frameworks or re-architect systems to ensure their resilience. We build these frameworks and tools that help engineers to easily integrate things like service authentication and authorization with minimal security team involvement.
We work on evangelizing the security mindset across our software teams — through software development frameworks and abstractions, and also through training and fun initiatives like Hacktober. Every October is our Security Awareness month, which means we run CTF competitions and smaller scoped Red Team exercises. Employees earn points and swag in addition to bragging rights when they solve these puzzles or discover and report rogue authentication pushes. This helps us raise security awareness across our entire company and exercise the organizational muscles we would need during a real incident. We know that sophisticated adversaries will always be interested in services like ours, so we continually examine our internal processes to ensure that our security operations and technology keep pace to combat future threats.
Q3. Why is it important for Facebook to be at security event like Black Hat Asia 2019? What is your main focus at the event?
Everyone in the industry is working towards the same goal of making the Internet more secure. Many of us face the same security challenges and that's why communities like Black Hat are so important. To us, this is an opportunity to share our insights and discoveries around threats with one another so we can collectively get better at defending against sophisticated adversaries.
We want to hear from our industry colleagues about the practices and methods they use to protect people online so we can learn from them and apply those lessons to our own security work. This type of open sharing has played a key role in our security operations, including through our Bug Bounty program, one of the longest running in the industry. We're committed to working with the security researcher community on improving the security of our infrastructure so we can get faster at finding, fixing and preventing bugs.
We've also long invested in sharing our insights, tools and technology with our infosec peers to help shore up our collective defenses against evolving threats. This continued collaboration is important now more than ever as we all increasingly rely on interconnected technologies in our daily lives.