Paths to the Future: A Year at Google Brain

I am currently a PhD student at Stanford, studying optimization and machine learning with Stephen Boyd, but from 2017 through 2018 I was a software engineer on the Google Brain team. I started three months after receiving a Master’s degree in computer science (also from Stanford), having just spent the summer working on a research project—a domain-specific language for convex optimization. At the time, a part of me wanted to continue working with my advisor, but another part of me was deeply curious about Google Brain. To see a famous artificial intelligence (AI) research lab from the inside would, at the very least, make for an interesting anthropological experience. So, I joined Google. My assignment was to work on TensorFlow, an open-source software library for deep learning.

From 2017-2018, I worked as an engineer at Google Brain, Google’s AI research lab. Brain sits in Google’s headquarters in Mountain View, CA. Photo by Robbie Shade, licensed CC BY 2.0.

Brain was a magnet for Google’s celebrity employees. For the past few years, Google’s CEO Sundar Pichai (who believes AI is “more profound than electricity or fire”) has emphasized that Google is an “AI-first” company, with the company seeking to implement machine learning in nearly everything do. In a single afternoon, in the team’s kitchenette, I saw Pichai, co-founder Sergey Brin, and Turing award winners David Patterson and John Hennessey.

I didn’t work with these celebrity employees, but I did get to work with some of the original TensorFlow developers. These developers gave me guidance when I sought it and habitually gave me more credit than I deserved. For example, my coworkers let me take the lead in writing an academic paper about TensorFlow 2, even though my contributions to the technology were smaller than theirs. The unreasonable amount of trust placed in me, and credit given to me, made me work harder than I would have otherwise.

The culture of Google Brain reminded me of what I’ve read about Xerox PARC. During the 1970s, researchers at PARC paved the way for the personal computing revolution by developing graphical user interfaces and producing one of the earliest incarnations of a desktop computer.

PARC’s culture is documented in The Power of the Context, an essay written by PARC researcher Alan Kay. Kay describes PARC as a place where senior employees treated less experienced ones as “world-class researchers who just haven’t earned their PhDs yet” (similar to how my coworkers treated me). Kay goes on to say that researchers at PARC were self-motivated and capable “artists,” working independently or in small teams towards similar visions. This made for a productive environment that at times felt “out of control”:

A great vision acts like a magnetic field from the future that aligns all the little iron particle artists to point to “North” without having to see it. They then make their own paths to the future. Xerox often was shocked at the PARC process and declared it out of control, but they didn’t understand that the context was so powerful and compelling and the good will so abundant, that the artists worked happily at their version of the vision. The results were an enormous collection of breakthroughs, some of which we are celebrating today.

At Brain, as at PARC, researchers and engineers had an incredible amount of autonomy. They had bosses, to be sure, but they had an a lot of leeway in choosing what to work on — in finding “their own paths to the future.” (I say “had”, not “have”, since I’m not sure whether Brain’s culture has changed since I’ve left.)

I’ll give one example: a few years ago, many on the Google Brain team realized that machine learning tools were closer to programming languages than to libraries, and that redesigning their tools with this fact in mind would unlock greater productivity. Management didn’t command engineers to work on a particular solution to this problem. Instead, several small teams formed organically, each approaching the problem in its own way. TensorFlow 2.0, Swift for TensorFlow, JAX, Dex, Tangent, Autograph, and MLIR were all different angles on the same vision. Some were in direct tension with each other, but each was improved by the existence of the other—we shared notes often, and re-used each other’s solutions when possible. It’s totally possible that many of these tools might not become anything more than promising experiments, but it’s also possible that at least one will be a breakthrough.

TF 2.0, Swift for TensorFlow, and JAX, developed by separate sub-teams within Google Brain, are different paths to the same vision—an enjoyable, expressive, and performant programming language for machine learning.

I would guess that the PARC-like context that Brain operated in was instrumental in bringing about the creation of TensorFlow. In late 2015, Google open-sourced TensorFlow, making it freely available to the entire world. TensorFlow quickly became enormously popular. Instructors at Stanford and other universities used it in their curricula (my friend Chip Huyen, for example, created a Stanford course called TensorFlow for Deep Learning Research), researchers across the world used it to run experiments, and companies used it to train and deploy models in the real world. Today, TensorFlow is the fifth most popular project on Github out of the many millions of public software repositories available on it, as measured by star count.

And yet, at least for TensorFlow, Google Brain’s hyper-creative, hyper-productive, and “out of control” culture was a double-edged sword. In the process of making their own paths to a shared future, TensorFlow engineers released many features sharing a similar purpose. Many of these features were subsequently deemphasized in favor of more promising ones. While this process might have selected for good features (like tf.data and eager execution), it frustrated and exhausted our users, who struggled to keep up.

A Google TPU, a hardware accelerator for machine learning. While at Brain, I created a custom TensorFlow operation that made it easier to load balance computations across TPU cores.

Brain differed from PARC in at least one way: unlike PARC, which infamously failed to commercialize its research, Google productionized projects that were incubated in Brain. Examples include Google Translate, the BERT language model (which informs Google search), TPUs (hardware accelerators that Google rents to external clients, and uses internally for a variety of production projects), and Google Cloud AI (which sells AutoML as a service). In this sense Google Brain was a natural extension of Larry Page’s desire to work with people who want to do “crazy world-breaking things” while having “one foot in industry” (as Page stated in an interview with Walter Isaacson.)

Leaving Google Brain for a PhD was difficult. I had grown accustomed to the perks, and I appreciated the team’s proximity to research. Most of all I loved working alongside a large team on TensorFlow 2.0—I’m passionate about building better tools, for better minds. But I also love the creative expression that research provides.

I’m often asked why I didn’t simply involve myself in research at Brain, instead of enrolling in a PhD program. Here’s why: the zeitgeist had little room for topics other than deep learning and reinforcement learning. Indeed, in 2018, Google rebranded “Google Research” to “Google AI,” redirecting research.google.com to ai.google.com. (The rebranding understandably raised some eyebrows. It appears that change was quietly rolled back sometime recently, and the Google Research brand has been resurrected.) While I’m interested in machine learning, I’m not convinced that today’s AI is anywhere near as profound as electricity or fire, and I wanted to be trained in a more intellectually diverse environment.

In fact, most of my mentors at Brain encouraged me to enroll in the PhD program. Only one researcher strongly discouraged me from pursuing a PhD, comparing the experience to “psychological torture.” I was so shocked by his dark warning that I didn’t ask any follow-up questions, he didn’t elaborate, and our meeting ended shortly afterwards.

These days, in addition to machine learning, I’m interested in in convex optimization, a branch of computational mathematics concerned with making optimal choices. Convex optimization has many real-world applications—SpaceX uses it to land rockets, self-driving cars use it to track trajectories, financial companies use it to design investment portfolios, and, yes, machine learning engineers use it to train models. While well-studied, as a technology, convex optimization is still young and niche. I suspect that convex optimization has the potential to become a powerful, widely-used technology. I’m interested in doing the work—a bit of math and a bit of computer science—to realize its potential. My advisor at Stanford, Stephen Boyd, is perhaps the world’s leading expert on applications of convex optimization, and I simply could not pass up an opportunity to do useful research under his guidance.

SpaceX solves convex optimization problems onboard to land its rockets, using CVXGEN, a code generator for quadratic programming developed at Stephen Boyd’s Stanford lab. Photo by SpaceX, licensed CC BY-NC 2.0.

It’s been just over a year since I left Google and started my PhD. Since then, I’ve collaborated with my lab to publish several papers, including one that makes it possible to automatically learn the structure of convex optimization problems, bridging the gap between convex optimization and deep learning. I’m now one of three core developers of CVXPY, an open-source library for convex optimization, and I have total creative control over my research and engineering projects.

There are many things about Google Brain that I miss, my coworkers most of all. But now, at Stanford, I get to collaborate with and learn from an intellectually diverse group of extremely smart and passionate individuals, ranging from pure mathematicians, electrical and chemical engineers, physicists, biologists, and computer scientists.

I’m not sure what I’ll do once I graduate, but for now, I’m having a lot of fun—and learning a ton—doing a bit of math, writing papers, shipping real software, and exploring several lines of research in parallel. If I’m very lucky, one of them might even be a breakthrough.

16 Comments

  1. The journey at so-connected yet diverse places/paradigms is wonderfully captured! You do have a gift of writing. Not many STEM aficionados tend to love creative writing but that is from where all Science has originated (You may like to read https://aeon.co/essays/bring-back-science-and-philosophy-as-natural-philosophy). Glad you mentioned that you get to interact with people from various domains while pursuing your research. While we tend to specialize so much which is unavoidable for solving complex problems, it is also super important to find ways to be inter-disciplinary especially when you wish to address some really high-stakes challenges. All the best CVXPY and keep posting updates from your journey ahead!

  2. So awesome! Great read, awesome to learn about convex optimization. As a freshman in undergrad, it’s great to still be able to explore my career and degree options, despite being online. Currently thinking of majoring in CSE, maybe focusing on ML/AI because the physical sciences interest me too and that seems like a possible link between CS and the real (physical-science) world.
    Do you know of any other CS fields which link physical engineering problems and computation, perhaps like convex optimization’s application in SpaceX?
    I’d love to hear about any keywords/fields if you know of something.

    • Hi Shilpa,

      I can think of a few subfields that might meet your criteria. Robotics is the obvious one. For robotics, tools like mathematical optimization, control (that’s another keyword), and machine learning will be very useful.

      Electrical engineering is another example. It’s a nice application of E&M & quantum mechanics (used in transistors, solid state drives, etc.), plus computer science. Exciting areas in EE right now include quantum computers and custom accelerators (like TPUs). I’m sure there are many others.

      You might also consider exploring computer graphics. That’s not a “physical” application, but it is an application in which the end-product is very tangible. For graphics, you’ll need linear algebra, calculus, and basic optimization.

      There’s also quantum computing. Likely not a viable career option in 4-5 years, unless you’re interested in research positions (in either academia or industry). For quantum computing, again, you’ll need linear algebra, calculus, optimization, and also probability.

      Here’s my recommendation: don’t focus narrowly on machine learning, or any other skill/field. Instead, focus on picking up a broad range of skills.

      Computer science fundamentals are a must. No matter which application you choose, being proficient in programming will pay dividends. If you focus on applications in the physical sciences, experience with computer systems and embedded-systems might also help.

      Make sure to also develop a strong foundation in math fundamentals: linear algebra, calculus, and probability are the essentials. Once you’re comfortable with these three, you can also pick up convex optimization. If you develop all these skills, you’ll easily be able to jump into any (quantitative) application you like.

      Hope that helps!

  3. Thank you for sharing your experience. I think you have pursued a unique life-arc. As someone beset with a similar question – would you say that part of your decision was motivated by the need to do original research and intellectual freedom in choosing the problems you’d like to work on?
    You mentioned proximity to research at one point and your wish to achieve a breakthrough. Did you feel you needed a PhD to move from this periphery to the center. Being a creator than a follower?

    • Hi Apurv,

      > Did you feel you needed a PhD to move from this periphery to the center. Being a creator than a follower?
      No, I don’t think so. As a graduate student, I’m surely still not at the center. And I was a creator at Google as well, just working on different kinds of things. Had I stayed at Google, I suspect I would have been happy there.

      My decision to start a PhD was more about the kinds of things I would get to work on at Stanford, and training under the supervision of my advisor.

      Hope that helps!
      -Akshay

  4. This might prove to be a very stupid question, but i am wondering , is it possible to do a Masters being advised by a Researcher at a technology company, Google Brain?
    Similar to how research labs work, it is obvious that there is high intersection between researchers at Google and Stanford, for example Sebastian Thun is Adjunct Professor at Stanford University but also @Google

    • Things like that happen, but they are rare and ad-hoc. They typically require (I think) very good relationships between both the university & the company/some employees thereof.

    • Thanks Akshay, that sounds very interesting and It would be great to see more of such programs. On another topic, I have been attending the Google Tensorflow Quantum open meetings that are open to the public, seeing the process on how they develop the software is just amazing to witness, the is a great attention to detail about increasing access and listening to the audience using Tensorflow.
      Can this library(CVXGEN) be used for finding the convex-hull in R^n?
      https://docs.google.com/document/d/1saB5LSFGomllY10ZddHaoyNF2KtzMBqH4yXxEri8IWs/edit

  5. Very nice, well written and refreshing experience to read about. I find it super motivating. Best of wishes for the pursue of your PhD, I think you took a wise choice!

    Thanks for sharing 🙂

  6. I always had this one question for people
    Working in academia, is there any chance to contribute/ be part of research and learn by being an outsider?

  7. Your article puts a great deal of optimism and hope on PhD life. I am so glad to hear that. I started my PhD in the field of wireless comms, almost at same time as you did. But unfortunately I tend to see my PhD life as half empty that half full.
    I have struggled to find a good research problem and solutions are often based on tuning the assumptions in the problem to tools available. Does this happen to you too? Do you rely on your guide to give you specific problems to work with? I wish that was the case…
    Nonetheless, I haven’t given up yet. Convex optimization is going to flourish and hopefully CVXPY will flourish with it. 🙂

    Best,
    Teenvan

    • Identifying what research problem to work on is more than half the battle.

      In my case, during the first year of my PhD, my advisor helped me considerably in choosing what to work on. We started by working on small, manageable research projects that were well-suited to my interests and skills.
      These projects helped build my confidence as a researcher, and helped me develop a taste for the kind of research I was interested in.

      I now have enough research experience that I do not need to rely on my advisor in choosing what to work on, though I do still brainstorm ideas with him (and, of course, collaborate on projects with him).

      An alumnus of our lab gave me some helpful advice when I first started my PhD. I’ll share it with you; I hope it’s helpful.

      Here it is. Don’t spend too much time worrying about whether a research project is novel or groundbreaking. Agonizing over the topic will slow your development as a researcher. Instead pick bite-sized topics you’re interested in and that align with your skills, and write papers on these topics.

      By writing several papers quickly, you’ll develop (1) confidence and (2) taste. Confidence lets you tackle new problems without being held back by doubt, and taste is what you use to decide what problems to work on in the first place. By going through the process of writing papers several times, you’ll also hone your skills as a writer, which are extremely important for a PhD.

      (This advice is applicable broadly, not just in PhDs. It’s often told in the form of a story, involving a ceramics class. See https://blog.codinghorror.com/quantity-always-trumps-quality/)

      I hope this advice is helpful. I wish you the best of luck!

Leave a Reply

Your email address will not be published. Required fields are marked *