Chloe Hsu

This is my third year as a PhD student at UC Berkeley, advised by Jennifer Listgarten and Moritz Hardt. The PhD journey is made possible with support from wonderful people at Berkeley and beyond, as well as from the Berkeley Graduate Fellowship, the LLNL Secure Biosystems Design SFA, and the Microsoft Research PhD Fellowship.

My research contributes to the machine learning basis for the design, engineering, and interpretation of proteins. More recently, immunology and immune repertoire data deeply fascinate me. A motivation close to my heart is the understanding and healing of immune system disorders.

One topic I'm curious about is the intersection between immunology and women's health. For most autoimmune diseases, there is a higher prevalence amongst women. Many autoimmune disorders tend to affect women during periods of extensive stress, such as pregnancy, or during a great hormonal change. Additionally, I'm also curious about the immune system's role in the healthy gut and in digestive disorders.

Prior work experience as machine learning engineer at Google Brain and Google Health (2018-2019) helped shape my interests in human health. Deep gratitude also goes towards Chris Umans and Peter Schröder for their kind and inspiring mentorship during my time at Caltech (BS 2018). In 2021, I had a fun internship with Adam Lerer and the Protein team (led by Alex Rives and Tom Sercu) at Facebook AI Research.

In the student community, I currently serve as a Seminar Chair for the UC Berkeley Computer Science Graduate Entrepreneurs (CSGE) and as the lead editor for the Berkeley AI Research Blog.

Email  /  Google Scholar  /  Twitter

profile photo

Research

project image

Learning inverse folding from millions of predicted structures


Chloe Hsu, Robert Verkuil, Jason Liu, Zeming Lin, Brian Hie, Tom Sercu, Adam Lerer*, Alexander Rives* (Equal contribution*)
bioRxiv, 2022
paper | code | colab notebook

Inverse folding aims to designs sequences to fold into desired structure. We augment training data by nearly three orders of magnitude by predicting structures for 12M protein sequences using AlphaFold2. Trained with this additional data, our new inverse folding model more accurately designs sequences to fold into desired structure, while also generalizing to a variety of more complex tasks including design of protein complexes, partially masked structures, binding interfaces, and multiple states.

project image

Learning protein fitness models from evolutionary and assay-labeled data


Chloe Hsu, Hunter Nisonoff, Clara Fannjiang, and Jennifer Listgarten
Nature Biotechnology, 2022
paper | talk | code

A simple combination approach to protein fitness prediction, learning from both (unlabeled) evolutionarily related protein sequences and variant protein sequences with experimentally measured labels. Also an analysis that highlights the importance of systematic evaluations and sufficient baselines.


Design and source code from Jon Barron's website and Leonid Keselman's Jekyll fork.