wall street journalShe’s Self-Taught and In Demand, but Crunching All That Code Is a ‘Slog’
CUPERTINO, Calif.—In many ways, Anne West is a typical 17-year-old California teenager. She wears her hair long. She likes to hang out with her friends. She went to the prom.
She is also analyzing her family’s genome.
Having being diagnosed with a pulmonary embolism in 2003, Anne’s father John decided last year to get the family’s genes sequenced. The process involves an advanced technology that spews out the six billion letters that represent the makeup of a person’s genetic code. But after putting up $160,000 to get the four-member family tested, the Wests realized something: sifting through the reams of data was tougher than they ever imagined.
Whole genome sequencing generates so much information that most companies just send clients reams of raw data but don’t offer a way to interpret the results. This doesn’t usually matter, since most clients are scientists working in academic labs where they can assign legions of Ph.D.s and use sophisticated computer programming to do the work themselves.
The Wests have Anne. She hasn’t studied computer languages, and her only potential colleague is her 16-year-old brother, Paul, who just started high-school biology. So Anne decided to do it the old-fashioned way: by hand.
On the family computer, she downloaded a Microsoft Excel spreadsheet to help her sort the data. While other kids were out at the mall or the beach, Anne was painstakingly comparing the sequences of letters that represent the nucleotides that make up the DNA that makes up the tens of thousands of genes in the human body.
To get an idea of the magnitude of the task, Nathan Pearson, director of research at the genomics company Knome Inc., says to think of the human genome as a huge book. “If you got an auctioneer to read out loud someone’s genome at six letters every second, it would take 34 years to finish,” he says.
Or, as Anne says: “It’s a bit of a slog.”
The idea to study the family genome first came up over dinner last fall. As an engineer, Mr. West had spent much of his career at companies that make the machines for DNA sequencing, including a stint as CEO of Solexa Inc, which was bought by Illumina Inc., the company that sequenced the West family’s genome.
The Wests started small. After the genetic-testing company 23andMe Inc. launched in 2006, Illumina offered a break for employees who wanted to get a small portion of their genome analyzed. The Wests signed up in 2007 and gave 23andMe genetic testing as Christmas presents to family members.
From that experience, Mr. West learned that Anne also inherited the mutation on the Factor V gene that increases the risk of a pulmonary embolism. He wanted to know more. By last year, whole genome sequencing was not only being offered to individuals, but the price had dropped enough that Mr. West raised the idea at dinner. “Dad got curious about the whole family,” Anne says.
Biology has been a passion of hers since the fifth grade. She works at the family computer in a room with a view outside to the pool. “I haven’t been swimming in a while,” she says, glancing outside.
Getting started, she knew it wasn’t possible to analyze the meaning behind every one of the six billion data points that comprise her own DNA. So Anne decided to narrow down her search. 23andme had already looked at some places on 571,000 pieces of Anne’s genetic code, a swath of the genome where researchers have already identified certain genes that increase health risks. Anne downloaded this same section of the genome for herself, her brother, and her parents into the family computer. She set up an Excel spreadsheet, with a column for each person.
She then created a series of formulas, using Excel’s help function when she got stuck. One formula separated which portions of genetic data came from Mr. West’s side of the family and which came from Mrs. West’s side. That information helped her to plot graphs to see how genetically close she and her brother are to each other, and to each of their parents.
The next step was to look more closely at one of the 20,000 genes we all have. Anne focused on the Factor V gene. She inherited the same mutation that her father has, putting her at increased risk for embolisms.
She decided to see if she had any other additional mutations on the same gene from her mother’s side of the family—something that might increase her risk even more. She discovered that she does have other mutations, but that researchers don’t believe they will cause any health issues.
It took six months for Anne to get through just one gene. “There is a lot of technical jargon,” she says. “It was a lot for me to learn.” The project is complicated by time—Anne can work mainly on weekends, after practice for the school cross-country team.
Anne says she doesn’t talk about her efforts with friends. And they say their own geeky interests tend to differ. “Anne is a bio nut,” explains her friend Erica Hansen, “and I’m reading super string theory.”
In the world of genetics, though, Anne is becoming something of a celebrity. After a talk at a personal-genetics conference in Boston last May, Anne was “mobbed,” says Knome’s Dr. Pearson, who heard her talk. Scientists gave her business cards.
Nervous about presenting to such a high-powered audience, Anne printed out every slide and practiced the presentation twice a day with her dad. “I felt a wave of self-consciousness” about the attention, she says.
Still, the talk led to a job offer. Over the summer, Anne worked in the lab of George Church, a prominent scientist at Harvard and MIT. She is now at work on a paper based in part on her family’s data, with researchers from a Seattle institute. Last month, she was a speaker on a panel at a personal-genomics conference held at Cold Spring Harbor, New York, a scientific mecca.
In the coming months, Anne wants to keep working on the family DNA.
But she’s also determined to find time to achieve a more elusive goal. “I still haven’t taken driver’s ed,” she says.
Link to article >>