Berkeley - In a study led by researchers at the University of California, Berkeley, and the Salk Institute in La Jolla, Calif., scientists have accurately mapped the genes of the common mustard weed, Arabidopsis. The achievement may lead to the next generation of genetically modified crops that can grow faster, produce more food and resist disease.
The study, which appears in the Oct. 31 issue of Science, reveals the existence of nearly 6,000 genes, about one-third of the genes that exist in Arabidopsis. Knowing these genes and how they work can allow researchers - in a short period of time - to use them to change the characteristics of other plants.
"Arabidopsis has all the genes a plant needs," said Joe Ecker, Salk professor of plant biology. "All flowering plants are closely related, and so the genes that encode various traits are also shared. It's possible, then, to take a gene for flowering from Arabidopsis and insert it into rice or poplar, and have that gene function."
Ecker and Athanasios Theologis, adjunct professor at UC Berkeley's College of Natural Resources and senior scientist at the Plant Gene Expression Center, are the principal investigators on the project, which includes a team of 72 scientists from nine institutions in the United States and Japan. The Plant Gene Expression Center is a collaboration between UC Berkeley's Department of Plant and Microbial Biology and the USDA's Agricultural Research Service.
The findings revealed some shortcomings of computer-based gene prediction programs, including those that have been used to sequence the human genome and the Arabidopsis plant - the plant biologists' equivalent of the fruit fly for genetics research.
The researchers point out that computer algorithms can't always distinguish whether a piece of code corresponds to a single gene or to two overlapping genes. And while the programs have become increasingly accurate in recent years, the researchers added, computer programs may still put genes' parts in the wrong places, find genes that aren't really there, or miss genes altogether. What researchers say they often get from an initial sequence of a genome is a "best-estimate" lineup of transcription units.
To get the real picture of what's there and what's not, researchers say they need empirical, experimental verification.
The research team placed the entire Arabidopsis genome, consisting of about 25,000 suspected genes, on a series of six gene chips, and then analyzed the chips for any protein-making activity, the primary function of genes. They isolated one-third of the plant's genes, which will be publicly available for researchers to fix errors in the current blueprint of the genome. In addition to finding shortcomings in the much-heralded, computerized methods of sequencing a genome, they discovered about 3,300 functioning genes for the first time.
"By putting the entire genome on the gene chips, we could find that what the computers predicted as genes were wrong about a third of the time," said Ecker. "But we also found other genes we had not seen before. Genetically, plants are much simpler than animals, so this information can be used almost immediately to improve crop yields and disease resistance."
"We eventually want to be able to understand the function of all the proteins within an organism," said Theologis. "If you know the correct gene structure, you can clone DNA to express and study proteins. This type of research eventually will lead to advances in proteomics."
Many of the researchers on this study were part of the team that sequenced the genome of Arabidopsis nearly three years ago. The initial genome work and the current research are funded by the National Science Foundation (NSF), which established a project to identify an entire plant genome by 2010.
"The technology used in this research will be able to reveal the dark matter in a genome," said Theologis. "We will be able to identify never-before-seen RNA in regions that were once thought to contain no genes. Researchers could also use this method to get a more definitive answer to how many genes are in the human genome."
"Finding the genes that lurk in the DNA sequence sounds like an easy problem, but in fact is tremendously challenging," said Robert Last, program director of the NSF's plant genome research program. "Completion of the DNA sequence of a genome such as Arabidopsis is an important milestone towards understanding the function of every gene in the plant, and discovering the genes that can positively influence the productivity, nutritional and medical value of the plant to human beings."