During my PhD thesis, I have made an attempt to distangle the evolution of an important protein family in photosynthesis. I had noticed that something was wrong in the accepted evolutionary models of the light-harvesting complex (LHC) protein superfamily, so I tried to squeeze the availabe phylogenetic methods as much as possible in order to solve this problem. Along the way, I discovered a new gene subfamily in red algae and diatoms which we subsequently analyzed in functional experiments. In photosynthesis, sunlight interacts with colorful photosynthetic pigments like the chlorophylls, carotenoids and phycobilines. The first two of these pigments can be bound by members of the extended light-harvesting complex (LHC) protein superfamily and are organised in order to take on functions in the collection of or in the defense against sunlight.
The main result of this work was an improved model for the evolution of the extended LHC protein family. After careful searches of homologous protein sequences in public sequence databases, we developed a coherent classification system of the different protein families in part based on hidden Markov model analyses. With this approach, we identified numerous new LHC-like genes including several from the model plant species Arabidopsis thaliana and described new families, like the RedCAP from red algae and complex algae with red plastids, and new subfamilies of two-helix proteins from glaucophytes, red algae, diatoms and plants. By adjusting different phylogenetic methods to our questions, we showed that LHC and PSBS, as well as other eukaryotic three-helix proteins, have evolved independently, contrary to previous suggestions. Over the last billions of years and in an still ongoing process, adaptational processes including the evolution of new protein functions, origin of novel protein families and secondary losses of others, as well as lineage-specific family expansions have shaped this protein superfamily.