Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes
Mahajan A., Wessel J., Willems SM., Zhao W., Robertson NR., Chu AY., Gan W., Kitajima H., Taliun D., Rayner NW., Guo X., Lu Y., Li M., Jensen RA., Hu Y., Huo S., Lohman KK., Zhang W., Cook JP., Prins B., Flannick J., Grarup N., Trubetskoy VV., Kravic J., Kim YJ., Rybin DV., Yaghootkar H., Mñller-Nurasyid M., Meidtner K., Li-Gao R., Varga TV., Marten J., Li J., Smith AV., An P., Ligthart S., Gustafsson S., Malerba G., Demirkan A., Tajes JF., Steinthorsdottir V., Wuttke M., Lecoeur C., Preuss M., Bielak LF., Graff M., Highland HM., Justice AE., Liu DJ., Marouli E., Peloso GM., Warren HR., Afaq S., Afzal S., Ahlqvist E., Almgren P., Amin N., Bang LB., Bertoni AG., Bombieri C., Bork-Jensen J., Brandslund I., Brody JA., Burtt NP., Canouil M., Chen Y-DI., Cho YS., Christensen C., Eastwood SV., Eckardt K-U., Fischer K., Gambaro G., Giedraitis V., Grove ML., de Haan HG., Hackinger S., Hai Y., Han S., Tybjærg-Hansen A., Hivert M-F., Isomaa B., Jäger S., Jørgensen ME., Jørgensen T., Käräjämäki A., Kim B-J., Kim SS., Koistinen HA., Kovacs P., Kriebel J., Kronenberg F., Läll K., Lange LA., Lee J-J., Lehne B., Li H., Lin K-H., Linneberg A., Liu C-T., Liu J., Loh M., Mägi R., Mamakou V., McKean-Cowdin R., Nadkarni G., Neville M., Nielsen SF., Ntalla I., Peyser PA., Rathmann W., Rice K., Rich SS., Rode L., Rolandsson O., Schönherr S., Selvin E., Small KS., Stančáková A., Surendran P., Taylor KD., Teslovich TM., Thorand B., Thorleifsson G., Tin A., Tönjes A., Varbo A., Witte DR., Wood AR., Yajnik P., Yao J., Yengo L., Young R., Amouyel P., Boeing H., Boerwinkle E., Bottinger EP., Chowdhury R., Collins FS., Dedoussis G., Dehghan A., Deloukas P., Ferrario MM., Ferrières J., Florez JC., Frossard P., Gudnason V., Harris TB., Heckbert SR., Howson JMM., Ingelsson M., Kathiresan S., Kee F., Kuusisto J., Langenberg C., Launer LJ., Lindgren CM., Männistö S., Meitinger T., Melander O., Mohlke KL., Moitry M., Morris AD., Murray AD., de Mutsert R., Orho-Melander M., Owen KR., Perola M., Peters A., Province MA., Rasheed A., Ridker PM., Rivadineira F., Rosendaal FR., Rosengren AH., Salomaa V., Sheu WH-H., Sladek R., Smith BH., Strauch K., Uitterlinden AG., Varma R., Willer CJ., Blüher M., Butterworth AS., Chambers JC., Chasman DI., Danesh J., Duijn CV., Dupuis J., Franco OH., Franks PW., Froguel P., Grallert H., Groop L., Han B-G., Hansen T., Hattersley AT., Hayward C., Ingelsson E., Kardia SLR., Karpe F., Kooner JS., Köttgen A., Kuulasmaa K., Laakso M., Lin X., Lind L., Liu Y., Loos RJF., Marchini J., Metspalu A., Mook-Kanamori D., Nordestgaard BG., Palmer CNA., Pankow JS., Pedersen O., Psaty BM., Rauramaa R., Sattar N., Schulze MB., Soranzo N., Spector TD., Stefansson K., Stumvoll M., Thorsteinsdottir U., Tuomi T., Tuomilehto J., Wareham NJ., Wilson JG., Zeggini E., Scott RA., Barroso I., Frayling TM., Goodarzi MO., Meigs JB., Boehnke M., Saleheen D., Morris AP., Rotter JI., McCarthy MI.
Identification of coding variant associations for complex diseases offers a direct route to biological insight, but is dependent on appropriate inference concerning the causal impact of those variants on disease risk. We aggregated coding variant data for 81,412 type 2 diabetes (T2D) cases and 370,832 controls of diverse ancestry, identifying 40 distinct coding variant association signals (at 38 loci) reaching significance (p<2.2×10−7). Of these, 16 represent novel associations mapping outside known genome-wide association study (GWAS) signals. We make two important observations. First, despite a threefold increase in sample size over previous efforts, only five of the 40 signals are driven by variants with minor allele frequency <5%, and we find no evidence for low-frequency variants with allelic odds ratio >1.29. Second, we used GWAS data from 50,160 T2D cases and 465,272 controls of European ancestry to fine-map these associated coding variants in their regional context, with and without additional weighting to account for the global enrichment of complex trait association signals in coding exons. At the 37 signals for which we attempted fine-mapping, we demonstrate convincing support (posterior probability >80% under the “annotation-weighted” model) that coding variants are causal for the association at 16 (including novel signals involvingPOC5p.His36Arg,ANKHp.Arg187Gln,WSCD2p.Thr113Ile,PLCB3p.Ser778Leu, andPNPLA3p.Ile148Met). However, at 13 of the 37 loci, the associated coding variants represent “false leads” and naïve analysis could have led to an erroneous inference regarding the effector transcript mediating the signal. Accurate identification of validated targets is dependent on correct specification of the contribution of coding and non-coding mediated mechanisms at associated loci.