留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

PenCards: a global and community-contributed public archive of variant penetrance

Zhaopo Zhu Ling Shang Chuhan Shao Zheng Wang Xinxin Mao Yuanfeng Huang Pei Yu Bin Li Jinchen Li Guihu Zhao

Zhaopo Zhu, Ling Shang, Chuhan Shao, Zheng Wang, Xinxin Mao, Yuanfeng Huang, Pei Yu, Bin Li, Jinchen Li, Guihu Zhao. PenCards: a global and community-contributed public archive of variant penetrance[J]. 遗传学报, 2026, 53(2): 332-342. doi: 10.1016/j.jgg.2025.07.001
引用本文: Zhaopo Zhu, Ling Shang, Chuhan Shao, Zheng Wang, Xinxin Mao, Yuanfeng Huang, Pei Yu, Bin Li, Jinchen Li, Guihu Zhao. PenCards: a global and community-contributed public archive of variant penetrance[J]. 遗传学报, 2026, 53(2): 332-342. doi: 10.1016/j.jgg.2025.07.001
Zhaopo Zhu, Ling Shang, Chuhan Shao, Zheng Wang, Xinxin Mao, Yuanfeng Huang, Pei Yu, Bin Li, Jinchen Li, Guihu Zhao. PenCards: a global and community-contributed public archive of variant penetrance[J]. Journal of Genetics and Genomics, 2026, 53(2): 332-342. doi: 10.1016/j.jgg.2025.07.001
Citation: Zhaopo Zhu, Ling Shang, Chuhan Shao, Zheng Wang, Xinxin Mao, Yuanfeng Huang, Pei Yu, Bin Li, Jinchen Li, Guihu Zhao. PenCards: a global and community-contributed public archive of variant penetrance[J]. Journal of Genetics and Genomics, 2026, 53(2): 332-342. doi: 10.1016/j.jgg.2025.07.001

PenCards: a global and community-contributed public archive of variant penetrance

doi: 10.1016/j.jgg.2025.07.001
基金项目: 

We thank the members of the National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University. We are grateful for resources from the High-Performance Computing Center of Central South University. We are grateful for technical support from the Bioinformatics Center, Furong Laboratory and Bioinformatics Center, Xiangya Hospital, Central South University. This work was supported by the National Natural Science Foundation of China (32070591, 82371552, and W2512102), the Scientific Research Program of FuRong Laboratory (2023SK2093-1), the Central South University Research Programme of Advanced Interdisciplinary Study (2023QYJC010), the Natural Science Foundation of Hunan Province (2023JJ30975), and the Fundamental Research Funds for the Central Universities of Central South University (2025ZZTS0834).

详细信息
    通讯作者:

    Bin Li,E-mail:lebin001@csu.edu.cn

    Jinchen Li,E-mail:lijinchen@csu.edu.cn

    Guihu Zhao,E-mail:ghzhao@csu.edu.cn

PenCards: a global and community-contributed public archive of variant penetrance

Funds: 

We thank the members of the National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University. We are grateful for resources from the High-Performance Computing Center of Central South University. We are grateful for technical support from the Bioinformatics Center, Furong Laboratory and Bioinformatics Center, Xiangya Hospital, Central South University. This work was supported by the National Natural Science Foundation of China (32070591, 82371552, and W2512102), the Scientific Research Program of FuRong Laboratory (2023SK2093-1), the Central South University Research Programme of Advanced Interdisciplinary Study (2023QYJC010), the Natural Science Foundation of Hunan Province (2023JJ30975), and the Fundamental Research Funds for the Central Universities of Central South University (2025ZZTS0834).

  • 摘要: Penetrance is a crucial indicator for accurately assessing disease risk and plays a vital role in disease research, gene therapy, and genetic counseling. However, with penetrance data dispersed across various sources, efficiently accessing and consolidating this information becomes a challenge. A comprehensive platform that integrates penetrance is urgently needed. Here, we present PenCards, a global, community-contributed public archive of variant penetrance, by first collecting penetrance data from all published literature and then using large international cohorts to specifically calculate the penetrance of autism-related variants. PenCards contains a total of 244,531 variants, including 239,244 single nucleotide variants, 4994 insertions and deletions, and 293 copy number variants, covering approximately 300 phenotypes. We also provide a submission portal for the dynamic updating of penetrance. Additionally, to help users efficiently access genetic information, we comprehensively integrate over 150 variant- and gene-level resources. In summary, PenCards is a powerful platform designed to advance genetic research and diagnostics. PenCards is publicly available at https://genemed.tech/pencards/.
  • Aleksander, S.A., Balhoff, J., Carbon, S., Cherry, J.M., Drabkin, H.J., Ebert, D., Feuermann, M., Gaudet, P., Harris, N.L., Hill, D.P., et al., 2023. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031.
    Amberger, J.S., Bocchini, C.A., Scott, A.F., Hamosh, A., 2019. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic. Acids. Res. 47, 1038-1043.
    Auton, A., Brooks, L.D., Durbin, R.M., Garrison, E.P., Kang, H.M., Korbel, J.O., Marchini, J.L., McCarthy, S., McVean, G.A., Abecasis, G.R., 2015. A global reference for human genetic variation. Nature 526, 68-74.
    Avram, S., Wilson, T.B., Curpan, R., Halip, L., Borota, A., Bora, A., Bologa, C.G., Holmes, J., Knockel, J., Yang, J.J., et al., 2023. DrugCentral 2023 extends human clinical data and integrates veterinary drugs. Nucleic, Acids, Res, 51, 1276-1287.
    Backman, J.D., Li, A.H., Marcketta, A., Sun, D., Mbatchou, J., Kessler, M.D., Benner, C., Liu, D., Locke, A.E., Balasubramanian, S., et al., 2021. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628-634.
    Benn, D.E., Zhu, Y., Andrews, K.A., Wilding, M., Duncan, E.L., Dwight, T., Tothill, R.W., Burgess, J., Crook, A., Gill, A.J., et al., 2018. Bayesian approach to determining penetrance of pathogenic SDH variants. J. Med. Genet. 55, 729-734.
    Blake, J.A., Baldarelli, R., Kadin, J.A., Richardson, J.E., Smith, C.L., Bult, C.J., 2021. Mouse Genome Database (MGD): knowledgebase for mouse-human comparative biology. Nucleic. Acids. Res. 49, 981-987.
    Brown, G.R., Hem, V., Katz, K.S., Ovetsky, M., Wallin, C., Ermolaeva, O., Tolstoy, I., Tatusova, T., Pruitt, K.D., Maglott, D.R., et al., 2015. Gene: a gene-centered information resource at NCBI. Nucleic. Acids. Res. 43, 36-42.
    Cao, Y., Li, L., Xu, M., Feng, Z., Sun, X., Lu, J., Xu, Y., Du, P., Wang, T., Hu, R., et al., 2020. The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals. Cell Res. 30, 717-731.
    Chen, S., Francioli, L.C., Goodrich, J.K., Collins, R.L., Kanai, M., Wang, Q., Alfoldi, J., Watts, N.A., Vittal, C., Gauthier, L.D., et al., 2024. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92-100.
    Choi, Y., Sims, G.E., Murphy, S., Miller, J.R., Chan, A.P., 2012. Predicting the functional effect of amino acid substitutions and indels. PLoS ONE 7, e46688.
    Davis, A.P., Wiegers, T.C., Johnson, R.J., Sciaky, D., Wiegers, J., Mattingly, C.J., 2023. Comparative Toxicogenomics Database (CTD): update 2023. Nucleic. Acids. Res. 51, 1257-1262.
    DiStefano, M.T., Goehringer, S., Babb, L., Alkuraya, F.S., Amberger, J., Amin, M., Austin-Tse, C., Balzotti, M., Berg, J.S., Birney, E., et al., 2022. The Gene Curation Coalition: A global effort to harmonize gene-disease evidence resources. Genet. Med. 24, 1732-1742.
    Fadista, J., Oskolkov, N., Hansson, O., Groop, L., 2017. LoFtool: a gene intolerance score based on loss-of-function variants in 60706 individuals. Bioinformatics 33, 471-474.
    Ferlaino, M., Rogers, M.F., Shihab, H.A., Mort, M., Cooper, D.N., Gaunt, T.R., Campbell, C., 2017. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome. BMC Bioinformatics 18, 442.
    Firth, H.V., Richards, S.M., Bevan, A.P., Clayton, S., Corpas, M., Rajan, D., Van Vooren, S., Moreau, Y., Pettett, R.M., Carter, N.P., 2009. DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am. J. Hum. Genet. 84, 524-533.
    Forrest, I.S., Chaudhary, K., Vy, H.M.T., Petrazzini, B.O., Bafna, S., Jordan, D.M., Rocheleau, G., Loos, R.J.F., Nadkarni, G.N., Cho, J.H., et al., 2022. Population-Based Penetrance of Deleterious Clinical Variants. Jama 327, 350-359.
    Freshour, S.L., Kiwala, S., Cotto, K.C., Coffman, A.C., McMichael, J.F., Song, J.J., Griffith, M., Griffith, O.L., Wagner, A.H., 2021. Integration of the Drug-Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic. Acids. Res. 49, 1144-1151.
    Fu, W., O'Connor, T.D., Jun, G., Kang, H.M., Abecasis, G., Leal, S.M., Gabriel, S., Rieder, M.J., Altshuler, D., Shendure, J., et al., 2013. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216-220.
    Geer, L.Y., Marchler-Bauer, A., Geer, R.C., Han, L., He, J., He, S., Liu, C., Shi, W., Bryant, S.H., 2010. The NCBI BioSystems database. Nucleic. Acids. Res. 38, 492-496.
    Genovese, G., Rockweiler, N.B., Gorman, B.R., Bigdeli, T.B., Pato, M.T., Pato, C.N., Ichihara, K., McCarroll, S.A., 2024. BCFtools/liftover: an accurate and comprehensive tool to convert genetic variants across genome assemblies. Bioinformatics 40, btae038.
    Geoffroy, V., Herenger, Y., Kress, A., Stoetzel, C., Piton, A., Dollfus, H., Muller, J., 2018. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34, 3572-3574.
    Glusman, G., Caballero, J., Mauldin, D.E., Hood, L., Roach, J.C., 2011. Kaviar: an accessible system for testing SNV novelty. Bioinformatics 27, 3216-3217.
    Itan, Y., Shang, L., Boisson, B., Patin, E., Bolze, A., Moncada-Velez, M., Scott, E., Ciancanelli, M.J., Lafaille, F.G., Markle, J.G., et al., 2015. The human gene damage index as a gene-level approach to prioritizing exome variants. Proc. Natl. Acad. Sci. U. S. A. 112, 13615-13620.
    Karczewski, K.J., Francioli, L.C., Tiao, G., Cummings, B.B., Alfoldi, J., Wang, Q., Collins, R.L., Laricchia, K.M., Ganna, A., Birnbaum, D.P., et al., 2020. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434-443.
    Kircher, M., Witten, D.M., Jain, P., O'Roak, B.J., Cooper, G.M., Shendure, J., 2014. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310-315.
    Kirov, G., Rees, E., Walters, J.T., Escott-Price, V., Georgieva, L., Richards, A.L., Chambert, K.D., Davies, G., Legge, S.E., Moran, J.L., et al., 2014. The penetrance of copy number variations for schizophrenia and developmental delay. Biol. Psychiatry 75, 378-385.
    Kleinert, P., Kircher, M., 2022. A framework to score the effects of structural variants in health and disease. Genome Res. 32, 766-777.
    Kohler, S., Gargano, M., Matentzoglu, N., Carmody, L.C., Lewis-Smith, D., Vasilevsky, N.A., Danis, D., Balagura, G., Baynam, G., Brower, A.M., et al., 2021. The Human Phenotype Ontology in 2021. Nucleic. Acids. Res. 49, 1207-1217.
    Koster, J., Rahmann, S., 2018. Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 34, 3600.
    Landrum, M.J., Lee, J.M., Benson, M., Brown, G., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Hoover, J., et al., 2016. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic. Acids. Res. 44, 862-868.
    Landrum, M.J., Lee, J.M., Riley, G.R., Jang, W., Rubinstein, W.S., Church, D.M., Maglott, D.R., 2014. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic. Acids. Res. 42, 980-985.
    Lek, M., Karczewski, K.J., Minikel, E.V., Samocha, K.E., Banks, E., Fennell, T., O'Donnell-Luria, A.H., Ware, J.S., Hill, A.J., Cummings, B.B., et al., 2016. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285-291.
    Li, J., Shi, L., Zhang, K., Zhang, Y., Hu, S., Zhao, T., Teng, H., Li, X., Jiang, Y., Ji, L., et al., 2018. VarCards: an integrated genetic and clinical database for coding variants in the human genome. Nucleic. Acids. Res. 46, 1039-1048.
    Li, K., Luo, T., Zhu, Y., Huang, Y., Wang, A., Zhang, D., Dong, L., Wang, Y., Wang, R., Tang, D., et al., 2022. Performance evaluation of differential splicing analysis methods and splicing analytics platform construction. Nucleic. Acids. Res. 50, 9115-9126.
    Li, S., van der Velde, K.J., de Ridder, D., van Dijk, A.D.J., Soudis, D., Zwerwer, L.R., Deelen, P., Hendriksen, D., Charbon, B., van Gijn, M.E., et al., 2020. CAPICE: a computational method for Consequence-Agnostic Pathogenicity Interpretation of Clinical Exome variations. Genome Med. 12, 75.
    Li, T., Wernersson, R., Hansen, R.B., Horn, H., Mercer, J., Slodkowicz, G., Workman, C.T., Rigina, O., Rapacki, K., Staerfeldt, H.H., et al., 2017. A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61-64.
    Liu, X., Jian, X., Boerwinkle, E., 2011. dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum. Mutat. 32, 894-899.
    Liu, X., Li, C., Mou, C., Dong, Y., Tu, Y., 2020. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 12, 103.
    Lott, M.T., Leipzig, J.N., Derbeneva, O., Xie, H.M., Chalkia, D., Sarmady, M., Procaccio, V., Wallace, D.C., 2013. mtDNA Variation and Analysis Using Mitomap and Mitomaster. Curr. Protoc. Bioinformatics 44, 21-26.
    McCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A.R., Teumer, A., Kang, H.M., Fuchsberger, C., Danecek, P., Sharp, K., et al., 2016. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279-1283.
    McGurk, K.A., Zhang, X., Theotokis, P., Thomson, K., Harper, A., Buchan, R.J., Mazaika, E., Ormondroyd, E., Wright, W.T., Macaya, D., et al., 2023. The penetrance of rare variants in cardiomyopathy-associated genes: A cross-sectional approach to estimating penetrance for secondary findings. Am. J. Hum. Genet. 110, 1482-1495.
    Miller, J.A., Ding, S.L., Sunkin, S.M., Smith, K.A., Ng, L., Szafer, A., Ebbert, A., Riley, Z.L., Royall, J.J., Aiona, K., et al., 2014. Transcriptional landscape of the prenatal human brain. Nature 508, 199-206.
    Nassar, L.R., Barber, G.P., Benet-Pages, A., Casper, J., Clawson, H., Diekhans, M., Fischer, C., Gonzalez, J.N., Hinrichs, A.S., Lee, B.T., et al., 2023. The UCSC Genome Browser database: 2023 update. Nucleic. Acids. Res. 51, 1188-1195.
    O'Leary, N.A., Wright, M.W., Brister, J.R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D., et al., 2016. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic. Acids. Res. 44, 733-745.
    Pavan, S., Rommel, K., Mateo Marquina, M.E., Hohn, S., Lanneau, V., Rath, A., 2017. Clinical Practice Guidelines for Rare Diseases: The Orphanet Database. PLoS ONE 12, e0170365.
    Paysan-Lafosse, T., Blum, M., Chuguransky, S., Grego, T., Pinto, B.L., Salazar, G.A., Bileschi, M.L., Bork, P., Bridge, A., Colwell, L., et al., 2023. InterPro in 2022. Nucleic. Acids. Res. 51, 418-427.
    Petrovski, S., Gussow, A.B., Wang, Q., Halvorsen, M., Han, Y., Weir, W.H., Allen, A.S., Goldstein, D.B., 2015. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity. PLoS Genet. 11, e1005492.
    Pinero, J., Sauch, J., Sanz, F., Furlong, L.I., 2021. The DisGeNET cytoscape app: Exploring and visualizing disease genomics data. Comput. Struct. Biotechnol. J. 19, 2960-2967.
    Rehm, H.L., Berg, J.S., Brooks, L.D., Bustamante, C.D., Evans, J.P., Landrum, M.J., Ledbetter, D.H., Maglott, D.R., Martin, C.L., Nussbaum, R.L., et al., 2015. ClinGen--the Clinical Genome Resource. N. Engl. J. Med. 372, 2235-2242.
    Rolland, T., Cliquet, F., Anney, R.J.L., Moreau, C., Traut, N., Mathieu, A., Huguet, G., Duan, J., Warrier, V., Portalier, S., et al., 2023. Phenotypic effects of genetic variants associated with autism. Nat. Med. 29, 1671-1680.
    Rubinstein, W.S., Maglott, D.R., Lee, J.M., Kattman, B.L., Malheiro, A.J., Ovetsky, M., Hem, V., Gorelenkov, V., Song, G., Wallin, C., et al., 2013. The NIH genetic testing registry: a new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic. Acids. Res. 41, 925-935.
    Sharo, A.G., Hu, Z., Sunyaev, S.R., Brenner, S.E., 2022. StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants. Am. J. Hum. Genet. 109, 195-209.
    Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., Sirotkin, K., 2001. dbSNP: the NCBI database of genetic variation. Nucleic. Acids. Res. 29, 308-311.
    Spargo, T.P., Opie-Martin, S., Bowles, H., Lewis, C.M., Iacoangeli, A., Al-Chalabi, A., 2022. Calculating variant penetrance from family history of disease and average family size in population-scale data. Genome Med. 14, 141.
    Sunkin, S.M., Ng, L., Lau, C., Dolbeare, T., Gilbert, T.L., Thompson, C.L., Hawrylycz, M., Dang, C., 2013. Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system. Nucleic. Acids. Res. 41, 996-1008.
    Tang, J., Tanoli, Z.U., Ravikumar, B., Alam, Z., Rebane, A., Vaha-Koskela, M., Peddinti, G., van Adrichem, A.J., Wakkinen, J., Jaiswal, A., et al., 2018. Drug Target Commons: A Community Effort to Build a Consensus Knowledge Base for Drug-Target Interactions. Cell Chem. Biol. 25, 224-229.
    Teschendorff, A.E., Zhu, T., Breeze, C.E., Beck, S., 2020. EPISCORE: cell type deconvolution of bulk tissue DNA methylomes from single-cell RNA-Seq data. Genome Biol. 21, 221.
    Uhlen, M., Fagerberg, L., Hallstrom, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, A., Kampf, C., Sjostedt, E., Asplund, A., et al., 2015. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419.
    Vassos, E., Collier, D.A., Holden, S., Patch, C., Rujescu, D., St Clair, D., Lewis, C.M., 2010. Penetrance for copy number variants associated with schizophrenia. Hum. Mol. Genet. 19, 3477-3481.
    Wang, Z., Zhao, G., Zhu, Z., Wang, Y., Xiang, X., Zhang, S., Luo, T., Zhou, Q., Qiu, J., Tang, B., et al., 2024. VarCards2: an integrated genetic and clinical database for ACMG-AMP variant-interpretation guidelines in the human whole genome. Nucleic. Acids. Res. 52, 1478-1489.
    Whirl-Carrillo, M., Huddart, R., Gong, L., Sangkuhl, K., Thorn, C.F., Whaley, R., Klein, T.E., 2021. An Evidence-Based Framework for Evaluating Pharmacogenomics Knowledge for Personalized Medicine. Clin. Pharmacol. Ther. 110, 563-572.
    Wilson, E.B., 1927. Probable Inference, the Law of Succession, and Statistical Inference. J. Am. Stat. Asso. 22, 209-212.
    Zeidan, J., Fombonne, E., Scorah, J., Ibrahim, A., Durkin, M.S., Saxena, S., Yusuf, A., Shih, A., Elsabbagh, M., 2022. Global prevalence of autism: A systematic review update. Autism Res. 15, 778-790.
    Zhao, G., Li, K., Li, B., Wang, Z., Fang, Z., Wang, X., Zhang, Y., Luo, T., Zhou, Q., Wang, L., et al., 2020. Gene4Denovo: an integrated database and analytic platform for de novo mutations in humans. Nucleic. Acids. Res. 48, 913-926.
    Zhao, L., Wang, J., Li, Y., Song, T., Wu, Y., Fang, S., Bu, D., Li, H., Sun, L., Pei, D., et al., 2021. NONCODEV6: an updated database dedicated to long non-coding RNA annotation in both animals and plants. Nucleic. Acids. Res. 49, 165-171.
    Zhou, W., Chen, T., Chong, Z., Rohrdanz, M.A., Melott, J.M., Wakefield, C., Zeng, J., Weinstein, J.N., Meric-Bernstam, F., Mills, G.B., et al., 2015. TransVar: a multilevel variant annotator for precision genomics. Nat. Methods 12, 1002-1003.
  • 加载中
计量
  • 文章访问数:  16
  • HTML全文浏览量:  8
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-04-10
  • 录用日期:  2025-07-01
  • 修回日期:  2025-06-30
  • 刊出日期:  2026-02-10

目录

    /

    返回文章
    返回