Skip to Main Content

Finding and Working with Data Sets

Research Data

  • National Center for Biotechnology Information
    The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information.
  • ENCODE (Encyclopedia of DNA Elements)
    The Encyclopedia of DNA Elements (ENCODE) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.
  • Functional Glycomics Gateway by the Consortium for Functional Glycomics (CFG)
    The Functional Glycomics Gateway is a comprehensive resource for functional glycomics research brought to you by the Consortium for Functional Glycomics (CFG). Use this site to access our extensive databases, to request CFG resources and services, and to visit our archives of research publications and editorials.
  • RCSB Protein DataBank
    The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids. These are the molecules of life that are found in all organisms including bacteria, yeast, plants, flies, other animals, and humans. Understanding the shape of a molecule helps to understand how it works. This knowledge can be used to help deduce a structure's role in human health and disease, and in drug development. The structures in the archive range from tiny proteins and bits of DNA to complex molecular machines like the ribosome.

Multidisciplinary Data Sets or Repositories

A data repository is a directory or system that allows researchers to discover datasets (or in some cases, discover other data repositories). Data repositories may or may not actually host the datasets they index: some repositories (like ICPSR) do host datasets and make them available for to download, while others do not host datasets but provide links to the official sources where they can be downloaded.

  • Bureau of Labor Statistics
    The Bureau of Labor Statistics is an independent national statistical agency that collects, processes, analyzes, and disseminates essential statistical data to the American public, the U.S. Congress, other Federal agencies, State and local governments, business, and labor.
  • Climate Change Knowledge Portal (CCKP)
    The Climate Change Knowledge Portal (CCKP) Beta is a central hub of information, data and reports about climate change around the world. Here you can query, map, compare, chart and summarize key climate and climate-related information. (The World Bank Group)
    As a priority Open Government Initiative for President Obama's administration, increases the ability of the public to easily find, download, and use datasets that are generated and held by the Federal Government. provides descriptions of the Federal datasets (metadata), information about how to access the datasets, and tools that leverage government datasets. The data catalogs will continue to grow as datasets are added. Federal, Executive Branch data are included in the first version of
    Registry of Research Data Repositories, the largest and most comprehensive registry of data repositories.
  • Harvard Library Open Metadata
    This dataset contains over 12 million bibliographic records for materials held by the Harvard Library, including books, journals, electronic resources, manuscripts, archival materials, scores, audio, video and other materials.
    The metadata has been created, acquired and modified over decades, and represents a range of cataloging rules and practices. The records have not been altered or quality-checked during the export process and are offered as is. For more information about the dataset, please see the Documentation file, below.
    HUD USER provides interested researchers with access to the original data sets generated by PD&R-sponsored data collection efforts, including the American Housing Survey, HUD median family income limits, as well as microdata from research initiatives on topics such as housing discrimination, the HUD-insured multifamily housing stock, and the public housing population.
    Social and political science research
  • Office of Minority Health (OMH): Data/Statistics
  • Statistical Databases and Data Sets - Library of Congress
    An Annotated List of Reference Websites from the Library of Congress. Sets include Census Bureau data, Bureau of Justice data, HIV/AIDS Surveillance Reports, National Center for Health Statistics (CDC), and many many more.
  • UK Data Service
    The UK Data Service is a comprehensive resource funded by the ESRC to support researchers, teachers and policymakers who depend on high-quality social and economic data. Here you will find a single point of access to a wide range of secondary data including large-scale government surveys, international macrodata, business microdata, qualitative studies and census data from 1971 to 2011.
  • UNSD Statistical Databases
    Our work is guided by Mandate from the United Nations Statistical Commission, the apex entity of the global statistical system. The Statistics Division's mission is to advance the global statistical system. We are responsible for collecting, compiling, classifying, publishing, and disseminating global official statistical data.
  • US Census: International Programs Data
    International Database (IDB), World Population Summary, Country Rankings, HIV/AIDS Surveillance, Global Population Mapping
  • World Bank Open Data
    World Bank Open Data: free and open access to data about development in countries around the globe.

Statistics and More Datasets Arranged by Subject Area

State, Federal, Foreign, and International Statistical Resources. Statistics are arranged by topic.


Data Sets


Data Sets

Budget and Taxes

Data Sets

Climate and Weather

Crime and Justice

Data Sets

Economy, Labor, and Trade

Data Sets


Data Sets

Elections and Voting

Foreign and International Resources


Health and Vital Statistics

Data Sets

Population and Housing

Data Sets

Science and Technology

Data Sets


More Datasets