Answered By: Bryan Kasik
Last Updated: Dec 20, 2023     Views: 24

"The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. LDC was formed in 1992 to address the critical data shortage then facing language technology research and development.

Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise.

LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences. LDC’s connection with Penn provides a strong foundation for the Consortium’s research and outreach to an active and diverse member community." - from

UVA does not pay membership fees, but is listed as the umbrella organization. Users have to create an individual account listing UVA as their institution, which goes to Erin Pappas for approval. Licensing is paid by the end user by invoice or credit card – the library does not pay for any data/corpora.

There are no costs to create an account and download any data already in the account (please note there are some “E” datasets in the account which are part of a specific Evaluation someone participated in previously and are not for public download). Users can request data as non-members, each dataset in our catalog has a “View Fees” button which shows the fee schedule. Users can either use a credit card to license things via the online transaction system or can request an invoice be sent. 

 See Licensing Data as a Nonmember  for information about obtaining data, licensing, and paying for data.

Contact Erin Pappas  or Jenn Huck for more information.



Contact Us