Metadata is data that provides information about other data. In scholarly publishing, metadata refers to structured information that describes the attributes of an article, including its title, authors, date of publication, copyright and licensing status, and more.
Metadata should be created following appropriate standards and it is commonly deposited via Crossref (article metadata), DataCite (metadata on other research objects such as data, software and more) or indexes (journal metadata). Journals have a responsibility to make their metadata easily available, so that any contents published are discoverable by readers via a broad range of search approaches.
It is important to provide a common structure for metadata, so that it can be digitally read and automatically presented to users. Metadata standards promote interoperability, by helping ensure that records remain accurate and consistent.
Using metadata standards also enables and promotes development indexing and discovery services, particularly when in combination with persistent identifiers for articles (e.g. digital object identifiers, permalinks), authors (ORCID) and organisations (e.g. Research Organization Registry).
Some notable metadata standards include Dublin Core, Machine Readable Cataloging (MARC), Crossref and DataCite. Whilst JATS is primarily a format for storing the entire article (see Structured content), it is also used as a metadata interchange format between publishers and archivists
Typical differences between article and journal metadata are noted in the following table.
|Focus of the metadata||Typical fields|
Journals typically have clear metadata displayed alongside individual articles. This helps readers identify the title of the article, its author(s), publication data and persistent identifier, for example. If the journal publishes JATS XML versions of the article, this can be used to supply metadata in a structured form, which can be helpful for text and data mining. Metadata can also be embedded in pdf documents, using the Extensible Metadata Platform. Most publishing systems (e.g. Open Journals System, Janeway) will make Dublin Core metadata available on the article’s abstract page, so that it can be read by referencing tools (e.g. Zotero), and will provide an Open Access Initiative Protocol for Metadata Harvesting (OAI-PMH) feed for metadata harvesting.
- Merriam Webster. (n.d.). Metadata.
- ORCID. (n.d.). About ORCID.
- ROR. (n.d.). Home.
- Dublin Core. (n.d.). Dublin Core.
- Library of Congress. (2009). WHAT IS A MARC RECORD, AND WHY IS IT IMPORTANT?.
- Crossref. (2021, October 22). Metadata principles and practices.
- DataCite Schema. (2021, March 30). DataCite Metadata Schema.
- National Center for Biotechnology Information, U.S. National Library of Medicine. (n.d.). Journal Article Tagging Suite (JATS).
- Padula, D. (2019, August 22). Journal indexing: core standards and why they matter. LSE blog.
- Wikipedia. (2023, March 21). Extensible Metadata Platform.
- Open Journal Systems. (n.d.). Open Journal Systems.
- Janeway. (n.d.). Janeway.
- Zotero. (n.d.). Zotero.
- Open archives initiative. (n.d.). Protocol for Metadata Harvesting.