Glossary

Best Practice

Professional procedures that are generally accepted as the best or most ‘correct’ way of doing something

Collection

In the DOH repository, a collection is a group of items that have been deliberately placed together or that have come together naturally. While digitizing, we want to reflect the physical groupings of items through the creation of ‘collections’ in our filenames and directory structures

Compound object

An item that has two or more parts that are connected to each other in some way. This can be a double sided postcard, a multi page document stapled together, or a book. As we digitize items, we want to keep the digital images connected in the same way that the physical items were. 

Digitizing

The process of turning a physical item into a digital item and documenting metadata for that item.

Directory structure

A way of saving files that is hierarchical and systematic.

DPI (‘dots-per-inch.’)

A unit of measurement that indicates the density of physical dots present per inch of an image. Traditionally, DPI was used in printing environments, where ink printers deposited thousands of small dots on a page, which together formed the desired image. DPI is often used synonymously with PPI, the two are not the exact same. DPI is used for the output of the digital image to the physical environment and PPI is used for the input resolution of the image in digital space. Regardless, images with a higher DPI or PPI will result in higher image quality. To avoid confusion, this toolkit will use DPI since it is the acronym most commonly used by digitization technology.

Filenames

The name given to a digital derivative upon creation. Filenames should be unique and identify what fonds/collection the item belongs to.

Fonds

A fond is an archival term for a group of documents that have come together naturally over time. While digitizing, we want to keep the items in the fond together through the use of filenames and a directory structure. Throughout these documents the term collection is used instead of fonds for ease of terminology.

Item

The smallest single unit, the most basic level of archival description. For a photo collection, a physical ‘item’ would be a single photograph, and for a digital photo collection, the ‘item’ would be the single file that represents that photo.

JPEG

A common image file format that works well on the internet as it is quick to load. This is a lossy compression format and should only be used as an access copy for digitized images.

Lossy

Lossy compression occurs when digital information is lost during the conversion to a particular file format. Lossy compression is typically used to reduce the size of a file and removes information that is considered superfluous by the convertor, such as similar colour information. Lossy compression is acceptable to use for access copies, but should never be used for archival master files.

Lossless

Lossless compression is used to refer to file compression that does not result in any lost digital information during the conversion process. Lossless files will be larger than lossy files, but should always be used for archival master files to ensure that no important information is lost.

Metadata

Information about an object that helps to situate it in context and allows people to search for the item or items like it.

OCR

Optical Character Recognition (OCR) is the process of digitally converting digitized images of text into machine readable type.

PDF ('Portable Document Format’)

This is a file format that provides an electronic image of text or graphics that looks like a printed document and can be viewed, printed, and electronically transmitted.

PPI (‘Pixels-per-inch.')

A unit of measurement that indicates the density of digital pixels present per inch of an image. While PPI is often used synonymously with DPI, the two are not the exact same. DPI is used for the output of the digital image to the physical environment and PPI is used for the input resolution of the image in digital space. Regardless, images with a higher DPI or PPI will result in higher image quality. To avoid confusion, this toolkit will use DPI since it is the acronym most commonly used by digitization technology.

Provenance

The origin or source of an item or archival aggregation. In archives collections and fonds are kept separate according to provenance to distinguish their origin and preserve the context of their creation. 

Repository

A place where things are stored and can be found. For DOH this represents all of our project partners who have the physical version of the items that have been or will be digitized.

Scanning

Using a scanner to turn a physical image or item into a digital item.

Sprockets

Film sprockets are holes on the edge of a film strip that are used by cameras and/or projectors to pull the film from one frame to the next. Different film stocks will have different sprockets shapes and placement. Certain rare film stocks have sprockets in the center of the frame, rather than on the edges.

TIFF

A common image file format, short for Tagged Image File Format. Unlike JPEGs, a TIFF can be both a file format in its own right as well as a wrapper for other image files, like JPEGs. TIFF files are the ideal format for archival masters because they are both lossless and stable. However, it is important to note that lossy compression is irreversible and wrapping a lossy image format in a TIFF will not undo this loss. Subsequently, TIFFs and JPEGs will need to be derived separately, rather than wrapping a jpeg in a TIFF afterwards.

Transcribe

To transcribe something is to copy it exactly as it appears in the original