Module 3: Part 2

Box of postcards.

Digitization of Analog Materials

When working with your personal records, you might find yourself interested in digitizing some of them. Digitization can help by creating digital backups of your paper-based files, allowing you to share images or recordings more easily with others, and/or make reproductions of your materials. It is important to note that digitization in and of itself should not be considered a replacement technique for the physical materials. When you digitize something, you should still keep the physical item as the original.

The digitization itself often can be easily accomplished as a DIY project with some exceptions where using a professional service might prove beneficial.

The following section provides basic information about the digitization process, including the steps, different equipment, and considerations.

The first step in any digitization project is developing a plan. Consider what you would like to digitize. Do you want to digitize everything? Or, more likely, only some of your material? How much time do you want to spend on the project? What types of materials do you want to digitize (e.g., paper-based, photographs, moving images, recorded audio, etc.)? Do you want to do all the digitization at once or in phases over time? How much money can you spend on either purchasing or renting equipment or using a professional service? Where will you keep the resulting digital files?

We recommend using the following worksheet to plan your digitization project.

The following sections discuss the equipment, procedure, file formats, and standards for each major format type.

Paper-based objects

Paper-based objects are the most common and easiest to digitize at home with relatively inexpensive equipment. Three types of scanners are used for paper-based objects: a photocopier or office printer, a flatbed scanner, or a drum scanner. The use of an office printer’s scanner function often may include using an automatic document feeder. This can significantly increase the speed and efficiency of scanning, but you should be careful not to try and send any damaged or brittle documents through the document feeder as they can jam and damage the paper.

When scanning documents, you should use either PDF or PDF/A format. Although these are proprietary formats for Adobe, they are the standards used in archival settings. In particular, PDF/A meets ISO standards for long-term digital preservation. Depending on the documents themselves, you may scan in color, grayscale, or black & white. Scanning in color will result in larger file sizes but should be used if the original document is in color. After scanning your documents, follow the digital preservation procedures noted in the next module.

Photographs

Most people think of digitizing their photographs as a priority in order to share images with others or create reproductions of the originals. Digitizing photographs require some additional decisions and can be more complicated than documents. In general, digital images fall into two categories: scanned or digitally captured images or images created on a computer using graphics software (e.g., logos). The former tends to be bitmap or raster graphics. This means they are made up of pixels, with each pixel being assigned a specific location and color. Each image contains a fixed number of pixels, and the overall size of the image is based on the image resolution. They can represent subtle gradations of color, as you would see in a photograph. Images created in graphics software, on the other hand, are vector graphics and can be scaled to any size without losing detail. When working with photographs or moving image material, you will most likely be creating the bitmap form of files—those with individual pixels.

One of the big things to consider when digitizing photographs is the types of file formats you will use. These also fall into two categories, so-called lossy or lossless compression formats. Lossy compression creates smaller file sizes by discarding parts of the image information. While it focuses on details and color information that may be unnoticed by the human eye, the pixels cannot be recovered once they are discarded. Lossless compression creates smaller files by rewriting the image data into a compressed version of the same things. It does not remove any of the image data; it simply uses fewer words to say the same thing.

Yet another thing to consider is bit-depth. This refers to the number of bits used to represent the colors of each pixel in an image. The greater the bit depth, the more bits of information per pixel. An 8-bit setting will display 256 colors. A 16-bit setting will display thousands of colors, and a 24-bit setting will display millions of colors. Here are some examples of different bit-depths:

  • Black & white: 1-bit; one bit to describe each pixel; can be used for line art

  • Grayscale: 8-bit; 256 possible colors (Ex: 256 shades of gray in a grayscale image)

  • Full Color: 24-bit; millions of colors; considered full color

  • Three 8-bit channels: 16 million color combinations; represents a significant portion of the range of colors visible to the human eye

  • 32-bit = CMYK images or RGB images with a 4th Alpha channel

  • 48-bit = generally the highest bit-depth available

  • Three 16-bit channels; most software and hardware are not able to display this much data

Here are some common file formats used for digitized images.

  • Graphics Interchange Format (GIF): In general, this format is best for line drawings, cartoons, illustrations, logos, or images that use large flat areas of color. It is a lossless compression format with 8-bit color support. It is most well-known now for supporting animation.

  • Joint Photographic Experts Group (JPEG): In general, this format is best for continuous-tone photographic images. JPEGs use lossy compression and support 24-bit color.

  • Portable Network Graphic (PNG): In general, this format is used as a replacement for GIF when more bit-depth or transparency is needed.

  • TIFF(Tagged Image File Format): This format is a good way to save scanned images for long-term archival storage. It is a platform-independent lossless compression file format. The main drawback to saving a file in TIFF format is that the file size may be quite large because of the large amount of information saved.

One last thing to consider for scanning is the resolution you will capture. Resolution is the number of pixels per inch (PPI) in an image. This is often displayed as a dimension, such as 800 x 600. Many people are more familiar with dots per inch or dpi, which refers to the “dots” of ink per inch when a digital image is printed. When we create or scan a digital image, we are capturing pixel information. Scanners record the color value and brightness of each area of an image when scanned. We decide how much pixel information to capture by setting the resolution as we scan. The resolution or image dimension determines what we can do with the image. A lower-resolution image may look fine when it is small but quickly becomes pixilated when enlarged. Higher resolution and higher bit-depth scans will result in larger file sizes since they contain more pixels—but they can reproduce more detail and subtle color transitions.

When deciding on how to scan your photographs, first consider how many photographs you would like to digitize (and their sizes). If you have many of the same sizes, you might consider using a drum scanner, as this will automate some of the scanning. If you prefer using a flatbed scanner, consider the quality and features of the one you purchase. Some may have higher bit-depth and resolution capabilities than others—typically for an additional cost.

If you can, try to scan at the highest level possible and save the scans in an uncompressed file format, such as a TIFF. You can then create a derivative file from this original to edit or correct in programs such as Adobe Photoshop and optimize/compress the image after all edits are complete. Similar to documents, follow the advice given in the next module section regarding digital preservation once you’ve created your digital images.

If you have a lot of images to digitize, you could work on them in batches. Using a laptop and flatbed scanner, you can efficiently complete many scans in an evening while watching TV or a movie. You can also consider outsourcing the digitization to one of many companies.

Moving Image Magnetic Tape (e.g., VHS) & Cassette Tapes

Digitizing moving image and audio materials is a more complex task that requires specific equipment to be completed properly. If you have a well-maintained playback machine—VCR or tape deck—you can connect the audiovisual outputs to a digital converter that connects to your computer. You can then use software to digitally capture the analog feed. This can be done for moving images using a screen or direct-to-optical capture (such as with VCR/DVD recorder decks). For audio, this would likely involve using Audacity or a similar program to capture the audio feed through your soundboard.

Similar to digitizing photographs, you need to consider the type of capture formats used, the sampling quality, and capture resolutions. As magnetic tape materials are more fragile, we recommend considering using a professional company to digitize these materials.

Digitization Plan

Download a handout with more details on developing a digitization plan here.

Previous Part | Next Part