You are here: Learn > The Library > Daily News Desk > Ancestry Daily News

Ancestry Daily News
10/23/2003 - Archive

•  Ancestry Daily News, October 23, 2003
•  RootsWorks: Scanning Newspapers

RootsWorks: Scanning Newspapers
Last year, I volunteered again. I know I said I wouldn't, but if you are one of those people who volunteers for things, you know how it goes. I just hate to see a service to others go undone. This time it was a 30-year high school reunion, and I agreed to help with the Alief High Class of '72 30-year reunion website.

Every thing I do turns into a history project. In this case, I was attracted to the possibility of establishing a timeline of our local history and the real world outside Alief. I was excited to find old school directories—I could map where people used to live. We found a yearbook and scanned the pictures from our senior year for a yearbook page. We found old football programs, basketball programs, and the graduation program.

That was nice, but then someone found a stack of old school newspapers. That was treasure for me—lots of the events of our lives during that time were there in words and pictures. As I've joked many times, I paid good money to forget everything before 1983. Some great memories came back. Those football games and school competitions seemed like life and death back then, and all of the urgency and hopes of youth came flooding back to me.

I decided to scan the newspapers and put them on the reunion site. That's when I learned that scanning newspapers isn't like scanning photos. In hopes that you can avoid some of the pitfalls I encountered, I'll tell you how to get the best scans from newspapers. But before we do, let's talk about newspapers and why scanning them is a special problem for genealogists.

How Newspaper is Made
First, they start with some very cheap paper. They don't make it to last 80 years. It ages badly, the ink goes everywhere when you handle it, and some real care is called for in handling the paper.

The Propes family holds a reunion in Henderson, Texas, each June. The Henderson Daily News has covered that event, publishing group photos from the reunion from time to time. I have newspapers from the late 1950s, containing the only photos I have of many long-departed relatives. Those pages are brittle, and past yellow to an orange-brown color.

Second, they print them on a printing press. That means that, like magazines, they are printed with a "halftone screen" that is a pattern of black dots of varying sizes. If you are looking at color, it's actually four screens. Color printing uses a "CMYK" process to print four different screens, at 90 degree angles to each other, of cyan, magenta, yellow, and black ink. If you scan them, you will get a pattern of lines and blocks, called a "moiré pattern." Take a look at a newspaper under a magnifying glass and you'll see these same patterns on the original.

The word moiré is from the French word "mourir" that means "to water." A moiré pattern is an optical interference pattern that was somehow associated with a watery look. Don't confuse this with "moiré fabric" which is not related to optics, but is more likely a corruption of the English word "mohair." (You get your information value here at the Ancestry Daily News.)

Scanning Resolution
Newspapers are generally printed at about 85 lines per inch (LPI). Magazines vary between 135 and 150 LPI, so there is a big difference in the quality of the printing. Most people suggest that you not scan at higher than 2 times the LPI, which would be about 170 dots per inch (DPI). That's probably pretty close—I've scanned newspaper at 150 and 200 DPI with good results, if I'm only working on the text. Images are a different problem.

I scan in color, even if the paper is black and white, for two reasons. One, I get more information, even if I'm going to convert the final image to gray scale. Two, I kind of like that sepia color that the old papers have, so sometimes I keep it. But I don't like that moiré pattern on my photos, and there are several ways to get rid of it.

The best tool is a scanner that has a "descreen" setting in the scanner software. If you don't see the word "descreen," you might look for "newspaper" as a source for your scans. Examples of images with and without screens are available on the RootsWorks site, at the link listed below the article. This works well for most newspapers: just scan at 150 DPI with a descreen filter and you're done.

If you want to try a more advanced approach, there are lots of ways to make it more complicated. Since we saw the pattern in the magnifying glass, we know that it is part of the image. What we have to do to remove it is to "blur it out of existence" as Wayne Fulton, the author of the scantips.com website, suggests. Wayne scans at 300 and 600 DPI, without the descreen filter. Then, using a program like Photoshop Elements, he shrinks the image to 1/3 its scanned size, and applies an "Unsharp Mask" as the icing on the cake.

It's Not Your Paper
Newspapers are published by companies that own the copyrights to the material in them. Copyrights are the rights to distribute copies of the work. Some publishers might claim that you can't legally copy their products, but they can't do so under copyright law: it only applies to distribution. If you want to scan and copy an article for your own personal consumption, make your own choice.

Just don't distribute it without permission from the copyright owner. Many newspapers will allow a "fair use" copy of their content, if you just ask nicely. If your society wants to scan old newspapers, or your family association wants to put pictures on a website, do it right. Many newspapers have microfilm or digital libraries of back issues, and if you are nice to them, you might get a pleasant surprise.

Words, Not Pictures
If you are interested in converting newspaper articles into text, you might consider using an Optical Character Recognition (OCR) program.
There are many of them, and they work better every year. I'm partial to Omnipage Pro, but there are others that also do a fine job. If you want to run an OCR program on a page, be sure to get a black and white image, instead of color or grayscale. You want a high contrast between the letters and the page for the best results. Some users like to scan a bit larger than actual size, say 25%, and then shrink it back down to actual size to smooth things out. I never go to the trouble.

More Information
For links and more information about scanning newspapers, please see the RootsWorks site. If you want to discuss your scanning challenges, please drop by the RootsWorks Forums. Registration is free, and I'd be interested to know what kinds of issues you are facing.


The RootsWorks series of articles focuses on genealogical applications for generic technologies. Beau would like to hear from you. Whether you have something to add or something to ask, please point your browser to www.rootsworks.com/forums and discuss this or any topic related to the use of technology in family history. Tell us about your scanning experiences. Please note that he cannot assist you with your individual computer problems. Visit the RootsWorks website for links to previous articles and Beau's lecture schedule.

Copyright 2003, MyFamily.com.


  Printer Friendly
 
E-mail to a friend

Search The Library



Weekly Journal

Sign up for the Ancestry Weekly Discovery and get free family history tips, news and updates in your inbox.