World Archives Project: International Characters and Diacritics

From Ancestry.com Wiki

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.  
There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.  
-
What is a diacritic?  
+
__TOC__
 +
==What is a diacritic?==
A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter. A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself.  
A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter. A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself.  
Line 7: Line 8:
Examples: '''Ã Ę Ł'''  
Examples: '''Ã Ę Ł'''  
-
Map of International Characters used in the World Archives Project  
+
==Map of International Characters used in the World Archives Project==
[[File:AWAP International Characters.png]]
[[File:AWAP International Characters.png]]
Line 13: Line 14:
<br>
<br>
-
How to enter diacritical marks in the keying tool  
+
==How to enter diacritical marks in the keying tool==
When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool.  
When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool.  
Line 21: Line 22:
There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.  
There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.  
-
<br>Known problems with Windows XP  
+
==Known problems with Windows XP==
There are some diacritics that older Microsoft Windows operating systems do not fully recognize. Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma. When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.  
There are some diacritics that older Microsoft Windows operating systems do not fully recognize. Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma. When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.  
Line 29: Line 30:
These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E caudate.  
These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E caudate.  
-
Remedies  
+
====Windows XP Remedies====
One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7. If upgrading is not possible and you are running Windows XP's you may be able to install the European Union Expansion Font Update, which adds support for many missing diacritics. http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0ec6f335-c3de-44c5-a13d-a1e7cea5ddea&amp;DisplayLang=en Note: Ancestry.com and Ancestry World Archives Project are not responsible for any third party software updates and installing any updates are done at your own risk.  
One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7. If upgrading is not possible and you are running Windows XP's you may be able to install the European Union Expansion Font Update, which adds support for many missing diacritics. http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0ec6f335-c3de-44c5-a13d-a1e7cea5ddea&amp;DisplayLang=en Note: Ancestry.com and Ancestry World Archives Project are not responsible for any third party software updates and installing any updates are done at your own risk.  
-
Notes for Arbitrators and Reviewers  
+
==Notes for Arbitrators and Reviewers==
Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.
Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.

Revision as of 20:42, 27 April 2011

There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.

Contents

What is a diacritic?

A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter. A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself.

Examples: Ã Ę Ł

Map of International Characters used in the World Archives Project

File:AWAP International Characters.png


How to enter diacritical marks in the keying tool

When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool.

NOTE: Not all diacritics will have a shortcut displayed. Do not use other keyboard shortcuts.

There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.

Known problems with Windows XP

There are some diacritics that older Microsoft Windows operating systems do not fully recognize. Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma. When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.

Example: Mie□dzyrzecz instead of Międzyrzecz.

These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E caudate.

Windows XP Remedies

One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7. If upgrading is not possible and you are running Windows XP's you may be able to install the European Union Expansion Font Update, which adds support for many missing diacritics. http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0ec6f335-c3de-44c5-a13d-a1e7cea5ddea&DisplayLang=en Note: Ancestry.com and Ancestry World Archives Project are not responsible for any third party software updates and installing any updates are done at your own risk.

Notes for Arbitrators and Reviewers

Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.

Personal tools