World Archives Project: International Characters and Diacritics

From Ancestry.com Wiki

(Difference between revisions)
Jump to: navigation, search
(Created page with 'There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to k…')
(Map of International Characters used in the World Archives Project)
 
(14 intermediate revisions not shown)
Line 1: Line 1:
-
There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.
+
There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.  
-
+
-
What is a diacritic?
+
-
A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter.  A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself. 
+
-
Examples: Ã Ę Ł
+
-
Map of International Characters used in the World Archives Project
+
__TOC__
-
How to enter diacritical marks in the keying tool
+
== What is a diacritic? ==
-
When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool.
+
-
NOTE:  Not all diacritics will have a shortcut displayed.  '''Do not use other keyboard shortcuts.''' 
+
-
There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.  
+
 +
A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter. A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself.
-
Known problems with Windows XP
+
Examples: '''Ã Ę Ł'''
-
There are some diacritics that older Microsoft Windows operating systems do not fully recognize.  Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma.  When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.
+
-
Example: Mie□dzyrzecz instead of Międzyrzecz. 
+
-
These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E caudate.
+
-
Remedies
+
== Map of International Characters used in the World Archives Project  ==
-
One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7.
+
<gallery perrow=2>
-
If upgrading is not possible and you are running Windows XP's you may be able to install the European Union Expansion Font Update, which adds support for many missing diacritics.  http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0ec6f335-c3de-44c5-a13d-a1e7cea5ddea&DisplayLang=en  
+
Image:AWAP International Characters.png | This is the screen you will see in the PC version of the keying tool.
-
Note: Ancestry.com and Ancestry World Archives Project are not responsible for any third party software updates and installing any updates are done at your own risk.  
+
File:Internationalcharacters2.png | This is the screen you will see in the Mac version of the keying tool.
-
+
</gallery>
-
Notes for Arbitrators and Reviewers
+
 
-
Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.
+
== How to enter diacritical marks in the keying tool  ==
 +
 
 +
In some cases you'll need to enter international characters that are not found on your computer's keyboard. To enter the characters, click the International characters icon located in the menu bar just above the form where data is entered. [[Image:Intnl Icon Small.png]] Once the International Characters window appears, click the character you wish to enter once to highlight it, then select the Insert button. Alternately, you can double click on the letter and it will insert it for you.
 +
 
 +
When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool. In the image above the capital letter A with acute is highlighted. This letter also has a numeric shortcut of Alt+0193
 +
 
 +
'''Note:''' Not all diacritics will have a shortcut displayed. <u>'''Do not use other keyboard shortcuts.'''</u>
 +
 
 +
There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.
 +
 
 +
== Known problems with Windows XP  ==
 +
 
 +
There are some diacritics that older Microsoft Windows operating systems do not fully recognize. Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma. When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.
 +
 
 +
Example: Mie□dzyrzecz instead of Międzyrzecz.
 +
 
 +
These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E ogonek.
 +
 
 +
==== Windows XP Remedies ====
 +
 
 +
One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7.  
 +
 
 +
If upgrading is not possible and you are running Windows XP, you may be able to install the [http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0ec6f335-c3de-44c5-a13d-a1e7cea5ddea&DisplayLang=en European Union Expansion Font Update], which adds support for 6 additional characters: Ș, ș, Ț, ț, Ѝ, ѝ (s and t with comma, Cyrillic i with grave ). If these characters display correctly, you do not need to install this.
 +
 
 +
'''Note:''' Ancestry.com does not endorse, guarantee or provide support for any of these products.  
 +
 
 +
*[http://www.ancestry.com/legal/terms.aspx Ancestry.com Terms and Conditions]
 +
 
 +
== Notes for Arbitrators and Reviewers ==
 +
 
 +
Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.

Current revision as of 06:24, 5 October 2012

There are many projects within the World Archives Project that use international characters or diacritics. In order to maintain the integrity of documents, keyers are asked to key all diacritical marks as seen on the original image.

Contents


What is a diacritic?

A diacritic is a glyph that accompanies a letter and is mainly used to change the sound of the letter. A diacritical mark typically appears above or below a letter but they can be positioned within the letter itself.

Examples: Ã Ę Ł

Map of International Characters used in the World Archives Project

How to enter diacritical marks in the keying tool

In some cases you'll need to enter international characters that are not found on your computer's keyboard. To enter the characters, click the International characters icon located in the menu bar just above the form where data is entered. Image:Intnl Icon Small.png Once the International Characters window appears, click the character you wish to enter once to highlight it, then select the Insert button. Alternately, you can double click on the letter and it will insert it for you.

When you key letters that have diacritical marks, you should only use the international character set provided or the numeric shortcut shown in the keying tool. In the image above the capital letter A with acute is highlighted. This letter also has a numeric shortcut of Alt+0193

Note: Not all diacritics will have a shortcut displayed. Do not use other keyboard shortcuts.

There can be and are multiple methods of getting the same diacritic to display using special combinations of keys, but the character will not be recognized during processing in the World Archives tool and will cause errors.

Known problems with Windows XP

There are some diacritics that older Microsoft Windows operating systems do not fully recognize. Windows XP default fonts for example do not recognize some diacritics such as the Romanian T-comma. When your operating system does not recognize a particular diacritic you may see words in the wiki pages or in the keying tool (particularly during review and arbitration) that look like they have a box inserted as a letter.

Example: Mie□dzyrzecz instead of Międzyrzecz.

These errors are typically happening with letters that have a ring such as ů or a letter with a comma, cedilla, (ș ş) or in the example above the E ogonek.

Windows XP Remedies

One way to remedy this is to upgrade your operating system to Windows Vista or Windows 7.

If upgrading is not possible and you are running Windows XP, you may be able to install the European Union Expansion Font Update, which adds support for 6 additional characters: Ș, ș, Ț, ț, Ѝ, ѝ (s and t with comma, Cyrillic i with grave ). If these characters display correctly, you do not need to install this.

Note: Ancestry.com does not endorse, guarantee or provide support for any of these products.

Notes for Arbitrators and Reviewers

Arbitrators and reviewers who are using Windows XP operating systems: If you see the □ displayed when you are reviewing or arbitrating an image set, you will want to consider getting the font update described above. Since you do not have the ability to see the correct diacritic you will need to key the proper diacritic using the international characters symbols provided to ensure the proper diacritic is being entered. If all keyers use the same method of entering in diacritics, then during arbitration the system will not flag them as a discrepancy to be arbitrated.

Personal tools