1.
Properly Exporting / Importing
Characters – see examples in paragraphs a. to aa. below, in three parts:
Illegal Characters, Special Symbols and Foreign languages; this is followed by
a solution or two:
Part I: Illegal Characters
a.
In English, there is a
difference between ‘ and ' - its dissimilar twin: Jacob’s or Jacob's? (note the
difference: the first apostrophe is slanted whilst the second one in straight).
b.
– means: - ( = a
simple dash, as in Mars Record Code 4315)
c.
– = - (
= a simple dash)
d.
Ch’n means: ‘ ( = a
simple apostrophe, as in “Ch’n” of Record Code 3337)
e.
English extended characters
like an apostrophe becomes fangled: “Int’l” should be “Int’l”. Same with “Solly’s” for: Solly’s
f.
The same with “wife?��s mobile” for “wife’s mobile”
g.
A Dash or Stroke: like the
name of the following company: “Molkerei Niesky – Niesky Werk”, which means:
“Molkerei
Niesky – Niesky Werk”.
h.
Similarly the apostrophes
that were transferred from FileMaker and opened in Excel show the following
text instead of “ ‘ “: “’”, as in “Shackelton’s Milling Ltd”, which should
read: “Shackelton’s Milling Ltd”. (Replaced over seventy (!) occurrences).
i.
Similarly the quotation
marks “ “like these” ” that were transferred from FileMaker and opened in Excel
show the following text instead of “ ‘ “: ““”.
Part II: Special Symbols / Characters
a. The Registered Symbol (®) need to be repaired, as it came over as “®” “or “®”.
Part III: Non-alphabetical, non-special Characters, also
found in search string that were copied from the Address Field in a web browser
and when Paste Special was used to paste it into Word, the search operands and
code strings are sometimes converted into computer lingo (See: [i])…
a. Dash: - %E2%80%8E
b. _ (Horizontal bar or Underscore) %5F
c. . (Period or fullstop): %2E
d. Comma (or , ): %2C
e. Slash: \ / HTML:
%2F
f.
Colon: : %3A
g. Ampersand or ( & ): HTML:
&
h. Chevron: < or >
i.
Quotation mark “”” HTML:
%22
j.
Open Quotation mark ““” HTML:
%9C
k. Close Quotation mark “”” HTML:
%9D
l.
HTML Code Equivalent “” %80
m. + (Plus sign) %2B
n. â (Small a, circumflex accent) %E2
o. < <
p. HTML Code Equivalent “” %8E
q. > >
r.
Double Apostrophe (Quotation
Mark) - also in the Hebrew Character set "
s. weeks' for weeks’ (apostrophe) - Upper Apostrophe (') in the Hebrew Character set '
t.
non-breaking space
u. when I copy a non-breaking hyphen from MSWord into
FileMaker, it pastes a ¨
Part IV: Foreign
languages
a.
Wherever the words “Tel. ”
or “Fax. ” Appear in Contact Fields, delete them, as they appear in the dialler
as “835.” And “329.” Respectively.
b.
A space in Acrobat PDF
originating in the USA, turns into a in FileMaker.
c.
Turkish characters are
problematic, as they’re not recognised at all by Excel, like the name of the
following company: “EVL?YA ?EKERLEME SAN. VE T?C. LTD. ?T?”, which means:
“EVLİYA ŞEKERLEME SAN. VE TİC. LTD. ŞTİ” and “Lütfü Türközü” means “Lütfü
Türközü”…
d.
Spanish characters like “á”
are problematic, as they’re not recognised at all by Excel, like the name of
the following company: “Andrea Sánchez”, which means: “Andrea Sánchez”.
e. Spanish characters like “— are
problematic, as they’re not recognised at all by Excel, like the name of the
following company: “Ingenio Azucarero “Roberto Barbery Paz—, which means: “Ingenio Azucarero “Roberto
Barbery Paz””.
a.
The following four
ingredients were exported as an Excel file from FileMaker Pro –
UTF-8, the fifth is from the Contacts:
i.
Ac CaprÃlico C-8 = Ac
Caprílico C-8
ii.
Ac Láctico = Ac
Láctico
iii.
Ac MirÃstico = Ac
Mirístico
iv.
Ac.Acético = Ac.Acético
v.
Gaétan = Gaétan
f.
Spanish characters like “ã”
are problematic, as they’re not recognised at all by Excel, like the name of
the following company: “Amorim & Irmãos., SA - Equipar Plant”, which
means: “Amorim & Irmãos., SA - Equipar Plant”.
g.
José
= José
h.
AsÃÂn = Asín
i.
Frédéric
= Frédéric
j.
Micheál = Michel
k.
?¹ = “Ö” from “Österreich”
l.
f??r = für
m.
Swedish characters like “ö”
are problematic, as they’re not recognised at all by Excel, like the name of
the following company: “Arla Foods Götene Ost”, which means: “Arla Foods Götene
Ost”.
n.
The German “Jörg” becomes “J�rg”
o.
Scandinavian characters are
problematic, as they’re not recognised at all by Excel, like the name of the
following company: “Høgelund Dairy”, which means: “Høgelund Dairy”.
p.
Polish characters are
problematic, as they’re not recognised at all by Excel, like the following names:
a.
The company: “Spóldzielnia
Mleczarska”, which means: “Spóldzielnia Mleczarska”
b.
“Å‚” in “Malwina GaÅ‚ach”,
which should read “Malwina Gałach”
c.
“KrążyÅ„ska”, which means
“Krążyńska”.
d.
The same for Wróblewski
/ Wróblewski.
q.
Irish characters are
problematic
a.
à and ¡ (lowered
i) as in Micheál (for Michael in English) comes out in the export file as
“Micheál).
r.
German characters are
problematic, as they’re not recognised at all by Excel, like the following
name: “Rückert”, which means: “Rückert”.
s.
The õ character creeps in
when doing an export and then import…
t.
French ê from a factory
name “Pêcheur” (Sensient), should read: “Pêcheur”
I’ll start with the last problem: Foreign
languages
Copy the text with the foreign language characters in
them from the source document (i.e. an Outlook eMail message) and paste it into
MSWord. Then Cut & Paste the text
into FileMaker. Try to write a Script
that will do all of it for you on the fly: Launch MSWord, Paste the text that
you copied at source, Paste into FileMaker.
Try using the following Scripts to ‘massage’ the text further: Replace, TextStyleRemove, TextStyleAdd, Proper (),Lower(), Upper().
The instructions below come from the FileMaker Pro
Gurus Website (See [ii]),
which will help for cleaning the data so it is prepared for importing back into
FileMaker (or any other database): See next page.
Remember that it is possible to set the import and
export Character Set setting to one of the following:
·
ANSI
·
ASCII
·
Unicode
·
UTF-8
(8-bit Unicode Transformation Format) is a variable-length character encoding
for Unicode, which is backwards compatible with ASCII.
·
Mac
·
ISO-8859-1 (informally also
called Latin-1) is an 8-bit character set for Western European languages
·
Code Page
The resulting text in the output file will depend on
the selection you made from the Character Sets in the above list, which
explains why sometimes the exported data contain strange characters, as shown
in the above examples.
No comments:
Post a Comment