Bikur Cholim בקור חולים

Tuesday, December 20, 2016

Data Import - Part IX

More pointers on cleaning data in preparation for transferring (or “migrating”) them from one FileMaker Pro file to another, especially if they are in different versions:
1.                   Try RefreshFM from Goya (See: [i]), which is designed to help streamline data transition from one (old) FileMaker solution to another (new) FileMaker system.
2.                   When trying to find text in FileMaker, remember to work by the rules: Fields containg quotation marks can only be found if you prefix the text string with a backslash:\.
3.                   Before anything, clean the data first:
a.                   The text from MSWord will come across with an added – invisible character.
1.       In Excel: If the text results with a Question Mark (“?”) at the end of every Cell, copy and paste into Notepad, Find & Replace “?” with “” (no text, blank) and Copy & Paste back into the spreadsheet.
2.       In FileMaker: If the text results with a “ý” to the same as the above.
a.       Status Check: Is all the Data still intact?  Are the IDs in the proper order still?
b.      Save the changes in a separate Spreadsheet, so you can roll back to it if the changes in the next step go pear shape.
3.       In Chrome: it comes across as “‎”.
                                           ii.            Arrange physically entered blank lines:
                              iii.            Find the following: ^l
                             iv.            & Replace all Line Feeds* with   ^p^t
                        (*as opposed to Paragraph Returns or Hard Carriage Returns)
                               v.            Then Copy & Paste the text into a new Excel Spreadsheet
1.       Status Check: Is all the Data still intact?  Are the IDs in the proper order still?
2.       Save the changes in a separate Spreadsheet, so you can roll back to it if the changes in the next step go pear shape.
b.                  Quotation Marks
1.       Sometime there are quotation marks, which are unnecessary; delete them by doing a Find&Replace.
                                                             ii.      Space Invaders:
                                                                                          iii.                        Many Records have only visually blank Fields, as during Data Entry, the Clerk sometimes failed to hit the Carriage Return to ‘create’ a blank line, when the last item/s didn’t have anything to enter.  However this shouldn’t be a concern as if there is no data then it might as well be without a blank line.
                                                                                         iv.                        Check wherever the character “0” appears, which was entered in the Manufacturer, Kosher Org, Ingredients Columns - instead of a line entry by Suzy Schleider, as seen in 10857470 and hundreds more.
These blank ‘spaces’ need to stay in the correct place, level with all other corresponding pieces of data: Ingredients, Supplier, Manufacturer, Kosher Organisation, Class.
To solve this problem, we need to replace all occurrences of a Value with a Primary (or unique) ID / Serial Number / Code, so that the system can have a separate list of data sets with their Unique ID’s that are ‘consulted’ whenever this unique ID is looked up.

Data Import - Part VIII

1.       Properly Exporting / Importing Characters – see examples in paragraphs a. to aa. below, in three parts: Illegal Characters, Special Symbols and Foreign languages; this is followed by a solution or two: 
Part I:          Illegal Characters
a.                            In English, there is a difference between ‘ and ' - its dissimilar twin: Jacob’s or Jacob's? (note the difference: the first apostrophe is slanted whilst the second one in straight).
b.                           – means: - ( = a simple dash, as in Mars Record Code 4315)
c.                            – = - ( = a simple dash)
d.                           Ch’n means: ‘ ( = a simple apostrophe, as in “Ch’n” of Record Code 3337)
e.                           English extended characters like an apostrophe becomes fangled: “Int’l” should be “Int’l”.  Same with “Solly’s” for: Solly’s
f.                             The same with “wife?��s mobile” for “wife’s mobile”
g.                            A Dash or Stroke: like the name of the following company: “Molkerei Niesky – Niesky Werk”, which means: “Molkerei Niesky – Niesky Werk”.
h.                           Similarly the apostrophes that were transferred from FileMaker and opened in Excel show the following text instead of “ ‘ “: “’”, as in “Shackelton’s Milling Ltd”, which should read: “Shackelton’s Milling Ltd”. (Replaced over seventy (!) occurrences).
i.                              Similarly the quotation marks “ “like these” ” that were transferred from FileMaker and opened in Excel show the following text instead of “ ‘ “: ““”. 
Part II:         Special Symbols / Characters
a.       The Registered Symbol (®) need to be repaired, as it came over as “®” “or “®”.
Part III:       Non-alphabetical, non-special Characters, also found in search string that were copied from the Address Field in a web browser and when Paste Special was used to paste it into Word, the search operands and code strings are sometimes converted into computer lingo (See: [i])…
a.       Dash: -                                                                                                                                      %E2%80%8E
b.      _ (Horizontal bar or Underscore)                                                                                  %5F
c.       . (Period or fullstop):                                                                                                          %2E
d.      Comma (or , ):                                                                                                                       %2C
e.      Slash: \ /                                                                                                                                  HTML: %2F
f.        Colon: :                                                                                                                                     %3A
g.       Ampersand or ( & ):                                                                                                            HTML: &
h.      Chevron: < or >
i.         Quotation mark “”                                                                                                             HTML: %22
j.        Open Quotation mark “”                                                                                                HTML: %9C
k.       Close Quotation mark “”                                                                                                 HTML: %9D
l.         HTML Code Equivalent “€”                                                                                   %80
m.    + (Plus sign)                                                                                                                            %2B
n.      â (Small a, circumflex accent)                                                                                          %E2
o.      <                                                                                                                                                 <
p.      HTML Code Equivalent “Ž”                                                                                   %8E
q.      >                                                                                                                                                 >
r.        Double Apostrophe (Quotation Mark) - also in the Hebrew Character set  "
s.       weeks' for weeks’ (apostrophe) - Upper Apostrophe (') in the Hebrew Character set '
t.        non-breaking space                                                                                                             
u.      when I copy a non-breaking hyphen from MSWord into FileMaker, it pastes a ¨
Part IV: Foreign languages
a.       Wherever the words “Tel. ” or “Fax. ” Appear in Contact Fields, delete them, as they appear in the dialler as “835.” And “329.” Respectively.
b.      A space in Acrobat PDF originating in the USA, turns into a in FileMaker.
c.       Turkish characters are problematic, as they’re not recognised at all by Excel, like the name of the following company: “EVL?YA ?EKERLEME SAN. VE T?C. LTD. ?T?”, which means: “EVLİYA ŞEKERLEME SAN. VE TİC. LTD. ŞTİ” and “Lütfü Türközü” means “Lütfü Türközü”…
d.      Spanish characters like “á” are problematic, as they’re not recognised at all by Excel, like the name of the following company: “Andrea Sánchez”, which means: “Andrea Sánchez”.
e.      Spanish characters like “”” are problematic, as they’re not recognised at all by Excel, like the name of the following company: “Ingenio Azucarero “Roberto Barbery Paz””, which means: “Ingenio Azucarero “Roberto Barbery Paz””.
a.       The following four ingredients were exported as an Excel file from FileMaker Pro – UTF-8, the fifth is from the Contacts:
                                                                           i.      Ac Caprílico C-8 = Ac Caprílico C-8
                                                                         ii.      Ac Láctico = Ac Láctico
                                                                        iii.      Ac Mirístico = Ac Mirístico
                                                                       iv.      Ac.Acético = Ac.Acético
                                                                         v.      Gaétan = Gaétan
f.        Spanish characters like “ã” are problematic, as they’re not recognised at all by Excel, like the name of the following company: “Amorim & Irmãos., SA - Equipar Plant”, which means: “Amorim & Irmãos., SA - Equipar Plant”.
g.       José = José
h.      Asín = Asín
i.         Frédéric = Frédéric
j.        Micheál = Michel
k.       ?¹ = “Ö” from “Österreich”
l.         f??r = für
m.    Swedish characters like “ö” are problematic, as they’re not recognised at all by Excel, like the name of the following company: “Arla Foods Götene Ost”, which means: “Arla Foods Götene Ost”.
n.      The German “Jörg” becomes “Jrg”
o.      Scandinavian characters are problematic, as they’re not recognised at all by Excel, like the name of the following company: “Høgelund Dairy”, which means: “Høgelund Dairy”.
p.      Polish characters are problematic, as they’re not recognised at all by Excel, like the following names:
a.       The company: “Spóldzielnia Mleczarska”, which means: “Spóldzielnia Mleczarska
b.      “Å‚” in “Malwina GaÅ‚ach”, which should read “Malwina Gałach”
c.       “KrążyÅ„ska”, which means “Krążyńska”. 
d.      The same for Wróblewski / Wróblewski. 
q.      Irish characters are problematic
a.       à and ¡ (lowered i) as in Micheál (for Michael in English) comes out in the export file as “Micheál).
r.        German characters are problematic, as they’re not recognised at all by Excel, like the following name: “Rückert”, which means: “Rückert”.
s.       The õ character creeps in when doing an export and then import…
t.        French ê from a factory name “Pêcheur” (Sensient), should read: “Pêcheur”
I’ll start with the last problem: Foreign languages
Copy the text with the foreign language characters in them from the source document (i.e. an Outlook eMail message) and paste it into MSWord.  Then Cut & Paste the text into FileMaker.  Try to write a Script that will do all of it for you on the fly: Launch MSWord, Paste the text that you copied at source, Paste into FileMaker.  Try using the following Scripts to ‘massage’ the text further: Replace, TextStyleRemove, TextStyleAdd, Proper (),Lower(), Upper().
The instructions below come from the FileMaker Pro Gurus Website (See [ii]), which will help for cleaning the data so it is prepared for importing back into FileMaker (or any other database): See next page.
Remember that it is possible to set the import and export Character Set setting to one of the following:
·         ANSI
·         ASCII
·         Unicode
·         UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode, which is backwards compatible with ASCII.
·         Mac
·         ISO-8859-1 (informally also called Latin-1) is an 8-bit character set for Western European languages
·         Code Page
The resulting text in the output file will depend on the selection you made from the Character Sets in the above list, which explains why sometimes the exported data contain strange characters, as shown in the above examples.

Data Import - Part VII

Before Importing:
                                                              i.      Delete all blank records before starting the Import procedure, leaving only the records with Data in them, to shave off precious minutes from this operation.
                                                            ii.      Filter the Data Set that you want to export, by doing a Find with a specific range of dates, numbers or other Data – that will leave you with visible Records that are relevant and the irrelevant Records will be hidden (i.e. will not be exported).  This will save a lot of time.
                                                          iii.      Add another Column with Data that may be required when importing, such as:
1.      Data for the “Field Label Data” Field (where there is a Calculation that assigns a Label from a Value List, as for example is the case with Contacting_Label).
2.      Data for the “Unique ID” Field – remember to copy the Data in the “next value” Field in the Auto-Enter Tab of the Options for Field “Unique ID” Dialogue Box.  Also remember to copy the next value after the value found in the last record to be imported; increment by 1 = next Value.
3.      If you follow the advice contained in the previous paragraph, don’t tick the box to “Perform Auto-Enter (modification date, serial number, lookups, etc.)” during the Data Import, as shown in the Import Options Dialogue Box – the last step before executing any Data import.
4.      Company Status to denote whether this Record is for a Company, Associate or other type.
                                                          iv.      Make sure that
1.      Similar Fields (i.e. ID or Unique ID) are not mixed up, to preserve Data fidelity!
2.      Any Fields below the matched Fields are not ticked with an arrow, otherwise you’re overwriting all these Fields with blank Cells in the Source Spreadsheet (if importing an Excel Sheet).

    1. Open the export file (i.e. Test_02.fmp12): File->Import->Files of Type: FileMaker Files (*.fmp12)->File Name and click the Open Button.
    2. Choose “matching names” in the Arrange by Drop-down List. 
    3. Maximise the Combox (drag the bottom right-hand-corner, where you will see six dots in a small triangle), so you can see three times as many Field Names in one view; this will make it easier for you to ‘catch’ any unmatched Fields.
    4. Match up any Field names that haven’t automatically been paired with their counterparts; if needed, create new Fields in the Target Database, to accommodate those Fields that have no pair yet.
    5. Test the data by pressing the “Field Names [>>]” Button, the first of which should be showing the Field Names in the Source Field Column on the left hand side of the Dialogue Box.
    6. Look up the Import Data section in the RFC file, where there are many other points to consider before finalizing a data import to FileMaker