November 2008
Sun Mon Tue Wed Thu Fri Sat
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            

Members


Archives

Recent Entries

Recent Comments

Notification


Main | Visualization Archives »

7 August 2007

Saving Chinese Characters with Export to KML

A cool extension for ArcMap is Export to KML, which allows you to select a label field and which fields you want to appear as descriptions in the output KML file.

This extension works really well and preserves the original encoding of the attributes. If you export from one of the UTF-8 CHGIS shapefiles, your characters should be fine. However, if you do any preprocessing with other ArcTools, the characters will seem to be destroyed.

I encountered this problem when exporting CHGIS Time Series data to KML. It turns out that the Time Series POINT layers have the data type set to MULTIPOINT. If you want to use Export to KML, you will have to convert the MULTIPOINT to POINT first, using ArcToolbox.

After using ArcToolbox to convert to POINT data type, the Chinese Characters in the attribute table will appear to have been destroyed. Nonetheless, if you continue to use the Export to KML extension, you can correct this problem with another freeware tool, BabelPad - and then by editing in DOS EDIT.

Let us assume you have a file called export.kml, which has been converted to POINT in ArcToolbox, then exported to KML. In BabelPad, browse to the file export.kml, and before opening, be sure to set the Encoding value to the last item on the drop-down list:

CESU-8: Compatibility Encoding Scheme for UTF-16

View image

Open the file and the Chinese Characters should be okay.

Finally save the file as

UTF-8: Unicode 8 bit transformation format

Once you have saved the file with BabelPad, the UTF-8 characters should be fine for viewing in GoogleEarth, however, the BabelPad application often introduces a small double-byte header that you need to DELETE using MS DOS Edit.

To accomplish this, place the saved UTF-8 KML file in a folder that you can easily navigate to in DOS Edit. Open up the Command Prompt and change directory to the folder in question. Now fire up DOS EDIT. If there exists any double-byte string at the beginning of the file (before the opening of the XML bracket <?xml ) DELETE it!

Make sure the first line calls for [ encoding="UTF-8″ ]. Now save the file.

View image


Your KML file should be good to go!

Open it in GoogleEarth and the Chinese should be okay.


Posted by Lex Berman at August 7, 2007 5:13 PM