Ok, for WikiServer 1.5.1 which I push to the app store review team last night I'm
doing
(2) I actually run the self name squeakToUTF8 capitalized utf8ToSqueak
This converts the String to a utf8 widestring then capitalized, then back to String.
There fortunately is only one usage of capitalized in Pier, so the fix was easy.
As for the Option 1, I guess I'll have to look at all the Pier objects and see where
the data comes from, then
build a post loader that decides if the database is not UTF8 clean that it runs the UTF8
converter on all the instances
fields that are being targeted. Once the Pier data model is saved I can mark it clean.
On 2010-02-04, at 1:03 PM, Philippe Marschall wrote:
Assuming you don't already have corrupted data in
your image and want
to do a migration:
Option 1:
Do a utf-8 decoding on the Strings in your model and use WAKomEncoded
from that point on.
Option 2:
Hack #title method (and the other senders of #capitalized) to first do
a utf-8 decoding, then #capitalized and then utf-8 encoding. Continue
using WAKom.
Now if you already have corrupted data in your image you'll have to
clean that up. That can be tricky:
- find the potential places (senders of #capitalized, can't think of
anything else right now)
- find the actual places, eg. try to do a utf-8 decoding on each
candidate and see if you get an exception
- undo the "capitalization", if you can't replace the String with
"ERROR" or something
- chose one of the options above
Sorry for the inconvenience
Philippe
--
===========================================================================
John M. McIntosh <johnmci(a)smalltalkconsulting.com> Twitter: squeaker68882
Corporate Smalltalk Consulting Ltd.
http://www.smalltalkconsulting.com
===========================================================================