encoding - decode string encoded in utf-8 format in android -
i have string comes via xml , , text in german. characters german specific encoded via utf-8 format. before display string need decode it.
i have tried following:
try { bufferedreader in = new bufferedreader( new inputstreamreader( new bytearrayinputstream(nodevalue.getbytes()), "utf8")); event.attributes.put("title", in.readline()); } catch (unsupportedencodingexception e) { // todo auto-generated catch block e.printstacktrace(); } catch (ioexception e) { // todo auto-generated catch block e.printstacktrace(); }
i have tried this:
try { event.attributes.put("title", urldecoder.decode(nodevalue, "utf-8")); } catch (unsupportedencodingexception e) { // todo auto-generated catch block e.printstacktrace(); }
none of them working. how decode german string
thank in advance.
udpdate:
@override public void characters(char[] ch, int start, int length) throws saxexception { // todo auto-generated method stub super.characters(ch, start, length); if (nodename != null) { string nodevalue = string.copyvalueof(ch, 0, length); if (nodename.equals("startdat")) { if (event.attributes.get("eventid").equals("187")) { } } if (nodename.equals("startscreen")) { imageaddress = nodevalue; } else { if (nodename.equals("title")) { // try { // bufferedreader in = new bufferedreader( // new inputstreamreader( // new bytearrayinputstream(nodevalue.getbytes()), "utf8")); // event.attributes.put("title", in.readline()); // } catch (unsupportedencodingexception e) { // // todo auto-generated catch block // e.printstacktrace(); // } catch (ioexception e) { // // todo auto-generated catch block // e.printstacktrace(); // } // try { // event.attributes.put("title", // urldecoder.decode(nodevalue, "utf-8")); // } catch (unsupportedencodingexception e) { // // todo auto-generated catch block // e.printstacktrace(); // } event.attributes.put("title", stringescapeutils .unescapehtml(new string(ch, start, length).trim())); } else event.attributes.put(nodename, nodevalue); } } }
you use string constructor charset parameter:
try { final string s = new string(nodevalue.getbytes(), "utf-8"); } catch (unsupportedencodingexception e) { log.e("utf8", "conversion", e); }
also, since data xml document, , assume encoded utf-8, problem in parsing it.
you should use inputstream
/inputsource
instead of xmlreader
implementation, because comes encoding. if you're getting data http response, either use both inputstream
, inputsource
try { httpentity entity = response.getentity(); final inputstream in = entity.getcontent(); final saxparser parser = saxparserfactory.newinstance().newsaxparser(); final xmlhandler handler = new xmlhandler(); reader reader = new inputstreamreader(in, "utf-8"); inputsource = new inputsource(reader); is.setencoding("utf-8"); parser.parse(is, handler); //todo: data handler } catch (final exception e) { log.e("parseerror", "error parsing xml", e); }
or inputstream
:
try { httpentity entity = response.getentity(); final inputstream in = entity.getcontent(); final saxparser parser = saxparserfactory.newinstance().newsaxparser(); final xmlhandler handler = new xmlhandler(); parser.parse(in, handler); //todo: data handler } catch (final exception e) { log.e("parseerror", "error parsing xml", e); }
update 1
here sample of complete request , response handling:
try { final defaulthttpclient client = new defaulthttpclient(); final httppost httppost = new httppost("http://example.location.com/myxml"); final httpresponse response = client.execute(httppost); final httpentity entity = response.getentity(); final inputstream in = entity.getcontent(); final saxparser parser = saxparserfactory.newinstance().newsaxparser(); final xmlhandler handler = new xmlhandler(); parser.parse(in, handler); //todo: data handler } catch (final exception e) { log.e("parseerror", "error parsing xml", e); }
update 2
as problem not encoding source xml being escaped html entities, best solution (besides correcting php not escape response), use apache.commons.lang library's handy static stringescapeutils class
.
after importing library, in xml handler's characters
method put following:
@override public void characters(final char[] ch, final int start, final int length) throws saxexception { // variable hold correct unescaped value final string elementvalue = stringescapeutils. unescapehtml(new string(ch, start, length).trim()); [...] }
update 3
in last code problem initialization of nodevalue
variable. should be:
string nodevalue = stringescapeutils.unescapehtml( new string(ch, start, length).trim());
Comments
Post a Comment