Caucho Forums  

This forum is permanently closed because of spam. For free community support, please visit Google Groups:


Go Back   Caucho Forums > Quercus

Reply
 
Thread Tools Display Modes
  #1  
Old 09-24-2013, 02:01 PM
tobia tobia is offline
Junior Member
 
Join Date: Jul 2011
Posts: 5
Question How to handle UTF-8 data without unicode semantics?

I've been reading Quercus's documentation and studying its sources, as well as reading past threads, trying to understand how it handles string encoding, but I feel I'm still missing something.

I need to call some modules (chiefly PDO) passing Unicode non-Latin1 characters in the query text (and possibly in prepared statement parameters) as well as receiving those characters in the resulting records. I'd like to do so using PHP string variables encoded in UTF-8, those that come for example from json_decode() and go into json_encode(). I'd also like to do this without enabling unicode semantics, if at all possible.

The PDO classes internally handle java.lang.String values, both for the query text and for varchar return values; PHP string variables (ConstStringValue) are represented as a java.lang.String as well, so they could theoretically contain any Unicode character. But I'm not sure how many encoding/decoding steps they go through along the chain.

I should mention that enabling unicode semantics makes everything work out of the box beautifully, because the PHP variables are no longer being encoded/decoded in strange places, they are just Unicode strings and are passed around as such.

My question is: how can I get this to work without unicode semantics, which breaks other random things?

Code:
$db = new PDO('java:comp/env/jdbc/something');

$qry = "select * from table where x = ".$db->quote(json_decode(...));
// Now $qry contains unicode characters encoded in UTF-8.

// How do I encode/decode $qry, so that it can be passed to the JDBC
// driver as a Java String with those same Unicode characters?
$cur = $db->query($qry);

$row = $cur->fetch(PDO::FETCH_NUM);
// How do I encode/decode $row values so that they are come up
// encoded in UTF-8 and can be passed for example to json_encode()?
Reply With Quote
  #2  
Old 01-04-2014, 08:44 AM
nam nam is offline
Administrator
 
Join Date: Aug 2009
Posts: 337
Default

Without unicode.sematics, strings are just byte arrays. That is how PHP5 treats strings.

But with unicode.semantics, then strings are Java strings. Things "breaking" with it on is most likely due to bugs. We ironed out a lot of those bugs recently. But you may want to try out the upcoming 4.0.39 which should improve things a lot.
Reply With Quote
Reply

Tags
pdo, unicode, utf8

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 10:44 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.