View Full Version : A series of concerns (~4) from a Newbie

03-25-2011, 04:22 AM
People should never do this, but I'm going to preface with: I don't know what I'm doing. There, that's out of the way. Please, if I don't seem to be doing things right, just point me in the direction to get a better setup.

My Situation
Thanks for undertaking this project! It's a great concept, though it's tough enough to work out bugs on individual platforms let alone bridging them. Speaking of which, I'm trying really hard to bridge Google's App Engine (Python/Java) with Facebook's mostly PHP (and Python). PHP is the language I know best, and Python's whitespace really drove me nuts.

Quercus Issues in General
But.. so far I'm having a really hard time with Quercus. Checking other reports and version updates, I'm not sure if things are getting better or worse. I tried to find others addressing my issues, but it appears that I'm quite special when it comes to having these problems.. so hopefully it's basic human error!?

Here's my current rundown:

1) Curl-HTTPS - 2) Regex: preg_match - 3) Register POST Globals? - 3) General Instability?

1) Curl-HTTPS - http://quercus-https.appspot.com/
Was one of my first issues out of the gate: running the very same core Facebook PHP SDK that worked fine on my home server. Quercus is also not sending HTTPS curls to GAE. The only workaround I found so far was to use java's urlfetch (or is that Python from the GAE?)

The link above is my temporary detailed bug illustration until the problem is addressed, or I learn how to fix my setup! Until then, I'm going to mimic what I'm trying to do with curl through the java(python?) method of urlfetch that seems to work.

Another GAE Quirk, but not the issue:
Now, there's also partly an issue of GAE's SDK not allowing any HTTPS from a Dev server. But I've worked around that just fine by stepping around SSL validation entirely in GAE SDK.. but using a copy of the original using SSL just to deploy and close that potential security hole for live app.

2) Regex: preg_match
This just plain seems to be broken right now.. I tried a number of regexs with errors such as:

Warning: com.caucho.quercus.QuercusException: Can't find second ^ in regexp '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$'. [preg_match]
QuercusException: '{' is an unknown regexp flag in [a-zA-Z0-9\_\-]+[a-zA-Z0-9\.\_\-]*@([a-zA-Z0-9\_\-]+\.)+([a-zA-Z]{2,4}|travel|museum)$ [preg_match] ... '?' is an unknown regexp flag ... '(' .. '{' .. etc.. at this point, nothing would work - no escaping, changing 'flags', or deleting the offensive character.

Then after spending hours trying to tweak and rebuild regexs, as a last ditch I tried eregi and BOOM, every regex I threw at it worked. Makes me wonder, is it another charset bug like others I just saw at the top of the forum? Because supposedly(looking at version history) preg_match has been re-'fixed' multiple times for Quercus and yet, the deprecated approach(eregi) wasn't similarly 'fixed' such that eregi actually works instead of preg.

Am I doing (or expecting) something that I shouldn't with preg_match?

3) Register POST Globals? - minor observance
I was very startled when creating my first form on Quercus that I suddenly didn't have to individually extract _POST to variables.. they were all automatically extracted for me! I get how this should be much safer than traditional register_globals (which is not enabled), but still.. what if I don't want all variables dumped out at the start?

Is there a way to disable this? Is there any reason why I shouldn't want to?

4) General Instability? - speculation
When I took a gander at Python, I was amazed at how shaky it seemed.. half the Demos didn't seem to work out of the box, and just shifting a random "return true;" to the wrong tab made the whole demo freeze out.

Now, Quercus is far from that, but when I look around at caucho sites in general (I assume running Resin) and notice intermittent responses (this forum didn't load the first couple days I tried - and captcha images load about %30 of the time), reports of databases spitting garbled because the default encodings aren't supported, and various core functions like curl and regex processing failing.

My two cents
What I assume is that many of these issues were solved in the past, but as Resin seems to daringly 'fix' the very way these languages operate because they seem to be irrational (assuming databases won't use UTF-8, for example), then many unforeseen complications previously solved rise up again.

Again, thanks so much for working on this project. I'm sure you all realize how incredible this will be when cross-language platforms like Resin can reliably make any application in any main language accessible everywhere! I'm super excited about it and wish you all the best!

le mig

03-25-2011, 05:32 PM
1) While our curl implementation has its flaws, this one goes off to Google. GAE has a very strict Java environment and as you can see in the errors you where getting, they are not allowing certain connections in this environment.

GAE is not an officially supported environment. While most of the things work, a couple of others don't and we don't have the resources to fully support GAE.

Please code your app in a standard java environment (resin/tomcat/jetty) and then finally try to port it to GAE. If you try to do it on GAE first, there are simply too many factors influencing (GAE, Facebook, Quercus, etc.) .

2) preg_match is different than eregi, as preg_match uses perl compatible regular expressions and eregi uses other regular expressions. Thus, those two are completely separate sources. The issue with encoding your referencing is a problem with the mysql jdbc driver and has nothing to do with regular expressions.

3) this is a new issue and actually, I can't really reproduce it. If you can, please post a unit-test for this.

4) Not everything is connected to encoding issues. Most issues you may encounter on our caucho sites are from php software that use certain functions in an uncommon or undocumented way (because in PHP it works).
The way we develop Quercus, we can only improve our implementation through documentation and tests, as we don't look at the PHP source code.
If a certain feature of PHP has undocumented side effects and people are relying on them, its hard for us to compete, when no-one reports these issues.

I must admit, we had some problems lately with our infrastructure, but are currently fixing them, so everything "should" be fine again soon.