[ptt-users] setFile corrupting data on multi-part HTTP POST?
Aaron Romeo
aaron.romeo at artez.com
Fri Dec 15 09:41:11 PST 2006
Actually, I think it does make sense... So it sounds like Jython isn't
recognizing that the response is in UTF-8, and as a result the find command
(or a Unicode command) is doing really weird stuff...
Aaron.
-----Original Message-----
From: users-bounces at lists.pushtotest.com
[mailto:users-bounces at lists.pushtotest.com] On Behalf Of
Mark.Lutton at thomson.com
Sent: Friday, December 15, 2006 12:03 PM
To: users at lists.pushtotest.com
Subject: RE: [ptt-users] setFile corrupting data on multi-part HTTP POST?
What you saw in the HTTP response is correct UTF-8 and should render
correctly in the browser, but it may not be what you want stored in a text
file.
The French small c with cedilla is Unicode code point 00E7. This is hex E7
in the Windows Code Page 1252 character set and hex 87 in Code Page 437 (the
original IBM PC character set). (Just for fun, open a command prompt window
in Windows, hold ALT and type 135 on the numeric keypad, meaning x'87' in
code page 437, then try ALT and 0231, meaning x'E7' in code page 1252. They
are the same character. The leading zero signals which code page to use.)
Unicode code point 00E7 is represented in UTF-8 by the two-byte sequence C3
A7, as you saw. This looks like ç in Notepad and something else in a DOS
window.
Here is the word "français" in hex codes:
UNICODE: 0066 0072 0061 006E 00E7 0061 0069 0073
Original IBM PC character set (code page 437): 66 72 61 6E 87 61 69 73
(As saved on disk by C:\WINDOWS\system32\EDIT.COM. Old-timers will remember
this program.)
Windows (Code page 1252): 66 72 61 E7 87 61 69 73
(As saved on disk by Microsoft Office Word 2003.)
UTF-8: 66 72 61 6E C3 A7 87 61 69 73
Does that make sense? No? I'm not surprised. I have read only ONE
explanation of Unicode that made sense and that was here:
http://www.joelonsoftware.com/articles/Unicode.html
It is called "The Absolute Minimum Every Software Developer Absolutely,
Positively Must Know About Unicode and Character Sets (No Excuses!)" and the
title says it all.
-- Mark
-----Original Message-----
From: users-bounces at lists.pushtotest.com
[mailto:users-bounces at lists.pushtotest.com] On Behalf Of Aaron Romeo
Sent: Friday, December 15, 2006 10:28 AM
To: TestMaker users list
Subject: RE: [ptt-users] setFile corrupting data on multi-part HTTP POST?
Funny, I had run into a similar problem with images... Thanks Mark that
will be helpful.
By the way, I have had a similar problem dealing with French characters in
the response (like a HTTP Response). They do not appear to be in UTF-8. I
have my French characters appearing as "français" rather than "français".
I know this is a character encoding issue, and I believe it is because the
Response content is not read as UTF-8.
Aaron.
-----Original Message-----
From: users-bounces at lists.pushtotest.com
[mailto:users-bounces at lists.pushtotest.com] On Behalf Of
Mark.Lutton at thomson.com
Sent: Wednesday, December 13, 2006 5:05 PM
To: users at lists.pushtotest.com; users at lists.pushtotest.com
Subject: RE: [ptt-users] setFile corrupting data on multi-part HTTP POST?
What's probably happening in (1) is that ProtocolHandler is assuming UTF-8
encoding. I had the same problem in downloading applet code. Hex 85 in the
file was converted to hex 26.
I solved this by using Java. Here is Jython code to use the Java classes to
download. You can do something similar to upload data. Wrap the file in a
DataInputStream, read it and write it into the connection's output stream.
from java.net import URL, URLConnection
from java.io import DataOutputStream, FileOutputStream, DataInputStream
myURL = URL('''http://localhost:8080/MyApp/applets/Applets.jar")
cc = myURL.openConnection()
jarFile = DataOutputStream(FileOutputStream("Applets.jar")
inStream = DataInputStream(cc.getInputStream())
inNum = inStream.read()
while -1 != inNum:
jarFile.write(inNum)
inNum = inStream.read()
jarFile.close()
print "Applets.jar written"
Mark Lutton
Business Intelligence Services, a Thomson Business
________________________________
From: users-bounces at lists.pushtotest.com on behalf of Friedman, Seth
Sent: Wed 12/13/2006 4:44 PM
To: users at lists.pushtotest.com
Subject: [ptt-users] setFile corrupting data on multi-part HTTP POST?
Hi,
Two questions.
(1)
With the following excerpt of code,
self.body.setFile(filetoupload, "video/x-ms-wmv", "Filedata")
(..bunch of parameters..)
self.response = self.http.connect()
I'm seeing the source and received binary data differ. The filesizes are
identical, but hex 90s are all converted to hex 3F.
Is there an alternative to setFile() that would be more suited for binary
data?
(2)
In testmaker, the parts to the multi-part POST are generated seemingly
independent of the order that I add parameters. Regardless of whether I
put the setFile call at the beginning or end of a series of
self.body.addParameter() calls, I'm seeing the file POSTed first of the
parts of the multi-part post. Looking at the RFC (2388
http://www.ietf.org/rfc/rfc2388.txt sec 5.5) it seems like this might
actually matter. Is there a way that I can control ordering that the
parts of a multi part post works?
Thanks!
seth
_______________________________________________
Users mailing list
Users at lists.pushtotest.com
http://lists.pushtotest.com/mailman/listinfo/users
_______________________________________________
Users mailing list
Users at lists.pushtotest.com
http://lists.pushtotest.com/mailman/listinfo/users
_______________________________________________
Users mailing list
Users at lists.pushtotest.com
http://lists.pushtotest.com/mailman/listinfo/users
More information about the Users
mailing list