Binary Response Getting Mangled

Hello,

So I have been trying to write a ResponseTransformer over the last day or so to convert PDF responses into something consumable (like maybe some json groups or HTML or whatever) and I’ve run into an issue. It seems like the binary in the response (v1.body.bytes) that gets returned is somehow changed and not consumable by my PDF library. I’m using a Gatling 2.0 snapshot.

Just as an example, I’ve been using the IRS’s W-2 form to test it’s functioning.

`

exec(http(“asdf”).get(“http://www.irs.gov/pub/irs-pdf/fw2.pdf”).transformResponse{
new ResponseTransformer() {
override def isDefinedAt(x: Response): Boolean = true
override def apply(v1: Response): Response = {
println("Content Length: " + v1.bodyLength)
var position = 0
javax.xml.bind.DatatypeConverter.printHexBinary(v1.body.bytes).toCharArray.foreach((c: Char) => {
print(c)
position += 1

if(position >= 32 )
{
println()
position = 0
}
else if(position % 2 == 0)
{
print(" ")
}
})
println()
v1
}
}

`

This code grabs the W-2 PDF, and then does a hex dump of what it gets. I executed that, and then in Powershell I downloaded the same file and ran this:

`

Get-Content “C:\Users\stella.clemens\fw2.pdf” -Encoding Byte `
-ReadCount 16 | ForEach-Object {
$output = “”
foreach ( $byte in $_ ) {
#BEGIN CALLOUT A
$output += "{0:X2} " -f $byte
#END CALLOUT A
}
$output
}

`

The full dump is unnecessarily long, but here are two analogous blocks of the hex dump from the end.
From Gatling:

`

20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 0A 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 0A 3C 3F 78 70 61 63 6B 65
74 20 65 6E 64 3D 22 77 22 3F 3E 0D 0A 65 6E 64
73 74 72 65 61 6D 0D 65 6E 64 6F 62 6A 0D 31 32
35 20 30 20 6F 62 6A 0D 3C 3C 2F 46 69 6C 74 65
72 2F 46 6C 61 74 65 44 65 63 6F 64 65 2F 46 69
72 73 74 20 32 34 2F 4C 65 6E 67 74 68 20 31 32
32 2F 4E 20 33 2F 54 79 70 65 2F 4F 62 6A 53 74
6D 3E 3E 73 74 72 65 61 6D 0D 0A 68 EF BF BD 32
34 33 36 51 30 50 30 34 33 36 55 30 EF BF BD 00
EF BF BD 66 0A EF BF BD 46 16 0A 36 36 EF BF BD
EF BF BD EF BF BD EF BF BD 79 25 0A EF BF BD EF
BF BD EF BF BD DE 99 29 EF BF BD EF BF BD 60 15
06 0A 41 10 25 40 46 EF BF BD 7E 48 65 41 EF BF
BD 7E 40 62 7A 6A EF BF BD EF BF BD 1D 5C EF BF
BD 29 4C EF BF BD EF BF BD 01 44 39 EF BF BD 34
EF BF BD EF BF BD EF BF BD 42 EF BF BD 10 EF BF
BD 01 EF BF BD 45 EF BF BD 20 1B 20 4E 08 EF BF
BD 6E EF BF BD 19 EF BF BD 3C EF BF BD EF BF BD
16 60 EF BF BD 08 62 EF BF BD EF BF BD 31 EF BF
BD EF BF BD EF BF BD 19 59 10 36 16 20 EF BF BD
00 EF BF BD EF BF BD 3B EF BF BD 0D 0A 65 6E 64
73 74 72 65 61 6D 0D 65 6E 64 6F 62 6A 0D 31 32
36 20 30 20 6F 62 6A 0D 3C 3C 2F 46 69 6C 74 65
72 2F 46 6C 61 74 65 44 65 63 6F 64 65 2F 46 69
72 73 74 20 37 2F 4C 65 6E 67 74 68 20 31 37 34
2F 4E 20 31 2F 54 79 70 65 2F 4F 62 6A 53 74 6D
3E 3E 73 74 72 65 61 6D 0D 0A 68 DE 8C EF BF BD
41 0B EF BF BD 30 18 EF BF BD EF BF BD EF BF BD
EF BF BD 74 07 73 EF BF BD 34 1C 22 EF BF BD EF
BF BD 29 41 32 EF BF BD 3C DD 8B 2D EF BF BD EF
BF BD EF BF BD 51 EF BF BD 3E 0F D1 A5 4B EF BF
BD EF BF BD EF BF BD 7B 58 1C 1E EF BF BD 42 EF
BF BD EF BF BD EF BF BD 6A EF BF BD DA B8 6D EF
BF BD 3B 5E EF BF BD 67 5E 37 EF BF BD 2F 0C 0A
EF BF BD EF BF BD 52 0A EF BF BD 6E EF BF BD 03
EF BF BD 42 16 04 31 4D 28 EF BF BD EF BF BD 47
23 EF BF BD 52 EF BF BD EF BF BD EF BF BD 5C EF
BF BD 1E EF BF BD 1E 58 EF BF BD EF BF BD 09 EF
BF BD C4 BB 1A 17 34 50 EF BF BD EF BF BD EF BF
BD 28 EF BF BD 6B 2D 7F EF BF BD EF BF BD 25 5F
EF BF BD 31 5A EF BF BD 03 EF BF BD 41 EF BF BD
6B 7F EF BF BD EF BF BD EF BF BD EF BF BD 18 11
EF BF BD 22 EF BF BD 22 EF BF BD EF BF BD EF BF
BD 4D EF BF BD 71 EF BF BD C4 BF 28 3B EF BF BD
EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD 0C
EF BF BD 17 EF BF BD 2C 7B 0B 30 00 EF BF BD 3B
46 34 0D 0A 65 6E 64 73 74 72 65 61 6D 0D 65 6E
64 6F 62 6A 0D 31 32 37 20 30 20 6F 62 6A 0D 3C
3C 2F 44 65 63 6F 64 65 50 61 72 6D 73 3C 3C 2F
43 6F 6C 75 6D 6E 73 20 35 2F 50 72 65 64 69 63
74 6F 72 20 31 32 3E 3E 2F 46 69 6C 74 65 72 2F
46 6C 61 74 65 44 65 63 6F 64 65 2F 49 44 5B 3C
30 45 46 31 33 35 34 34 32 34 32 37 39 30 34 44
38 31 42 33 30 37 46 39 30 30 38 35 37 37 37 32
3E 3C 31 45 36 41 32 45 31 36 45 42 45 36 30 42
34 38 39 31 34 32 43 30 45 45 37 32 34 39 35 44
36 33 3E 5D 2F 49 6E 66 6F 20 31 36 33 37 20 30
20 52 2F 4C 65 6E 67 74 68 20 33 34 35 2F 52 6F
6F 74 20 31 36 33 39 20 30 20 52 2F 53 69 7A 65
20 31 36 33 38 2F 54 79 70 65 2F 58 52 65 66 2F
57 5B 31 20 33 20 31 5D 3E 3E 73 74 72 65 61 6D
0D 0A 68 EF BF BD EF BF BD 31 2F 04 51 14 EF BF
BD DF 9B 35 6B 77 EF BF BD 04 11 EF BF BD 44 23
EF BF BD 10 0A 09 15 0A 0D 5B EF BF BD 4E EF BF
BD 46 EF BF BD 13 11 7E EF BF BD EF BF BD 3F 58
EF BF BD 4A 24 34 24 44 14 EF BF BD 42 47 49 4B
14 EF BF BD 0D 1D 62 EF BF BD 37 EF BF BD 24 66
23 EF BF BD 24 48 EF BF BD 16 5F EF BF BD EF BF
BD 77 EE 9B BB 73 EF BF BD 06 EF BF BD 3E EF BF
BD 77 D7 9B 2E 70 EF BF BD EF BF BD 1A EF BF BD
57 EF BF BD EF BF BD 31 EF BF BD 43 EF BF BD EF
BF BD EF BF BD EF BF BD 4E 63 EF BF BD 32 EF BF
BD 04 4E EF BF BD 35 61 74 47 54 1E EF BF BD EF
BF BD EF BF BD 31 7D 31 71 1E EF BF BD EF BF BD
4D 05 74 17 EF BF BD 35 EF BF BD 61 7A 1A EF BF
BD EF BF BD EF BF BD EF BF BD DB 8F EF BF BD EF
BF BD 00 EF BF BD 7D EF BF BD 3A 33 36 54 EF BF
BD EF BF BD 3D EF BF BD EF BF BD 69 EF BF BD 0B
EF BF BD 2F 60 11 3E 31 61 EF BF BD EF BF BD EF
BF BD 24 EF BF BD 04 54 5C 6D EF BF BD 1F 1A 33
5D EF BF BD EF BF BD DA 9D 3F 35 EF BF BD 6F EF
BF BD D3 B1 62 EF BF BD EF BF BD 79 EF BF BD EF
BF BD 0E 53 EF BF BD EF BF BD EF BF BD D9 98 EF
BF BD EF BF BD 28 EF BF BD 3C EF BF BD 66 EF BF
BD 16 EF BF BD 5D EF BF BD 78 EF BF BD 5E 34 EF
BF BD EF BF BD EF BF BD EF BF BD 0E EF BF BD EF
BF BD EF BF BD EF BF BD 0E 3D EF BF BD EF BF BD
EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD EF
BF BD EF BF BD 74 6D EF BF BD 47 2F EF BF BD 57
E1 BA B1 EF BF BD EF BF BD EF BF BD 44 EF BF BD
EF BF BD EF BF BD 57 EF BF BD EF BF BD 30 EF BF
BD 56 EF BF BD EF BF BD 17 EF BF BD 27 CC B0 EF
BF BD 69 EF BF BD 28 EF BF BD 47 5F EF BF BD 7B
10 EF BF BD EF BF BD EF BF BD 32 EF BF BD EF BF
BD 57 DB B9 57 4A 45 EF BF BD EF BF BD 45 EF BF
BD 4A 54 EF BF BD 44 51 EF BF BD 12 EF BF BD 2B
51 EF BF BD 12 45 EF BF BD 4A 54 EF BF BD 44 EF
BF BD 4A 14 EF BF BD 2B 51 EF BF BD 12 EF BF BD
EF BF BD EF BF BD EF BF BD EF BF BD 21 33 EF BF
BD DB BB 00 03 00 EF BF BD EF BF BD 69 4D 0D 0A
65 6E 64 73 74 72 65 61 6D 0D 65 6E 64 6F 62 6A
0D 73 74 61 72 74 78 72 65 66 0D 0A 31 31 36 0D
0A 25 25 45 4F 46 0D 0A

`

From Powershell:

`

20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 0A 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 0A 3C 3F 78 70 61 63 6B 65 74 20 65
6E 64 3D 22 77 22 3F 3E 0D 0A 65 6E 64 73 74 72
65 61 6D 0D 65 6E 64 6F 62 6A 0D 31 32 35 20 30
20 6F 62 6A 0D 3C 3C 2F 46 69 6C 74 65 72 2F 46
6C 61 74 65 44 65 63 6F 64 65 2F 46 69 72 73 74
20 32 34 2F 4C 65 6E 67 74 68 20 31 32 32 2F 4E
20 33 2F 54 79 70 65 2F 4F 62 6A 53 74 6D 3E 3E
73 74 72 65 61 6D 0D 0A 68 DE 32 34 33 36 51 30
50 30 34 33 36 55 30 B1 00 D1 66 0A 86 46 16 0A
36 36 FA CE F9 A5 79 25 0A 86 86 FA DE 99 29 C5
D1 60 15 06 0A 41 10 25 40 46 AC 7E 48 65 41 AA
7E 40 62 7A 6A B1 9D 1D 5C BD 29 4C B9 89 01 44
39 98 34 87 B0 A1 42 C6 10 FD 01 89 45 A9 20 1B
20 4E 08 C2 6E 9E 19 D4 3C A8 E5 16 60 CA 08 62
90 91 31 84 82 C8 19 59 10 36 16 20 C0 00 8C A2
3B E6 0D 0A 65 6E 64 73 74 72 65 61 6D 0D 65 6E
64 6F 62 6A 0D 31 32 36 20 30 20 6F 62 6A 0D 3C
3C 2F 46 69 6C 74 65 72 2F 46 6C 61 74 65 44 65
63 6F 64 65 2F 46 69 72 73 74 20 37 2F 4C 65 6E
67 74 68 20 31 37 34 2F 4E 20 31 2F 54 79 70 65
2F 4F 62 6A 53 74 6D 3E 3E 73 74 72 65 61 6D 0D
0A 68 DE 8C CD 41 0B 82 30 18 C6 F1 AF F2 DE 74
07 73 D3 34 1C 22 88 DA 29 41 32 F0 3C DD 8B 2D
D4 C1 9A 51 DF 3E 0F D1 A5 4B F7 E7 FF 7B 58 1C
1E 80 42 9A FA F9 6A AF DA B8 6D C5 3B 5E E4 67
5E 37 C4 2F 0C 0A AB F4 52 0A 8B 6E C9 03 CA 42
16 04 31 4D 28 8B 98 47 23 87 52 E7 B3 DA D2 5C
EA 1E E1 A4 1E 58 BC 86 09 A1 C4 BB 1A 17 34 50
B5 90 EC 28 F1 6B 2D 7F A5 90 25 5F A9 31 5A AE
03 FE 41 B5 6B 7F C3 C1 BA 9D 18 11 C4 22 E1 22
9E D0 DA 4D 9F 71 B1 C4 BF 28 3B A1 BB DD EC E1
A8 CD 0C 9D 17 90 2C 7B 0B 30 00 E7 3B 46 34 0D
0A 65 6E 64 73 74 72 65 61 6D 0D 65 6E 64 6F 62
6A 0D 31 32 37 20 30 20 6F 62 6A 0D 3C 3C 2F 44
65 63 6F 64 65 50 61 72 6D 73 3C 3C 2F 43 6F 6C
75 6D 6E 73 20 35 2F 50 72 65 64 69 63 74 6F 72
20 31 32 3E 3E 2F 46 69 6C 74 65 72 2F 46 6C 61
74 65 44 65 63 6F 64 65 2F 49 44 5B 3C 30 45 46
31 33 35 34 34 32 34 32 37 39 30 34 44 38 31 42
33 30 37 46 39 30 30 38 35 37 37 37 32 3E 3C 31
45 36 41 32 45 31 36 45 42 45 36 30 42 34 38 39
31 34 32 43 30 45 45 37 32 34 39 35 44 36 33 3E
5D 2F 49 6E 66 6F 20 31 36 33 37 20 30 20 52 2F
4C 65 6E 67 74 68 20 33 34 35 2F 52 6F 6F 74 20
31 36 33 39 20 30 20 52 2F 53 69 7A 65 20 31 36
33 38 2F 54 79 70 65 2F 58 52 65 66 2F 57 5B 31
20 33 20 31 5D 3E 3E 73 74 72 65 61 6D 0D 0A 68
DE EC 93 31 2F 04 51 14 85 DF 9B 35 6B 77 D8 04
11 89 44 23 A1 10 0A 09 15 0A 0D 5B 8A 4E 8B 46
A3 13 11 7E 82 C4 3F 58 8D 4A 24 34 24 44 14 84
42 47 49 4B 14 CA 0D 1D 62 EE 37 C5 24 66 23 B1
24 48 CE 16 5F CE DE 77 EE 9B BB 73 CF 06 CE 3E
81 77 D7 9B 2E 70 CE F7 1A A3 57 F4 94 31 E8 43
F7 A3 F3 E8 4E 63 E3 32 FA 04 4E D2 35 61 74 47
54 1E F1 EC A1 CB 31 7D 31 71 1E A7 A7 4D 05 74
17 CE 35 F4 61 7A 1A 8D A4 A7 85 DB 8F BD E1 00
9E 7D A6 3A 33 36 54 8C AD 3D C6 F6 69 EA 0B F8
2F 60 11 3E 31 61 A6 E2 EB AA 24 F7 04 54 5C 6D
8F 1F 1A 33 5D DA F8 DA 9D 3F 35 DB 6F DD D3 B1
62 BF BD 79 90 CD 0E 53 AF B2 A3 D9 98 D1 C1 28
EF 8D 3C E4 66 E0 16 9E 5D 9E 78 8A 5E 34 E6 DB
C8 C9 0E BD E3 EC FD 0E 3D 87 EE 86 CF F6 AC 90
BD 87 97 74 6D E3 99 47 2F A1 57 E1 BA B1 E5 9C
FB 93 44 95 E2 E4 57 AF 98 30 FE 56 9B B9 17 88
27 CC B0 FC 69 AF 28 D6 47 5F D1 7B 10 95 AB EF
32 BA F9 57 DB B9 57 4A 45 FD EB 45 E5 4A 54 AE
44 51 B9 12 95 2B 51 B9 12 45 E5 4A 54 AE 44 E5
4A 14 95 2B 51 B9 12 95 AB BF C2 E0 21 33 F3 DB
BB 00 03 00 FB 92 69 4D 0D 0A 65 6E 64 73 74 72
65 61 6D 0D 65 6E 64 6F 62 6A 0D 73 74 61 72 74
78 72 65 66 0D 0A 31 31 36 0D 0A 25 25 45 4F 46
0D 0A

`

It seems like all the ASCII made it through pretty unmolested. Up until “3E 73 74 72 65 61 6D 0D 0A 68” they are the same, in ASCII it’s this:

<?xpacket end="w"?>

endstream
endobj
125 0 obj
<</Filter/FlateDecode/First 24/Length 122/N 3/Type/ObjStm>>stream
h

But after that it’s a binary data stream and the two diverge. Is there any way I can get an unmodified version of the binary response? Is this intended behavior?

Thank you.

  • Stella

I’ll check this on monday.

OK, actually, I know what happens.

In order to get the bytes, you had to disable chunks discarding, right?
When you do so, as you haven’t configured any check so Gatling has a hint about what you want to do, it assumes you want the body string. So then, if you want the bytes, you get them from the string, so you’re doing a bytes/chars/bytes decoding/encoding roundtrip, which is destructive if the bytes weren’t actual text.

I’ll open an issue so that Gatling doesn’t assume text so aggressively.

Then, you probably shouldn’t be using a ResponseTransformer for this.

If you’re using a recent snapshot, you have a bodyBytes (so Gatling then knows that you actually want the bytes, you don’t even have to disable discarding) check where you can plug a transform step:

exec(http(“asdf”).get(“http://www.irs.gov/pub/irs-pdf/fw2.pdf”).check(bodyBytes.transform(transformBytesToSomethingElse).saveAs(“document”)))

Get it?

See comment: https://github.com/excilys/gatling/issues/2018
Everything looks fine now.

Yes! Thank you, this worked great for me.

`

exec(http(“asdf”).get(“http://www.irs.gov/pub/irs-pdf/fw2.pdf”)
.check(
bodyBytes.transform((bytes: Array[Byte]) => {
val parser = new PDFParser(new ByteArrayInputStream(bytes))
parser.parse()
val doc = new PDDocument(parser.getDocument)
println(new PDFTextStripper().getText(doc))
})

`

Great!

One more question about this: how would I string in a call to the regex system at the end? Like I’d like to go bodyBytes.transformString.regex(“xy(z*)”).is(“1500”) but it doesn’t seem like regex is available from ValidationCheckBuilder. Is there another way to access the regex system, or should I use the library directly in defining pdf2String?

Thank you!

Gatling regex is at the same level as bodyBytes, they’re roots, so you can’t chain them.

If you can’t build your check logic from Gatling primitives, you either have to write your own check or transform the response.

I just added a response processor example to the new doc: https://github.com/excilys/gatling/blob/master/src/sphinx/http/http_request.rst#response-processors

How it helps.

Excellent! Thank you!

  • Stella