In this paper we will examine one way to distribute dynamic binary data via HTTP to clients using ASP (Active Server Pages). The problem, specifically, is distributing a self-extracting compressed archive with the ability to change the data inside of the archive in real time on the server side before sending it to the client. However the method presented here can easily be generalized to incorporate any binary data of any file format. This will require a few nifty tricks, but other than that it is fairly straightforward.
Building the database
Before we can construct the ASP code to change and serve the appropriate data we must have a source for the original binary file. To facilitate this we will use a simple database.
The design of the database is simple. We will use a Microsoft Access database with one table ('PacT') and two fields in it: 'Version' (Text) and 'Pac' (OLE Object). (Here 'Pac' is short for 'Package'.) You can obviously add more fields later if your application calls for it.
Next we create a record in the table 'PacT' with an arbitrary value for 'Version' and leave the 'Pac' field empty. This is the point at which we encounter our first problem. That is, how do we get the binary data into the database? There really isn't an easy way. The best way I found is to load up the database in a VB (Visual Basic) program and insert the data in that way. Attached is the source code as well as a compiled executable [1] for the program created to do this. It is very short and simple; however it has not been tested or optimized to a high degree so it might need a little bit of tweaking. The main function of interest is 'Insert Data', and then to confirm that it got in there ok use the 'Write Data To File' function. As a final check when the database is opened again in MS Access for the record of interest the 'Pac' field should say 'Long binary data'. Once this point is reached the database is good to go.
Before we get to the next section, for those of you who are interested on how I arrived at this method, read on. My first approach was not using a database, but rather trying to load up the data into a string and then passing that string on to the client. However as you might imagine when you are dealing with large files this quickly becomes an extremely difficult and tedious task. I am also including the source code [2] to a program that I first created to read a file and then spit out code that would load that file up into a string, byte by byte. Note that I am not including a compiled version at all, since you will have to change some paths and things yourself anyway. It is not nearly as refined as the final program I used to load data into the database, but it actually works (for the most part). If you are distributing a file < 1KB you *might* want to consider using this method under special circumstances, but I would strongly advise against it. The main reason I'm including this source code is because it is interesting; it turns out that it is not very useful.
Querying the database
This part is actually pretty simple. Use a simple query to get a record set with the data of interest in it (only one record for our example).
One very important thing I would like to mention here, however, is that you *must* enable ASP processing for the file extension of interest (here *.exe) on your IIS account. This created a huge problem for me at first. I turned this on, wanting to distribute one of my programs dynamically, but I assumed since the other executables were straight binary (no ASP code to be processed) that they would be passed right along as usual. I was wrong. By simple probability, you can see how a file of even a modest size could easily contain a '<%' or '%>' in it somewhere. Once the ASP processor hit this point it returned an error and nobody was able to retrieve these other programs. Because of this you must distribute all of the files of this type within your IIS account from a database, even if you are not changing any of their content on the fly.
Depending on your hosting situation, it shouldn't be too hard to enable ASP processing for unusual file types. Chances are that you do not have in-house hosting, since if you do then it might be easier just to call a local DLL to do it all for you. Most competent hosts won't put up too much of a fight against this odd, yet secure (as long as you know what you're doing and your host does too), request.
Another important thing is to put the following two lines at the top of your code.
Response.buffer = TRUE
Response.ContentType = "application/octet-stream"
The first one allows the download to complete faster by telling the server to execute all ASP code before sending anything to the client [3]. The second line instructs the server to tell the client that the data you are sending is actual binary data and thus it will prompt the user with the typical file download prompt and so forth. Some example code might be:
<%
Response.buffer = TRUE
Response.ContentType = "application/octet-stream"
strConn="provider=Microsoft.Jet.OLEDB.4.0;
data source=D:\Inetpub\mydomain.com\db\mydb.mdb"
set conn = server.createobject("adodb.connection")
set rs = Server.CreateObject("ADODB.Recordset")
Conn.Mode = 3 '3 = adModeReadWrite
conn.open strConn
SQLStmt = "SELECT Pac FROM PacT WHERE Version
= 100"
set rs = conn.Execute(SQLStmt)
%>
Feeding Data
This is the good stuff that makes it all possible. If you want to just pass the data in the database on to the client without any modification, you would write:
<%
dim mFieldSize, mBytes
mFieldSize = rs.Fields("Pac").ActualSize
mBytes = rs.Fields("Pac").GetChunk(mFieldSize)
response.binarywrite mBytes
%>
This gives you the basis needed to send just about any kind of data that you want. To demonstrate we will return to our specific case of distributing a self-extracting executable. This archive contains one text-only file ten characters/bytes long which we will assign a random value for each download. During installation, this text file can be copied to a certain location and later read by the program or used in any other way imaginable. We are now immediately faced with a few problems:
1) Compression: We do not want to have to compress our new ten byte file every time we are making a distribution. This would be cumbersome, probably prompting the need for a file compression ActiveX control or the like. To get around this when creating the initial archive, our original file will be flagged as uncompressed. It is still possible, and advisable, to compress all other files except for this one. For a ten byte file you need not think twice about the increase in distribution size this will cause.
2) CRC Values: If we simply change the contents of the ten byte text file and send it off to the client as-is when the user tries to unzip/open the file they will receive a CRC error indicating that the file has been corrupted. CRC (Cycle Redundancy Check) is a hash of some data which is unique to that specific data, such that if any modification to that data is made the CRC will be different. This is a mechanism to ensure that a file is not altered during a download. In PKZip, WinZip, and every other compression program which follows the standard, a CRC value is created for each file and stored along with the file inside the archive. So when we change the value of the text file we must recompute a new CRC value for it and replace the old CRC with the new one. After a little bit of research and thought it is no longer a large hurtle. A quick trip to the PKZip application note [4] reveals that the CRC value is stored in two places: a local file header and the central directory (assuming no data descriptor is included, which is generally the case). This is where a hex editor comes in handy. I highly recommend WinHex, but just about any one will serve our purpose here. I suggest you take a look at your file, search it for the current CRC value of the static file (there's a column that tells you the CRC for a file if you select to view it), and get comfortable with the general layout.
I was able to find some pretty good ASP source code for a CRC calculator off of Planet Source Code. It was a while ago, so I'm not sure which entry it was from, and thus am unable to give proper credit for it. If you are the author please send me an email and I will give you credit here. At any rate, I am including the CRC source code [5] as well. Just paste it into the bottom of your ASP page and go from there. Here is an example of some source code which will generate a random value consisting of numbers from zero to nine and capitol letters for our ten-digit text file (the function is defined in the same file as the definitions for the CRC functions), insert this new value in for the text file, generate a new CRC value, and insert it in at the right points.
<%
dim mFieldSize, mBytes
mFieldSize = rs3.Fields("Pac").ActualSize
mBytes = rs3.Fields("Pac").GetChunk(mFieldSize)
Dim myrnd
myrnd = GenRnd
Private Crc32Table(255) 'global array needed
by CRC functions
response.binarywrite mid(mBytes, 1, 321862)
response.binarywrite GenMyCRC(myrnd)
response.binarywrite mid(mBytes, 321865, 10)
for i = 1 to 10
response.binarywrite chrb(asc(mid(myrnd, i, 1)))
next
response.binarywrite mid(mBytes, 321880, 4013)
response.binarywrite GenMyCRC(myrnd)
response.binarywrite mid(mBytes, 325895)
%>
The most complicated things in this source are the numbers that restrict the range of data to send (i.e. 321862, 321865, 321880, and 325895). Below are the representative sections which they send.

Figure 1. Block diagram of file layout. Note that all stated
positions are the offsets for the character before the break.
You find these positions for your particular archive by doing a search in your hex editor for things like the current CRC (once again given to you in WinZip) and the original file contents. However, you will have to take care when searching for the CRC. It is read backwards (LSB (Least Significant Bit) last) in WinZip relative to the file. Also, when you find the offset for these values, you will probably have to divide this by two since it is the offset of the hex byte and it takes two hex bytes to make one character. You should get an integer back since the offset of the MSB (Most Significant Bit) for each character is even. For example, the original offset given to me by WinHex for the start of the first CRC was 9D28C. I punched this into the Windows Calculator in hex mode, divided by two, and then switched to decimal. It sounds rather complicated at first but if you get into there with your hex editor it becomes obvious what is going on fairly quickly.
3) Self-extracting executable wrapping: It turns out that wrapping a zip file into an executable file does not change the structure of the zip file after all, so we do not need to worry about any changes that might take place during the self-extractor creation. Just make sure that when you get the offsets for the different points of the archive that you are getting them from the self-extracting version. Of course, if you do not wish to use a self-extracting archive then it is simply one less thing to worry about.
4) Yet another consideration is in the case that you might want to pass some kind of arguments in the URL to the script generating the executable. For example if you run an affiliate program you can tell each of your affiliates to send the people that they refer to 'myapp.exe?affilaiate=2485'. This comes in extremely handy, but one problem that arises is that when the client starts the download the default filename will be literally 'myapp.exe?affilaiate=2485'. Luckily the HTTP people have given us an easy fix. Somewhere in your code insert the following line (adapted for your filename, of course):
Response.AddHeader "Content-disposition", "attachment; filename=myapp.exe"
...bingo!
Applications
There are tons of things you can do with this construct. You're not limited to executables, zip archives, or any file format. Here are a few ideas.
Note: Many of the things you can do with this would be an extreme invasion on the end user's right to privacy! Make sure that anything implementing a system like this is not infringing on these rights!
1) Affiliate Program Tracking: As mentioned above, if you encode every single downloaded file with a unique string and store the affiliate ID associated with that string in some database, this can be useful. For example, if you are distributing shareware programs and the user must connect to your server to retrieve their key, you can configure the client to reproduce its string to the server. Alternatively, when a user clicks in your application on the button to register, append the ten digit code as an argument to the end of the web page they are directed to and read it from there. This way you can link every registration with the person who referred them. (This is a lot more robust than using cookies!) Even if you do not run an affiliate program, you could still track the referring URL instead of an affiliate ID.
2) Dynamic Digital Watermarking: There have been many papers [6] which describe methods with which artists can encode certain data into the image that cannot be removed without distorting the original image to a high degree. Thus if there is ever any question as to if that image belongs to the artist the watermark can be read. If, perhaps, the artist was able to encode a unique watermark inside every copy of the file distributed, and then maintain a database with information about who received each copy, and someone is selling it illegally, the person who downloaded that specific copy can be identified. It is obvious that such a scheme could be easily detected since if a user reloads the image and compares the two files they will be different, however this does not mean it could be easily removed.
There is another large problem which sets dynamic digital watermarking apart from our specific example here. A digital watermark is generally embedded in the entire image, not just a certain small representative range of data as in the case which we looked at. Also, because of the complexity of watermarking systems and ASP's very limited bit manipulation capabilities, the optimal suggestion for implementing such a system would be to employ an ActiveX component to do the hard work.
3) Basic Image Manipulation: A few ActiveX components have been released which allow image manipulation on the fly by making a few calls to these components. However now there is the potential to do basic operations without any controls. In fact, you are only limited to how much bit manipulation you can do in ASP!
4) Encoding registration information into executables: This could also be a useful anti-piracy feature. If the name and email address of a registered user is "engraved" into every registered copy distributed, this would seriously deter piracy [7]. However, note that if you can change out the name and email address in real time then a cracker can definitely use a hex editor to change it themselves. If you want to implement a scheme like this, don't leave out the crypto!
If you would like to contact me with comments, suggestions, feedback, or for any other reason, you can reach me at adam@viratech.com.
References
[1] Database Insertion Source and Binary, Adam M Smith, January 2002, http://www.viratech.com/files/DBData.zip
[2] String Representation Source, Adam M Smith, January 2002, http://www.viratech.com/files/StringLoader.zip
[3] Response Object, Microsoft Developer's Network, http://msdn.microsoft.com/library/default.asp?url=/library/en-us/iisref/html/psdk/asp/vbob5sj8.asp
[4] PKZIP Application Note, PKWARE Inc., Version 4.5,
http://www.pkware.com/support/appnote.html
[5] CRC32 ASP Source, December 2001,
http://www.viratech.com/files/CRC.zip
[6] Information Hiding - A Survey, Fabien A. P. Petitcolas and Ross J. Anderson and Markus G. Kuhn, Proceedings of the IEEE, 87(7):1062-1078, July 1999, http://citeseer.nj.nec.com/petitcolas99information.html
[7] Defending Shareware Against Cracks, Adam M Smith, April 2000, http://www.viratech.com/sharenc.htm
This article Copyright © 2002 Sense of Security Incorporated
All Rights Reserved.