PRELIMINARY DOCUMENTATION 9-3-88 C-MODEM FILE TRANSFER PROTOCOL: Finally, a powerful and FAST file transfer protocol that's relatively easy to implement. Why is it fast? Because it's organized and the basic concept is simple. Why is it powerful? Because C-Modem is a batch file transfer protocol that takes batch file transferring a step further -- allowing the transfer of entire subdirectories (folders) of files. Don't let the power scare you away though. The support of subdirectories is not a requirement for the implementation of this protocol. The following will make better understood the inner workings of C-Modem. If you intend to implement C-Modem into your own application, congratulations. You'll soon discover (if you haven't already) that you have made a wise choice. C-Modem is now becomming a protocol in high demand in a very short amount of time. The reason for this is because it is a fast batch protocol; however, from the developer's point of view, C-Modem is becomming popular for this and yet another reason -- it's ease of implementation. Currently, the only protocols that can compete with C-modem are window-type protocols (like windowed X-Modem, Sealink, Z-Modem, etc...). These protocols (relatively speaking) are fairly difficult to implement. Not only that, C-Modem is faster than these protocols -- and it offers more features. Enough talk. Let's get to what you came for. C-MODEM HEADER TYPES: C-Modem has four different types of headers. All headers consist of two bytes. The first byte being $11 (or 17 decimal) and the second byte being one of the following: Hex value Binary value -------------------------------------------- $33 00110011 $55 01010101 $aa 10101010 $cc 11001100 So a C-Modem header would be one of these four values ($1133, $1155, $11aa, $11cc). Notice the bit pattern of the second byte of the header. They are designed so that at least four bits of each of these bytes are different from any other header byte. The first byte of a header ($11) insures that all bits of the second byte (one of the above four) line up properly for comparison (or at least, it is of a very high degree of certainty that they will be lined up). This will help in making sure that one header doesn't get mistaken for another (due to line noise). It is one thing for a header to get scrambled and go unnoticed, but it is another thing when one header gets scrambled and gets mistaken for another. These headers are used on one or more C-Modem packets. A packet header describes the type of C-Modem data that is being sent. In some cases, headers are the data themselves. Here are all of the different types of packets that C-Modem uses. All values between | | should be considered 1-byte hexadecimal numbers. If there are more than 4 characters between | |, then the field consists of multiple bytes. [SENDER/RECEIVER] Designates the terminal that is sending information. Packet "A" INFO-PACKET [SENDER]: SIZE OF SIZE OF FIELD 1 FIELD 3 FILE-SIZE (FILE NAME) (ZERO) FIELD 2 CRC-CHECK ---------------- ------ ------ --------- ----------- | 11 | AA | | | | | | | | | | | | | ----------- ------ ------ --------- --------- HEADER BUFFER SIZE OF FIELD 1 FIELD 3 SIZE FIELD 2 (FILE NAME) (N/A) (BLOCKS) OF SENDER-1 Packet "B" STATUS-PACKET (NAK FOR INFO-PACKET) [RECEIVER]: | 11 | 33 | 00 | ----------- HEADER Packet "C" STATUS-PACKET (ACK FOR INFO-PACKET) [RECEIVER]: CRC-CHECK ----------- | 11 | 33 | 01 | | | | ----------- ------ HEADER BUFFER SIZE (BLOCKS) OF RECEIVER-1 Packet "D" STATUS-PACKET (ABORT) [RECEIVER]: | 11 | 33 | FF | ----------- HEADER Packet "E" SUBBLOCK-DATA-PACKET [SENDER]: SUBBLOCK # CRC-CHECK ------ ----------- | 11 | CC | | | | | ----------- ------------------------------------------------------- HEADER DATA (USUALLY 256 BYTES) Packet "F" STATUS-PACKET (FOR SUB-BLOCKS NOT RECEIVED CORRECTLY) [RECEIVER]: FIELD SIZE (NUMBER OF SUBBLOCKS TO RETRANSMIT) CRC-CHECK ------ ----------- | 11 | 33 | | | | | | | | ----------- --------------------- HEADER RETRANSMIT THESE SUB-BLOCKS Packet "G" PROCEED (TELLS SENDER TO PROCEED AFTER LOADING BUFFER) [RECEIVER]: | 11 | 55 | ----------- HEADER Packet "H" STATUS-PACKET (MID-ABORT) [RECEIVER]: | 11 | 33 | 00 | ----------- HEADER Packet "I" STATUS PACKET (MID-ABORT) [SENDER]: | 11 | AA | ___________ HEADER Packet "J" STATUS-ACK (ALL SUBBLOCKS RECEIVED CORRECTLY) [RECEIVER]: | 11 | CC | ----------- HEADER Packet "K" STATUS-PACKET (NAK FOR RETRANSMIT STATUS) [SENDER]: | 11 | 33 | ----------- HEADER ------------------------------------------------------------------------------- File_Size=SIZE OF FILE (IN BYTES) BEING SENT Buff_Len=BUFFER LENGTH (MAXIMUM # OF BYTES THAT CAN BE STORED AT ONCE) Temp_Buff_Len=(TEMPORARY BUFFER LENGTH (AGREED UPON BUFFER LENGTH) / 256)-1 (E.G. 256 BYTE BUFFER: TBL=0, 512 BYTE BUFFER: TBL=1, ETC...) Other_Buff_Len=LENGTH OF OTHER END'S BUFFER SIZE Buff_Size=BUFFER SIZE (I.E. # OF BYTES CURRENTLY IN BUFFER) Block=THE CURRENT BLOCK BEING TRANSFERRED Last_Block=LAST BLOCK TO TRANSFER (0 TO 255) Last_Subblock=LAST SUBBLOCK TO TRANSFER WITHIN LAST BLOCK (0 TO 255) Last_Subblock_Size=SIZE OF LAST SUBBLOCK Tags=ARRAY FROM [0 TO TBL] HOLDING STATUS OF SUB-BLOCKS (ALL INITIALIZED TO "NOT RECEIVED CORRECTLY" AND UPDATED ONE-AT- A-TIME TO "RECEIVED CORRECTLY" AS THEY COME IN (WITH GOOD CRC) Buff_Size=SIZE (IN BYTES) OF CURRENT BUFFER ------------------------------------------------------------------------------- FLOWCHART [SENDER]: -------------------------- | Temp_Buff_Len=Buff_Len | -------------------------- |<--------------------------------------------------- v ------------------- | Last File? ----Y-------> / Send Packet "K" / | | ------------------- | ----------------------------- | | | Get Size of file and open | ------------ | | for read. | | Done | | --------- | | ------------ | | ABORT | | Count=0 | | --------- ----------------------------- | | | | COUNT=10? ---N----------->| ------------------- | ^ ------------------- ---/ SEND PACKET "B" / | | / SEND PACKET "A" / | ------------------- | | ------------------- | ^ | | |<------------- | --------- | | ------------------------------- COUNT=10? ---Y-->| ABORT |<-- | | / WAIT PACKET "B", "C" or "D" / ^ --------- | | | ------------------------------- | ^ | | ----------------- | ----------------- | | | | COUNT=COUNT+1 |<---TIMEOUT OR "B"? | COUNT=COUNT+1 | | | | ----------------- | ----------------- | | | N ^ | | | | | | | | PACKET="D"? ---Y--------------- | ---------------- | | | | | | -------------------------- | | | / GET REST OF PACKET "C" / | | | -------------------------- | | | | | | | TIMEOUT OR BAD CRC? --Y------------- | | | | | BUFFER SIZE OF RECEIVER=0? ---Y------------------------------>| | | | | ------------------------------------- | | | IF Other_Buff_Len| ------------ | | | USER ABORT? ---Y---->/ SEND "I" /--------------------->| | | | ------------ | | | v | | | ------------------------------------- | | | | | | | | | Set all Tags[] (0 through | | | | | Temp_Buff_Len) to BAD | | | | ------------------------------------- | | | | | | | --------------------------- | | | | IF Block=Last_Block | | | | | THEN Last=Last_Subblock | | | | | ELSE Last=Temp_Buff_Len | | | | --------------------------- | | | | | | | --------------------------------------- | | | | File_Remaining=File_size- | | | | | ((Temp_Buff_Len+1)*256)*Block | | | | | | | | | | IF File_Remaining| | | | | | | |<----------------------------------------- | | | -------------- | | | | | Subblock=0 | ----------------------- | | | | -------------- | | | | | | |<---------------------- | | | | | ----------Y--- Subblock>Last? | | | | | | | | ----------------------- | | | | | | Tags[Subblock]=bad?-->| Subblock=Subblock+1 | | | | | | | | ----------------------- | | | | | | Y | | | | | | | | | | | | | ------------------------------- | | | | | | | Tags[Subblock]=good | | | | | | | | | | | | | | | | IF Block=Last_Block | | | | | | | | AND Subblock=Last_Subblock | | | | | | | | THEN Size=Last_Size+1 | | | | | | | | ELSE Size=256 | | | | | | | ------------------------------- | | | | | | | | | | | | | ---------------------------------------------- | | | | | | / Send Packet "E" [Subblock] of Length[Size] / | | | | | | ---------------------------------------------- | | | | | | | | | | | | | --------------------------------------- | | | | | | | | | ---------------------- | | | ------ | | | | | ----------- | | | ----------------- | Count=0 | | | | | Block=Block+1 | ----------- | | | ----------------- |<----------------Y---Count=10? | | | ^ v | | | | | ------------------------------- ----------------- | | | | / WAIT Packet "H", "J" or "F" / | Count=Count+1 | | | | | ------------------------------- ----------------- | | | | | | | | | | v ------------------ | | | N TIMEOUT? ---Y----->/ SEND Packet "K" / | | | | | ------------------- | | | Block=Last_Block? <--Y-Packet "J"? ^ | | | | | | | | | Y ------------------------------------- | | | | | / RECEIVE REST OF Packet "F" or "H" / | | | | ------ ------------------------------------- | | | | | | | | | | | TIMEOUT OR BAD CRC? ---Y------------ | | | | | | | | | Packet "H"? ---Y----------------------------- | ----->| | | | | | | | ------------------------------------ | | | | | I=0 TO NUMBER OF BAD SUBBLOCKS-1 | | | | | ------------------------------------ | | | | ------------------->| | | | | | --------------------------------- | | | | | | SET Tags[Packet_"F"[I+3]]=BAD | | | | | | --------------------------------- | | | | | | | | | | -------------N--- DONE(I)? | | | | | | | | | ------------------------------------------- | | | | | | ----------- | | ------------------>| Count=0 | | | ----------- | | |<-------------------- | | ------------------- | | | / WAIT Packet "G" / Count=10? ---Y-------------------- | ------------------- | | | ----------------- | TIMEOUT?-------->| Count=Count+1 | | | ----------------- | ------------------- | / SEND Packet "G" / | ------------------- | | | ----------------------------------------------------- ------------------------------------------------------------------------------- NOTES: - When both buffer sizes are exchanged with the info/status-packets, both ends use the smaller of the two buffer sizes. This will synchronize the loading and saving of the buffer. This makes it possible to load and save at the same time, gaining a slight increase in speed). - The FILENAME field is a standard 1-8 character ASCII field with an optional 1-3 character extender. The filename and extender are separated by a period. ( XXXXXXXX.XXX ). Subdirectories may also be included within this FILENAME field (1-8 characters). The subdirectories and filenames are separated by backslashes '\'. Example: "IBM\TELECOMM\MODEM.PRG". The first character in this filename field should NEVER be a backslash. - If the receiving terminal cannot use subdirectories, they can be stripped out and discarded. If subdirectories are used, then the receiving terminal should attempt to place the program within the subdirectory. If the subdirectory doesn't exist, then it should be created first. This will allow multiple subdirectories of files to be transferred all at once. - Field 1 and field 2 in the INFO-PACKET are expansion fields. As of now, field 2 is not used. Field 1 MAY be used (with some implementations) as a time/date stamp. If time/date stamping of files is implemented by the sender, then size of field 1 will be 7 bytes long (but you must make sure that if the size is greater than 7, then that number of bytes be read in to insure upward compatibility). Keep in mind that a field can contain more than just one item of interest. As of now, only one item (the time/date) is used, but if another feature was to later be found useful, it could be added to this field (therefore making the field larger). The time/date item should look like this: FIELD 1 ------------------------------------ | 01 | yy | mm | dd | hh | mm | ss | ------------------------------------ The time/date are the actual byte values, not the ASCII values. The first byte ($01) indicates that this item is indeed a time/date item. If a second item were to exist, it would be assigned the identifier of $02, the third $03, and so on. Up to 127 items can be contained in one field (if they can all fit within 255 bytes), but of course, I don't suspect this will happen (that's a lot of afterthought). Notice that seconds are included as well. With most time/date stamping (if not all), seconds are not used; however, it very well could be. The seconds field should always be zeroed if it is not a supported feature with the sender's computer system. If the sender's version of C-Modem does not implement time/date stamping of files at all, then one of two things can take place. Either the high order bit of the time/date identifier ($01) can be set (yielding a $81 (or 129 decimal)), or the size of field 1 can be set to zero (which would indicate that NO extra features are implemented). Setting the high bit of time/date identifier indicates that this field exists, but is not to be used. Since (as of this time) the only extra feature available is time/date stamping, the best way to dispose of this feature is to set the field length to zero. Field 2 is used for a similar purpose as field 1; however, it is reserved for more dynamic types of data (data that may not have a fixed length). Again, as of now, this field is not used and should be zero. - The last block transferred may not necessarily be 256 bytes (since files generally aren't multiples of 256 bytes in length). However, since the file size is known on both ends, the receiver will know when the last subblock will occur and how big it should be. - This protocol can send up to 64K in one block of subblocks. The maximum file size is 16 megabytes. One of the disadvantages to this protocol is that you will need to know the exact file size before the file is sent. If the Disk Operating System being used cannot return this value immediately, then a byte count may need to be performed manually (which may take some time). However, most DOS's have this feature. ------------------------------------------------------------------------------- PERFORMANCE: A test was run with a file of size 6360 bytes and errors were generated to test efficiency. The C-Modem test was run using only 4 subblocks per block (1024 bytes total) to show an interesting comparison with Y-modem (which also sends 1024 bytes at a time). This test was run at 300 baud to improve the accuracy of these figures. As baud rates increase, propogation delays become a more significant factor in speed loss. Doing a comparison at 300 baud is being very kind to X-Modem and Y-Modem. P | 0 ERRORS | 1 ERROR | 2 ERRORS | 3 ERRORS | SECONDS/BLOCK R --------------------------------------------------------------------- O X-Modem | 4:18 | 4:22 | 4:26 | 4:30 | 4 seconds T ---------|----------|----------|----------|----------|--------------- O Y-Modem | 3:42 | 4:17 | 4:53 | 5:28 | 36 seconds C ---------|----------|----------|----------|----------|--------------- O C-Modem | 3:42 | 3:51 | 4:01 | 4:10 | 9 seconds L RETRANSMITTION TIME PER BAD BLOCK: X-MODEM: 4 seconds Y-MODEM: 36 seconds C-MODEM: 9 seconds TRANSFER PROTOCOL HISTORY: X-modem was, and in a lot of cases still is, one of the most widely used file-transfer protocols. However, as you can see, it is not one of the most efficient. There are 2 major problems to consider when developing a fast file- transfer protocol: retransmittion time for bad blocks, and propagation delays. X-Modem has a relatively low retransmission overhead (i.e. if a bad block occurs, it only has to retransmit 128+5 bytes), however, the reason that X- Modem is so slow is because of propagation delays. When a block is sent, it is not received immediately. It takes time for the transmission to propagate through (in most cases) the phone system. After receiving a block, the status of that block has to be sent back so that the transmitter will know if the block was received correctly. If not, then it has to be sent again. This status also takes time to propagate through the phone system. Y-modem was the result of someone's attempt to make this delay less prominant. By making the block size 8 times bigger, Y-modem only needs to wait through these propagation delays 1/8th as often. As you can see, this approach works fine (if there are very few bad blocks); And as services like PC-Persuit (a low-cost computer networking service) become more and more popular, propagation delay becomes a much more significant factor in speed. Y-Modem begins to look better. The obvious disadvantage of Y-modem is that if noisy lines become a factor (as they quite frequently do), Y-modem's efficiency drops RAPIDLY. C-Modem literally does have the best of both worlds. Blocks range anywhere from 256 bytes to 64k (a 256-byte block really defeats the purpose, blocks should start at no less than 1K). These blocks are broken down into sub-blocks of 256 bytes. In the above example, we used a small 1k block. This made C-modem look very much like Y-Modem. However, as block errors start occuring, Y-Modem has to retransmit the entire 1K block again, where C-modem only has to retransmit 256 bytes. This nets a gain of about 27 seconds per bad block over Y-modem; however, C-modem blocks will generally range anywhere from 8k to 64k. The bigger blocks make C-modem do an even better job than Y-modem was intended to do even without errors. When line noise comes into play, C- modem can still get the job done with good efficiency whereas Y-modem can't. Here is an example of how you could expect C-modem to compair with X-modem and Y-modem under various conditions: If we let p= the propagation delay from one terminal to another, and we let P= the Total Propagation delay, and we transferred a 64k file using C- modem, Y-modem, and X-Modem. Assuming no errors occurred, this is what you might expect. X-MODEM: P=(2*p)*65536/128=(2*p)*512=1024*p Y-MODEM: P=(2*p)*65536/1024=(2*p)*64=128*p C-MODEM 4k: P=(2*p)*65536/4096=(2*p)*16=32*p C-MODEM 16k: P=(2*p)*65535/16384=(2*p)*4=8*p C-MODEM 64k: P=(2*p)*65536/65536=(2*p)*1=2*p What do these formulas mean? UNPRODUCTIVE WAITING TIME FOR A 64K FILE (seconds) | p=1/20 | p=1/10 | p=1/2 ------------------------------------------------- P X-Modem | 51.2 | 102.4 | 8.5 minutes R ------------|--------|---------|----------------- O Y-Modem | 2.1 | 6.4 | 32 seconds T ------------|--------|---------|----------------- O C-Modem 4k | 1.0 | 3.2 | 16 seconds C ------------|--------|---------|----------------- O C-Modem 16k | 0.26 | 0.8 | 4 seconds L ------------|--------|---------|----------------- C-Modem 64k | 0.07 | 0.2 | 1 second Remember, this is with NO errors. As errors occur, Y-modem can easily pass X-modem in unproductive waiting time. As you can see, C-modem behaves even better than Y-modem's positive side. And when block errors come into the picture, then they become incomparable. Also keep in mind that even if you were using 192,000 baud, these delays would stay the same! ADDED ADVANTAGES: Aside from batch file transfers, and time/date stamping, C-Modem also has a few inherent advantages. Unlike X-modem and Y-modem, subblock numbers can never get out of synchronization. The sender tells the receiver which subblock it is receiving, and if a block has been omitted somehow, then it is the same as not receiving the subblock correctly (bad CRC, etc...). Since the size of a file to be transferred is known before it is transferred, the file does not have to be aborted half way through a transfer if there is not enough storage space on the receiver's end. It can be aborted before the transfer even takes place. Because Y-Modem's blocks are so big, padding the last block of a file (to fill in the remaining unused bytes of a block) can be too costly, so (under certain implementations) Y-Modem can switch to X-Modem when it reaches the last 1024 bytes of a file. C-Modem doesn't need to do this. Not only that, it doesn't need to pad the last subblock at all (the last sub-block size is known: FILE SIZE mod 256). C-Modem is a public domain protocol, but proper credit should be given to Jerry Horanoff and Carina Software Systems for its development (within supplied documentation or what-have-you). Flowchart for C-Modem receiver will follow shortly.