MXP UDP Transport Draft 0.6

Introduction

UDP transport layer implements the specialized MXP transport layer requirements. This is early draft. Please excuse quality of language at this stage.

Requirements

  1. Protocol should support connectivity.
  2. Protocol should support congestion avoidance.
  3. Protocol should support guaranteed delivery where delivery and order are guaranteed to be retained over the tranmission link.
  4. Protocol should support signal delivery where delivery is not guaranteed and out dated messages are ignored.
  5. Protocol should support multiple channels.
  6. Protocol should support fair bandwidth sharing between signal and guaranteed data transmission.
  7. Protocol should support fair bandwidth sharing between individual channels.
  8. Long message should be fragmented to frames and send in intervals to avoid congestion.
  9. One network packet should be able to contain many message frames.

Changes

Next Version Tasks

Functional Layers

Channel Layer

Channels are useful concept for enabling multiple ordered streams of messages through the same connection. This enables channels to have independent sequencing mechanisms and bandwidth allocation.

Guaranteed Channels

Guaranteed channels guarantee message delivery and retain the order of messages.

Signal Channels

Signal channels offer best effort service and discard messages which have lower order index than previously arrived message. In other words the order of accepted messages is retained.

Assembly Layer

Messages to Packets Assembly

Messages are divided to one or more frames and assembled to transport packet data section. This is implemented by having a fragmentation list of maximum N (max_messages_fragmented) messages. These messages are being assembled to packets in the same time. When all message frames have been placed to packets then the message is removed from the list and new message is added to the list from pending message queue.

Packets to Messages Assembly

Message frames are assembled back to messages by reverse process where unlimited number of incomplete messages form an aggregation list. When a packet is received then the frames from the packet are placed to correct messages. New messages are added to the list on demand. Ready messages are removed from the list and queued for further processing. Messages which have one or more frames timed out are removed from the list.

Channel Bandwidth Allocation

Per channel bandwidth allocation is realized by choosing messages to be sent according to round robin algorithm from different channels. If a channel already has a message in fragmentation list then the channels is ignored on that round.

Transport Layer

Guaranteed messages have to be delivered without any loss. Fraction of the signal messages may dropped without risking data consistency. Guaranteed and signal data are transported in different packets as guaranteed packets need acknowledgements. On the event of packet drop it would be waste of bandwidth to have signal data in the same packet as guaranteed data as both of them would get resent.

Acknowledging

Acknowledgements are sent in first outbound packet or if there is no outbound packets then empty transport packet is sent after maximum acknowledgement wait window (max_ack_wait).

Retransmit

If a guaranteed packet is not acknowledged inside acknowledgement timeout (ack_timeout) then packet is retransmitted. Packet is retransmitted maximum N times (max_packet_retransmit). Packet is retransmitted by inserting it as first in the channel outbound packet queue. This enables the retransmits to honor the per channel bandwidth allocation. What happens when maximum number of retransmits has occured. Will the connection break or should exponential back off be applied for some number of times before that? If guaranteed message can not be delivered then connection should disconnect in the end.

Ignoring Duplicates

Duplicate packets have to be ignored and number of duplicates counted. Reason for duplicates may be either retransmit, network layer error or attack.

Congestion Avoidance

On connection initialization the up stream (maximum) bandwidth level defaults to specified default bandwidth level. The remote process transmits input (message) buffer fill percentage in packet header. Local process tunes the outbound bandwidth between predefined transmit levels based on packet loss and remote peer input buffer congestion. Quality is measured within sample time window. If the sample window does not achieve predefined minimum quality level then lower bandwidth level than realized bandwidth is selected. If for N sample windows the quality is level is higher than predefined good quality level and realized bandwidth is 9X% of the current bandwidth level then higher bandwidth level is selected.

Guaranteed vs Signal Bandwidth Allocation

Guaranteed and signal channels share single connection and bandwidth. These channel types are sent in separate packets and the bandwidth allocation is defined in terms of packet ratio. This is implemented in the send loop by trying to send series of N guaranteed packets and M signal packets. If there is not enough packets to send in either type then the other type fills in the missing slot. The default N/M ratio is defined in this specification. In future this ratio may be application specific or dynamic.

Keep Alive

If no other packets are send inside half of the activity timeout window (activity_timeout) then keep alive packet is transmitted to avoid activity timeout.

Connectivity Layer

Sockets

Each peer (process) has one socket which is used to send and receive all information. Incoming packets are tied to connection based on remote address, port and connection id.

Connecting

Both peers assign connection up stream id unique inside local process. These stream ids together form connection id. Least significant bits of the connection id are the up stream id defined by the initiating peer.

Connection is initiated by connection request packet. Connection request is resent N (connect_retry_count) times if connection response is not received in expected time window (connect_timeout). Connection request packet does not have complete connection id. Connection response is paired to connection by remote address, port and up stream id returned in the least significant bits of the response packet connection id.

Disconnecting

Disconnect is invoked by sending disconnect packet. Disconnect can also occur due to activity timeout (activity_timeout) or protocol error.

Encoding

Byte Ordering

http://en.wikipedia.org/wiki/Endianness

Little-endian byte ordering is used, except for serializing UUIDs (see below). Should we change to big-endian as this is standard inwire format?

Signed Fixed Point Representation

Signed values are represented in the Two's complement signed representation.

http://en.wikipedia.org/wiki/Two's_complement

Floating Point Encoding

The IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754) is used for floating point encoding.

http://en.wikipedia.org/wiki/IEEE_754-1985

Universally Unique Identifier Encoding

UUIDs are serialized in big-endian byte order, as per section 4.1.2 of the UUID specification.

http://www.ietf.org/rfc/rfc4122.txt

Types

bitmask: 8 bit bitmask.

double: signed 64 bit float

udouble: unsigned 64 bit float

float: signed 32 bit float

ufloat: unsigned 32 bit float

long: signed 64 bit integer

ulong: unsigned 64 bit integer

int: signed 32 bit integer

uint: unsigned 32 bit integer

short: signed 16 bit integer

ushort: unsigned 16 bit integer

byte: unsigned 8 bit integer

uuid: 16 byte Universally Unique Identifier

time: 64 bit timestamp with unix time encoding.

Packet Structure

UDP packet data length is maximum of (1500-48(IPv6))=1452 bytes which consists of header bytes and as many frames as fit in the remaining bytes.

PACKET

packet header - data

PACKET HEADER

4  : int     : packet_id /* Packet identifier, unique inside connection. */
4  : int     : connection_id /* Connection identifier, unique between two peers. */
1  : bitmask : flags /* Bitmask containing packet type flags and guaranteed flag. */
/* Type bit being set is that of packet type index. */
/* 7th bit is guaranteed flag. 8th bit not used. */
1  : byte    : input_buffer_fill_percentage 
1  : byte    : number of acks
X  : ints    : message indexes acked

Packet Types

CONNECT 1 (not guaranteed)


CONNECT_RESPONSE 2 (not guaranteed)


DISCONNECT 3 (not guaranteed)


TRANSPORT 4 (Empty transport packet is used as keepalive packet and to transmit acks if no payload data needs to be transmitted.) (both guaranteed and not guaranteed)

X  : bytes  : data

Message Frame Structure

Each frame length is maximum of 268 bytes total.

MESSAGE FRAME

frame header (14 bytes) - frame data (maximum 255 bytes)

FRAME HEADER

4  : int     : channel_index /* Channel identifier. */
4  : int     : message_index /* Rotating message index unique inside channel. */
2  : short   : frame_count /* Number of frames in this message. */
2  : short   : frame_index /* Index of the frame. */
1  : byte    : message_type /* Type of the message. */
1  : byte    : frame_data_size /* Number of byte in the frame data section. */

Largest message data sizes are limited to by maximum frame count and frame data size to ~15Mbytes. Could this protocol be used for transmitting larger amounts of data on background? If so another message type can be specified for large file transfers where each message transmit part of the file.

0 Attachments 0 Attachments
1022 Views

CloudDeck Splash

bubble_cloud_demo_4_small.png

Bubble Bouncher Demo 1

Idealist Viewer 1

Bubble Bouncher Demo 2