CMSC 498P February 13, 1996

V.A. Form input and URL encoding

When a form's "submit" button is pushed, the web client (usually a browser) gathers up the form's name/val pairs, URL encodes them, and passes the URL-encoded data to the server for action. URL encoding is a encoding scheme used to insure that the form's data does not conflict with the URL (mostly for backwards compatibility).

Each of the form's input fields must have a name (as mentioned earlier). This is assigned by the element's NAME attribute. The client groups the name together with the data (value) into a pair called a name/val pair (*that* was a toughy). These are encoded by seperating the name from the val with an equal sign (=), and by seperating each name/val pair from its neighbor with an ampersand (&). This would look something like name1=val1&name2=val2&...&namex=valx.

Some characters must be "escaped" or swapped with a different character when sent in a URL. Examples of such characters are ~ / & = + <space> and others. Spaces are swapped with a plus sign (+). Other characters, because they might be used in the URL or in the data grouping are escaped by convertng them to their HEXed US-ASCII value preceded by a percent sign (%). For example, a tilde (~), whose decimal US_ASCII value is 126, would be converted to %7E (or %7e since both lowercase and uppercase letters are allowed).

An in-depth discussion on URL encoding may be found in section 2.2 of RFC 1738 [4].

<- Back to Index

Authored by LoneWolf (Mosh Teitelbaum).