Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103rd Street, Indianapolis, IN 46290 or at support@mcp.com.

Notice: This material is excerpted from Running A Perfect Web Site with Apache, ISBN: 0-7897-0745-4. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.

Chapter 12 - htmL Forms

Static Web pages are fine for presenting information, but they don't harness the full capabilities of Web technology. The Web community's craving for interactivity has led to the incorporation of animation, audio, and video clips, and other multimedia items into Web pages. One of the earliest types of interactivity on Web pages were htmL forms - sets of clickable buttons and boxes, text fields, and menus into which the user enters data. The browser then passes the data to a script on a server that processes the data and sends a response back to the browser.

This powerful form of interactivity initially allowed for querying of databases and for soliciting feedback from Web users. As encryption technology improved, forms became a common part of electronic commerce sites where they gave Internet shoppers a secure interface for entering their orders, shipping addresses and credit card numbers. Recent innovative applications of htmL forms include real-time chat and conducting online research.

On the client side of htmL forms are a few basic yet versatile tags that make it easy to create familiar graphical elements for data input and to specify where and how to send the form data.

In this chapter, you learn:

  • The basics of form data flow

  • How to create htmL forms

  • The two methods clients use to pass form data to a server

  • How to query a script without filling out a form

Overview of Form Data Flow

Before learning about the htmL tags used to create forms, it is helpful to have a perspective on the "path" that form data takes as it moves from the browser to the processing script. Once a user enters the data, the browser encodes it and makes a call to the server where the processing script resides. One part of this call indicates which script the server should run and the other part passes the form data.

Once the server has the data and knows which script to run, it starts the script and passes the form data to the script by the Common Gateway Interface (CGI) - a set of specifications that allows clients to make calls to server scripts, regardless of platform. The script processes the data and creates a response to be sent back to the client. Typically this response is an htmL file, but it can also take the form of things like a plain text file or a URL. The response is (in most cases) handed back to the server which then passes it on to the browser for display to the user.


When a script produces an htmL document, it is sometimes referred to as generating htmL "on-the-fly."

Form Support in htmL

Like other elements of htmL, forms have a similar appearance in different browsers, but the appearance is not identical. The appearance of a form always matches the graphical environment in which the form is displayed. For example, Windows pull-down menus and check boxes look significantly different than they do in X-Windows. This platform portability is part of the power of forms. Authors of htmL forms don't need to worry about the details of interacting with the user's graphical operating system - the browser handles all the details. This is what allows you to use the same htmL form under Windows, Mac, OS/2, X-Windows, and even in text-mode with Lynx.

htmL's form support is very simple, and yet surprisingly complete. A handful of htmL tags can create the most popular elements of modern graphical interfaces, including text windows, check boxes and radio buttons, pull-down menus, and push buttons. In fact, using htmL forms in conjunction with server scripts is arguably the fastest and simplest way to create cross-platform graphical applications! The only programming required is the script itself, and the programmer can choose the language.

Creating Forms

Composing htmL forms might sound like a complex task, but there are remarkably few tags that you need to master to do it. All form-related tags occur between the <FORM ...> and </FORM> container tags. If you have more than one form in an htmL document, the closing </FORM> tag is essential for distinguishing between the multiple forms.

Each htmL form has three main components: the form header, one or more named input fields, and one or more action buttons.

The Form Header

The form header is really just the <FORM ...> tag and the attributes it contains. The first of these is the ACTION attribute. You set ACTION equal to the URL of the processing script so that the client knows where to send the form data once it is entered. ACTION is a mandatory attribute of the <FORM ...> tag. Without it, the browser has no idea where the form data should go.

The ACTION URL can also contain extra path information at the end of it. The extra path information is passed on to the script so that it can correctly process the data. It's not found anywhere on the form and is therefore transparent to the user. Allowing for the possibility of extra path information, an ACTION URL has the form:

protocol://server/path/script_file/extra_path_info
You can use the extra path information to pass an additional file name or directory information to a script. For example, on some servers the imagemap facility uses extra path information to specify the name of the map file. The name of the map file follows the path to the imagemap script. A sample URL might be:

http://cgi-bin/imagemap/homepage 
The name of the script is imagemap, and home page is the name of the map file used by imagemap.


On many CGI capable servers, you will find the script executable files in the cgi-bin directory. Having a special directory for the executable files helps the server administrator keep ill-intentioned users from getting to portions of the server where they might do serious harm.

The second attribute found in the <FORM ...> tag is the METHOD attribute. METHOD specifies the HTTP method to use when passing the data to the script and can be set to values of GET or POST. When using the GET method, the browser will append the form data to the end of the URL of the processing script. The POST method sends the form data to the server in a separate HTTP transaction. More specific information about the differences between these two methods can be found in the "HTTP Methods" section later in this chapter.

METHOD is not a mandatory attribute of the <FORM ...> tag. In the absence of a specified method, the browser will use the GET method.


Some servers may have operating environment limitations that prevent them from processing a URL that exceeds 1 kilobyte of information. This can be a problem when using the GET method to pass a large amount of form data. Since the GET method appends the data to the end of the processing script URL, you run a greater risk of passing a URL that's too big for the server to handle. If this is a concern on your server, you should use the POST method to pass form data.

Actually, the absolute safe limit, and one we should definitely mention, is 256 characters for the whole URL - meaning, if the URL including the script name is 40 characters, the part after the ? mark or / can only be 215 characters. There is no fundamental limit, but the specs (will) only require applications to support 256-character URL's to be considered compliant. If you've got more than that, use POST.

In summary, a form header follows the syntax:

<FORM ACTION="URL" METHOD={GET|POST}>
Following the form header are the tags to create named input fields and action buttons. We discuss these tags next.

Named Input Fields

The named input fields typically comprise the bulk of a form. The fields appear as standard GUI controls such as text boxes, check boxes, radio buttons, and menus. You assign each field a unique name that eventually becomes the variable name used in the processing script.


If you aren't coding your own processing scripts, be sure to sit down with your programmer to agree on variable names. The names used in the form must exactly match those used in coding the script.

You can use several different GUI controls to enter information into forms. The controls for named input fields appear in table 12.1. The TYPE="FILE" control allows you to create forms that ask for files as input. This control is a Netscape extension to standard htmL and is only supported by the Netscape Navigator browser.

Table 12.1 Types of Named Input Fields
Field TypehtmL Tag
Text Box<INPUT TYPE="TEXT" ...>
Password Box<INPUT TYPE="PASSWORD" ...>
Check box<INPUT TYPE="chECKBOX" ...>
Radio Button<INPUT TYPE="RADIO" ...>
Hidden Field<INPUT TYPE="HIDDEN" ...>
File<INPUT TYPE="FILE" ...>
Text Window<TEXTAREA ...> ... </TEXTAREA>
Menu<SELECT ...> ... <OPTION> ... </SELECT>


Even though it is a text-only browser, Lynx emulates GUI elements to achieve complete support for forms.

The <INPUT ...> Tag

You may have noticed in table 12.1 that the versatile <INPUT ...> tag, together with the appropriate TYPE attribute, is used to produce most of the named input fields available to form designers. The following sections discuss each of the different possible TYPE attributes in greater detail.

Text and Password Fields

Text and password fields are simple data entry fields. The only difference between them is that text typed into a password field appears on-screen as asterisks (*).


Using a password field may protect users' passwords from the people looking over their shoulders, but it does not protect the password as it travels over the Internet. To protect password data as it moves from browser to server, you need to use some type of encryption or similar security measure.

To learn more about encryption and other security issues, see Chapter 6, "Managing an Internet Web Server."

The most general text or password field is produced by the htmL (attributes in square brackets are optional):

<INPUT TYPE="{TEXT|PASSWORD}" NAME="Name"
    [VALUE="default_text"] [SIZE="width"] 
    [MAXLENGTH="width"]>
The NAME attribute is mandatory as it provides a unique identifier for the data entered into the field.

The optional VALUE attribute allows you to place some default text in the field, rather than having it initially appear blank. This is useful if there is a certain text string that the majority of users will enter into the field. In such cases, you can use VALUE to put the text into the field, thereby saving most users the effort of typing it.

The optional SIZE attribute gives you control over how many characters wide the field should be. The default SIZE is 20 characters. MAXLENGTH is also optional and allows you specify the maximum number of characters that can be entered into the field.


Previously the SIZE attribute used to take the form SIZE="width,height" where setting a height other than 1 produced a multiline field. With the advent of the <TEXTAREA ...> ... </TEXTAREA> tag pair for creating multiline text windows, height has become something of a vestige and is ignored by most browsers.

A simple application of text and password fields would be to provide a user login interface. For example, listing 12.1 produces the screen in figure 12.1.

Listing 12.1 htmL for Figure 12.1

<htmL>
<HEAD>
<TITLE>XYZ Corporation Login Screen</TITLE>
</HEAD>
<BODY>
<H1>Welcome to XYZ's Computer System!</H1>
<HR><P>
Please enter your ID and password.<P>
<FORM ACTION="http://www.xyz.com/cgi-bin/login.cgi" METHOD="POST">
ID:  <INPUT TYPE="TEXT" NAME="Username" SIZE="10" MAXLENGTH="10"><P>
Password:  <INPUT TYPE="PASSWORD" NAME="Password" 
    SIZE="10" MAXLENGTH="10"><P>
<INPUT TYPE="SUBMIT" VALUE="Log On">
</FORM>
</BODY>
</htmL>
The "Log On" button you see is an action button. Action buttons are described more fully later in the chapter.

Fig. 12.1 - You can use text and password boxes to produce a login facility like this one.


The <INPUT ...> tag and other tags that produce named input fields just create the fields themselves. It's up to you as the form designer to include some descriptive text next to each field so that users know what to information to enter.


Because browsers ignore white space, it's difficult to line up the left edges of text input boxes on multiple lines because the text to the left of the boxes are of different lengths. One solution is to put label text to the right of input boxes. Another solution is to set up the text labels and input fields as cells in the same row of an htmL table.

Check Boxes

You can use check boxes to provide users with several choices, from which they may select as many of them as they want. An <INPUT ...> tag to produce a check box option has the syntax:

<INPUT TYPE="chECKBOX" NAME="Name" VALUE="Value" [chECKED]>
Each checkbox option is created by its own <INPUT ...> tag and must have its own unique NAME. If you give multiple check box options the same NAME, there will be no way for the script to determine which choices the user actually made.

Check boxes only show up in the form data sent to the server if they are selected. Check boxes that are not selected do not appear. For check boxes that are selected, the VALUE attribute specifies what data is sent to the server. This information is transparent to the user. The optional chECKED attribute will preselect a commonly selected check box when the form is rendered on the browser screen.

Figure 12.2 shows an expanded version of the login screen in figure 12.1. The expanded screen has several options that the user can specify at login time. Because people generally want to check their electronic mail, the Check for New Messages option is preselected. The htmL to produce the check box options in figure 12.2 is:

<B>Login Options -</B> Select any, all or none of the following:<P>
<INPUT TYPE="chECKBOX" NAME="Suppress" 
    VALUE="Yes">Suppress greeting screen<P>
<INPUT TYPE="chECKBOX" NAME="Email" 
    VALUE="Yes" chECKED>Check for new messages<P>
<INPUT TYPE="chECKBOX" NAME="Schedule" 
    VALUE="Yes">Display today's schedule<P>
Fig. 12.2 - Check boxes like this one give users many options from which they can choose as many as they like.

Radio Buttons

Radio buttons are used to present users with several options from which they may select one and only one option. When you set up options in a radio button format, make sure that the options are mutually exclusive so that a user won't try to select more than one.

The htmL to produce a set of three radio button options is:

<INPUT TYPE="RADIO" NAME="Name" 
    VALUE="VALUE1" [chECKED]>Option 1<P>
<INPUT TYPE="RADIO" NAME="Name" 
    VALUE="VALUE2">Option 2<P>
<INPUT TYPE="RADIO" NAME="Name" 
    VALUE="VALUE3">Option 3<P>
The VALUE and chECKED attributes work exactly the same as they do for check boxes, though you should only have one preselected radio button option. A fundamental difference with a set of radio button options is that they all have the same NAME. This is permissible because the user can only select one of the options.

Figure 12.3 shows one more extension to our login screen, giving the user the choice of a UNIX shell or to load X-Windows at login. The htmL to produce the radio buttons is:

<INPUT TYPE="RADIO" NAME="X_WIN" 
    VALUE="NO" chECKED>UNIX Shell&nbsp;&nbsp;
<INPUT TYPE="RADIO" NAME="X_WIN" 
    VALUE="YES">Start X-Windows<P>
Fig. 12.3 - Radio buttons present users with multiple options from which they may select one and only one option.

Note that the two radio button options are side-by-side in figure 12.3, separated by nonbreaking space. This is fine for radio button or check box options described with a small amount of text.

Hidden Fields

Technically, hidden fields are not meant for data input. However, you can send information to the server about a form without displaying that information anywhere on the form itself. The general format for including hidden fields is:

<INPUT TYPE="HIDDEN" NAME="name" VALUE="value">
One possible use of hidden fields is to allow a single general script to process data from several different forms. The script needs to know which form is sending the data, and a hidden field can provide this information without requiring anything on the part of the user. For example, all forms processed by the script can have a hidden name of FormID and hidden values of Sales, Order, Followup, NewUser, and so on.

A closely related use of hidden fields is to use a generic script to process several forms that vary only in one or two fields. For example, a generic script to send comments via e-mail might use a hidden field to specify the e-mail address. This way, the user doesn't have to type an address or even know where the mail is going, but because the form contains the address information in a hidden field, a single script can still be used to send automated feedback to several different e-mail addresses.


In general, anything you can do with hidden fields, you can do by specifying extra path information in the form's ACTION attribute. However, hidden fields appear as regular data items in a form and may therefore be easier to process, especially if there are multiple hidden items.

A third possible use of hidden fields is to embed state information into forms generated on-the-fly. For example, a form that is generated in response to a previous form can contain the original contents of the first form in a hidden field. This way, when the data from the second form is sent, the data from the first form is sent, too, and the processing script has a complete history of the necessary information. This can be useful in a search for returning preliminary results to the user while still maintaining a record of the original query.

Files

The Netscape Navigator supports an extension to the <INPUT ...> tag that allows a file to be specified as input. To accomplish this, you need to do two things. The first is to add the ENCTYPE attribute to the form header to let the browser know that it will be sending a file. The modified form header looks like:

<FORM ACTION="URL" METHOD="POST" ENCTYPE="multipart/form-data">
The second change is to set the TYPE attribute in the <INPUT ...> tag to FILE:

Enter file name:<INPUT TYPE="FILE" NAME="filename">

Because the browser will be transferring an entire file as part of the form data, use the POST method for this type of form so that you don't run the risk of creating too large a URL for the server to process.

The <TEXTAREA ...> ... </TEXTAREA> Tag Pair

Text and password boxes are used for simple, one-line input fields. You can create multiline text windows that function in much the same way by using the <TEXTAREA ...> and </TEXTAREA> container tags. The htmL syntax for a text window is:

<TEXTAREA NAME="Name" [ROWS="rows"] [COLS="columns"]>
Default_window_text
</TEXTAREA>
The NAME attribute gives the text window a unique identifier just as it did with the variations on the <INPUT ...> tag. The optional ROWS and COLS attributes allow you to specify the dimensions of the text window as it appears on the browser screen. The default number of rows and columns varies by browser. In Netscape Navigator the defaults are 1 row and 20 columns, while in Microsoft Internet Explorer they are 3 rows and 30 columns.

Multiline text windows are ideal for entry of long pieces of text such as feedback comments or e-mail messages. Figure 12.4 shows a text window being used as an online suggestion box. The corresponding htmL is shown in listing 12.2.

Listing 12.2 htmL for Figure 12.2

<htmL>
<HEAD>
<TITLE>XYZ Corporation Suggestion Box</TITLE>
</HEAD>
<BODY>
<H1>XYZ Corporation Suggestion Box</H1>
<HR><P>
<FORM ACTION="http://www.xyz.com/cgi-bin/suggest.cgi" 
    METHOD="POST">
<TEXTAREA NAME="Suggest" ROWS="10" COLS="60">
Enter your suggestions here.
</TEXTAREA>
<P>
<INPUT TYPE="SUBMIT" VALUE="Submit Suggestion">
</FORM>
</BODY>
</htmL>
Fig. 12.4 - Multiline text windows permit entry of larger amounts of text.

The <SELECT ...> ... </SELECT> Tag Pair

The final technique for creating a named input field is to use the <SELECT ...> and </SELECT> container tags to produce a pull-down or scrollable menus of options. Listing 12.3 shows the htmL used to create a general menu.

Listing 12.3 htmL That Produces a General Menu

<SELECT NAME="Name" [SIZE="size"] [MULtipLE]>
<OPTION [SELECTED]>Option 1
<OPTION [SELECTED]>Option 2
<OPTION [SELECTED]>Option 3
...
<OPTION [SELECTED]>Option n
</SELECT>
In the <SELECT ...> tag, the NAME attribute again gives the input field a unique identifier. The optional SIZE attribute lets you specify how many options should be displayed when the menu is rendered on the browser screen. If there are more options than there is space for displaying them, they will be available either by a pull-down window or by scrolling through the window with scroll bars. The default SIZE is 1. If you want to let users choose more than one menu option, you can include the MULtipLE attribute. When MULtipLE is specified, users can choose multiple options by holding down the Control key and by using the mouse to click on the options they want.


If you specify the MULtipLE attribute and SIZE=1, a one-line scrollable list box is displayed instead of a drop-down list box. This is because you can only select one item (not multiple items) in a drop-down list box.

Each option in the menu is specified with its own <OPTION ...> tag. If you want an option to be pre-selected, you can include the SELECTED attribute in the appropriate <OPTION ...> tag.

Figure 12.5 shows the menu produced by listing 12.4.

Listing 12.4 htmL for the Menu in Figure 12.5

<htmL>
<HEAD>
<TITLE>XYZ Corporation Report Generator</TITLE>
</HEAD>
<BODY>
<H1>XYZ Corporation Report Generator</H1>
<HR><P>
Select the reports you want to generate.<P>
To select multiple options, hold down the Control key 
while clicking the mouse.<P>
<FORM ACTION="http://www.xyz.com/cgi-bin/reports.cgi" METHOD="POST">
<SELECT NAME="Reports" SIZE="4" MULtipLE>
<OPTION SELECTED>Bi-weekly Payroll
<OPTION>Accounts Payable
<OPTION>Accounts Receivable
<OPTION>YTD Revenue
<OPTION>YTD Expense
<OPTION>YTD Profit and Loss
<OPTION>Balance Sheet
</SELECT>
<P><INPUT TYPE="SUBMIT" VALUE="Submit Report Request">
</FORM>
</BODY>
</htmL>
Fig. 12.5 - Scrollable menu boxes allow you to pack several options into a compact space.

You may have noticed that there are no VALUE attributes for the <SELECT ...> or <OPTION ...> tags. This is because the values passed to the server are the text items that appear after each <OPTION ...> tag.


You can replace radio buttons with pull-down menus to save space on-screen. Including the MULtipLE option in a <SELECT ...> tag allows menus to replace check boxes, as well.

Action Buttons

The handy <INPUT ...> tag returns to provide any easy way of creating the form action buttons you have seen in the preceding figures. Buttons may be of two types: submit and reset. Pressing a submit button instructs the browser to package the form data and send it to the server. Pressing a reset button clears out any data entered into the form and sets all the named input fields back to their default values.

Any form you compose should have a submit button so that users can submit the data they enter. The one exception to this rule is a form containing only one input field. For such a form, pressing Enter automatically submits the data. Reset buttons are technically not necessary, but are usually provided as a user courtesy.

To create submit or reset buttons, you use the <INPUT ...> tags:

<INPUT TYPE="SUBMIT" VALUE="Submit Data">
<INPUT TYPE="RESET" VALUE="Clear Data">
The VALUE attribute is used to specify the text that appears on the button. You should set VALUE to a text string that concisely describes the function of the button. If VALUE is not specified, the button text will read "Submit Query" for submit buttons and "Reset" for reset buttons.


Normally, forms include only one submit button. In some cases, however, you may want to include multiple buttons that take different actions. You can achieve this by naming submit buttons with a NAME attribute so that the NAME and VALUE of the button pressed show up in the query string. However, this capability is not yet part of standard htmL and is not supported by many browsers.

A One-Field Form: The <ISINDEX> Tag

There is an exception to the rule about forms having headers, input fields and action buttons. You can use the <ISINDEX> tag to create a single field form. No other tags are required. <ISINDEX> fields are used to allow a user to enter search criteria for queries against Gopher servers or database scripts. For example, you may be maintaining a directory of employees where you work that is searchable by a person's last name. You can use an <ISINDEX> field as a front-end to search the directory. Figure 12.6 shows such a field. The user would enter the last name to search on and press Enter to initiate the search.

Fig. 12.6 - The <ISINDEX> tag creates a single field form that can be used for entry of search criteria

You may be wondering where the data entered into an <ISINDEX> field goes. After all, there's no <FORM ...> tag with an ACTION specified. How does the client know which URL to send the data to? The answer is that it sends the data to the URL of the page containing the <ISINDEX> field. This requires one of two things: (1) that the page be created by some sort of a script, since a static htmL page could not receive and process the data or (2) that the <ISINDEX> field be part of a Gopher document, since Gopher servers are configured to process such queries.

Chapter 13, "CGI Scripts, and Server APIs," contains more details on how to create scripts that return htmL documents.

Note in figure 12.6 that the <ISINDEX> field is preceded by the default text This is a searchable index. Enter search keywords:. The Netscape Navigator supports a PROMPT attribute of the <ISINDEX> tag that allows you to alter this default and make the text in front of the field more descriptive. For example, the following htmL produces the page shown in figure 12.7.

<BODY>
<H1> XYZ Company Employee Directory</H1>
<ISINDEX PROMPT="Enter the last name to search by:">
</BODY>
Fig. 12.7 - Netscape lets you customize the text in front of an <ISINDEX> field.

Passing Form Data

Once the user clicks the submit button after filling out a form or presses Enter after specifying an <ISINDEX> query value, the form data or query is packaged by the browser for transmission to the server. There are two key aspects to this transmission: the HTTP method used to transmit the data and the URL encoding used to format the data. All form data or query information gets encoded, regardless of which HTTP method is used. Once you understand how the encoding works, it becomes easy to make calls to the same script with the same data later on.

HTTP Methods

In the earlier discussion of form headers, you learned that there are two HTTP methods by which form data can be passed to the server. The GET method attaches form or query data onto the end of the URL of the processing script. The POST method sends form data to the server in a separate transaction. If you don't specify the METHOD attribute in the <FORM ...> tag, the client will use the GET method by default.


Query data entered into an <ISINDEX> field is always sent by the GET method.

The GET Method

The default GET method appends data onto the end of the URL specified in the ACTION attribute in the case of forms and onto the end of the URL of the page containing the <ISINDEX> field in the case of an <ISINDEX> query. A URL created by the GET method has the form:

protocol://server/path/filename/extra_path_info?query_string
where the query string is the form data or <ISINDEX> query data formatted as described in the "URL Encoding" section below.


If your form contains several input fields, it's possible that your query string will grow too large for the server to process (more than 1 kilobyte of data). If this is a concern, use the POST method.

Since <ISINDEX> query information is generally short, use of the GET method is not a problem.

The POST Method

The POST method sends the form data to the server in a separate HTTP transaction, passing it to the standard input device on the server's operating system. By sending the data separately, you no longer need to be concerned about a URL becoming too long for the server to process. Even though it is sent separately, the data is still encoded in the same way as data sent by the GET method.


The HTTP protocol supports the ability to POST data files of any type from browser to server, even outside of a script context. However, this capability is not yet widely supported by browsers or servers. Form data is of MIME type x-www/url-encoded.

URL Encoding

The browser must somehow convert all data represented graphically in a form to a string of text it can send to the server. This involves both packaging the form data and then formatting it for proper transmission to the server.

Packaging Form Data

The form designer assigns each field, or graphical control, a unique name using the NAME attribute, except for submit and reset buttons. This naming provides a way for the server to associate data with where it came from.

The browser translates the entire contents of a form into a single text string using the following format:

name1=value1&name2=value2&name3=value3...
where name1 is the name of the first form variable and value1 is the value of that variable as entered by the user. For example, the URL from a form that adds names and phone numbers to the employee directory described earlier might look like this:

http://www.xyz.com/
    cgi-bin/add.cgi?name=Beth+Roberts&number=2025551234
Because you give each field in a form a unique name, the processing script can figure out what each form entry represents. The type of graphical control in which a user enters a value is not specified directly in the information sent to the server. This lack of specification is okay, though, because the processing script can use each field's unique name to figure out the control type if necessary.


Query data entered into an <ISINDEX> field is simply added on to the end of the URL. For example, a query to the employee phone directory might take the form:

http://www.xyz.com./cgi-bin/lookup.cgi?Adams


Since there is only one input field, it is impossible to confuse the meaning of the query data and there is no need to use a name=value packaging approach.

Formatting Rules

Most operating environments interpret spaces in character strings as some type of delimiter indicating the start of a new field, a new parameter, and so on. Consequently, you must remove spaces from all form data and queries in order to ensure that the data is successfully received by the processing script. By convention, all spaces become plus signs (+).

This replacement presents a minor problem. What if a query itself contains a plus sign? Or what if form data contains an equals sign or ampersand, both of which are used to package the form data? There must be some way to distinguish between those characters inside form data versus those characters used to package the data. Consequently, when these characters appear in the form data itself, they are escaped by converting them into their hexadecimal ASCII representations, beginning with a percent sign (%). For example, the string "#$%" is converted to "%23%24%25". In hexadecimal ASCII, 23 represents the pound sign (#), 24 represents the dollar sign ($), and 25 represents the percent sign. For programming convenience, most nonalphanumeric characters are represented in hexadecimal ASCII notation.


The exact range of characters represented in hexadecimal ASCII is not important because the decoding operation converts all character sequences beginning with a percent sign to their hexadecimal ASCII equivalent. Even if letters and numbers in a query string were encoded this way, they would still be decoded properly.

In summary, for both form data and <ISINDEX> queries, any data inside the form or query itself is converted according to the following rules:

  • All spaces in the data are converted to plus signs (+).

  • Nonalphanumeric characters are represented by their hexadecimal ASCII equivalents.

Storing Encoded URLs

As you have seen in the previous discussion of URL encoding, packaging form or query data into a single text string follows a few simple formatting rules. Consequently, it is possible to "fake" a script into believing that it is receiving form or query data without using a form. To do this, you simply send the URL that would be constructed if a form were used. This may be useful if you frequently run a script with the same data set.

For example, suppose you frequently search the Web index Yahoo for new documents related to the scripting language JavaScript. If you are interested in checking for new documents several times a day, you could fill out the Yahoo search query each time. A more efficient way, however, is to store the query URL in your browser's hotlist or bookmark list. Each time you select that item on the hotlist, a new query is generated as if you had filled out the form. The query URL stored in the hotlist would look like:

http://search.yahoo.com/bin/search?p=JavaScript

Innovative Uses of Forms

Forms have come a long way from just being front-ends for search facilities and directory updates. Currently forms are used to conduct electronic commerce, to conduct research on who is using the Web, and to have fun! Here are some examples:

Electronic Commerce: The Nashville Country Store

The Nashville Country Store at http://www.countrystore.com/ is an example of doing business over the Web or electronic commerce. As technology to keep Internet transactions secure has emerged, Web users have developed more confidence in buying merchandise online. Figure 12.8 shows a screen from the Country Store that uses check boxes, text fields and pull-down menus to create an shopping interface for visitors. Hidden fields are used to keep track of a shoppers' purchases as they browse through the store.

Fig. 12.8 - htmL forms support electronic commerce by providing users with an interface to facilitate their shopping.

Research: Georgia Tech GVU Web Surveys

The Graphics, Visualization, and Usability Center at Georgia Tech has conducted four World Wide Web user surveys since January 1994. Online forms collect data on user demographics and the results are made publicly available on the Web. The last survey occurred in October 1995, with the next one planned for April 1996. You can visit the GVU Center's Web site at http://www.cc.gatech.edu/gvu/user_surveys/.

Fun: The Dreaded Matching Question '95

In a play on the popular testing format, a faculty member at Tulane University has put a 26-item matching question online. The question tests knowledge of popular culture and is taken from actual exams given in the course Management of Promotion. Figure 12.9 shows the first few items in the question. The boxes next to the numbered items in the first column are text boxes into which you type the letters of the matching item from the second column. You can take the test at http://129.81.234.19/courses/dmq95.htm.

Fig. 12.9 - htmL forms allowed one faculty member to replicate a matching question online.


QUE Home Page

For technical support For our books And software contact support@mcp.com

Copyright © 1996, Que Corporation

Table of Contents

11 - Graphics and Imagemaps

13 - CGI Scripts and Server APIs