|
Bill Fugate, July, 2000
Web pages are written in a language called HTML (Hypertext Markup Language). Most web pages are created with web publishing tools like Microsoft Frontpage, which let you compose a web page with design tools and then click button to have the appropriate HTML code written for you. Such tools are convenient but not necessary. I think it is fun to see how HTML itself works. This document provides a brief overview of HTML itself.  Reading this document and performing the exercises in it takes less than an hour and requires no special software other than what you already have on your PC. When you are finished, you will have:
Preparation Begin by opening Windows File Explorer and creating a temporary work directory to hold the files you will be creating. (You do this so you don't accidentally interfere with anything else on your PC.) This overview presumes that you are storing files in a directory called c:\temp, but you can call it anything you wish. Next, you need an easy way to edit the HTML files. The odds are high that you have a web browser on your PC called Internet Explorer. If so, simply open Internet Explorer and click on tools, Internet Options, Progams. Under "HTML editor," choose Windows Notepad and click OK. Now whenever you are in File Explorer, you can right-click on an HTML file and choose Edit, which will open the file in Notepad. (Do you use a web publishing program like Frontpage? If so, you will probably want to change the HTML Editor setting back to its original value when are finished with this exercise.) Note that using Internet Explorer to change this setting does not mean that you must use that browser for the remainder of this exercise; for that, you can use Netscape or any other browser. (In the unlikely event that you don't have Internet Explorer, there is a more tedious
way to accomplish the same thing. First, make sure that file extensions are being displayed
in Windows File Explorer. If not, click on View, Options while within Windows File
Explorer. Go to the View tab and uncheck the "Hide MS-DOS file extensions" box.
Then you need to make sure that your PC is set up so that *.txt files are associated with the
Notepad text editor and *.htm files are associated with your browser. This allows you to
open these two types of files by double-clicking on them. These associations are
probably already in place on your PC. If not, you can create them by using File Explorer
and clicking on View, Options. Click the File Types tab and find "Text Document" and
"Netscape Hypertext Document" [if you are using the Netscape browser; if you are using
the Internet Explorer browser, just look for a similar name for it]. If necessary, you
can open Notepad manually by clicking on Start, Programs, Accessories.)
HTML HTML files are ASCII text files that contain special tags. You can create HMTL files with any ASCII editor, such as Notepad. You can also create them with general editors such as Word; see that editor's Help system for details. Your web browser can read any text file, even if it doesn’t contain HTML tags. To demonstrate this, create a file called C:\temp\temp1.txt and type this text into it. After saving the file, you can display it in your browser by entering
this URL and pressing Return: file://C:/temp/temp1.txt
Tags HTML tags tell the browser how to display the file contents. To create a simple HTML file, use Notepad to edit temp1.txt, which you created above, and change the text as follows: Next, rename the file to temp1.htm. Change the URL in the browser to point to temp1.htm instead of temp1.txt, and press Return. Now the browser displays "This is an HTML file." By renaming the file to *.htm, you told the browser to process HTML tags, which are delimited by "<" and ">". The <b> tag tells the browser to begin displaying text in bold font, and the </b> tag tells it to quit. Most HTML tags appear in pairs, with the ending tag identical to the beginning tag except that it is prefixed with a "/". It doesn’t matter whether you use upper or lower case letters within HTML tags; <B> works the same as <b>. Note that the browser does not display anything between the "<" and ">" symbols. Or, more accurately, if the browser does display the contents of an HTML tag, that means there is a syntax error somewhere in the file that needs to be corrected. For something fancier, rename the file back to temp1.txt and edit it with Notepad. (The general procedure throughout this exercise is rename a file to something.txt to edit it, and then rename it to something.htm to display it in the browser.) Change the contents to read as follows: Rename the file to temp1.htm, and display it in the browser. To display
it in the browser, probably all you need to do is to double-click the temp1.htm
file. If that doesn’t work, type the URL of the file into your browser.
Links To illustrate an HTML link, use Notepad to create a second text file called temp2.txt. Type this text into the file: You have now created a web of two connected HTML files.
Images You can include an image in a web page by pointing to the image file with HTML code something like this: In you need an image file to use for testing, you can retrieve image files directly from the web as follows:
The one-line HTML sample files above work on most browsers, and they are good enough for test programs, but you wouldn’t want to put such bare-bones files into production. A complete HTML file should include these additional tags: <head> <title>Whatever you type here will be displayed at the top of the browser window.</title> </head> <body> This is an <b>HTML</b> file. </body>
These additional tags look useless at this point only because the HTML
code in this overview is so simple. They have a purpose in more realistic
HTML files, so it is good to get into the habit of always including them.
Tables Below is HTML code for a simple table. The indentations are for readability only. The <tr> ("Table Row") tag marks the beginning of a new row. The <td> ("Table Data") tag marks the beginning of a new column in a row. <tr> <td> first row, first cell </td> <td> first row, second cell</td> </tr> <tr> <td> second row, first cell </td> <td> second row, second cell </td> </tr> </table> Tables are often used to control the placement of images by placing a pointer to the image file inside a table cell, like this: Tables can also be used to control the placement of text. Unlike word
processors, you cannot format HTML text output with tabs, because a tab
character in an HTML file, or even a series of tab characters, will be
displayed by the browser as a single blank space. One trick is to
place text in a two-column table with invisible borders, placing text in
the second column and leaving the first column blank. That makes the text
look like it has been tabbed to the right. You can make a table’s borders
invisible by specifying "<table border=0>".
Forms Forms are used to pass information to other programs for further processing. It is beyond the scope of this overview to present a complete working form, but the example below gives the flavor of how they work. Enter Geographic Code here: <input type="text" size="2" name="geo_code"> </form> To see what this form looks like, type the code above to a temporary HTML file and display the results in a browser. An actual working form would be much more complex than this example, and it would have a SUBMIT button to call the application that performs the additional processing. That application could be written in ColdFusion, Perl, Visual Basic, etc. If the form above did have a SUBMIT button, and if the form user clicked
on it, an error message would appear unless there truly was a program named
geo_code.cfm ready to take over at that point.
Spacing HTML can be confusing at first when you are trying to format a page:
Publishing a web involves nothing more than copying that collection
of web files to a machine that has web server software installed on it
and that is available to the public.
HTML Isn’t a Programming Language HTML is not a programming language, it is a markup language. That is, instead working with IF-blocks, FOR-loops, etc., HTML is designed to process tags that specify how items are to be displayed. Note the following points:
Unlike HTML, JavaScript is a programming language, books on which are available at major bookstores. (Despite the similarity of their names, JavaScript has no relationship to the Java programming language.) JavaScript is different from HTML, yet JavaScript code can be imbedded within an HTML file where it can be used to handle form field validation, calculations, and other logic tasks. Below is a contrived sample of JavaScript code. You can type it into a temporary HTML file, as explained above, and run it in a browser to see what it displays. var action = "exit"; if (action == "exit") alert ("Are you sure you want to exit?"); </script> HTML Editors HTML files for web sites are usually built with special editors, like FrontPage or DreamWeaver. But, as the examples above show, these editors aren’t really necessary, especially for simple chores, although they certainly make life easier. Learning the basics of raw HTML coding is beneficial even if you do use an HTML editor, because editors sometimes get confused, forcing you to manually correct the HTML code they produce. You can view the HTML source code behind any web page, including this one.
If you are using Netscape, for example, click on View, Source. Using
cut-and-paste, you can "borrow" that code for your own use.
Index pages Browsers can be used to display the contents of a directory. For example, you can display the contents of the root directory of your PC by entering this URL: file:///C:/ You probably don’t want users to view the raw contents of the directory that holds your web pages. To prevent this, place an index file in that directory. An index file is an ordinary HTML file with a special name, which is usually either index.html or index.htm, depending on how the web server is set up. If the user’s URL specifies only the directory name, the index file will be displayed by default, which means that a browser can’t be used to view the contents of that directory. A good practice is to include an index file in every directory and subdirectory on your web site. Index files typically display a menu of available pages. |