TFM:Writing the Internets

From ProgSoc Wiki

Jump to: navigation, search

Contents

Writing the Internets: HTML, CSS, JavaScript, and other small insects.

James Ducker

This chapter is intended to provide an introduction to HTML, CSS, and maybe tie in a little JavaScript. I'm not going to lie, it's pretty easy[1]. At time of writing, the big players are XHTML 1.0 and CSS2. HTML5 and CSS3 are slowly tricking in, but HTML5 browser support is still shaky, so I won't address it. Maybe YOU can for TFM 2020 (HTML5 should have decent IE support by then).

Warning!

JavaScript is not Java. It has nothing to do with Java, beyond sharing the same C-style language syntax. JavaScript's "proper" name is ECMAScript. If you thought otherwise, drop it right now. This is your only warning. Trespassers will be kill -9'd with prejudice.

A Quick Note About Web Browsers

Not all web browsers are created equal. The big players (Microsoft Internet Explorer, Mozilla Firefox, and to a lesser extent Apple Safari/Google Chrome and Opera) all do things their own way. Internet Explorer uses a rendering engine known as Trident (which Windows also uses to render Windows Explorer - I smell a security vulnerability!), Firefox uses Gecko, and Chrome and Safari both use Webkit. You'll also find at any given point in time there are a significant number of people not using the latest version of a web browser, which introduces more rendering discrepancies to watch out for.

Though not perfect, modern browser rendering engines are programming works of art, in that they all handle really bad markup very elegantly.

Talk to your kids about IE6

Internet Explorer 6 refuses to die[2], though the world's web developers are slowly beating the life out of it. Even Microsoft thinks it's pretty bad that people still use IE6. That being said, it's really not such a big problem anymore, thanks to various JavaScript libraries to make it play nice:

Everyone Should Have Standards

We're all about standards, and you should be too, which means your code has to validate against W3C[3] specifications. To help you care about the quality of your markup, begin equating invalid HTML to a segmentation faults in C. Hitler wrote invalid HTML[4].

The Parts of an HTML File

 !DOCTYPE

Any HTML or XML file should begin with a DOCTYPE. This tells the client (the thing reading the HTML or XML) what to expect. It contains a reference to a Document Type Declaration (DTD), which explicitly defines what is and isn't allowed in the file. If you're familiar with XML schemas, DTDs perform a similar role.

Take a look at the XHTML 1.0 Strict DOCTYPE:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

By comparison, check out the up-and-coming HTML5 DOCTYPE:

 <!DOCTYPE html>

Despite the fact that this guide isn't directly addressing HTML5, I'm going to use the HTML5 doctype in examples, somewhat for brevity, but mostly because I like it, and it won't impact any of the examples.

HTML Elements (a.k.a. Tags)

Some people refer to elements as tags, while others refer to tags as elements. The simplest way is to use the terms the same way you would use the terms "class" and "object" in programming. A "tag" is some markup in a file, whereas an "element" is a tag as interpreted and stored by a web browser. Thus, you write tags, but a rendered page consists of elements, and browsers will occasionally work some magic on your tags before producing the associated elements (though this normally won't be of any significance).

You can look at HTML in two ways: as a bunch of elements (<a>, <p>, <strong>, <div>, etc.) or as a tree-like structure of nodes, with a well-defined hierarchy. I recommend the latter, because you can apply that knowledge to XML and vice-versa, and it helps no end when you move on to writing JavaScript. So lets take a look at a Hello, World! example:

  <!DOCTYPE html>
  <html>
     <head>
        <title>My First HTML Page</title>
     </head>
     
     <body>
        <h1>Hello, World!</h1>
     </body>
  </html>

So already you can see (a) a hierarchy of nodes, and (b) the major parts of any HTML document. As a hierarchy, we might express the h1 element as html $\rightarrow$ body $\rightarrow$ h1, and the text inside the h1 element as html $\rightarrow$ body $\rightarrow$ h1 $\rightarrow$ \#text. The title element (or node, if you'd rather) sets the title in the window bar of your browser (up the very top). Most browsers will suffix the title with the browser name, for example My First HTML Page - Mozilla Firefox.

The title element also consists of an opening tag $<$title$>$, some text, and a closing tag $<$/title$>$. In Strict-mode HTML, all elements must be explicitly closed. For elements without any text content (for example, an image element), you use a self-closing tag, like so:

<img src="foo.jpg" alt="Some image" />

Note the forward-slash just before the closing bracket. This indicates that the tag is self-contained.

Okay, so, now you know about tags, and about how tags can go inside other tags and whatnot, it's time to look at...

HTML Element Attributes

Attributes are pieces of information that go inside of tags. Under XHTML and any sane HTML standard, they exist in key="value" format. There is a limited set of attributes available for any given tag, but attributes like id, class, style, and title[5] are almost universal.

<a href="http://www.example.com" class="myLink">visit example.com</a>

What have we here? We have an anchor element with an opening tag, some inner text, and a closing tag. Inside the opening tag are also two attributes, href and class.

So there's a crash course in HTML tags and element (or node) structure. There's not a lot more to HTML than that; the rest is just a matter of learning all the different tags and how to use them appropriately. This guide isn't going to provide an exhaustive list of tags, that's what the {\em Resources} section is for! In the following sections we'll introduce a few more tags, but I'll explain them as we go.

CSS

Cascading Style Sheets (CSS) are the markup of choice for making your web pages look fashionable (you could also use Extensible Style Sheet Language Transformations (XSLT) if you were writing your pages in pure XML, but that's neither here nor there). They work by creating rules that target an element or groups of elements in your html using something known as a selector. There are three basic types of selector:

Element Selectors

These select all HTML elements of a particular type, e.g. div or p or a. The basic syntax is:

body {
  margin: 0;
  padding: 0;
}

Class Selectors

These select all HTML elements with a particular value for their class attribute. For example, take the following div:

<div class="foo">

You may target it using the Element Selector described above, but you can also target it using the class selector for foo, like so:

.foo {
  color: #333333;
  background-color: #ffcc00;
}

Note that to target a class value, you prefix the class name with a period. You can have multiple class values in your html by simply space-delimiting each one, as in this example:


<div class="foo bar">

This element may now be target via foo and/or bar. Note that you can target it with both at the same time. Conflicting stylesheet rules are solved by a set of precedence rules, but I will leave the specifics of those to you to find out. Hah!

ID Selectors

In HTML, elements may also be given an id attribute which is unique to that element. Unlike classes, the id attribute may only contain one ID. IDs are also useful in JavaScript for targetting elements via the document.getElementById function. The syntax again:

<div id="myDiv">

To target this specific element in CSS, you use the id myDiv and prefix it with a hash, as follows:

#myDiv {
  border: 1px solid #000000;
}

Inline and Block-level elements

Most HTML elements fall into one of two display types: inline and block-level. There are a few others, but you can safely ignore them for about 99.9\% of your work. Put simply, block-level elements are good for laying stuff out, and inline elements are good for things like styling bits of text, and so on.

Block-level elements provide structure. Headings, paragraphs, and divs are all block-level by default. Any block-level element can be identified by the fact that, by default, it expands as wide as possible, and causes a vertical break. In other words, the element following a block-level element will sit \textbf{below} the block element, and the block element itself will sit below whatever element came before it.

On the other hand, inline elements act as their name implies - they sit inline with other elements. For instance, if I were to use <strong> to make some text bold, the text inside the strong element and surrounding the strong element would all sit inline, just like this text here. Awesome.

TODO: Create diagrams showing block-level and inline elements.

JavaScript

Before I begin, JavaScript has nothing to do with Java. The proper name for JavaScript is ECMAScript. Some champ at Netscape in the '90s decided JavaScript would be a great name for the language.

Okay, so because you're reading TFM, I'm going to assume you have some concept of programming. JavaScript is (in a nutshell) Object-Oriented. If you disagree with me, then you obviously know enough about JavaScript already and this guide will be of no further use to you!

So let's jump straight into a couple of important things to know about JavaScript.

JavaScript is concise. Awesomely concise. And, being a scripting language, it's loosely-typed (sometimes very loosely), so you can concatenate strings with numbers without any casting, and discover the many subtleties of inter-type comparisons, like 0 == "0", or false == 0.

I'll just throw a bunch of examples at you to start with. If you're familiar with C++ or Java, you already know a lot of the syntax.

  // A string
  var myString = "Hello, world!";

  // A number (behind the scenes it's a double)
  var myNumber = 123;

  // A regular expression
  var myRegexp = /^[a-zA-Z]+$/;

  // A function
  var myFunc = function ( arg1, arg2 ) {
     return (arg1 == arg2);
  }

  // An empty object
  var foo = {};

  // An object with some properties and a method
  var bar = {
     foo: "Hello, world!",
     bar: "Yo Earth whatup dawg?",
     baz: function ( arg ) {
        return "Look at this: " + arg;
     }
  }

  // Accessing those properties
  document.write(bar.foo);
  document.write(bar.bar);
  document.write(bar.baz("I'm awesome."));

  // An array with some values
  var baz = [ "Hello", "World", "!" ];

JavaScript also has all your favourite bits and pieces, such as while, do... while, and for loops, and if... else, and if... else if... else statements. It will also switch on just about any type you want.

The Document Object Model

The Document Object Model (DOM) is a pattern by which the node structure of an HTML or XML page is translated into an Object hierarchy for easy programmatic access.

Consider the following HTML sample:

<div id="foo"></div>

<a href="#">Bar</a> <a href="#">Baz</a>

We can access these nodes in a number of ways, but the most straightforward would be the following:


// Get a reference to the div
var foo = document.getElementById("foo");

var links = document.getElementsByTagName("a");

In this example, foo is a reference to the HTMLElement object for foo, and links is an array of HTMLElement objects, representing each of the anchor elements in the sample.

Events

JavaScript events are fairly easy to use:

window.onload = function() {
  alert("Hello, World!");
}

This little sample demonstrates an event in JavaScript. So, we have the window object, which represents the browser viewport context, and it has an event called onload. This particular event is fired when page loading completes, including the loading of all images and other resources on the page. Simple, effective, but sub-optimal. Here's another common example, detecting a click on an element:

document.getElementById("foo").onclick = function ( e ) {
  alert("I was just clicked!")
}

You'll notice this time an argument in the function, e. This is because the function signature for an event handler includes an event object, which contains all sorts of useful information.

You've probably heard of jQuery. jQuery, and other JS frameworks usually implement an event known as domready, or ready, as jQuery calls it. This event occurs when the DOM structure of a web page has been loaded, but before things like images have been downloaded. It's the event you want to rely on when you need to do some DOM manipulation as soon as the page is ready. Unfortunately, it's not natively supported, and the implementation is somewhat complicated, so I'll leave it to you to decide whether you want to use a framework, or find a dependency-free implementation floating around on the Internet.

Resources

  • W3Schools An okay reference resource for learning HTML, CSS, JavaScript, and various other computery things. Note that W3Schools is not related to the W3C.
    http://www.w3schools.com

  1. It used to be grueling, until science invented magic, and JavaScript libraries to make IE6 play nice. Also, if you're reading this from the future, ha ha ha, I bet IE6 still hasn't died yet!
  2. Since writing this Google has dropped IE6 support, so we may yet see it become insignificant in the near future.
  3. World Wide Web Consortium, a club for large companies that make web browsers and related paraphernalia.
  4. TFM obeys Godwin's Law.
  5. Note that there is also a tag called title, as outlined in the HTML subsection.
Personal tools