Julius Caesar started his history of the Gallic Wars with "All Gaul is divided into three parts". We will try to conquer web sites by dividing our topic into three parts. It is not rocket science. Alas, each of these will be divided into three parts, and so on. You need three things (apart from a computing device) to view (or build) a web site. Internet Web Server Web Browser The Internet is simply lots of smaller computer networks (like the ones in the Computer Room) all connected together, and all talking with each other. The USA Defense Advanced Research Projects Agency (DARPA) paid for ARPANET, the 1969 predecessor of the Internet. ARPANET taught a lot about how to connect multiple different computer networks using packet switching over circuit switched telephone lines. A workable Internet protocol suite was devised and tested between 1973 and 1977. A quick look at three aspects of the internet. Protocols Standards Internet Address Book Protocols Communication Protocols are a fancy word for agreements about how to talk with each other. Which language to use, who you are talking to, what time it is there, whose turn it is. Pretty much like a conversation at a party, except that computers are really stupid. They need to be told exactly how to have a conversation. They need a formal protocol. Each computer company once had its very own local area network protocols, and usually their own cables. Apple had AppleTalk. IBM had Token Ring. Microsoft Windows had Server Message Block (SMB) also known as Common Internet File System (CIFS) running over NetBios and NetBUI. Novel had NetWare. Unix often used Network File System (NFS) to access remote files. A Tower of Babel. Ethernet (1980) became a standard, and now pretty much all local area networks run on Ethernet wiring (or wirelessly). Eventually almost everyone also agreed to support Transmission Control Protocol and Internet Protocol (TCP/IP) for internet connections. Running over TCP/IP are applications such as File Transfer Protocol (FTP), which lets us transfer files (such as web pages) to a web server. Your web browser can download files (such as new programs) from a file server via FTP. Other well known applications show your mail using Post Office Protocol (POP) or Internet Message Access Protocol (IMAP). Your email is transferred using Simple Mail Transfer Protocol (SMTP). Web servers use the HyperText Transfer Protocol, invented by Sir Tim Berners-Lee in 1989. Standards The Internet Society guides and funds the Internet Engineering Task Force (IETF). The IETF maintains the protocols in use on the Internet. Volunteers produce Request for Comment (RFC), which describe standards for various internet protocols. They work in conjunction with International Organization for Standardization (ISO), and the World Wide Web Consortium (W3C). The World Wide Web Consortium (W3C) set standards for the World Wide Web (WWW). The consortium is made up of member organisations which maintain full-time staff for the purpose of working together in the development of standards for the World Wide Web. W3C was founded in 1994 by Sir Tim Berners-Lee. He also invented the World Wide web. The W3C also provides validators to confirm your web pages are valid and meet standards. If your web page is not valid, it is often very hard to work out why a web browser is not displaying it correctly. Internet Address Book When we seek a web resource, how do we translate the Internet Protocol address numbers computers use to identify each other into a domain name we humans find comfortable? We do it via the Domain Name System (DNS). The Internet Assigned Numbers Authority (IANA) delegates number assignment to five Regional Internet registries. They in turn delegate assignment to a Domain Name Register. I bought our computer club name from a Domain Name Register in the USA. It costs about US$12 a year. IANA registries also delegate more local Domain Name Servers. Usually our Internet Service Provider tells our computer the number it needs for a connection. We do not need to worry about it, unless something goes wrong. One of many Uniform Resource Schemes registered with IANA is the top level for Uniform Resource Identifiers (URI) - formerly called Uniform Resource Locators (URL). As well as the computer, it may include a path to a resource. You may be calling that mouthful a web address. There are three standards we need for a web site. HyperText Transfer Protocol A hypertext markup language. Specifically a HyperText Markup Language that includes Hyperlinks. The contents of a web page are written in a HTML markup language. However these days we do not try to describe the presentation within the web page, since any presentation change means rewriting every page. Most early HTML variations are obsolete or ill advised. The three current HTML markup languages we could use are HTML 4.01, the later XHTML™ 1.1 (which I use), or the forthcoming and very exciting HTML5. Cascading Style Sheets CSS. A style sheet is much like a style sheet in a word processor. It describes how to present the contents of a web page. The size of headings, the fonts for text, the colours, and almost everything to do with presentation. Since each web page in even a very large web site will link to only a few CSS pages, changing an entire web site can be done by changing only those few CSS pages. More advanced web sites also use the Javascript interpreted programming language to manipulate the HTML using the Document Object Model (DOM) Level 2 HTML Specification. Programming in Javascript is not needed for a simple site, and can add a lot of complexity. However it is standard for commercial web sites. The standard is actually ECMAScript. Do web pages follow web standards? Unfortunately standards compliance by web sites has been pathetic, as that 2008 report shows. Here from IBM are some reasons to check and make your web site valid. Some hints from W3C on making your web site valid. Here are W3C links to tools for making valid web sites. A HTTP Web Server A computer that runs a Web site. Using the HTTP protocol, the Web server delivers Web pages to browsers as well as other data files to Web-based applications. Web servers are not always used for serving the World Wide Web. They can also be found embedded in devices such as printers, routers, webcams and serving only a local network. Where do I get a web server? Your own computer. If you have sufficient upload speed (we generally do not). First install a web server program such as Apache HTTP server. Installing a web server may be tricky (easy on a Macintosh). Configuring it correctly is absolutely tricky. I strongly advise against doing this. Many Internet Service Providers operate a web server, often free. They let you install small web sites on it. You may not be able to use your own domain name, or that may cost extra. A specialist web server provider. Use your own domain name as standard, for web sites, email and much else. Carlyle Gardens Computer Club's web site is one of millions hosted on Dreamhost. A Web Browser A web browser is an application such as Microsoft's Internet Explorer, Mozilla Firefox or Google's Chrome. Each uses a different web browser engine or layout engine component to paint the contents of a web page on a virtual window. The window can be a display, or a printer, or even a voice synthesiser. Because it may not be a browser, we often call it a User Agent. Because the web browser engine is simply a software component, it can be used by many different applications. For example, fancy graphic email generated as a web page is now usually displayed by the web browser engine. ePub ebooks, which are really just a special zipped web site, can be displayed by a web browser engine. An HTML web page can be considered as having three parts. Document Type Declaration Head Body The first line, a Document Type Declaration, is a link to a Document Type Definition (DTD). This is the specification for that version of HTML or XHTML. While technically it is human readable, it is specifically intended to be computer readable. It describes exactly what markup the web page can contain. So unless you have learnt how to read that sort of specification, looking at it is a bit of a waste of time. If writing a web page, you just copy the correct version of that line. Head is hidden information to help a web browser understand and present a web page. The only bit you normally see is the Title, which typically shows up at the top of your browser window. You may also see a Meta tag called Description, which a search engine like Google may display as a guide to the contents of that web page. Body is the actual content of a web page. The stuff that you see on the display. It is all just text, which are surrounded by tags. Other Items Browser sniffing Character encoding Derek Powazek comments on If you’re not paying for the product, you are the product Handling Internationalisation Handling character encodings in HTML and CSS Declaring character encodings in HTML John Klensin, "Simple Mail Transfer Protocol", RFC 2821 David Siegel The Web is Ruined and I Ruined it List of three digit HTTP status codes returned to a web browser by a web server. RT @barbiche: HTTP response codes for dummies. 50x: We stuffed up. 40x: You stuffed up. 30x: Ask them over there. 20x: Cool! Knowledge engines Evi formerly TrueKnowledge. Quora. Computational knowledge engine Wolfram Alpha