NetSpeak

Articles and definitions relating to the Internet.

Articles

The World Wide Web

World Wide Web The World-Wide Web (WWW) project began in 1992 at the Geneva-based European Centre for Nuclear Research (CERN) and its commercial possibilities were quickly recognised.

WWW, or Web, can be thought of as a collection of documents resident on thousands of servers around the world. Each document, written in the HyperText Mark-up Language (HTML) can obtain text, images, sounds and video. More importantly, it can contain links to documents held on other machines accessible through the Internet, creating what is usually called "hypertext".

These links appear as hotspots in the document, and can be a word, phrase or even an image. Selecting the hotspot automatically connects your computer with the one referenced in the underlying hypertext document, and loads a new hypertext document held there on your machine. As the user, you need know nothing about where the machines are, or how the connection is made.

Since these hypertext links often take you to other documents rich in hotspots, the ensemble can be thought of as a huge hypertext system spanning the world, a kind of global version of a Microsoft Windows Help file.

To access WWW you need a browser: these programs let you read hypertext documents, view any built-in images and activate hotspots.

Two of the most important browsers are Netscape and Microsoft Internet Explorer.

Browsers can also be used to link to file transfer protocol (FTP) servers (where files are stored), gophers (where information can found using a menu structure) and wide area information servers (WAISs, which allow free-text searches).

Businesses can benefit from the World Wide Web by providing: -

Advanced E-mail features include attachments; which enables spreadsheets, documents, images and presentations to be distributed with your message.

FTP can also be used to obtain free updates from leading software companies.

This allows technicians control computers from anywhere in the world, allowing a single, centralised support department to correct software faults throughout an organisation.

How it all works and why nobody runs it

The Internet is not online service, but a collaborative collection of networks that adhere to certain basic standards when exchanging information among themselves.

Each Internet service provider pays for the cost of the connections it runs, and makes profit from the charges to its subscribers for their use of them. Economics of scale and continuing advances in technology make these cheap, even for intercontinental traffic.

Extensive connectivity among major Internet providers means that information is sent around the Internet efficiently, traversing only a few separate networks.

Unsurprisingly no one body controls it. Although various standards need to be adhered to by Internet service providers, there are no Internet police who check and enforce them. The system is self-policing; if any organisation strays from collective standards, it loses the benefits of universal connectivity - which is the whole point of becoming part of the Internet in the first place.

There are bodies that carry out central functions for the Internet such as the InterNIC ( http://www.internic.net ) which, among other things, registers companies that are connected to the Internet, and the Internet Society (gopher,isoc.org). The society has various engineering committees that help make technical recommendations for the future development of the Internet, but none of these has the power to force a particular direction or action on the Internet community.

Much of the information available over the Internet is held on university computers and is managed by public-spirited individuals. Similarly, many of the Internet indexing and cataloguing services available - Archie, Gophers, World Wide Web servers - have been set up and run by university departments. Manufacturers, who effectively sponsor an Internet site, often provide the computers themselves and their huge storage requirements free. For example the important UK FTP site at Imperial College (src.doc.ic.ac.uk) is sponsored by Sun Microsystems, which receives due acknowledgement each time you log on.

Internet Providers

There are hundreds of Internet providers and it is difficult choosing between them. Most however rely upon the services of a selected few. These, including Demon Internet Services, EUnet, GB, Pipex (owned by UUnet, which itself is partly owned by Microsoft) and BT. These companies, together with Ukerna (the UK Education and Research Networking Association - the body that runs the UK academic network Janet) formed Linx; the London Internet Exchange. This is a neutral point of interconnect where they can exchange data among themselves directly.

Hops

The structure of the Internet consists of many joined links. Cyber road maps show the information as a hierarchical tree, with local Internet service providers connecting to national connectivity providers. For users a more natural way of conceiving the Internet is as a series of hops between your computer and the site you are trying to reach. Different parts of the same message - that is different packets - may take different paths that vary from moment to moment according to the status of the intervening network.

Whatever path is taken, it consists of sections between computers that decide how to pass on the packets (the routers). These sections are commonly called hops. How many hops there are between computer and destination depends upon the end-points, the time of day and other factors. It can be as low as three of four, or may extend into the 20s or even 30s.

The number of hops required to reach the destination matters: the more there are, the longer the transit time, and the more chance that the packet will get lost. Indeed, if the number of hops is too large, the system may simply give up, since for reasons of efficiency packets are generally discarded automatically once they have passed through a predetermined number of hops.

How best to get your company name on the Net

An early problem of registering Internet names for companies was that sharp individuals in the US were registering the names of major corporations. These companies then found themselves unable to use their main brand online. There is now a formal requirement in the US that the requested name to be used on the Internet does not infringe the intellectual property of any third party and will not be used for any unlawful purpose (full details at ftp://rs.internic.net/policy/internic/internic-domain-4.txt).

In the UK, things have always been simpler. A considerable amount of material has appeared online that will prove useful for companies considering making an application for their first or subsequent Internet names.

The main .UK country domain is divided up into various sub-domains: .ac.uk (the academic world); .gov.uk (for government bodies); .mod.uk (Ministry of Defence): .net.uk (network suppliers); .org.uk (general organisations that do not fall into any other category); .nhs.uk (for NHS organisations); .sch.uk (for schools); and .co.uk as the main commercial domain. See the URL http://www.nic.uk/for details.

There are formal sub-domains within these (called neutral sub-domains), such as .music.co.uk, .law.co.uk, .tv.co.uk, .radio.co.uk and .internet. co.uk (for Internet suppliers), see http://www.britain.eu.net/naming-co/other-domains.html.

EUNET GB administers the .co.uk sub-domain. However, the final decision as to whether a name will be accepted or not is down to an informal committee, called the naming-co list. It has voting and non-voting members. The former are essentially the UK Internet suppliers who have their own international connections. See http://www.britain.eu.net/naming-co/members.html.

The main page with information on how to register company names in this sub-domain is at http://www.britain.eu.net/naming-co/. Perhaps the most important information here is the hints and tips on what names will be accepted by the naming committee (at http://www.britain.eu.net/naming -co/tips.html).

Remember that the name should reflect the company applying for it, and that two- and three-letter names will not in general be accepted (unless the company is very well known, like BT or HP).

Names that are already registered will obviously not be given out again but neither will new names be given to a company that does not intend to stop using its old name. Requests for a large number of similar names for related organisations are frowned on and should be created at the next level down. One easy way of finding out if a name has already been registered is to use the search engine at http://www.britain.eu.net/naming-co/whois-form.html which will look through all the registered names in the .co.uk sub-domain. Also useful is the list at http://www.hensa.ac.uk/uksites/co/index.html, which lets you look through a consolidated list of company names in this sub-domain.

Finally, to keep up-to-date with who has applied recently, see http://http.demon.net/external/networks/ncprov.html, which shows the various company names that have been applied for by each of the Internet suppliers.

The normal procedure is to go via your Internet provider in order to register your name, although you can also apply to EUnet GB directly using the form at http://www.britain.eu.net/naming-co/user-form.html.

If all this sounds too complicated to bother with, you might want to visit the URLs at http://www.britain.eu.net/naming-co/count-couk/count.html and http://www.britain.eu.net/naming-co/count-couk/history.txt. These show how the growth of the .co.uk sub-domain is more or less exponential. In other words, if you don't register soon, you may well find yourself left behind in the great business stampede to the Internet.

Creating A Web Site

Quite how you establish a World Wide Web site is perhaps the least important aspect of the whole process. A company might choose to develop the skills of people in-house (for example in the marketing and IT departments), exploiting the fact that setting up HTML pages is extremely simple (though hard to do well). The pages will then be placed on a server run as part of the company's network and connected to the Internet through a firewall, or isolated from it completely for total security and corporate peace of mind.

Alternatively, the work can be farmed out to specialists who will write pages (rather as advertising agencies produce copy for marketing purposes) and then arrange for them to be held on a Web server where space can be rented. This has the advantage that there are no security worries, and design is left to experts. The downside is that you will obviously pay a premium for these services and there is no opportunity to learn from the process of creating pages - which means forgoing the chance to develop skills that in the future all companies will need to understand if not practise.

More crucial to the success of a Web site is the content. It is content that draws people to pages and holds them there. Design can help or hinder, but never acts as a substitute.

The three most important aspects of Web site content are that it needs to be appropriate to its intended audience, genuinely valuable (not marketing fluff) and constantly changing. You cannot just place anything on a Web page and hope that somebody finds it interesting: you need to have a clear idea of what kind of visitor you want to attract (generally this will be the same as your typical customer profile). In fact, even more than with conventional marketing, online activities need to be very highly targeted so that visitors can tell at a glance whether it is worth their while lingering and exploring further.

Assuming that they decide to do so, it will then be the richness of the online content that determines whether they stay long and what their overall impressions will be. There is nothing worse than a site that promises much and delivers little. Word will soon get out on the Internet grapevine (and in a sense the Internet is all grapevine) that the pages in question are not worth visiting, and the site will languish.

Finally, assuming that visitors find your pages of genuine interest, you need to ensure that there is at least some element that changes on a regular basis so as to draw people back. There is so much competition online that Web sites must fight for their audiences every day.

Assuming this battle is won, and visitors find the site interesting and worth returning to, you will have created a powerful online marketing tool. All the time that people view your site they are imbibing your corporate messages (either implicit or explicit). If what they have seen there is useful, impressive or entertaining, they will leave with an enhanced opinion of the site's brand and owner.

But there is another, rather novel benefit. In creating a site which meets the criteria mentioned above, you will effectively be putting together an online periodical, targeted at a particular sector (in fact the same as that served by your company). What you gain is an online readership - readership, moreover, that if big enough might even be sold to online advertisers in the form of links to their Web pages. Every company setting up a WWW site adds the second business of Internet publishing to its traditional activities.

How companies can set up stalls on the Internet

Once a company has decided to use the Internet for promotional or sales purposes by creating some kind of publicly-accessible site - whether FTP, Gopher, telnet or World Wide Web - it is faced with the problem of letting potential visitors know about its existence. For one of the novel aspects of the Internet is that its users are active rather than passive: it is they who decide to go to the company rather waiting for the company to come to them as with conventional marketing.

An obvious place to start in the process of informing people about a new site is to join the great list of commercial sites at Yahoo (http://www.yahoo.com/Business/). This offers perhaps the most comprehensive list of companies on the Internet, and has an extremely large number of visitors who use it as a jumping-off point.

The entries in Yahoo are strictly informational, with little scope for imaginative design, subtle approaches or more comprehensive material. To meet this clear need for a forum where companies on the Internet can inform and attract potential visitors to their sites a new kind of Web site has evolved, generally called an Internet shopping mall by analogy with the physical equivalents that are such a feature of the US.

The pre-eminent source of information in this area is called, appropriately enough, The Internet Mall. It was begun in February last year as an E-mail document with just 34 companies offering various items for sale over the Internet. Today, it has over 1000 cybershops, with tens joining each day, and the main document is almost a megabyte in size; if your E-mail system can cope, it can be retrieved by sending the message send fullmail to the address taylor@netcom.com. It might be more advisable is to retrieve instead one of the fortnightly updates: use the message send mall passed to the same address. The full list can also be retrieved by FTP from ftp://ftp.netcom.com/pub/Gu/Guides/Internet.Mall.

But perhaps the best way to access the information is through its hypertext incarnation at the URL http://www.mecklerweb.com/imall/. This very impressive site offers various ways of approaching the material: a thematic organisation (based around mall 'floors') and via a search engine. There is also information (at http://www.mecklerweb.com/imall/howto.htm) on how to add your own company to the list.

It costs nothing to join the Internet Mall since the Internet Shopping Network sponsors the project. The latter is part of the US cable TV company Home Shopping Network, which had sales of $1.2 billion in 1993. The Internet Shopping Network can be found at the URL http://shop.internet.net/, and claims to be the world's largest Internet shopping mall with 600 companies (presumably the original Internet Mall is not included since it is more of an information source than a commercial venture).

Nor is the Internet Shopping Network the only Internet mall to be owned by a major company with plenty of experience in online selling. NetMarket (at http://www.netmarket.com/) is part of CUC International, which sells goods at discount to its 30 million members using conventional means, and obviously hopes to do the same in cyberspace. Particularly interesting are the pilots of the advanced interactive shopping services (follow the Business Solutions link on the home page above).

Alongside these giants there are many other smaller Internet malls. A comprehensive list can be found at the URL http://www.yahoo.com/Business/Corporations/Shopping_Centers/. In the UK, for example, there are sites at Apollo UK (http://apollo.co.uk/) and MarketNet (http://mkn.co.uk/). Also of note is Downtown Anywhere (http://www.awa.com/) which offers both commercial and other information in a form that extends the basic metaphor of the US mall to include additional areas in a virtual city.

A Web page has been set up to aid the submission of new Web sites to the main global listings and search engines such as Yahoo, EInet Galaxy, Lycos, Web Crawler and so on. It allows you to cut and paste your URL to the various submission forms. It can be found at http://www.cen.uiuc.edu/~banister/submit-it/. or http://submit-it.permalink.com/submit-it/.

A comprehensive list of those pages willing to consider new URLs for inclusion in their holdings can be found at http://www.homecom.com/global/pointers.html. The list includes HomeCom global Village (http://www.homecom.com/global/gc_entry.html); a graphically based "Starting Point" for new site listings (http://www.stpt.com/); The World Wide Yellow Pages (http://www.yellow.com/); Open Market (http://www.directory.net/dir/submit.cgi); Entrepreneurs on the Web (http://sashimi.wwa.com/~notime/eotw/EOTW.html); The Centre of the 'Comp-Uni-Verse' for Net Surfers (http://netcenter.com/yellows.html); BizWeb (http://www.bizweb.com/InfoForm/infoform.html); and Net Search (http://www.ais.net/netsearch/).

Why everyone on the Net should know about HTML

Although most of the essential components of the Internet have been in existence for nearly 25 years, it is only in the last two or three that the Net's use in business has become widespread.

In part this has been caused by the increasing appreciation of how E-mail can simplify and extend all kinds of business contacts. However, this corporate awakening has been largely due to the introduction and extraordinarily rapid take-up of the World Wide Web. Even before the exciting latest developments of Java and VRML, the Web offered a business medium with immediate impact, interactivity and a sense of immersion hitherto lacking in Internet services. Its application for marketing and sales purposes, particularly the former, have been so obvious that there has been surprisingly little resistance within corporate structures to at least experimenting in this new arena.

And so it is that some sense of how Web pages are put together - what is and is not possible - is increasingly becoming a prerequisite for modern managers.

If you use a Web browser that offers a bookmark/ hotlist feature to store your favourite Internet sites, this is effectively your own home page, although represented on screen in a slightly unconventional form. In fact it is a trivial matter to export a set of bookmarks and turn them into a fully-fledged and even more accessible home page that can be loaded as the default every time you run your browser.

For those who remain sceptical about the relevance of Web page creation in the ordinary business context, there is now another reason why some knowledge of HTML is likely to become an indispensable skill for the modern manager.

The use of internal TCP/IP networks - the so-called "intranets" - is already a major trend, especially in the US. There, such internal E-mail, news and Web systems are fast turning the long-promised groupware/ Executive Information Systems into reality. Where intranets flourish, personal pages are usually encouraged as a way of allowing staff to participate and to add a human touch.

A by-product of this is that those unable to write their own HTML pages will be denying themselves the opportunity to stake a personal claim in this new, very public corporate space. Adding your home page gives you the chance to present to both peers and superiors a honed self-image that can act as a kind of online CV in permanent application for that next promotion.

It is surely not too fanciful to predict that in the future the ability to knock up a solid Web page will be just as useful as good memo or report writing skills are today.

How to get the best from your Net site pages

Writing World Wide Web pages is very easy, but as with so much else on the Internet, understanding the underlying principles can be an enormous help in getting the most from the tools available and avoiding the various pitfalls. For example, few are aware that the HyperText Markup Language (HTML) used for creating Web pages derives part of its name and much of its underlying philosophy from the world of the Standard Generalised Markup Language (SGML).

SGML is about defining logical structures of documents in a formal and self-consistent way (for an excellent introduction see Readme.1st, SGML for Writers and Editors, £30.47, ISBN 0-13-432717-9). Although this might appear to be a fairly dry, academic exercise, it has considerable benefits. It allows you to take a SGML document and use it in many different contexts. For example, it may be displayed on a screen, printed on a dot matrix or laser printer, or even converted into Braille. Because SGML codifies the structure of the document, it allows each logical element of the document to adjust itself according to the final medium, without the need for further user intervention. HTML is what is known as an application of SGML. This means it uses the conventions and ideas of SGML to define the basic structural elements of its documents. HTML is an extremely simple application of SGML, but there are two very important properties that must always be remembered when using it to create Web pages.

First, HTML is not a language for describing the appearance of a page, however much it might seem to be. Instead, it describes the underlying structure of that page. In this it differs radically from the more familiar desktop publishing programs. These explicitly define the size and position of all the components of a page.

Because HTML describes the structure, not the appearance, when creating a Web page you do not know how exactly it will appear on the screen of the person viewing it. To understand why this is so, it is worth examining what happens when a Web page is requested and retrieved.

When you use a browser such as Netscape to access a Web page, say the one at http://www.gm.com/index.htm, a request is sent across the Internet to the Web server at that address. Assuming that the requested page exists and is freely available (some require passwords before they are sent), the server returns that page to the browser over the Internet.

The page itself consists of nothing but a text file, written according to the rules of HTML. Within this document there may be references to multimedia elements (graphics, sounds etc), in which case these are sent separately. When the main HTML document arrives at the browser, it is processed in a simple but important way. The HTML structural markers are located (following SGML conventions, they are all written between angled brackets <>), and then converted to an on-screen representation. However - and this is crucial to remember - it is the browser that chooses what form this will take.

For example, some of the commonest structural elements within an HTML document are various levels of headings. When creating the HTML file, you simply specify that certain phrases are first or second level headings, etc. You are not able to specify the size of headings or a certain typeface, since these are determined by the settings of the browser that processes your HTML document. These setting can often be altered by the user.

The fact that HTML is an SGML application therefore changes the way you must think about the Web pages you create. Remember that what you see on your screen is not what other users will necessarily get, and that special and "clever" effects will almost certainly be lost on some browsers. For HTML documents, logical structure, not dramatic layout, is paramount, and simple but effective design is the order of the day.

How to survive the data deluge from the Internet

The culmination of the incredible progress the various Web search engines has been the recent appearance of services offering full-text searches of a very large proportion of the Internet, with all of the Web promised for some time in the near future.

The more you use these search engines, the more you become aware of the yawning gap between what is available globally and what is available locally. The problem is exacerbated by the sheer richness of the Internet and the fact that most of what you find is free. The temptation to download a file or a page is almost irresistible.

The result is plain to see on anyone's hard disc: tens if not hundreds of Mbytes of Internet-derived data and programs. For anyone who uses the Internet regularly the challenge is to manage this local data flood in the same way that the search engines are helping people cope on the global scale.

Fortunately there are solutions available that give you much of the power of an Alta Vista or Open Text search through the contents of your hard disc. Although these have not been designed specifically for Internet users, they lend themselves very readily to the task.

There are two programs for PCs running Windows: AskSam, which costs £99.95 from Guildsoft (01752) 895100, and ZyIndex, costing £395 from ZyLab UK (01235) 861681.

Both allow you to carry out full-text searches of groups of documents and other text files that have previously been indexed using this software. Both programs offer a good range of features, including Boolean searches (using AND, NOT, OR, etc), proximity searches (finding two words within a certain distance of each other) and fuzzy searching (where near-equivalents can be found for a given search word).

Response times are good for both: just a few seconds to search through several Mbytes of data. However, the approach taken is quite different in each case, and which you choose depends to an extent on your working environment.

For example, AskSam creates an entirely new file containing the data and the index, roughly the same size as the files themselves. This has the virtue that you can carry copies of this indexed version to other machines running AskSam.

ZyIndex, in contrast, creates an index that is separate from the files themselves. This has the advantage that the index is smaller than that of AskSam, and allows data spread across a network to be indexed more efficiently.

The down side is that once words and phrases have been located, you need to load the file's native application (e.g. Microsoft Word) to view it in its original form. (ZyIndex can only display rather crude ASCII excerpts.)

For the Macintosh there is the program On Location, &#163;99 from ESP (01628) 23453. This offers fewer text retrieval facilities than the PC programs, but also serves as a more general file-search tool.

These free-text search engines solve one part of the problem of the data deluge - how to find things - but leave untouched another aspect. Using a 28.8Kbyte modem it is quite possible to download tens of Mbytes of files from the Internet in a day, and with leased lines much more can be obtained.

Even with Gbyte hard discs, such a stream of files can soon fill up the most ample storage. Moreover, many programs (such as the latest versions of Netscape Navigator) are so large they will not fit on a floppy disc (and dividing programs across floppies is not simple).

Happily, a new generation of low-cost removable discs has arrived to meet this need. Products such as Iomega's Zip drive (available for the PC and Macintosh, priced at about £108+VAT), makes it possible to store 100Mbytes of data on a disc costing under £10. And if that isn't enough, the Iomega Jazz drive (£229+VAT) offers no less than one Gbyte on a removable medium (£52), which should keep even the most voracious of downloaders happy. Iomega is on (0800) 973194.

Creating a World Wide Web for every tongue

Few people have tried to add multilingual capabilities to their browsers. A good reference (http://wwli.com/library/localize.html) point for users interested in this area is the site for multi-lingualism and the Internet.

Perhaps not surprisingly, Microsoft's Internet Explorer is well thought-out as far as international use is concerned. For all its faults, Microsoft is a company that is keenly aware of the importance of localisation for software. Internet Explorer is available in 9 languages (that is, with menus changed appropriately), and it is also easy to add extra characters sets to view multilingual pages.

This is done by downloading the appropriate font file (http://www.microsoft.com/msdownload/ieadd/03.htm), (the choices are Simplified Chinese, Traditional Chinese, Japanese, Korean and Pan-european). Running the file causes it to be installed and the relevant changes for Internet Explorer made automatically. Thereafter, there is a pop-up list of available character sets available in the bottom right-hand corner of the browser window. When you encounter a page using a character set other than Latin-1 you can simply select from this list to refresh the page.

Adding this capability to Netscape is much harder, and reflects this young company's relative inexperience in dealing with international markets. First, you need to find the relevant fonts yourself (in practice, the simplest solution is to use those provided by Microsoft). Once these have been installed, you must then activate them in Netscape. This is done from the Options menu, choosing General Preferences and Fonts. For each of the encodings you specify the font that you have added. Then, to use this encoding for a Web page, you will need to go to the drop-down list available on the Document Encoding entry on the Options menu.

None of this is very intuitive; worse is the fact that for Japanese font capabilities you have to edit an entry in the Windows registry - the software equivalent of brain surgery, and about as risky. If you want full foreign language capabilities for Netscape, it may be easier to buy the plug-in (http://www.accentsoft.com/) called Navigate with an Accent from Accent Software. This adds a new drop-down list of character sets alongside the main menu buttons. An evaluation copy (http://www.accentsoft.com/download/dleng.htm) is available. Unfortunately the add-in disables important features such as plug-ins, frames and Java.

Accent produces its own standalone browser (based on the original Mosaic). This too adopts a drop-down list of language options, though strangely Chinese is absent. Accent does, however, offer both Arabic and Hebrew, something that neither Internet Explorer nor Netscape is capable of. An evaluation copy is available from the URL given above. Another product in the Accent range that can be downloaded from there in a trial version is Accent Publisher. This addresses the other side of the multilingual problem: creating Web pages with character sets other than Latin-1.

With Accent Publisher, you can design a page in most European languages plus Arabic and Hebrew (floating keymaps let you use a QWERTY pad to enter non-Latin characters) and then to convert them to HTML files automatically. More advanced features such as tables are supported. Also notable is the ability to swap among 21 languages (including Arabic, Greek, Hebrew, Russian and Turkish) for the menus.

Another browser product based on the original Mosaic is Tango, from Alis Technologies (http://www.alis.com/), whose site was mentioned last week as a useful starting point for exploring Internet multilingual issues. An evaluation copy (http://www.alis.com/internet_products/try_form.en.html) can be downloaded. Tango can display no less than 90 languages, including Arabic, Chinese, Greek, Hebrew, Korean, Russian and Thai. The interface can be switched to any of 19 languages. The corresponding creation software called Tango Creator lets you compose HTML pages in 90 languages using character sets other than Latin-1, and supports tables and frames.

Why the Web will be the font of all corporate data

Several surveys in the US have indicated that already the majority of larger companies there have or are implementing intranets. The UK is still a little behind in this area, but as with the Internet in general it is probably further advanced than any other country outside North America.

Since even the most basic of intranets - one where you replace costly and inefficient paper-based communication systems within companies by equivalent TCP/IP network technologies - is so simple to grasp and so compelling as an idea, it is easy to forget its intrinsic limitations.

Much of the intranet's enormous potential derives from the use of Web browsers as the common front-end to corporate information. These are easy to use and platform-independent, making rollout across a company straightforward both in terms of training and development. But where simple intranets fall down is at the back-end.

Normally an internal Web system will run off one or more Web servers; this will therefore mean that all of the information to be made available on the intranet must first be transferred (and possibly translated) to the store of HTML documents that are served across the network. This is relatively straightforward for text documents, but for anything more complex - in particular for the kind of information held in corporate databases or financial systems - the simple intranet approach is not enough.

As a result, there is now a growing interest in the marriage of the conceptual simplicity of Web front-ends with the rich complexity of corporate databases, mediated by the TCP/IP-based intranets.

The vision driving the very many disparate approaches being developed is to employ the Web browser as the universal client that will allow anyone to access any information held on a company's heterogeneous array of data servers. This would avoid the need for new, proprietary software on every desktop or costly re-training.

In many respects, this coming together of Web and database represents the second generation of the Internet and intranets in business. On the external Web such systems are paramount for electronic commerce, and still thin on the ground as a consequence of the work they require in setting them up.

There, typically, a customer will access a database of product information via a Web browser, and place an order. The customer and order details would then be entered into another database and fed from there into the vendor's fulfilment system. Both sides of the equation therefore require tight integration of the Web front-end and the database back-end.

The importance of this integration of Web and database is even more crucial for intranets.

Corporate databases in all their forms represent a unique and therefore highly valuable store of information about customers, markets, divisions and future trends. Enabling people to get at these easily, and to drill down in myriad ways to find other, possibly unsuspected kinds of data deposits could offer major benefits in terms of a company's day-to-day running and longer-term planning.

Because the Web, intranets and their associated standards are so new, and because databases are so much an established part of corporate computing, there remain many thorny problems to be resolved before this golden age of internal information retrieval dawns. Although it is quite possible to lash up quick fixes using more or less any of the many development tools that are currently available, the strategic importance of this area means these must be superseded by solutions that are properly thought-out, reliable and fully scalable.

Why small is beautiful on an intranet

Scalability is one of the main reasons for why the TCP/IP protocol is used for business intranets. The same basic elements can be used for a multi-national company employing hundreds of thousands of people as for a small office network: at no point is it necessary to switch technologies as the number of users increases, and there is no need for bridges between different networks.

Another great advantage of intranets is that they are very easy to set up, although maintenance may be another matter. This means that they can be rolled out not just centrally, but locally too, with each department, office or even worker able to create and run their own personal servers, which will typically be Web-based.

Clearly, money is an important issue: if many sites are being set up, the overall expense can soon mount up. This means that traditional heavy-duty Internet or corporate intranet solutions such as Sun Sparc systems running Solaris are ruled out for widespread local use. One solution that might well find favour is to use a PC running the Linux operating system and the Apache Web server, both of which can be downloaded for free.

Although some managers may be nervous about entrusting their data to free software, they can be comforted by the knowledge that more than 40% of all Internet Web servers use this software, according to the definitive Netcraft guide. Another obvious candidate is Windows 95. Microsoft has brought out its aptly named Personal Web Server, which is free, while Netscape has its Fast Track Server (£220) and many other freeware and shareware programs for Windows 95.

However, as anyone who has used the platform for a while will know, Windows 95 is hardly robust enough for this kind of role. Windows NT is far more appropriate, albeit more expensive. This is particularly the case since Microsoft has chosen to impose an arbitrary limit in terms of maximum simultaneous TCP/IP connections on its cheaper Windows NT Client product, which more or less forces you to use Windows NT Server if you are likely to exceed this limit. The benefit of moving up to NT Server is that you automatically get Microsoft's Internet Information Server (IIS) for free. This is an impressive piece of software, now in its third release, and may well be justification enough for swallowing the price of NT Server.

An alternative is Netscape's Fast Track server, which also runs under NT. A particular benefit of this product is the way it employs a Web browser as the main administration tool, rather than a standalone program as with Microsoft's IIS. Indeed, this is likely to become the standard way of administering many functions, and Netscape deserves much of the credit for pioneering this approach. Alongside these high-profile products, there are two others that are worth noting: Purveyor Encrypt (£599 from Process Software) and Website (which costs £365 from O'Reilly & Associates). Both are notable for the excellence of their documentation.

Nor do they suffer in comparison with Microsoft's and Netscape's products. Like them, Purveyor and Website offer secure transactions (increasingly important even for small intranets), as well as easy access to programming interfaces. Where Microsoft and Netscape have ISapi and NSapi, Purveyor has ISapi too (which Process helped Microsoft develop) and Website has WSapi. All four products have sophisticated ways of integrating with back-end databases.

Version 2.0 of the Wingate program is now available. This software allows entire networks to be connected to the Internet through a single Windows 95/NT machine, which acts as a proxy server.

How to make money from your Web site

One of the initial promises of commercial Web sites was to generate revenues from the huge and global audiences the Internet offers. Three revenue models were available: subscriptions, advertising and commerce. Subscriptions, almost without exception, have failed miserably: Internet users seem unwilling to pay for something they have by-now come to think of as free. E-commerce is still very small-scale, though recent forecasts have become very optimistic in the light of technological developments (the agreement of SET etc.).

This has left advertising as the primary means of making money with Web sites, notably through the use of banner ads. These are block-shaped advertisements, typically placed at the top and bottom of Web pages, with links to the advertiser's site. Practically all of the most popular sites employ them, and earn millions of dollars as a result. For those companies that have established a Web site and built up a more modest readership, it has hitherto not proved possible to convert this audience into money. The main problem is creating a viable sales infrastructure: setting up a Web site is one thing, managing an ad sales force to go with it, quite another.

This has led to the rise of services, which sell advertising space on others' Web sites. There is much to be said for this idea. For the Web site owner, it obviates the problem of running an ad sales team, and even small sites can gain revenues by being bundled with others to form attractive packages of sites - advertisement networks, as they are called. For the advertiser, there is the advantage of a single purchase, along with better targeting: with large advertising networks it is possible to put together tailor-made groupings of sites.

Moreover, the technique employed by most services - whereby the advertisements are held on a central server and accessed by URL references to them in the participating sites' Web pages - means that obtaining statistical information about visitors is much easier. Although this is a very new area, there are already hundreds of companies offering these services. The best resource is the excellent collection of links at Web Site Banner Advertising (http://www.ca-probate.com/comm_net.htm). As well as companies offering to sell advertising space on a company's Web site on a commission basis, listed here are also those that hold ads centrally and pay according to hits. Other variants include companies that use an auction technique (for example AdBot http://www.adbot.com/) and even host the entire site themselves (Intercity Oz Network http://interoz.com/network/).

There are a few relatively well-established services such as DoubleClick (http://www.doubleclick.net/), SIMWeb ( http://www.simweb.com/, part of Softbank) and WebRep (http://www.webrep.com/). The quality of their current clients is perhaps the best guarantee that they are offering a serious service generating real money. However, they only deal with very large sites. Of the newer entrants, it is probably safe to assume that schemes involving well-known companies such as Infoseek (http://info.infoseek.com/network/info.html), British Sky Broadcasting and the UK publisher EMAP (http://www.webwidemedia.com/ ) will be less problematic than working with unheard-of start-ups.

But in general the watchword for this whole new area is caution. Few of these companies have any track-record that can be examined, and it will take many months - perhaps years - before this interesting idea has been refined into something that will allow ordinary corporate Web sites to generate money in this way. Until then, you may prefer to adopt a rather safer strategy. Rather than worrying about whether the people selling space on your pages are just making money out of you, you can simply exchange banner ads with other sites. The relatively well-established Link Exchange (http://www.linkexchange.com/) has been organising this for some time, and uses a system of ad banner credits. It costs nothing to join and there are no charges. There is a UK-based operation called LinkSwap UK (http://www.ukwebmarketing.com/linkswap-uk/) that is also free, and even simpler.

New tools to create the perfect Web site

NetObjects Fusion is notable for the power of its design tool; you can place elements on a Web page with pixel-level accuracy. This means that designers can fine-tune their pages and be reasonably sure that what appears in the visitor's browser will look very similar to their intentions. However, there is a price to be paid for this very unWeb-like behaviour: the huge number of tables and small graphical elements the program uses to pad out space.

HTML purists will probably be shocked if they examine the HTML code generated, and certainly they will find it very hard to edit it by hand. However, if such matters do not worry you, and design considerations are paramount, then Fusion is probably the best tool. It also offers standard site management facilities such as link checking, and a very visual though fairly limited approach to hooking up with back-end databases

Microsoft's FrontPage 97 is more orthodox, and it is much easier to edit the HTML code directly, not least because FrontPage produces code that is automatically indented (examples of this can be seen at the site http://dialspace.dial.pipex.com/glyn.moody/, created using FrontPage).

Microsoft also offers other, complementary tools for the Web site development process. For example you can use Visual Source Safe to manage simultaneous development, while the new Visual InterDev (see) is a powerful tool for linking Web pages to back-end databases, and creating Active Server Pages, which generate HTML on the fly.

Corel's WebMaster Suite provides a similar range of tools for editing pages (Web.Designer), managing a site (Web.SiteManager) and database integration (Web.Data - which employs a very simple nine-step approach), and trial versions can be downloaded from. In some respects Corel's tools are even better than Microsoft's. For example, as well as indenting HTML source, it also colour-codes the links within it according to whether they have been checked, are broken, etc.

SoftQuad's Web site tools represent in many respects the antipodes to NetObjects' Fusion. Where the latter is particularly concerned with appearance, the former concentrates on content. One consequence of this is that HoTMetal Pro 3, SoftQuad's Web page editor, is now looking distinctly old-fashioned with its non-WYSIWYG approach and its limited design capabilities.

However, its site tool, called Information Manager, offers a number of interesting features. For example, unlike Fusion's simple hierarchical representation, or the more flexible approaches of FrontPage and WebMaster, Information Manager uses what it calls a Cyberbolic display of elements employing a spherical geometry to give you an ingeniously compact view of your site.

Perhaps even more impressive is SoftQuad's intranet product, HoTMetal Intranet Publisher (HiP). This offers the same basic features as the Internet product, but adds the extremely powerful ability to add user-defined extensions to HTML to allow different views to be produced from the same basic HTML code (so that accounts and production can pull out different relevant information from the same document, for example).

It does this by employing cascading stylesheets, the only Web site tool to support this new standard. Also provided is a server-side tool for monitoring Web site usage and sending user notifications of changes to pages of interest. However, HiP's lack of database tools is a serious weakness in the current product - one that is apparently being addressed in the next release.

How the Web is going to turn up everywhere

Microsoft's fascinating attempt to integrate Internet functionality directly into the Windows operating system, in effect, the Web browser becomes the interface to the entire computing environment, although bold - not least for the rigour with which it has been carried out - this move is by no means unprecedented.

For example, last year Microsoft introduced NT Web Admin to allow Windows NT environments to be administered using a Web interface. Similarly, it has been possible for some time to control both Microsoft's Information Internet Server and the cut-down Personal Web Server employing just a Web browser.

There are two big advantages of this approach. First, and most obvious, is the ability to exercise control at distance: if the target system is connected to a TCP/IP network it can be manipulated from anywhere. More subtly, Web interfaces draw on the intuitive nature of the browser. One of the reasons why the World Wide Web has taken off so dramatically, particularly in business, is because it requires only the most minimal training. As some cynics have put it, a browser is software even a Chief Executive can use.

The credit for this shift towards using the Web for general interface purposes must be given to Netscape. When it launched its Web servers it adopted the then-new technique of administering them using a standard Web browser (which access the server on a non-standard port number). Microsoft's adoption of this technique sets the seal on the idea. The beauty of the Web approach, along with its essential simplicity, is that it can be applied to almost any field. For example, Pipex has created a service whereby its Internet subscribers can read their e-mail using a Web interface - which means that it can be read anywhere in the world that an Internet connection is available.

More ambitious is the attempt by a consortium including Microsoft, Cisco, Compaq and Intel to create a complete Web-based approach to managing all aspects of networks, and from any location. An exemplary site devoted to the initiative has been created, with details of the basic ideas and the elements involved, including an illuminating Web-based demo and full details of the new HyperMedia Management Protocol.

Nor is this all promises; BMC has already added support for this proposed standard to its Patrolwatch Management Suite. Similarly, Cisco has employed a Web interface for its ClickStart management software for some time. Other network hardware devices are also being drawn into this approach. For example, Hewlett-Packard has developed Web JetAdmin for managing its JetDirect printers, while IBM has created a Java-based Network Printer Manager. Even CD-ROM units can now be controlled from a distance through a Web browser.

The Web interface can also be applied to software, as the server administration tools described above show. But so powerful is the approach that it can be used with any kind of application. For example, both Softquad and Netiva have come up with databases that are accessed purely through Web interfaces. And the use of Java means that potentially any kind of functionality can be added to a Web page while retaining the basic metaphor.

This Web-based approach could become even more central to computing if the ambitious vision underlying the next version of the TCP/IP protocols, IPv6, is realised. IPv6 was designed in part to allow IP addresses to be allocated to just about every electrical object on the planet - from light bulbs and toasters up; what could be more natural than accessing them from a distance via a Web interface?

Make a spectacle of your Web site

Web site development is no longer a job for amateurs and enthusiasts, writes Brian Clegg. As delegates at the Web 98 Design and Development Conference conference in Boston, US, September can vouch, to do a professional job requires knowledge of strategy, usability, information and visual design, and programming.

The first consideration in setting up a Web site is the server software needed to host it. A Web site is, in effect, a simple database. This database, the server, uses hypertext transfer protocol (HTTP) to deliver the specific Web page requested by many browsers simultaneously.

But delivering pages efficiently is no longer enough: a Web server must be able to execute programs to retrieve data, format pages on request and give a page life.

Depending on the server's abilities, this external program might be written in languages such as Perl, C++ and Visual Basic. In these an application communicates with the Web server using Web technology called the common gateway interface (CGI).

Microsoft has provided an extra twist by adding ISapi - an interface that allows external programs to plug more directly into the server.

Operating system
The next thing to consider is the operating system on which the Web server will run. If the platform is Windows NT, the obvious choice is Microsoft's Internet Information Server, which is free, fast and feature-rich.

Page functionality can be added by writing direct to ISapi, using any development environment that can build a dynamic linked library; or by scripting Active Server Pages using Perl or the popular browser scripting languages. If you want to provide access to a wide range of data try OLE-DB, Microsoft's latest object interface for databases.

Even more development options are available with O'Reilly's Website Professional, which adds CGI and its own proprietary interface. However, it costs about £500, has less general functionality and is probably better suited to smaller enterprises.

If the platform is Unix, the most powerful choice of operating system is Netscape's Enterprise Server. Here, development interfaces are provided for NSapi - Netscape's equivalent of ISapi, for CGI and for the Java language. Enterprise Server also has a huge range of connectivity options, making it an excellent choice for fronting up a database. But it isn't cheap - about $1,200 (£750).

One budget option is Apache. It is limited to CGI but has add-on modules to extend it - and it's free. If you have a free choice, Internet Information Server probably has the edge at the moment, with Enterprise Server coming a close second.

Once the server is chosen, the site needs to be built. hypertext markup language (HTML), which is used to describe Web pages, is plain text.

However, developing a site with a text editor such as Notepad in Windows is tedious and risky. Most developers use a visual design tool, which act like a word processor for the Web.

Simple pages can be created using actual word processors such as Microsoft Word and Lotus Wordpro, which generate HTML from an ordinary document. Developers can also use free Web editors such as Netscape's Composer which is included in the Communicator Suite, or Microsoft's Frontpage Express which comes with Windows 98 or Internet Explorer 4.

For professional sites, though, products such as Softquad Hotmetal Pro, Microsoft Frontpage, Adobe Pagemill and the unusual free-format Netobjects Fusion offer a much wider range of features.

These typically allow Web designers to preview pages in multiple browsers. They offer high-end facilities such as style sheets, and support for scripting. There are also software packages that manage the Web site, map its layout and check for dead ends, and upload pages to the live site from a test environment.

Style sheets
As the tools have matured, so has the underlying language. HTML has been revised and extended. Most notably, it now offers cascading style sheets, providing a mechanism to fix items on a Web page and set standard styles across a site.

At the same time, other extensions enhance the presentation of data within Web pages (XML), and make possible pages that can be tailored to an individual or change on the fly.

Traditionally, scripting ran on the server. This meant it was independent of the functionality that the end-user's Web browser had to offer. But increasingly, scripting has moved to the browser.

Both major manufacturers support the de facto standard Javascript, a cut-down version of the Java language, while VBScript, derived from Visual Basic, is available in Microsoft's browser, or through a Netscape plug-in.

One development that has been around for a while without achieving the penetration initially expected is "push". This technology enables users to subscribe to "channels" - special subsets of sites with active information. By contrast, another advance - hybrids of the Web and TV - is already under test.

Few environments change more quickly than the Web, and consequently Web site development remains a frantically moving world.

Design trends

Definitions

Active-X

The surprisingly wide support that Java has generated derives in part from manufacturers excited by the possibility of software-on-demand, perhaps sold on a per-use basis and delivered directly to your machine over the Internet.

Microsoft has responded vigorously to this and has come up with it's own Java-like approach. Recognising that there was a demand for 'componentware' Microsoft has plucked a fairly technical aspect of its programming products from obscurity and promoted it to linchpin of its new Internet strategy.

Active-X is the latest incarnation of OCX, which itself derives from OLE Component Object and evolved from VBXs (Visual Basic custom controls) found in Microsoft Visual Basic. These are software elements - components - that can be used in a variety of projects.

Active-X is an outgrowth of this software recycling. They add extra features, the ability to work in both 16- and 32-bit environments, and a greater portability than that offered by VBXs. This last property has allowed Microsoft to recast them as its platform independent Java-killer, with the added bonus that this builds on the highly popular and well-understood technology of VBXs.

The distinction between Active-X components and Java is that the former is compiled and therefore the appropriate binary has to be downloaded when required for the target computer. Java is transmitted as code and compiled upon demand.

Visual Java mirrors Microsoft Visual C++ tools and includes a way of turning Java applets into ActiveX controls. Using Visual Basic scripts as the glue that binds all these ActiveX elements together, Microsoft has managed to extend the functionality offered by Java.

Microsoft has made available a free software development kit (SDK) for use with its new Visual InterDev Web application development system. The Design-Time ActiveX Control SDK aids the creation of server-side components for use in Active Server Pages.

Why Microsoft has made ActiveX take a back seat

The most interesting aspect of the UK launch of Internet Explorer 4.0 was not what was said but what was omitted. During the surprisingly amateurish two-hour presentation the ActiveX approach was referred to neither directly or indirectly. ActiveX was introduced back in December 1995 as Microsoft's answer to the then relatively new Java applets. Rather than taking a chance on new and untried technology, so Microsoft's argument went, far better to go with ActiveX, a new incarnation of OCXes.

According to Microsoft, ActiveX could match all the exciting new interactive and dynamic features offered by downloaded Java applets in Web browsers, and required no investment in new languages or techniques. The fact that OCXes were strictly for the Windows platform would be addressed by extending the technology to the Macintosh and Unix at some point in the future.

With its usual superb marketing, Microsoft was able to convince the Internet world that there was therefore a real rival to Java applets, and that Java's apparent raison d'être - to extend Web browsers - had disappeared. Having come up with an approach that was clearly reactive and tactical, the company built on the initial success of the idea and made it central, to its Internet and later to its entire computing strategy.

ActiveX controls migrated from the client side to the server, notably with the Active Server Pages approach, and formed the basis of Microsoft's DCom component technology and fledgling transaction processing model. But while these important shifts were going on behind the scenes, the most visible manifestation of ActiveX remained on the client side.

This makes Microsoft's retreat all the more dramatic: dramatic, but perhaps inevitable. ActiveX's security model is fundamentally flawed. Whereas Java applets are restricted in terms of the operations they can carry out once they have been downloaded to a client, ActiveX controls can do anything their creator's desire.

Microsoft's response to this unacceptable situation is Authenticode. This employs digital certificates to ensure that ActiveX controls are not tampered with as they pass across the network, and to provide a sure way of establishing who wrote them. The idea is that if a control comes certainly from a trusted source - a major software house, say - then it can be left to operate freely on the user's system. But the trouble with Authenticode is that it places the burden on the user: he or she must decide whether a digital certificate provides enough assurance to allow the corresponding control full access to a system.

This is of course unrealistic for most users who are not experts in digital certificates and simply want to get on with their work. Moreover, if rogue ActiveX controls are accepted and cause damage, there is no guarantee that the certified software publisher still exists, much less is within any useful jurisdiction where it can be sued or prosecuted.

In other words, for all practical purposes, ActiveX controls are useless on the open Internet, since they simply cannot be trusted. Microsoft has tacitly recognised this with the introduction of security zones in Internet Explorer 4.0. Using these it is possible to set defaults in terms of accepting or rejecting ActiveX controls, according to whether they originate on the Internet or from within a intranet.

Security zones mean that ActiveX controls can still be employed within a corporate intranet, since their origin and capabilities are presumably known. And this in its turn means that Microsoft's server-side ActiveX strategy - and with it Active Server Pages, DCom and Transaction Server - is still viable. But it does mean that ActiveX controls are unlikely to be used for public Web design.

Instead, Microsoft is pinning its hopes on Dynamic HTML, which made an appearance at the Internet Explorer 4.0 launch, and which will presumably now take over as Microsoft's latest anti-applet technology.


ActiveX Web Database Programming (£27.49, ISBN 1-861000-46-4) presents an excellent practical introduction to the Microsoft's alternative middleware technologies.

Address classes/IP address

The internet uses a 32-bit address scheme, called IPv4 (Internet Protocol version 4) to define hosts on the global network. This 32-bit address is usually written as four eight-bit numbers in the decimal form 123.45.67.89, where each of the four elements is less than 256. These Internet addresses, of which there are theoretically 4,294,967,296 (though in practice there are fewer because blocks of numbers are reserved for special purposes), are split up into various classes, each of which has important characteristics. Moreover, the way in which blocks were allocated to users (especially in the early days of the Internet) means that there are now relatively few of these addresses left.

Class A IP addresses are those whose first element runs from 1 to 127. Because the other 24 bits can be freely assigned by the holder of Class A address, this gives a network with a potential 16,777,216 Internet addresses. Examples of these lucky organisations include IBM, which has the Class A address 9.0.0.0, and MIT, which has 18.0.0.0.

It was quickly realised that the entire address space would soon be exhausted if many Class A addresses were handed out, and so Class B numbers became the default.

Class B numbers run from 128,0,0,0 through 128.1.0.0 to 191.255.0.0 giving 65,536 addresses for each of the 16,000 or so networks available.

However, such has been the growth of the Internet that even Class B addresses are in short supply. It is now the policy to give out Class C numbers, which run from 192.0.0.0 through 192.0.1.0 to 235.255.255.0, and each of which has 256 addresses.

Organisations requiring more than this number may need to take several Class C networks.

The remaining IP addresses, those from 224.0.0.0 up to 255.255.255.255 are divided into two more classes: D and E. However these are reserved for special purposes, and are not allocated to individual networks on the Internet.

The solution is relatively simple: to increase the address size and with it the number of possible Internet nodes. This is the approached adopted with what is called IPv6 or IPng (for Next Generation). What is surprising is the scale of the extension that has been adopted. Instead of the current 32-bit system, IPng will use no less than 128 bits for addresses. This does not give a mere four-fold increase: in fact, according to the excellent introduction to the whole subject of IPng at the URL http://playground.sun.com/pub/ipng/html/INET-IPng-Paper.html, an address-length of 128-bits implies 340,282,366,920,938,463,463,374,607,431,768,211,456 (3.4x10^26) nodes. Since such large numbers are difficult to grasp, the same source puts things in context by point out that this figure represents 665,570,793,348,866,943,898,599 (6.7x10^23) possible nodes for every square metre of the planet.

This extraordinary number does not simply represent some rather excessive caution on the part of the IETF working group that drew up the new standard embodied in RFC1752, which is available from ftp://ds.internic.net/rfc/rfc1752.txt . It hints at something altogether grander and more exciting.

For with such huge numbers available it is possible to look beyond allocating Internet addresses to every computer. IP numbers could be given to every computer peripheral; to every piece of business equipment in an office - fax machines, photocopiers, telephones (assuming they had some kind of networkable digital control element that could become part of this super-Internet).

And beyond this lies the integration of an even broader range of digitally-controlled electrical devices into a massive total network girdling the world. Included could be transport systems (most cars already have as many chips inside them as a desktop PC) and even semi-intelligent domestic appliances. The day when individual light bulbs can be accessed and controlled over the Internet is perhaps not so far off - and already implicit in IPng.

Although the physical infrastructure can expand almost endlessly, some of the logical elements have limits in their current incarnations. Perhaps the most serious of these has to do with routing tables.

These determine how the individual data packets should be routed over the multiply-connected Internet. As the latter grows and changes on a daily basis, so these tables need to be updated - sometimes several times a day.

As well as resolving issues of address space - finding enough Internet addresses for new users is now becoming a problem - IPv6 adopts a more efficient approach to routing that avoids the need for massive tables and their frequent update.

Leaving aside the fact that IPv6 has not yet been implemented yet - and will require some years before it is fully rolled out - there remain broader challenges for the Internet. For example, the Internet phones, along with other kinds of multimedia traffic, generate huge quantities of data packets. The current Internet infrastructure and pricing system is simply not designed to cope with this flood which is likely to clog the system increasingly in the future.

It may, for example, be necessary to implement new pricing schemes for Internet connections whereby you pay for the volume of data transmitted, rather than a flat rate. Another possibility is that Internet users, particularly companies who are beginning to depend on their connections, will demand Quality of Service (QoS) guarantees from their ISPs - and be prepared to pay for them.

Here, too, IPv6 should help. It supports the new ReSerVation Protocol (RSVP) that attempts to negotiate a certain QoS on the Internet. Once this is in place users will be able to expect the same kind of reliability from their Internet suppliers that they do from telecom companies or energy utilities. In a few years' time it will be unthinkable for companies to work with an ISP unable to offer these kinds of guarantees. Although this is still a way off, now is the time to start talking to Internet connectivity suppliers about their plans in this area.

Further information can be found at Digital http://www.digital.com/info/ipvt/ with a noteworthy main white paper at http://www.digital.com/info/ipv6/white-paper.html, one of the best non-technical introductions to the subject.

ADSL

The recent features on ISDN described the advantages of this by-now rather old technology. Interestingly, the potential throughput - 64Kbit/s on each of two channels that can be combined to give 128Kbit/s without compression - is beginning to look less impressive. The new generation of 56Kbit/s modems running over ordinary telephone lines are not so far behind, though ISDN does have other advantages.

Independently of these modem technologies, ISDN is unlikely to remain the speed champion for long. As a previous Net Speak explained, a new kind of modem working with cable TV networks offers the promise of Mbit/s download speeds in the not-too-distant future. Trials are already underway in the UK, and there is talk of launching cable modem services next year.

Needless to say this cable modem challenge is not being taken lying down by the telecom companies of the world. Almost as if by magic, their engineers have discovered that they can push data down a telephone line at not just the current 28.8 Kbit/s, or the coming 56 Kbit/s, but at a stunning and rather convenient multi-Mbit/s speed.

This is accomplished by using higher frequency transmissions over conventional telephone lines. There are limitations of distance from user to exchange - typically 10-18 kilofeet, as the jargon has it - and not every exchange or area will be able to offer the technology. But the new service - generally known as Asymmetric Digital Subscriber Line (ADSL) - promises cable modem speeds down an ordinary phone line. It is asymmetric because upload speeds are "only" a few hundred Kbit/s - more than enough for most people's needs.

Microsoft has a white paper about ADSL

Agents

There are more than one million Web sites according to the Netcraft survey, with many more being added each day. As well as the new sites that are cropping up, established locations are being updated, often on shorter and shorter time-scales as constantly refreshed content and form becomes a paramount factor in attracting and keeping visitors.

The rise of the Internet search engines has been one response to this data deluge. Initially they seemed almost unbelievably powerful tools that placed millions of Web pages within the user's grasp, and allowed unimaginably large quantities of data to be sifted in seconds.

But now, as the Net continues to expand, even search engine results have become unmanageable. Instead of reducing data to information, the listings of hundreds or even thousands of hits across the Internet provide simply a first winnowing.

It is clear that two new elements are needed to help users in their struggle against this flood of facts. First, a way of conducting searches automatically, without needing to specify every time what you are looking for. And, secondly, more intelligence applied to the filtering process to produce usable results.

The new class of programs designed to offer these two elements is called agent software. Although grandiose claims have been made for agents, so far their incarnations have been simple and disappointing. But these will undoubtedly change and probably soon - not least because the Internet would otherwise soon drown in its own content.

The AgentNews Webletter, (http://www.cs.umbc.edu/agents/agentnews/) devoted to the fashionable subject of software agents can be found.

Agents are not limited to simple browsing but can help to find the best prices on offer by online merchants. One of the leaders in this field was Jango from Netbot at http://www.netbot.com/ , now owned by the search engine Excite at http://www.excite.com/. In the shopping section at http://jango.excite.com/cf/index.html are now offered Jango-enhanced searches for various categories including Computers and Software, Movies and Games & Toys.

Alexa

If you hate the way browsers lead you around the Internet by the nose, you'll welcome a new helper which aims not only to take you where you want to go but also to make things more interesting along the way.

Anyone who has spent any time browsing through the sections of netspeak and articles in magazines and books online will no doubt have come across links that no longer lead to the Web pages cited. Many of the resources referred to have not just moved, but disappeared completely.

This is a fundamental problem with the Internet: it is not like a traditional library gradually gaining more titles, but a huge, organic entity that changes constantly with time, losing sites and pages as well as gaining them. The idea of trying to capture and save these previous incarnations may seem utterly unrealistic, and yet this is precisely what Internet Archive is trying to do.

The man behind the Archive is Brewster Kahle, who developed the Wide Area Information Server search system. He has set up an ancillary company called Alexa which aims to use the Internet Archive (which it generates, largely) for commercial purposes. (For more information; the company, Brewster Kahle).

Alexa is also the name of a product, (for Windows 95/NT currently). It is a kind of browser helper: it runs alongside Navigator or Internet Explorer, and monitors which sites you visit. On the basis of where you are and where you have been, it offers suggestions about other sites that you might find relevant and interesting.

It does this in part by drawing on its database of where other people who have visited similar sites have gone: what it calls usage paths. A corollary of this is that it keeps a record of where you have been, though the company insists this information is kept private.

Novel navigation
There are full details of the technology, and a good article on some of the information that has been gleaned about the Web in general

The aim of Alexa is twofold: to offer a novel way to navigate through the World Wide Web, and to create a kind of consensus about which sites are worth visiting on the basis of what Internet users think and do, not according to some self-appointed arbiter's judgements. Alexa offers other benefits. As well as providing suggested locations to visit, it tells you about the site you are currently viewing: who runs it, how big it is etc.

Get into the archive
More interestingly, Alexa allows you to hook into the Internet Archive: when older material relevant to the site you are viewing is available, this is signalled on the Alexa toolbar that appears with your browser. You can then view previous versions of Web pages. There is also a facility to search through the electronic version of the Encyclopaedia Britannica (or at least some of it) and an online dictionary and thesaurus.

Alexa is free: the company aims to generate revenue by a new kind of online advertising. In the pop-up menus offering suggestions of where to go next small banner ads (http://www.alexa.com/company/advertise.html) can appear. I found the suggestions offered by Alexa stimulating, not least because they were frequently unexpected, and nearly always worthwhile. Too often search engines take you to pages that are irrelevant or dull. If enough people start using Alexa seriously - and so feeding in their use patterns to the system - it could represent a genuinely innovative way of using the Internet.

American Online (AOL)

AOL's dramatic acquisition of Netscape for $4.2bn (£2.6bn) is deeply ironic. For the start of Netscape's decline can perhaps be traced to the moment in March 1996 when AOL chose Microsoft's Internet Explorer, rather than Netscape's Navigator, as its default browser.

As Netscape lost market share in the browser sector, it also began to lose focus, proclaiming first the central importance of its server products, and then of its Netcenter portal.

It was this corporate schizophrenia that allowed AOL to step in with an offer that effectively split off the portal from the software, with Sun taking over the latter.

This "strategic alliance" will allow Sun to sell and co-develop the Netscape line of software, and guarantees the company $500m of business from AOL in terms of systems and services. America Online will receive more than $350m from Sun over the next three years.

AOL's acquisition of Netcenter certainly makes sense. It turns AOL into the undisputed leader in terms of Internet advertising and gives it a large number of new visitors at a stroke (since AOL's core market is outside business, where Netcenter's strengths lie).

With AOL's backing, Netcenter is now in a position to challenge even Yahoo, something that seemed unlikely before the deal. Similarly, Microsoft will have to work even harder to establish its own portal.

Whether AOL will switch to Netscape Navigator as its bundled browser is less clear. Although this might seem an obvious move, doing so would cost AOL the preferential placing it receives in Windows - Microsoft's trump card, which was the key to gaining the AOL deal. AOL may prefer to wait - not least because it will be easier to integrate later versions of Navigator that can be designed with this in mind.

If the browser is not bundled immediately, and the Netscape servers are being farmed out to Sun, the entire software side of the deal reveals itself as rather pointless for AOL - and problematic. Quite how the agreement with Sun will pan out remains to be seen; in particular, the product overlap between Sun's and Netscape's server lines will not be easy to manage.

Collateral benefit

However, irrespective of how things unravel on the software side, the deal will certainly change the Internet landscape dramatically and have a considerable impact on most players and sectors.

Microsoft is clearly affected adversely in terms of competitive pressures, since the deal represents the formation of a new AOL-Sun axis, with Netscape as the glue. None the less, there may be a collateral benefit in that the formation of a strong rival would seem to weaken the threat that Microsoft will operate as a monopoly (though it leaves open the question of whether it has already acted as one).

Sun does well out of the deal. It picks up guaranteed business, gets to control - and so neutralise - an erstwhile competitor, and cosies up to an emerging power in the online world. Perhaps even more important is the boost the deal will give to Java.

AOL is committed to support Java, including the imminent Java 2 and PersonalJava. The latter is very interesting, because it opens up the possibility of AOL-branded devices such as mobile phones, pagers and personal digital assistants.

Alternative approach

Clear losers in the deal are the other portals, especially the second-tier ones. Yahoo will doubtless survive (perhaps through alliances or acquisitions), but there may not be room for many smaller players. Another group that may not be too happy is the open source movement. Netscape was one of the most important evangelists for this alternative approach, and although the Mozilla project is safe (since open source code cannot be revoked), AOL may well be less committed ideologically.

The other main loser is Netscape, its employees and users. Although the name may linger on, the company now straddles uncomfortably two markets and two masters.

To see the first Internet company, once such an exciting and innovative player, disappear in this way is sad. Its passing represents if not the beginning of the end, certainly the end of the beginning for the Internet world.

Apache

Besides the browser Mosaic at the National Center for Supercomputing Applications (NCSA), NCSA has another claim to fame on the Internet, as the creator of the NCSA httpd Web server.

Just as Mosaic played a crucial role on the client side in popularising the idea of graphical navigation of the World Wide Web, so the NCSA httpd Web server was one of the key programs in providing a practical demonstration on the server side of just what could be achieved. The "d" in "httpd" in the name refers to the daemon, or continuous Unix process, that runs the HTTP service used by the server to supply Web documents to the client browser.

Like Mosaic, the NCSA was (and still is) free. However, also like Mosaic, it suffered from a number of bugs of varying severity. To solve these problems, and to improve the overall performance, a new Web server was developed, taking as its starting point some of the fixes - patches - to the NCSA httpd software. Because of its origins as a "patchy" server it was dubbed Apache, a name it retains to this day.

Despite its less-than-glamorous origins, Apache has been a phenomenal success. According to the Netcraft survey of Web servers, over 40% of sites are on machines that are running Apache, a market share far ahead of any commercial server program. Apache has the advantage of offering full-power encryption - even outside the US. Apache is explicitly designed for Unix platforms, and one of the most popular of these is Linux, not least because like Apache it can be obtained free of charge.

Apache's home page is at http://www.apache.org/.a very full FAQ at http://www.apache.org/docs/misc/FAQ.html and main documentation at http://www.apache.org/docs/. There is a version for windows under development at http://www.apache.org/docs/windows.html.

Application Programming Interfaces

Now that the Internet is becoming more integrated into the rest of corporate computing it may well be that the larger ensemble created begins to lose some of the very qualities that made the Internet such a success in the first place. Platform-independence is a case in point.

One of the current key areas of Internet development is how to plug the Internet into non-TCP/IP elements - specifically databases, perhaps using some kind of middleware. This means that the platform-independent techniques that lie at the heart of the Internet must be supplemented with others, some of which do depend on the platform in question.

For example, in the database arena, the use of Application Programming Interfaces is proving to be an important part of linking Internet Web servers with the heterogeneous components of the corporate computing matrix.

An API is essentially a published set of functions that are available to other third-party programs to enable them to carry out actions. They provide a common standard that can be used by many disparate programs without the need for special patches to be created on a case-by-case basis. However, APIs must be defined - usually by a leading player with enough clout to make them viable in the marketplace - which means that they are arbitrary to a certain extent, and tied to that defining manufacturer.

In the Internet field, two of the most important APIs that are starting to be widely supported are the Netscape Server API (NSAPI) and Microsoft's Internet Server API ISAPI. Although end-users need not worry about the details of such APIs, it is likely that they will be aware of their presence more in the future.

Archie

Archie was the first attempt to solve the Internet's biggest problem; the lack of a central directory. Begun at McGill University in Canada, the Archie project (the name comes from the fact that it has to do with file archives) provides users with a way to search for files on the Internet. It is indispensable when you are looking for a particular program among the 50 Gbytes of publicly-accessible software held on some sites.

Information about where files can be found is held on a number of Archie servers located around the world. These update their lists by contacting major FTP sites in turn and retrieving information on the directories and files held. To find a particular file you should ideally use an Archie client program resident on you desktop machine or elsewhere on the company network. There are such clients for all major platforms, including Microsoft Windows. You simply enter the name of the file you are looking for, send the request to an Archie server (the main one for the UK is archie.doc.ic.ac.uk) and wait for the response. This will generally consist of hundreds of locations throughout the world where that particular file is held. You would then use FTP to retrieve the file. If you don't have an Archie client on your system, it is also possible to use the telnet facility to log into an Archie server (enter Archie for log-in and the password) to carry out the searches directly.

There is an E-mail version of Archie that complements the FTPmailers well. By sending an E-mail message to a special Archie site (e.g. archie@archie.hensa.ac.uk). For example, to locate pkz204g.exe you would send the message find pkz204g.exe to the address archie@archie.hensa.ac.uk. You will eventually receive a list of locations and directory entry for the file. You will then need to extra the name of the site, the relevant directory and the name of the file (already known). You then send an E-mail message to ftpmail@doc.ic.ac.uk such as open teseo.unipg.it binary uuencode chdir pub/stat/jse/software/misc/ get pkz204g.exe end

Archie is a rather crude instrument; it is difficult to refine its searches unless you have a mastery of regular Expressions (and if you don't you might like to see the pages at http://venus.ubishops.ca/course/regex.html which has a good explanation of the subject).

Rather better in someway is Shase - the shareware Search Engine. Its home page is at http://www.fagg.uni-lj.si/SHASE/ - which, as the .si domain indicates , is located in Slovenia (at the University of Ljubljana). A UK mirror can be found at http://shase.doc.ic.ac.uk/SHASE.

Shase offers the possibility of carrying out searches for files in a more sophisticated way than is possible with Archie. Its index covers over 110,000 files totalling 13.6Gb. An interesting list of these archives (which includes CICA, SIMTEL, Info-Mac and Microsoft) can be found at http://dolphin.doc.ic.ac.uk/DB-SHARE/statistics.html.

Useful too is the list of 100 new files at each site. Selecting one of these files takes you to a page with a list of possible FTP locations: as well as providing hot links to the site, there is a useful statistic that indicates how often a test program was able to access the directory in question. This indicator of how easy it is to log in to leading FTP servers is probably unique.

Asynchronous Transfer Mode (ATM)

Asynchronous Transfer Mode (ATM) in the context of broadband communications is one of the most important technologies for the future. Broadband simply means able to transmit large quantities of data (over 1.5 Mbits per second according to the official definition).

As anyone who has used the Internet for a while knows, sooner or later you hit against transmission speed limitations - either locally, in the connection to your computer, or in terms of the size of the data pipe that your Internet provider uses to connect to the rest of the Internet (especially the size of the transatlantic connection, often a critical bottleneck).

For this reason many see ATM as offering one of the best ways of upgrading the global Internet infrastructure to provide bigger data pipes from which faster local feeds can be taken.

ATM has nothing to do with the physical side of these connections, but is all about how the data is packaged and transmitted. ATM employs packets of fixed size (rather perversely chosen to be 53 bytes - 48 bytes of data plus five for routing information). Once data has been chopped up into these packets, the latter are then transmitted across the physical network in question asynchronously (hence the name). That is, the sender and receiver do not have to be rigidly synchronised before or during the transmission.

The great advantage of ATM, apart from its ability to offer high data throughput reliably, is that its very simplicity means that it can cope with any kind and mixture of data and run over any kind of network. This flexibility makes it a good match for the similarly minimalist Internet, which is defined by little and which can be used in almost any situation.

A tutorial on ATM can be found at http://juggler.lanl.gov/lanp/atm.tutorial.html, which introduces all the main concepts. Even more information can be found in the ATM FAQ directory at ftp://ftp.funet.fi/pub/networking/technology/atm/FAQ/. As well as the very detailed three-part FAQ itself (contained in the files ftp://ftp.funet.fi/pub/networking/technology/atm/FAQ/ATM-FAQ1.txt, ATM-FAQ2.txt and ATM-FAQ3.txt) it also contains a file devoted to ATM-acronyms at ftp://ftp.funet.fi/pub/networking/technology/atm/FAQ/ATM-Acronyms.txt and more about the Fibre Distributed Data Interface than you could ever imagine at ftp://ftp.funet.fi/pub/networking/technology/atm/FAQ/FDDI-FAQ1.txt.

A list of ATM sites can be found at http://www.cl.cam.ac.uk/users/cm213/Project/atm.html. It has good links to other locations with ATM information, ATM testbeds, and ATM product vendors.

The ATM Forum is a non-profit organisation bringing together most of the main players in this market. Its home page can be found at http://www.atmforum.com/. From here there are links to its newsletter (at http://www.atmforum.com/atmforum/atm_newsletter.html), called 53 Bytes, the basic cell-size used for ATM transmission, a list of its members and other useful information. The Internet Engineering Task Force has a group looking at IP protocols running over ATM; details can be found at http://www.ietf.cnri.reston.va.us/html.charters/ipatm-charter.html.

Acceptable Use Policy (AUP)

The acronym stands for Acceptable Use Policy, and referred originally to a short but important document drawn up by those running the NSFnet, the first backbone of the Internet.

It defines very basic rules for governing who could and could not join the Internet (which necessarily meant using the NSFnet, since the latter tied everything else together), and for years was the main limiting factor on employment of the Internet for commercial purposes.

Its opening statement is unequivocal: "NSFnet Backbone services are provided to support open research and education in and between US research and instructional institutions, plus research arms of for-profit firms when engaged in open scholarly communication and research. Use for other purposes is not acceptable."

The full document can be found at ftp://nic.merit.edu/accpetable.use.policies/nsfnet.txt, although this is now only of historical interest. NSFnet has more or less shrivelled away.

There are other AUPs apart from that of the NSF: for example, the subsidiary academic networks frequently impose constraints similar to the original NSFnet statement. Even commercial suppliers usually have some kind of AUP, usually limiting people to legal activities.

Avatars

One of the most exciting applications of Virtual Reality Modelling Language (VRML) is in the creation of virtual worlds through which users can move, for example to allow information about data hierarchies and relationships to be conveyed in a simple visual way. A completely different use of VRML involves the fashioning of shared virtual environments. Here the emphasis is on interaction with other users, extending the other forms of online communication currently employed such as Internet chat or Internet telephony.

Since a crucial aspect of these worlds is their three-dimensional nature, it follows that some kind of virtual corporeal presence is required in them if the overall metaphor is to be preserved. This has led to the growth of an entirely new online element: avatars. An avatar is the incarnation or form that you take in one of these virtual worlds. Typically it will have some three-dimensional characteristics, a front and a back, for example, so that other participants in these worlds can move around you. The form might be minimalist - a floating head, or a simple object - or a complex piece of three-dimensional graphics crafting that is a work of digital art.

Although the word itself comes ultimately from the Hindu religion, the first use of the term avatar in a computing context is generally traced back to Neal Stephenson's seminal cyberpunk novel Snow Crash. In the world described there, avatars inhabit a huge and rich virtual domain called the Metaverse, where all kinds of activities and transactions are conducted. Current implementations are rather cruder, but may one day evolve into an important business and entertainment medium.

Auctions

For those interested in auctions, there are more and more sites springing up where you can make your bids over the Internet

A natural approach to a new medium like the Internet is to try to transfer pre-existing business activities online, as well as inventing wholly new ones. A case in point is the auction: since the Internet is essentially about communication, it obviously lends itself very naturally to the process of receiving information about lots and making bids for them against others.

It should therefore come as no surprise that this field is flourishing, as Yahoo's long list of online auctions sites testifies.

However, it is important to distinguish between two quite different kinds of auctions, since the way they work and their relevance for business are quite different.

Merchant

The first kind might be called the merchant auction. Here the online site auctions goods that come from manufacturers and that are frequently old stock that is being sold off cheaply. The online auction is a good way for both the seller and the buyer to agree a price efficiently.

The relation of the merchant auction to its conventional counterparts is made clear by the affiliation of some of the leading sites in this area. For example First Auction is part of Internet Shopping Network, which in turn is part of the huge US-based Home Shopping Network.

Similarly, SurplusAuction is an arm of Egghead stores, U-bid a division of Creative Computers, and Webauction part of Microwarehouse - all companies that sell computer equipment in the ordinary way. For them, such merchant auctions are simply a natural extension of their activities.

Other players in this area include Onsale, Dealdeal and FairMarket. There are also some Europe-based merchant auction houses, like Quixell, (there is also a German version) and Online Auctions UK.

Given the large and expanding list of auction houses, it is obviously difficult to track them all. An interesting service in this context is Lycos' AuctionConnect, which searches other merchant auctions for certain items then notifies users.

The other major class of auctions are those between individuals: what might be called personal auctions. Probably the most famous firm here is Ebay which is notable for being one of the few publicly-traded Internet companies to turn a profit. Ebay makes its money from charging sellers an initial insertion fee and commission on sold items.

Of course, personal auctions are inherently more risky than merchant auctions, since buyers are dealing with private individuals, not firms, and it is generally much harder to obtain information about their trustworthiness. This will tend to make personal auctions of less interest to companies. Interestingly, the main way of safeguarding users who engage in personal auctions is through other users' comments on sellers.

Protection

A more formal approach to protecting users is offered by Auction Universe which sells a kind of insurance policy called BidSafe. Auction Universe is a subsidiary of Times Mirror, which owns seven major US newspapers, and these personal auction sites are a natural outgrowth of the classified advertising that sustains local newspapers.

A dedicated Classifieds site, Classifieds2000, has added auctions. Classifieds2000 is a division of Excite, which is not alone in adding this service to its portal. Yahoo has started its own personal auctions area, and Lycos has promised to follow suit.

It may well be that this re-invention of classified ads online will prove to be as important a source of revenue for Web sites as banner advertising now is.

Banner sizes

The commonest form of Web advertising is through the use of images with promotional messages placed on a Web page. Given that many people do not scroll all the way through to the end of a document, the prime position is "above the fold", in the initial screen displayed to visitors when they reach a site. In particular, ads are frequently found at the top of Web pages to ensure that they are the first thing seen (unless inline images have been turned off).

These banner ads, as they are generally called, have sprung up in a completely uncontrolled way; not surprisingly, given the Internet's general lack of supervisory bodies. As a result, the ads tend to be designed to fit in with the overall form of the Web page on which they appear. This means that there are currently hundreds of slightly different shapes and sizes employed for banners.

For the user this is not a problem, but it is for companies such as Microsoft that wish to place the same advertisement in hundreds of sites. Rather than designing the promotional image for one or two standard sizes, it must be tweaked to fit the demands of particular pages.

This is clearly extremely inefficient from the advertiser's viewpoint, and indicative of the immaturity of the Web advertising market.

To combat this, the Internet Advertising Bureau and the Coalition for Advertising Supported Information and Entertainment have drawn up some standard sizes for banner advertisements. Unfortunately, to date there has been little effort to enforce them, and so the current banner size anarchy continues.

BinHex

The file extension .hqx refers to the BinHex format, commonly-encountered in the context of Macintoshes, and occasionally seen elsewhere on the Internet too.

Macintosh files are unusual in that they can consist of two parts, called the data fork and resource fork. This structure is used for programming convenience, and is part of the independent approach mentioned above as being characteristic of the Macintosh world.

One problem that the BinHex format solves is how to convey both parts from one computer to another. This not a trivial operation, and BinHex is now perhaps the most widely-used means of combining the two forks into a single entity.

But BinHex has another side, one, which means that it is of more general interest. As well as combining the data and resource fork into a single file, BinHex also converts the eight-bit binary code into something that can be represented with fewer bits. In this respect, BinHex is very similar to the UUencoding and MIME schemes that similarly take binary files and convert them into a form that can be represented in ordinary ASCII. As a consequence, like those produced using UUencoding or MIME, BinHex files are bigger than the original binary form.

Because BinHex represents an alternative to MIME or UUencoding (at least as far as its conversion of binary to ASCII goes) it is sometimes encountered outside the Macintosh world. For example the Windows version of the Eudora E-mail package offers BinHex as well as the more usual MIME when sending binary attachments with messages.

Bolero

Even though Bill Gates must envy the name, Software AG is hardly familiar in computing circles. The firm is probably best known for its database Adabas and its fourth generation language Natural, neither of which are products people are likely to be passionate about.

Software AG has won a certain fame (some would say notoriety) through its work with Microsoft to port the Distributed Common Object Model (DCOM) technology to other platforms. Many have seen this as another ploy by Microsoft to fend off accusations that DCom is limited to the Windows platform, but without having to support rival operating systems directly. But Software AG's strategy has long needed a complete overhaul. The firm's current products are all rather wedded to older programming models, while new-fangled concepts such as the Internet arise only tangentially.

Bold leap
Rather than a series of incremental updates, Software AG has opted for a single bold leap into this new market. Its Bolero product - the name derives from Business Objects Language Environment - is extremely ambitious. A good white paper on the subject is at www.softwareag.com/ corporat/solutions/Bolero/papers/boltwp.htm. In a sense, Software AG has taken its expertise in fourth generation languages and applied it to Java and the Internet. Bolero the product consists of an object-oriented language, also called Bolero, closely modelled on Java, which is used to create component-based business applications. The end result is server-side Java byte-code, even though no Java programming is employed.

Platform-friendly
The advantage of this approach is that Bolero's output can run on any platform which has a Java virtual machine available: a clever way to obtain cross-platform capability.

The development environment of Bolero comes with graphical user interface front-ends, compilers, wizards, editors, debuggers and other programming tools. The rest of Bolero, called the Application Server, consists of heavyweight modules that handle things like long transactions (see Net Speak) and links to databases.

Even though DCom is the main way of communicating between Bolero applications and existing software (Corba's Internet Inter-Orb Protocol will be added later), Bolero components can be either Com objects or Java Beans. Software AG is further hedging its bets through the ability to switch in other virtual machines if Java fails to catch on as the universal platform.

Bolero is still in beta, and therefore remains more promise than reality at the moment, though the demos look interesting. None the less, it is potentially significant in a number of ways. First, the use of the Java virtual machine through the output of Java byte-code is an approach that other software manufacturers will doubtless be interested in adopting.

Second, it is worth noting that Bolero will be written in Java, making it perhaps one of the largest and most complex Java projects to date. Assuming that Bolero is finished and works, its mere existence will be yet another significant demonstration of the capabilities of Java as a serious programming environment for the enterprise.

Finally, Bolero represents an important statement from Software AG, which is more or less betting the company's future on this product. Its next project is proof that it is not content to sit back from here on. Since EntireX the port of DCom to non-Windows platforms, represented an updating of Software AG's older Entire middleware range, and Bolero is a kind of Net-enabled Natural with bells and whistles, it is logical that the next move will be a revamp of the Adabas software range.

The key new ingredient here is XML, which will lie at the heart of the successor to Adabas: yet another indication of how this language is moving to the centre of corporate computing.

Broadband Sites

When the high-speed cable access company @Home bought the portal Excite in 1999, the business logic was not immediately obvious, but the move has since proved prescient. What @Home is hoping to create with Excite is a broadband portal - an entry point for those with fast connections - and now it seems that everyone is doing the same.

Yahoo aims to join the broadband portal club by following up its $4.5bn deal to acquire Geocities with its recent purchase of Broadcast.com for $6bn. Broadcast.com describes itself as "the leading aggregator and broadcaster of streaming media programming on the Web".

Snap

Another portal, c|net's Snap, has already set up a trial of its broadband service. AOL, on the other hand, has so far only announced what it calls "a premium upgrade" to its service that will be available to users with high-speed connections in the autumn of 1999 (see also).

Another company with a broadband site in the offing, this time from the world of entertainment, is Warner Bros. Its Entertaindom site is described as "the Internet's first vertical entertainment portal," whatever that means.

In a sense, this move to broadband is a logical development of a process that began when the Internet passed from text-only pages to graphics, and then to sounds and limited video. The thinking behind this latest "advance" - albeit one that is hardly indispensable for most users - is that it will allow audiences to be served with new premium content, ideally paid-for.

There are two kinds envisaged: "lean forward" and "lean back". The former embraces factual and interactive content, while the latter refers to more passive, entertainment-based forms. Alongside this new content, there will also be new kinds of ads. For example, it will be possible to replace simple animated banners with fully-fledged videos.

In other words, for both content and advertising, the broadband Internet represents a convergence of the Web with TV and cinema - something that makes the concept highly popular with Old Media companies, which are still struggling to come to terms with the Internet and its implications.

Alongside the (relatively) established Internet players that are attempting to extend their offerings to include broadband services, there are many new companies hoping to set up early in this area.

For example, the video compression company Duck, which has developed various broadband technologies, has also set up what it hopes will be an entire network of broadband sites called On2. Currently there are just three of these, focusing on film, travel and education.

Another firm, FasTV, offers a collection of TV and other video programming that is searchable by keyword or phrase, while Quokka is concentrating on "immersive" digital sports entertainment.

User appetite

One possible indication of future user appetite for these and other broadband services is the rise of the MP3 compression technology for digital music downloads. In a sense, MP3 allows listeners to obtain all the benefits of broadband without needing special equipment.

And for any remaining sceptics about the importance of the MP3 phenomenon, Microsoft's recent launch of its own version called Windows Media Technologies 4 should be proof enough that this is now a serious market.

Of course, whether or not this vision of a brave new broadband world is realised depends not just on firms creating the content for it, but even more on whether users will have fast enough access to make these services commercially viable.

The Last Mile

Although it is relatively straightforward to provide very high-speed connections between continents and cities, say, the real difficulty is reaching the individual end-user. This has been called the "last mile" problem: finding technologies that are fast and cheap enough, and which can be rolled out to millions and eventually billions of users.

One enterprising idea solves the last mile problem very neatly by using existing power lines. The leader in this field is the UK-based Nor.Web with its Digital Powerline technology (home page at http://www.nor.webdpl.com/index.htm ), which provides speeds of 1 mbps.

This is such a radical approach that many questions remain open about its viability for widespread use (see http://www.nor.webdpl.com/press/990413rebuttal.htm), though trials are under way in the UK, Germany and Sweden and elsewhere (more at http://www.nor.webdpl.com/casestudy.htm).

An approach that is less innovative - but more likely to come to fruition - is to use terrestrial radio links. This is essentially a high-speed extension of the mobile phone systems used around the world.

Unfortunately, since then the ITU, which was overseeing the negotiations for the global IMT-2000 standard, has put forward a rather unsatisfactory compromise solution that does not create a single, global approach, but one involving multiple access technologies (see http://www.itu.int/newsroom/press/releases/1999/99-04.htm). This is likely to cause costs for IMT-2000 equipment to be higher, and for progress in the provision of broadband access in this way to be slower.

Struggles
Although the world's users have lost out with this failure to agree on a single technology, one group must be delighted. These are the companies planning broadband wireless access via satellite.

This very costly technology is suffering by association through the struggles of satellite phone company Iridium (http://www.iridium.com/), which not only had to ask for a waiver of loan conditions, but lost both its chief executive and chief financial officer Q2 1999 (see the press releases at http://www.iridium.com/english/inside/media/press_index.html).

One company is already offering Internet access via satellite, although not at broadband speeds. Telstra's service (see http://www.telstra.com/press/yr98/dec98/98122203.htm) uses Inmarsat. Another is Cyberstar (www.cyberstar.com/), which does offer the Internet at broadband speeds.

Other major Internet satellite projects are under development. Spaceway (http://www.hns.com/spaceway/ spaceway.htm) is a $1.4bn investment by Hughes Electronics Corporation (http://www.hns.com/news/ pressrel/corporat/p031799.htm), and aims to be ready in the US by 2002.

Skybridge (see http://www.skybridgesatellite.com/), a consortium led by the French company Alcatel, also hopes to be up and running by 2002. It uses what it calls "bent-pipe" architecture whereby signals are always routed down to ground, not inter-satellite, and requires some 80 low earth orbiting (Leo) satellites (see http://www.skybridgesatellite.com/system/cont_41.htm).

The cost for this will be about $6.1bn, and the result will be a bit-rate of up to 20 mbps on the forward link, and 2 mbps on the return link (for details, see http://www.skybridgesatellite.com/system/cont_43.htm).

Teledesic (http://www.teledesic.com/) is even more ambitious, offering up to 64 mbps on the downlink and up to 2 mbps on the uplink.

This $9bn project, funded in part by such techno-luminaries as Bill Gates and Craig McCaw and Prince Alwaleed Bin Talal of Saudi Arabia, will require no less than 288 Leo satellites.

There are some good explanations of this project at http://www.teledesic.com/technology.html, and details of its innovative but untried space-based network at http://www.teledesic.com/ tech/details.html.

Broadband by Cable

The recent $58bn (£36bn) bid from AT&T for the MediaOne Group details of the AT&T offer represents not just a consolidation of the US cable TV market, but is further evidence that the broadband revolution is spreading to most high-tech sectors.

This is demonstrated by some of the names involved in the complicated negotiations surrounding the deal. As well as AT&T, the highly-diversified electronic media company Comcast was interested in expanding through the purchase of MediaOne. To that end, it had talks with both Microsoft and AOL - two other companies increasingly active in the broadband world - to seek support for its bid.

Horse-trading

But it was AT&T that won MediaOne, in a deal whereby AT&T and Comcast would exchange some cable customers and money Further info. An important side-effect of the high-level horse-trading that led to this result is that AT&T will use Microsoft's Windows CE for its set-top boxes, and Microsoft will invest $5bn in AT&T Further info.

The absorption of MediaOne by AT&T is also likely to spur AOL to make some big deals in the broadband sector soon, and to encourage Sun to put fresh momentum behind the use of Java in set-top boxes.

AT&T's purchase of MediaOne joins its earlier acquisition of the cable company TCI for $48bn (see www.att.com/press/0698/980624.cha.html). AT&T is hoping to create through these massive purchases a leading position as a supplier of broadband access via cable.

The idea is that the cables currently carrying mostly TV signals will in the future provide multimedia Internet services, along with TV and telephony.

For this to happen, users need a key piece of hardware: a dedicated set-top box or a cable modem. The cable modem works just like a conventional telephone modem - it connects to a PC to allow computer data to be sent over the cable medium.

In the US, the cable modem standard is called Docsis 1.1, which is under the aegis of the industry group CableLabs. There are now three certified cable modems, from 3Com, Toshiba and Thomson Consumer Electronics. Further info.

Unfortunately, in Europe there is a completely different standard for cable delivery called DVB/Davic. There is an interesting paper about the rivalry between the two camps, which suggests DVB is better for set-top boxes, while Docsis is preferable for cable modems attached to PCs (though, as its URL indicates, the document comes with certain biases). CableLabs has a set-top initiative called OpenCable.

The DVB/Davic standard has been adopted by the Euromodem project, which aims to produce a standard cable modem for Europe. See the specification.

Promoted

Euromodem is being promoted by the European Cable Communications Association, which has set up EuroCableLabs. See background on the latter.

Just to confuse matters, there is another EuroCableLabs and information about its aims can be found http://www.eurocablelabs.org/labsbgd.htm. Needless to say, there is some argument over the common name.

There is information on cable activity in Europe, and other details of cable companies in the UK.

Broadband by ADSL

The use of cable to deliver broadband Internet connections, has a number of advantages. The technology is relatively mature, several cable modems are available and broadband services are offered by many cable providers (in the US, at least). But cable also suffers from some drawbacks.

In addition to the standards rift that is opening up between Europe and the US, there is the technical problem that, as more cable users in a given locality take up such broadband services, the speed for each user tends to drop since they are sharing a common connection for that zone. Perhaps more seriously, cable is available only in certain areas.

Telephone lines, the main rival to cable in the field of terrestrial broadband supply, certainly do not suffer from this last problem. This is partly what makes the idea of using the telephone for broadband delivery so attractive: given that there are currently 800 million telephone lines in the world, soon to grow to a billion, the potential is clearly huge.

The technology that hopes to tap into this market is Asymmetric Digital Subscriber Line (ADSL). It allows extremely fast Internet access to be provided over conventional copper telephone lines. The maximum download speed is about 8 megabits per second (mbps), and the maximum upload speed of 1mbps (hence the asymmetric part of ADSL), depending on the particular conditions of the line in question. Just as cable systems need cable modems, so ADSL requires an ADSL modem. The beauty of ADSL is that the same telephone line can be used for voice calls and broadband Internet services.

The main industry body in this area is the ADSL forum. Unlike the rather secretive cable modem group, the ADSL forum has put together many good documents on the technology, including a basic FAQ, a technical FAQ, a glossary and background papers at market info and news releases.

Beyond ADSL there is very high-speed Digital Subscriber Line (VDSL) which can offer up to 52mbps. There is a FAQ on the subject. As the main ADSL FAQ explains, there is a slight split in the ADSL world regarding the modulation systems employed, but it seems unlikely that this will be a serious impediment to the uptake of ADSL.

One development likely to aid the spread of this technology is the agreement of the G.Lite standard, a mass-market version of ADSL, which has been drawn up by the Universal ADSL Working Group. The group has a home page, and there is also information about the body.

G.Lite eliminates the need for a special signal "splitter" inside the home to separate voice and data, provides 1.5 mbps downstream and 384 kbps upstream, and is interoperable with full ADSL. See the FAQ.

Information about ADSL trials can be found at trials matrix and point-topic.com. In the UK, of course, everything depends on BT, which owns the critical last mile of wiring to the home. Although BT is running ADSL trials, many feel that it is dragging its heels so as to protect its investment in the much older ISDN technology.

In this context Oftel recently issued a consultation document called Access to bandwidth: bringing higher bandwidth services to the consumer, which looks at the whole area of how broadband services can be supplied using the telephone network in the UK. BT's response is available on their site. The latter has a good section summarising the various alternative technologies.

The European Commission has also issued a paper that touches on many issues in the area of broadband services. It is available online.

Browsers

Like many great break-throughs, the World Wide web browser Mosaic was put together almost by chance, and certainly not as the result of a carefully-planned project to produce what some have called the "killer app" of the Internet. Its author is Marc Andreessen, a research student working at the University of Illinois. Apparently he wanted a browser for the wanted a browser for the then-new World Wide Web, and so knocked up one of his own. The result amazed not just his supervisors, but also the millions of people who have since downloaded the freeware product from the FTP directory at the NCSA (ftp://ftp.ncsa.uiuc.edu/Web/Mosaic). Although more glamorous browsers are now available, notably Netscape, Mosaic remains a standard by which such software is judged and is a key part of Internet history.

The 32-bit version of Mosaic (mos*.exe) can be found at ftp://ftp.demon.co.uk/pub/ibmpc/winsock/apps/wmosaic/. An advantage of Mosaic is that it remains free to all. Netscape (at http://home.netscape.com/) is not. Unfortunately development of Mosaic creased, only four years after it all started.

The aftermath of the battle of the browsers

Now that the dust has settled a little since the launch of version 3.0 of Microsoft's Internet Explorer and Netscape's Navigator (in September 1996), it seems clear that Microsoft has caught up on the browser front, in technical terms and through a serious of very adroit moves succeeded in out-manoeuvring Netscape in marketing terms. This does not mean that the Internet war is over: many important issues remain to be decided. But it has put the pressure on Netscape to respond with something dramatic for its next generation of products.

In one respect, Netscape has done that with the recently announced strategy for 1997. For it effectively cedes the general Internet browser market to Microsoft, largely because Navigator costs money, and Internet Explorer doesn't. Netscape has shrewdly chosen to concentrate on the highly lucrative corporate market, and in particular on intranets.

Netscape is perhaps lucky that its only choice happens to be the best thing it could have done anyway. The uptake of intranets has been, if anything, even more spectacular than that of the Internet; not least because the advantages of the former are so obvious. Even better for Netscape, those advantages have recently been quantified by the independent market research company IDC, which found that the return on intranet investment was, on average, over 1000%. This effectively makes intranets the best investment in business today.

Signalling this strategic shift is Netscape's bold move to turn its flagship product Navigator into just a relatively small part of a whole suite of intranet clients, called Netscape Communicator. Navigator 4.0 will be the primary interface for moving around the corporate LAN, but allied to it are a number of other components which together go to make up a complete groupware solution.

Given that Netscape bought the groupware company Collabra back in September 1995, it comes as no surprise to find a client called Netscape Collabra which offers standard threaded newsgroup discussions. But Communicator also offers an enhanced e-mail client (Messenger); a new HTML editor (Composer - replacing the unloved Navigator Gold); and an audio conferencing client supporting the new H.323 standard (Conference). There is also a professional edition that adds a central administration tool (AutoAdmin) and a calendaring and scheduling tool (Calendar).

Matching these are a series of servers which comprise SuiteSpot 3.0 (for full details of these and the client components see Netscape's customary polished document. The main novelty here is the Media Server, for publishing streamed audio files, and the use of intelligent agents (how intelligent remains to be seen).

Also important is the native support for Microsoft Office file formats. This is part of Netscape's other sensible if equally necessary move: recognising Microsoft's place in the Internet/intranet scheme of things. Rather cheekily, Netscape has dubbed this new policy "embrace and integrate" - a pointed reference to Microsoft's own more arrogant "embrace and extend" approach. Among the elements embraced are Windows 95 and Windows NT through tighter integration; OLE/COM and ActiveX (though there seems to be some hedging about how total the support for ActiveX will be, and when it will appear); and Microsoft Office and BackOffice.

IT managers implementing intranet strategies will applaud this partial rapprochement. They will also doubtless be pleased that through these announcements Netscape is providing a totally open, component-based groupware solution. One important effect of the Netscape announcements (due to be implemented early next year) is that the focus shifts from the browser to the server side. The battle here is likely to be complex and drawn out. Writing a good browser is a relatively simple task; creating ten or so enterprise-level servers is a mammoth undertaking. Netscape has not yet come out with final versions of all the elements, and Microsoft is even further behind (though its high-end server project, code-named Normandy, is gathering pace.

The Browser Watch site at http://browserwatch.iworld.com/news.html is rumours and sightings sent by Internet users. Pages are updated daily. Reports include rarely encountered products such as Netsurfer for Openstep and the original NCSA Mosaic. A new browser from Norway called Opera (see http://traviata.nta.no/) most notable feature is it's size at less than 1Mbyte is probably a 10th of what the next versions of Internet Explorer and Navigator will weigh in at. At http://browserwatch.iworld.com/activex.html lists just about all available Active X controls. Similarly http://browserwatch.iworld.com/plug-in.html is a list of Netscape plug-ins. Alternatives to the dominant Microsoft and Netscape products can be found at http://browserwatch.iworld.com/browsers.html.

Microsoft talks serious money on the Internet

Although Microsoft is currently (as of November 1996) behind Netscape as far as high-end Internet/intranet servers are concerned, with the launch of Merchant Server, Microsoft seems to be further ahead in one particular area: fully-integrated electronic commerce solutions. Microsoft's success in reaching this market first is due in no small part to its acquisition of the company eShop (http://www.microsoft.com/ecommerce/pressrel.htm) earlier this year. The fact that Netscape was also trying to buy the company is an indication of the importance of the technologies it had developed.

Microsoft's Merchant Server is part of the Normandy project, but is closely integrated with the BackOffice range of products. In fact it sits as middleware between Microsoft's Internet Information Server and any ODBC 2.5 compliant back-end database such as SQL Server. A trial version can be downloaded, the file is 13 Mbytes in size, and to run it you will need an NT Server system with 64 Mbytes of RAM. The product's home page; there is a FAQ and a very useful White Paper.

The Merchant Server's front-end offers no surprises, being based on a standard model employed by eShop and other pioneer online commerce sites. Pages of information about products are generated on the fly from the database; the software comes with some ready-made store templates that provide the overall structure and ambience. One element that Microsoft is at pains to emphasise is the scope for offering a personalised shopping experience. For example, the pages generated could be tailored on a visitor-by-visitor basis; it is also possible to offer personalised promotions.

Secure payments over the Internet have been available for some time, but Merchant Server goes further through tighter integration with the external financial system. Using the vPOS system from VeriFone, it is possible to take credit card details from a purchaser which have been sent from any browser supporting secure standards such as SSL, and forward that information to a variety of financial institutions. These will then validate the credit card request, carry out the back-end transactions, and return a confirmation. For each bank or other financial company there will be a corresponding module used with Merchant Server. One of the options is a module for CyberCash's micropayments system.

As part of the complete billing process, there are also various tax modules (including one for European VAT); it is possible to use any currency. Shipping and inventory management are also included. For users, there is also a kind of electronic wallet utility (available as an ActiveX control or Netscape plug-in) that saves having to enter your credit card details every time.

This whole order pipeline, as Microsoft calls it, is modular, so third-party components can be swapped in at various points. Also notable is the fact that the entire control process of the Merchant Server is effected via a Web interface. This is an approach that Netscape pioneered, and is likely to become increasingly the standard way of controlling all Internet/intranet software.

As well as being the first mainstream server product to offer such an integrated electronic commerce solution, Merchant Server is also notable for its pricing. There is a basic charge of around £9000 for each computer that the server runs on, together with a further 'per shop' cost of about £3000 - surely the highest price for a single Microsoft product. There are already a number of sites employing Merchant Server for real-life transactions. UK users include Tesco and Shoppers Universe. A list of international users can be found, including Microsoft's own store.

What users can expect as Explorer goes forth

In the struggle between Netscape and Microsoft for the hearts, minds and desktops of Internet users, release 4 of their respective browsers promises to be particularly important. Internet Explorer 3 showed that Microsoft had caught up with Netscape in terms of basic Internet functions, so the next iteration is crucial for both companies. Netscape needs to demonstrate that it has not lost the initiative, and Microsoft needs to show that it can not only match but trump its young rival.

Microsoft joins the battle at a slight disadvantage: as usual, its products are later than originally promised. Moreover, the available code for Internet Explorer 4 (IE4) is a Platform Preview - a pre-beta, in other words - whereas Netscape's Communicator browser is currently nearing the end of its testing. IE4 can be downloaded - but note that this is a 11 to 20 Mbyte file. For those who prefer to read about IE4, rather than risk running what Microsoft emphasises is not yet stable software, there is an excellent general introduction.

Superficially, IE 4 is not much different from IE 3. Wisely, perhaps, Microsoft chose to adopt its new Windows look from release 3, putting additional pressure on Netscape's product which at a stroke was made to look a little long in the tooth as far as user interface is concerned. One change that is not obvious is the Autocomplete feature: as you type in a URL that you have visited before, IE 4 completes the rest of it.

Also not immediately clear is the power of the new Search button on the tool bar. When you press this, an extra frame appears in the browser with six search engines: Infoseek, Lycos, Excite, Yahoo, Altavista and Hotbot. Once you have selected one of these, the results appear in the new frame. Clicking on a link brings up the site in the right-hand frame, while retaining the other links on the left.

Potentially one of the most interesting features of IE4 is hidden away on the Programs tab of Options on the View menu: Microsoft Wallet. This is a facility that will enable you to buy goods over the Internet using credit cards without having to fill in card or address details each time. Instead, the Wallet will transmit safely all the information to the merchant server. Also hidden away is the thumbnails feature for bookmarked sites. When you view sites that you have bookmarked (stored in a folder called Favourites by Microsoft) there is a humbnail option that shows what the pages look like - handy for reminding you exactly what they refer to.

As well as supporting the usual Java applets and ActiveX controls, IE 4 offers a scripting engine - which allows greater interactivity to be added to Web pages - and support for Dynamic HTML. There are a number of ancillary programs designed to work with IE 4's browser. For example, there is FrontPad, a cut-down but serviceable version of Microsoft's FrontPage HTML editor, Outlook Express, a new e-mail and Usenet client, NetMeeting for audio and videoconferencing, and NetShow, which offers streaming audio and video in a single file.

Also worth noting is the new Windows Address book. This employs the increasingly standard LDAP protocol to allow white pages directories on the Internet and intranets to be searched for information about online users. The above indicates the breadth and richness of the changes to Internet Explorer in this latest version. But in many ways these pale in comparison to an even more ambitious goal Microsoft hopes to achieve with IE 4: the complete integration of the Internet with the desktop.

Microsoft's marriage of the Internet and desktop

Some of the new elements in Microsoft's Internet Explorer 4 Web browser are essentially only incremental improvements to previous releases, but the same cannot be said about the other aspect of IE4, which represents a fundamental shift in Microsoft's entire operating system philosophy. For IE4 is not only the tool you use for browsing the World Wide Web, but has also been fully integrated with Windows Explorer to allow you to browse your PC in exactly the same way.

A trivial example of this unification is that it is now possible to open folders and files with a single rather than double mouse-click - just like ordinary hotspots in browsers. More profound is the presence of an item called The Internet as one of the fundamental elements of the Desktop as displayed in Windows Explorer (along with My Computer and Network Neighbourhood). That is, the Internet becomes simply an extension of your hard disc, with pages that you have visited indicated as a hierarchical arrangement of HTML files.

Moreover, when you select one of these pages, it appears in the right-hand pane of Windows Explorer, as a fully active Web page. This means that you can click on hotspots within the document and you will be taken to the corresponding Web page (assuming you have a live connection to the Internet, or the page in question is cached) which will be displayed within Windows Explorer once more.

This equivalence between internal and external storage space also works the other way. Just as the Internet is turned into a huge hard disc, so your hard discs become Web pages. Using a new Web view option, hard discs and their contents appear on a special Web page within Explorer whose HTML you can examine and modify using a Wizard. You can also create internal bookmarks: not to Internet pages, but to files on your hard disc that you can jump to as you would with external Web pages. As Microsoft's example shows, this could be a real boon for corporate IT departments, enabling them to set up PCs for non-technical users where explanations are not hidden away in help files, but found in the file listing itself (as a background HTML document).

Even the underlying desktop on your screen is an HTML file. As such, it can have not only live hotspots but also ActiveX controls too. This Active Desktop, as Microsoft calls it, is another major innovation for the Windows interface. For by embedding the appropriate ActiveX controls it is possible to obtain constantly updated information from the Internet which is then displayed automatically on your desktop - the ultimate in integrated push technologies. This new style of desktop push is also a feature of Netscape's Netcaster technology, part of its Communicator product, which will be discussed in a future column.

Other aspects of Microsoft's integrated Web and desktop include the new Taskbar, reminiscent of the icon bar in IE3 and above, and the Find option on the Start button. As well as searching for files through your hard disc, you can search for information out on the Internet and even for people using LDAP white pages servers. I was very impressed with IE4 - not just for what it did, but that Microsoft had done it at all. The extent to which it has managed to meld Web and desktop is extraordinary, and is eloquent testimony to just how serious Microsoft is about the Internet.

However, my admiration is tempered with considerable apprehension. As I have pointed out several times before, ActiveX controls are fundamentally insecure. Unlike Java applets, they are not confined within a safe 'sandbox', and the Authenticode scheme used by Microsoft essentially means that they are potentially viruses with birth certificates. Imagine, then the devastating power a rogue ActiveX control could have in the context of IE4 where there is no distinction between Web and desktop or corporate network. IE4 will rightly be a great hit with end-users, but corporate IT managers will need to control its use extremely tightly.

Why users must resist Microsoft's advances

The software giant is slowly but surely increasing its influence in different spheres - which should set the alarm bells ringing.

The legal battle between Microsoft and the US Department of Justice indicates how Microsoft seemed to be quibbling, to say the least. Its interpretation of the judge's initial order forbidding it from forcing computer manufacturers to bundle Internet Explorer with Windows 95 has been similarly selective: it now offers what are effectively non-viable versions of Windows 95, with Internet Explorer and various ancillary but essential files removed.

Not content with this less-than-conciliatory approach, Microsoft has gone on to attack not only the so-called 'special master' appointed by the judge to carry out further research into the technical and legal issues surrounding this case, but also the judge himself. Of course Microsoft has a perfect right to defend itself. But what is interesting is its attitude to the whole legal process. This matters because whatever power Microsoft wields today is likely to pale into insignificance compared to the control it will soon have.

In the context of operating systems, Microsoft now has few real competitors. Windows 95 rules the desktop, and Windows NT is fast making inroads into the fragmented Unix market.

Cause for concern
Microsoft's impending dominance in all operating system sectors is not in itself cause for concern. If the history of computing shows anything, it is that once-powerful players can be reduced to a secondary role by the arrival of new technology in a matter of years (the case of IBM springs to mind). What is far more preoccupying is Microsoft's use of its technology - particularly Internet technology - to gain control of something much more fundamental: infrastructure.

Even though the Microsoft Network (MSN) has been a damp squib, it has taught the company a very important lesson about the difference between providing and exploiting online access. As MSN becomes increasingly marginalised, Microsoft is replacing it with a series of standalone Web services that already are among the key players in their fields.

For example, Expedia is one of the top online travel services, CarPoint is a major site for online car sales, and there is an imminent house-selling service, Boardwalk. Also worth noting are an online mall called Microsoft Plaza and a growing collection of local listings magazines known as Sidewalk

Aggressive approach
Microsoft is also working more subtly to integrate itself into the fabric of business and education. It is setting up electronic filing for the US legal system, an architecture for linking banks and, with hardware partners, is trying to take over the running of California State University's computing network.

Given the $9bn (£5.6bn) cash reserves Microsoft has to realise these and other ambitions, its aggressive approach to the current court case is worrying. In particular, it raises the question of how the company might act if it ended up dominating the US's and perhaps the world's commercial, professional and educational infrastructure.

It also begs the question what can be done to counterbalance such an unhealthy concentration of power. In its escalating attempts to Balkanise the world of Java - and to turn into another Unix - Microsoft has provided the clearest indication that it is here that it sees its most serious rival. Java is not perfect, but it may offer the only alternative to Microsoft. A programming environment that is truly platform-independent would level the playing field and give other software houses and content providers a chance. Companies might like to think about deploying it more widely - while they still have the choice.

The Phoney Browser peace

Those with good memories may recall when browser updates seemed to appear every week, and when duration was measured in Web years, which ran 10 times faster than the more mundane variety. No longer: Web years are no more than double the normal kind, and new browser versions are very infrequent.

For example, it is now over a year ago that version 4 of Netscape Navigator and Internet Explorer first appeared. Internet Explorer 5 has finally made its appearance - as a pre-beta version - (July 1998) while Netscape Navigator 5 is still some months off. Nothing could illustrate more strikingly how browsers have moved away from the centre of the Internet stage.

Of course, they remain pivotal to the way the Internet and increasingly business works, but as a result they have become more workaday, the development cycles more leisurely and the improvements more incremental.

For instance, Internet Explorer 5 seems identical to version 4. One of the few differences between the two is the slight clicking noise that is produced when you select an option from the icon bar or follow a hypertext link. Less obvious is the new face of the Organise Favourites dialogue box. This comes up as a browser window written using Dynamic HTML, even though unusually it has no menus or other such screen furniture.

The biggest changes from Internet Explorer 4 are in the area of Dynamic HTML support. The company seems to be moving to make the language one of the main ways of creating and controlling objects on the desktop. Similarly, XML capabilities have been enhanced in Internet Explorer 5.

For the first hints of how all these new features might be applied, there is an overview of Internet Explorer 5, while more detailed information can be found.

If the development process of Internet Explorer has become largely hidden, with little obvious to show for the new engineering that lies beneath the browser surface - much of it tied to the Windows platform - Netscape is adopting a different approach for its next major browser release.

Netscape has boldly chosen to throw in its lot with the open source camp. This means making the development process of Navigator largely public. The results are at the Mozilla site where the source code can be downloaded, and the current status of the various building blocks of the browser reviewed.

However, quite how Netscape will pull all these together remains to be seen. In the meantime, it is about to issue version 4.5 of its browser. This will have new features including the use of the Alexa navigation system discussed elsewhere.

This lull might suggest that the browser wars are over. Certainly, in terms of market share things are settling down to a rough parity. With Netscape offering its browser free of charge, it is likely that it will staunch the ruinous flow of users towards Internet Explorer.

Similarly, now that a US Court of Appeals has overturned the initial decision regarding the bundling of Internet Explorer with Windows 95, it can be taken for granted that Internet functions, and Internet Explorer, will be part of the various Windows operating systems.

But this does not mean browsers have reached the end of their evolutionary path. In terms of user interface and basic HTML features perhaps they have, but the dynamic version, and particularly the imminent arrival of XML in the Web mainstream, will have great repercussions on browsers.

Microsoft has consistently led here, with limited XML support in Internet Explorer 4 now enhanced in version 5. Netscape has done important early work on the XML application Resource Description Framework, and needs to turn this now into new navigation features in the next version of its browser. So while the first browser battle may have ended in an uneasy truce, there can be little doubt that both parties are girding their loins for the next stage of the continuing war.

Cable modems

A natural instinct for Internet users is to want faster access, especially as new multimedia features such as animated gifs, Shockwave files and Java applets become increasingly common. For companies with leased lines, this comes down to cost: you can have any speed up to tens of megabits per second if you are prepared to pay for it. But for those using dial-up access, there are natural limits imposed by the ordinary telephone network that is generally used to reach the Internet Service Provider (ISP).

Modems pushed beyond 28.8 Kbit/s to 33.6 Kbit/s in the summer of 1996, with the possibility of 56 Kbit/s being dangled by Rockwell. ISDN, offers 64 Kbit/s (128 Kbit/s if two lines are combined) by using a purely digital telephone connection. There is, however, another alternative that, in the US at least, is passing from pious hope into everyday reality. Instead of using the main telephone network, the idea is to employ cable television as an alternative infrastructure for accessing the Internet.

Like the telephone, cable TV offers a huge wiring system that can be pressed into service for carrying digital data, with the bonus that it was built for high-bandwidth transmission from the start. A new device, the cable modem, is required to translate from computer digital signals into a form that is acceptable to the cable network. On offer are download speeds in excess of 1 Mbit/s; upload speeds are more modest - generally tens of kilobytes per second - but fast enough for most end-users not trying to run Web sites using this method.

Cacheing

The main constraining factor for almost all Internet activity is bandwidth, the quantity of information that can be sent over a given connection. By far the easiest way to speed up the transport of data is not to send it over the Internet at all.

This may sound rather paradoxical, but is based on the observation that many of the things that you do on the Internet you have done before (that is, you often visit the same sites, or step back to view the same pages). The idea behind Internet cacheing is that by storing certain kinds of Internet information it is possible to retrieve much of your online requirements locally, thus avoiding the constraints of bandwidth.

Perhaps the best-known form of Internet cacheing occurs within leading Web browsers like Netscape. There, pages that you visit are cached for the session so that you can easily move back to one that you have visited without needing to reload it from the distant site.

Similarly, images can be cached between sessions, again allowing pages to be retrieved more quickly (since only the text need be collected, with the cached images dropped in from the local store). Microsoft's new Internet Explorer, found in the Windows Plus! kit for Windows 95, goes even further, and can cache entire Web pages between sessions, providing a kind of offline WWW reader.

Internet providers sometimes make use of caches in order to improve the speed of the service that they offer. For example, they may set up a proxy server to hold all or most of the files that have been accessed; when one of those files is requested again by a user, it comes from the cache rather than from the Internet site.

Cache poisoning

The Domain Name System (DNS) lies at the very heart of the Internet. Without it, we would be reduced to entering addresses in the form 123.45.67.89 instead of as www.bloggs.co.uk. It has also ensured the recent rapid expansion of the Internet and in particular its extraordinary uptake in the business sector.

It is therefore surprising that this fundamental element of the Internet's infrastructure remains almost completely defenceless in the face of subversion. This was proved when the DNS files were intentionally corrupted so as to redirect traffic meant for one site - the main US registry for Internet names - to another run by an unauthorised rival.

The technique used was one known by the dramatic name of cache poisoning. This exploits the fact that the DNS service assumes that participating servers are well intentioned; after all, it is in their best interests that the domain name system be as efficient as possible.

DNS servers therefore accept without question certain kinds of information fed to them by other domain name system servers. Generally this information, stored in their caches for future use, helps speed subsequent conversions between the numerical and domain-based addresses.

However, as the attack on InterNIC showed, it is possible for the cache to be fed false data: to be "poisoned". If the poisoned caches belong to sufficiently important DNS servers, this can cause wholesale redirection of Internet traffic.

To solve this problem, several updates to DNS incorporating security and authentication have been proposed, but not yet implemented.

Cascading Style Sheets (CSS)

A solution to meeting the problem of ever increase demands of Web page design is a new HTML standard called Cascading Style Sheets (CSS). The official Web stylesheets home page is at http://www.w3.org/pub/WWW/Style/, hosted by the World Wide Web Consortium, and itself employing stylesheets. This has a useful list of HTML editors that support this new feature. The site also maintains the official CSS recommendations at http://www.w3.org/pub/WWW/TR/REC-CSS1. Perhaps the best overall starting place for exploring the world of stylesheets is Microsoft's page on the subject at http://www.microsoft.com/opentype/css/default.htm (also using stylesheets). Microsoft has been at the forefront of supporting CSS in its Internet Explorer, starting with version 3, whereas it is only with version 4 of Navigator that Netscape caught up.

A graphic demonstration of what style sheets are can be found at http://www.microsoft.com/truetype/css/gallery/spec1.htm. A list of other sites employing stylesheets can be found at http://www.microsoft.com/truetype/css/gallery/cssinf7.htm.

Also worth looking at is High Five at http://www.highfive.com/, a site about design with back issues at http://www.highfive.com/core/backissues.html; a must for any serious designer.

Case sensitivity

One of the many confusing aspects of the Internet is the extent to which elements within URLs can be altered when you enter them. Clearly, nothing major can be changed - no spaces added or letters substituted. But there remains the question of case, and which parts of an Internet address can and cannot be changed from upper to lower case, or vice versa.

In fact the rule is quite straightforward: you can change the actual address of the machine that you are accessing in any way, but you cannot safely fiddle with the directory structure that follows it. Thus http://www.ibm.com/ will work just as well as HTTP://WWW.IBM.COM/ or even HtTp://WwW.IbM.cOm/ .

However, the same is not true of the elements that follow this part, which must be entered exactly according to the information given. The reason for this stems from the underlying structure of the URL.

The first part - the domain name - is simply an alternative way of expressing the 32-bit address that is generally written as four decimal numbers, each less than 255, to give a general shape of 123.45.67.89. When messages and requests are routed over the Internet, the domain name is converted to the numeric equivalent, and the exact form of the name as far as case is concerned is irrelevant.

However, the remaining part represents the directory structure of the site being accessed. The commonest operating system found on these remains Unix, which is case-sensitive when it comes to directories and file-names. For sites running Unix there is therefore a world of difference between http://www.abc.co.uk/pub/ and http://www.abc.co.uk/PUB/, even between http://www.abc.co.uk/pub/read.me and http://www.abc.co.uk/pub/Read.me.

Castanet

Two of the most important innovations on the Internet during 1996 have been Java and PointCast. In some ways they are very similar: both are about transmitting information over a network - the Internet or intranets - to provide functionality. Java has the advantage that it transmits fully-fledged programs, while PointCast offers the added bonus of constant updates.

Given the plaudits which Java and PointCast have won separately, it is therefore not such a surprise that an approach which puts together the best of both to offer constantly updated programs delivered over networks should meet with equal acclaim. The fact that the company doing so, Marimba, was founded by several key members of the original Java development team only adds to its credentials.

After months of rumours Marimba has released beta versions of its first product, Castanet. As the basic PointCast approach dictates, there is a transmitter and a tuner (respectively the server and client elements); a Visual Basic-like development tool for this new medium, called Bongo, has also been written. They can all be downloaded from Marimba's home page at http://www.marimba.com.

Two official books about Marimba products have been published; both have been written in co-operation with the company's developers. They are guides to the application distribution technology Castanet (£37.50, ISBN 1-57521-255-2) and the Java interface design environment Bongo (£37.50, ISBN 1-57521-254-4).

In keeping with the broadcasting metaphor, information is transmitted and received on various channels; a transmitter (that is, Castanet server) may offer several channels. These channels are effectively distinct programs (written in Java initially, but not limited to this language) that are sent from the transmitter to receiver. They are stored on the client machine and run by the tuner, and receive updates of both data (for example a share service might display the latest prices) as well as of the basic program itself. This is one of the great advances of Castanet, particularly with respect to Java: it allows updates to be carried out on the fly and incrementally. This avoids having to download large applets again just to get the latest version.

The tuner adopts a tabbed sheet approach, familiar from Windows 95. A configuration sheet allows you to set how often channels are updated. A Hot sheet provides some of the available transmitters. Marimba is currently providing an open transmitter with developer channels at http://trans.havefun.com:5282/. The Listing tab is the way to access transmitters in general, though for this you need to know their address; typically a port is also specified for Castanet transmitters.

Once a transmitter is located, you can obtain a listing of all its channels. These are displayed as hanging off a directory-type tree of the main transmitter addresses, and contain a short description about the nature of each channel. Double-clicking on a new channel causes the associated application to be downloaded and then run. These applets can be quite large - several megabytes - but as mentioned above, downloading them is a one-off operation.

Among well-known names included on the Hot spots tab are Talk.com, HotWired's Castanet-based chat channel (at http://trans.talk.com: this address can be entered directly in the Listing tab of Castanet, the Excite search engine (at http://trans.excite.com), and the Java directory Earthweb (at http://trans.earthweb.com). As you might expect, Marimba itself has a number of interesting channels (at http://trans.marimba.com), including an online tutorial that uses the Marimba tuner as a proxy server to an ordinary browser, serving up standard HTML documents (this requires a slight manual modification to the set-up of your browser).

Encouragingly there are also already a couple of entries from UK users. One is at http://trans.totem.co.uk. Particularly impressive is the other, the KMI/Stadium project (at http://kmi.open.ac.uk/stadium) which is an ambitious mock-up of a distance learning project using Internet channels to broadcast lectures worldwide. The application of this kind of idea in a business context is obvious. Although it is early days yet for this technology it is clear that it adds greatly to the possibilities of Internet and intranet information provision. Both Microsoft and Netscape have expressed interest in the idea, and it will be interesting to see how this area develops.

Cello

In 1994, there were two main Microsoft Windows browsers for the World Wide Web: Mosaic and Cello. The history of Mosaic since then has been dramatic: a constant flow of updates, the licensing of the program to a wide range of companies (including at one point Microsoft) and the appearance of Netscape; Mosaic's intellectual progeny written by most of the team of programmers that devised and created the original product.

Against the blaze of publicity that Mosaic has enjoyed, Cello has almost vanished from sight. Things have not been helped by the fact that a number of commercial companies, such as Frontier Technologies Corporation and Booklink Technologies (now part of America Online), have also leapt into the browser market with products that offer alternative ways of accessing the Web; shifting the emphasis on to what is new and innovative rather than old and established.

And yet Cello has always had its loyal band of users. If the program was rather less showy in its capabilities compared with Mosaic, it was also much simpler to set up and easier to use. It also came with very full built-in help from the start - something that Mosaic lacked, although it does not at least have links to online help from its home site (an approach pioneered by Netscape). Cello is available from many sites, including its home site at ftp://ftp.law.cornell.edu/pub/LII/Cello/cello.zip.

Certification

The public key encryption techniques used in the Pretty Good Privacy program and elsewhere (for example in the Netscape browser) solve several problems to do with security. Most obviously they provide a means of sending a message to someone in a completely secure fashion, without needing to exchange secret keys beforehand (like conventional encryption). This is done using two keys: a private one (which is kept secret) and a public one (published for others to use).

Public key encryption also allows you to be sure that a given message came from the person holding a certain private key, since only the private key could have been used to generate the encrypted form of the message that the public key unlocks.

However, on its own this approach cannot guarantee that the person who has the private key is who he or she claims to be. Imagine the situation where somebody publishes a public key falsely claiming to be someone else: encrypted messages will then indeed appear to come from the claimed person (because the published key unlocks the messages), but it is the matching of the key with the individual that is untrue.

The way round this is to use what is called certification. It is possible for a trusted third-party organisation to add its seal of approval to certify that a given public key did indeed come from the person who claims it as his or her own.

Of course, this creates a situation where you need some trustable authority for certifying keys, and it has been suggested that independent bodies such as the Post Office, banks or similar might fulfil this role.

Chain E-mail

There are two basic forms of Internet chain E-mail. The first is the "good luck" chain letter where you are invited to forward a self-perpetuating E-mail message that is supposed to bring you and the people to whom you send it good luck. It explicitly tells you to send no money, but does threaten you with "bad luck" if you fail to pass on the E-mail. So much for primitive superstitions being banished by high technology.

The other form of chain letter is more insidious, and probably illegal in most countries where the Internet is widely used. It specifically requires you to send money to people at the head of the chain. The nominal logic of such pyramid schemes is, of course, familiar: that by doing so and passing on the message it is a mathematical "certainty" that in due course you will receive huge sums from those lower in the pyramid.

If for no other reason, chain letters and pyramid schemes should be studiously avoided because they are a complete waste of the Internet's valuable bandwidth. Fortunately, on the rare occasions when they rear their heads in public Usenet newsgroups they are quickly shot down by many of the vast majority of intelligent Internet users who have a rather more useful message to pass on.

Client-server

Many of the Internet's services operate using the client server model, which is widely employed in business for corporate databases and other applications. The basic idea is to split up a function into two separate elements: the client, which instigates actions, and the server, which responds to them. The client and the server are usually physically distant, and connected by some form of network (such as the Internet).

the advantage of the client-server model is that most of the processing can be done where data and processing can be done where data and processing power are concentrated - for example a mini or mainframe with extensive storage. Only the information requested by the client is sent back over the network, thus minimising the traffic. This contrasts with the situation where the processing is always done locally, requiring larger amounts of data to be sent and for greater processing power to be available on the local machine.

Archie file-searching, Gophers and the Wide Area Information Servers (WAIS) also use the same model. Less obvious is the fact that the familiar browser are examples of WWW clients.

Common Gateway Interface

Although the World Wide Web is an immensely impressive achievement, the underlying technology is extremely simple. When you use a Web client like Netscape or Internet Explorer, you send a simple request using the HyperText Transfer Protocol (HTTP) to a Web server which then sends you back a page written using the HyperText Mark-up Language (HTML). For obtaining information, or just browsing, this is fine; but if you need something more complex, the limitations of this approach soon become apparent.

To get round the problem, a way has been devised whereby ancillary programs can be run at the Web site to add any kind of extra functionality to the basic features of the WWW server. For example, the forms that appear in many Web pages send information back to the server which is then processed using one of these programs - perhaps to search through a database, or to take an order for a product and process it. Another common use is to invoke different responses when different parts of a clickable image are selected.

The Common Gateway Interface (CGI) describes the way the Web server and the external program interact. You can often detect the presence of programs using the CGI by the appearance of the directory /cgi-bin/ somewhere in the URL.

This indicates that when you access a page containing this path you are probably running a program, sometimes called a CGI script. Such scripts are often written in Perl, but other languages like Visual Basic and Java can also be used depending on the circumstances.

Compression

No matter how fast your Internet connection, or how big the hard disc on your server, they are never fast or big enough. One way to increase both the effective connection speed and hard disc capacity is to use compression.

Essentially compression techniques store a file in a way that exploits structures and repetitions that exist within all but the most random of digital files. The greater the structure within a file (for example a graphics file might contain distinct areas of one particular colour) the greater the possible compression.

As a result, wherever you go on the Internet you are almost certain to encounter compression. If you are using a dial-up connection, then your modem will probably offer some kind of compression on the fly, where files sent and received are squeezed for transmission over the telephone line and unsqueezed at the other end.

Similarly, files held on FTP or Web servers are usually stored in a compressed format, both to save space there and to reduce the download time for visitors (thus freeing up the connection for others). This is evident from the file extension employed. Typically these are .Z and .gz for Unix files, .zip for PC programs (less-common formats include .lzh, .arj and .arc), and .sea for the Macintosh.

The latter is unusual in that it is a self-extracting format: to retrieve the file it contains, you do not need any additional software to carry out the decompression; simply running the program causes the compressed file to be reconstituted automatically. There are facilities with PKZIP (Winzip) that convert a .zip file to a self extracting .exe file.

Connectionless protocols

The experience of using the World Wide Web is such a smooth and fluid one that the nature of the underlying technology is a little surprising. For when you are accessing a site, the connection remains in place between your browser and the server only as long as it takes to download the file (or files if there are inline graphics, for example) to your desktop machine. The rest of the time - while you read the text or admire the images - the server forgets about you completely.

Indeed, when you click on a link in a page, in general the site holding that document has no memory of you retrieving it whatsoever. For this reason, text files called browser cookies are sometimes employed to store an access history on the client machine that can be retrieved by a server.

Technically speaking, the HyperText Transport Protocol (HTTP) which handles the communications between the Web server and its client (the browser) is said to be connectionless, since there is no permanent link between the two sites.

This makes the Web very efficient in terms of bandwidth since it uses the Internet only at the moment when there is something to download (or upload if you are filling in a form). Other protocols - telnet and FTP, for example - maintain a connection between client and server whatever you are doing.

Even these, of course, are only virtual connections. They are unlikely to have any fixed physical path through the Internet's many constituent networks given the way in which all data is converted into packets that are sent independently and then re-assembled on arrival.

Cookie file

One of the problems of downloading executables files from the Internet (even assuming that they are virus-free) is that when you install them they start spraying components hither and thither until you end up with a hard disk full of unmememorably-named files.

Netscape is no exception, and many people must have wondered what on earth one called cookie.txt was for. Surprisingly, this is not just some feeble programmer's joke, but is a moderately well established fact of Internet life.

The cookie file is used to store information between Web sessions (normally, temporarily cached information is lost when the browser is shut down). This information is available to Web sites that may be configurable in some way, and which need to retrieve previously-defined parameters. One of the best examples of just a configurable site is the home page of the Microsoft Network (at http://www.msm.com/). One of the options from this page is to customise it in various ways: adding favourite links, search engines, news stories and even screen colours.

Obviously the site itself cannot store all this information for every visitor. Instead, this customisation is held in a cookie file from where it can be retrieved by the browser and sent to the web server the next time you log on.

Unfortunately, with typical Microsoft presumption, instead of placing the file cookie.jar in the main Internet Explorer directory (as would be natural) , the company has elected to hide it away in the main Windows directory, cluttering it up even more.

Corba and IIOP

One of the most important aspects of the Internet - and of intranets - is platform independence. There is a real possibility of mix and match, where elements of one manufacturer can be integrated seamlessly with those of another. Add to this the distributed nature of the Internet/intranet and you have the foundation for a component-based approach to computing that allows pieces to be plugged together wherever they are located on a network.

Against this background of interest a pre-existing approach is finding itself increasingly in the limelight after years of obscurity. Called Common Object Request Broker Architecture (Corba), this object management protocol has the backing of almost the entire computing industry. In addition to describing how distributed elements should be constructed so that they can work together in this seamless way, Corba also defines how the process should work across the Internet.

Known as the Internet Inter-ORB Protocol (IIOP) - Corba seems blighted with very forgettable names - this area has received a significant boost through Netscape's espousal in its Open Networking Environment. The success of IIOP and Corba would seem assured if Microsoft were not pushing its own Distributed Component Object Model approach as an alternative, even though it is nominally a member of the Object Management Group which oversees the development of Corba and its related standards.

Crawlers

Web robots, spiders, crawlers and worms conduct methodical searches through the whole of Web space, trying to find all URLs that are connected. As an example of the scale of the problem of indexing WWW, Lycos, which can be found at http://lycos.cs.cmu.edu has a database of over 1.89 million URLs.

Searching is simple; just enter a word or short phase and then Lycos will find matches or near matches among its holdings. The match does not have to be the site name, it can be hidden fairly deeply within the site's contents.

The worm at http://www.cs.colorado.edu/hme/mcbryan/WWWW.html also has a large database including information on gophers.

The Web crawler at http://webcrawler.cs.washington.edu/WebCrawler/WebQuery.html has around 70,000 documents referenced in a 50Mbytes index. It knows of a further 700,000 un-visited documents in a database of 40Mbytes.

Other, smaller databases of URLs include:

RBSE at http://rbse.jsc.nasa.gov/eichmann/urlsearch.html

Nikos at http://www.rns.com/cgi-bin/nikos

Jumpstation Robot at http://www.stir.ac.uk/sjbin/js

WWW Wanderer at http://www/mit.edu:8001/cgi/wandex.

babyOIL at http://www.dstc.edu.au:80/babyOIL/.

A good consolidated list of Web search engines can be found at http://cuiwww.unige.ch/meta-index.html.

More information and theoretical details on Web robots can be found at http://web.nexor.co.uk/mak/doc/robots/robots.html.

Credit card security

It is accepted online folk wisdom that you never send credit card details over the Internet. The reasoning behind this is that ordinary Internet messages are more like postcards than letters, with anyone being able to read them who takes the trouble to do so.

While this is certainly true, it is hardly worth the effort required for this kind of credit card fraud, and such thefts are more or less unheard-of. Clearly, though, not being able to use a credit card for purchasing online because of these fears would render the Internet useless for day-to-day commercial transactions. Fortunately, there are now a number of completely secure methods of sending such details across the Internet. All of them use the public key encryption techniques described elsewhere.

The basic TCP/IP standard that underlies the Internet says nothing about the security of the data it transmits. Although TCP/IP is generally called a 'packet'-based system, in truth information is sent more like a series of postcards that can be read by anyone who takes the trouble to do so. This inability to guarantee secure financial transactions has perhaps been the biggest obstacle to the routine use of the Internet for all business activities.

Of course, technology companies have not been slow to spot this need and to satisfy it. Indeed, the problem has been that there is not just one but two highly-suitable solutions. The first comes from Netscape Communications Corporation, the company that burst upon the Internet scene at the end of 1994 with the slick and powerful Netscape Navigator browser.

Built-in to this product was a security standard called Secure Sockets Layer (SSL) - see the URL http://www.netscape.com/newsref/std/SSL.html for more details. When the Netscape browser connects to a Netsite Web server, the SSL standard allows any data that passes between the two to be encrypted, using almost identical techniques to those employed by the Pretty Good Privacy system. Applied fully, these techniques can be - according to current knowledge at least - for all practical purposes unbreakable. Perhaps just as importantly for ordinary uses, they are implemented in a completely transparent fashion so that you are unaware that they are in place.

Or almost unaware: for Netscape has instituted a rather neat visual element that indicates the security or otherwise of the transaction: in normal, non-secure transfer, there is a small broken key in the bottom left-hand corner of the screen. As you pass to secure mode, this key magically joins itself together. To see this in action, try accessing the URL http://www.virtualvin.com/ with a Netscape browser.

SSL has the advantage that it exists now, and comes as standard with what is almost certainly the most popular Web browser. Its rival, called Secure HyperText Transfer Protocol (see http://www.eit.com/projects/s-http/), is backed by the NCSA (where Mosaic was developed) and the CommerceNet organisation (http://www.commerce.net/); it also seemed to be favoured by the WWW Consortium (but see http://www.w3.org/hypertext/WWW/Security/News/950303_Statement.html). It works in essentially the same way as SSL in that transactions are encrypted using the techniques also employed by PGP and without the user needing to worry about the details.

Until recently it looked as if the Internet world was faced with the prospect of a highly-damaging split as software vendors and commercial users lined up with one of the two opposing camps. Although working in similar ways, a browser set up for one could not work securely with servers using the other.

However, rather unexpectedly, all parties have shown tremendous good sense and got together to work on a common standard embracing both security schemes. This is possible because they work at different levels. SSL creates secure channels across the Internet at a very fundamental level (and therefore is not just restricted to Web operations, but can also be used for secure mail, news, FTP, telnet and Gopher operations), whereas SHTTP, as its name suggests, is bound up with the HTTP protocol that ties the World Wide Web together.

A variant of SHTTP has been developed by a UK company MarketNet whose product Workhorse uses SSL to encrypt E-mail containing financial information (see URL http://193.118.187.105/help/system/horse).

Apart from the fact that the two main camps have come together, thus avoiding any direct conflict, the success of this new standard seems assured by the fact that the move has been sanctified by four major online players: IBM, America Online, CompuServe and Prodigy. All of whom have invested (along with Netscape) in the company Terisa (see http://www.terisa.com/) that will produce the necessary toolkit for the new standard.

Unfortunately this cosy situation has now been disturbed by Microsoft's announcement of its Secure Transaction Technology (STT), developed jointly with Visa. The latter had originally joined with MasterCard to draw up a standard for the transmission of credit card details over the Internet, but the STT announcement sees Visa and Microsoft ranged against Mastercard, Netscape and many other major players in this market - and the waters muddied once more.

Another way to protect credit card details during transfer across the Internet is to avoid the latter completely. Instead, you set up an account with a company on the Internet (effectively a cyberbank) and use funds held there to pay for goods sold by companies or individuals who also have accounts. At the end of the month you are billed for your purchases, while vendors are credited with the appropriate sums. The most-developed scheme along these lines is the one run by First Virtual (see http://www.fv.com:80/info/). A similar system can be found at http://www.checkfree.com/.

Since the status of such cyberbanks is a little uncertain from a regulatory point of view, other companies have chosen to take a related but different approach. Instead of holding funds for you which can then be spent, they maintain instead direct links to banks, arranging for monies to be transferred from your account to that of the vendor. In this way no credit card or account details are passed over the open Internet, just orders to buy and confirmations of the sale. Among the schemes in various stages of development are those run by Cybercash (at http://www.cybercash.com/), NetChex (http://www.netchex.com/), Netbill (http://www.ini.cmu.edu/netbill/) and NetCheque (http://nii-server.isi.edu/info/NetCheque/).

NetCheque will also be offering something called NetCash, which represents the final stage in the evolution of online payment systems. Here there is no recourse to credit cards or matching orders of purchase and sale: instead a completely digital equivalent of money is employed. This is achieved using techniques very similar to those behind the encryption tool Pretty Good Privacy (and the SSL and S-HTTP standards). It allows messages with a value guaranteed by an ordinary bank to be sent safely, anonymously but non-reputably over the Internet.

Link, the UK's largest shared cash machine has put its directory on the World Wide Web. It can be accessed at http://www.link.co.uk/LINK/, and has a search engine that can be used to find cash machines local to a given address.

The pioneer and leader in this area is Digicash, whose pages at http://www.digicash.com/ have more background on the subject. Also worth noting is the Mondex consortium that, although primarily concerned with electronic money on a card, also has plans to allow its digital currency to be used over the Internet (see http://www.mondex.com/mondex/net.htm). Such schemes may seem the stuff of science fiction rather than business at the moment, but it is likely that they rather than any of the others discussed above will form the basis of Internet commerce in the not-too distant future.

Several research projects are investigating ways of pulling together the security technologies and making it easy-to-use. The European Community for the development of a "Secure Electronic Marketplace for Europe" (Semper) has a home page at http://www.darmstadt.gmd.de/~semper/, although all the links here seem to be out of date. A better starting point is http://www.digicash.com/products/projects/semper.html).

Also based in Europe is the E2S project - End-to-End Security on the Internet (home page at http://www.ansa.co.uk/E2S/index.html). A US equivalent is at http://www.llnl.gov/fstc/projects/commerce/index.shtml. The link at http://www.llnl.gov/fstc/projects/commerce/payments.html is good for background information.

The biggest impetus to conducting commerce over the Internet will come from agreements between Mastercard and Visa (see http://www.mastercard.com/set/set.htm and http://www.visa.com/cgi-bin/vee/sf/set/intro.html?2+0 for describing both the business and technical aspects). Originally Mastercard and Netscape offered their own standard SEP with Visa and Microsoft offering a rival one. Netscape's announcement can be found at http://home.netscape.com/newsref/pr/newsrelease91.html and Microsoft's electronic retail strategy at http://www.microsoft.com/advtech/eretail/faq.htm.

Cybersitter

One of the brakes on the uptake of the Internet in businesses has been a worry that staff may abuse it. Along with the fear of general time-wasting, there is the more serious concern that the Internet may be used for undesirable activities such as downloading pornographic material. Obviously this is part of a much larger issue, that of who ultimately controls the Net, and who can or should apply any kind of censorship. The extent to which these questions are unresolved was highlighted recently by the strange case of the disappearing CompuServe Usenet newsgroups.

According to CompuServe, German prosecutors informed the company that making available 200 of the more extreme newsgroups in Germany would have left CompuServe and its employees liable to legal action.

Unable to block Usenet feeds in one country only, CompuServe pulled all 200 areas from its global service, provoking a furious debate in the US, where this was seen as a gross infringement of personal liberties. What makes the matter even more curious is that the German authorities deny that they demanded this action, and say that CompuServe overreacted to their initial comments.

Clearly many issues need to be clarified in this area. But there is a growing feeling - among Internet users, at least - that legislation is not the way to tackle people's concerns about who can access what on the Internet. The alternative is to control Net access at the user's rather than the supplier's end.

For example, the new AOL service, allows several accounts to be set up, with one acting as the overall arbiter of what the others can access. It is therefore possible to block access to all newsgroups, to those containing certain key words or to specified areas. It is also possible to block all binary downloads.

Of course, if you know what you are doing, it is possible to circumvent these restrictions, but this feature is aimed at parents who are concerned about what their children access.

To serve this and related markets, a new class of software designed to offer control over what users do online has sprung up. Because of the way they work, many such programs could also be used in businesses by IT managers worried about rogue use.

For example, Net Nanny (at http://www.netnanny.com/netnanny/) allows you to block words, phrases, sites and content, all according to a user-defined dictionary. This applies to all Internet activity, whether it is on the Web, in Usenet newsgroups or at FTP sites, and can include E-mail.

It is also possible to block the transmission of sensitive information, and the software provides a full audit trail of attempted accesses to blocked sites. A corporate version that works across local area networks is under development.

Products working in very similar ways are Cybersitter (at http://www.solidoak.com/cybersit.html), and The Internet Filter (at http://www.xmission.com/~seer/jdksoftware/netfilt.html). More directly relevant to corporates is Cyber Sentry at http://www.microsys.com/cybers/default.html). This maintains a list of permitted Internet addresses, and records all network activity. It even allows access to a set of Net addresses after work hours, allowing limited recreational use by employees in their own time.

WebTrack (at http://www.webster.com/) is a complete monitoring, filtering and management server, designed to run as a proxy behind a firewall. A complete HTML log of all corporate accesses is maintained, and daily server logs can be exported to Excel for reporting and analysis.

WebTrack is Unix-based, unlike the programs above. These all run under Windows (with some Macintosh versions under development), and free, limited function demo versions are available for most of them.  

Denial of service

One aspect of the Internet that is often hard for businesses to accept is that sites are either public or private: there is no half-way house whereby the site is visible and yet cannot be reached in some way.

This means any public Web site, for example, is exposed to all kinds of probes from anyone on the Net. The most familiar attack involves trying to exploit weaknesses in the configuration of the software running on sites to gain control, but recently another has become more common: denial of service. Here the idea is not to break into a site or network, but to block access to it, often by abusing the logic of the software itself.

For example, denial of service attacks can work by causing programs to crash or get into states that cannot be resolved, such as waiting for confirmations that will never come. Just as ordinary cracking is foiled by applying patches to the relevant code to close any loopholes, so denial of service attacks are fought by tightening up underlying logic of the programs involved.

Denial of service attacks are generally carried out for the hell of it. When crackers find that a site - ideally a high-profile one - has omitted to configure software correctly, or failed to apply the latest patch, the temptation is to exploit that weakness as a demonstration of supposed superior computing skills. But in an age when Net connectivity is becoming indispensable, the use of such an attack as a weapon of illicit commercial warfare cannot be far off, and may already be here.

Diffie-Hellman

Encryption is crucial for the application of the Internet to business. Without it, communications are not secure, and the most important ingredient of electronic commerce - trust - is missing. Fortunately there are now many different encryption approaches that can be applied in electronic commerce applications. But because so much hinges on this area, encryption has become something of a battlefield.

There have been disputes about cryptographic patents (between the top company in this area, RSA, and the inventor of PGP), and there is the continuing argument about whether industrial-strength encryption should be exported from the US. In this context, the passing of the Diffie-Hellman encryption system into the public domain assumes a certain importance. Although few will have heard of this approach, it is in fact one of the earliest encryption systems and therefore one of the most-tested and best-understood.

If few have heard of Diffie-Hellman, even fewer will care about the mathematical details, which depend on the properties of very large prime numbers, as do many encryption techniques. In practical terms, Diffie-Hellman allows a secret number to be established by two parties using public communications. This number can then form the basis of conventional secret key encryption. The fact that the technique is now freely available should lead to a renaissance in its use and more low-cost electronic commerce solutions.

Digital certificates

Public key or asymmetric cryptography works using two encryption keys. One of these is kept secret (the private key), while the other is made public (by posting to Web sites, E-mailing to potential users etc). Information encrypted with one key can only be decrypted with the other. So if information is encrypted using an individual's public key, you can be sure (at least as far as is currently known) that only the person with the corresponding private key can read that message.

Public key cryptography works so well that it has been adopted for just about every situation where secure communications are required. RSA, the company that dominates this area, provides details. But public key techniques on their own are not enough to guarantee completely the secure transfer of information.

For even though the encryption techniques themselves are secure, there is the issue of identity. When you send a message to someone using their public key, you can be sure that only the owner of that key can read it. However, you cannot be sure that the key really belongs to the person you wish to communicate with. It would be easy for somebody to impersonate an individual online, and make available his or her own public key. Then any messages you sent using this key would be read by the impostor.

To get around this fundamental problem, the concept of digital certificates has been devised. The idea is simple: to associate an identity with a given public key. The certificate is provided by a certification authority. The authority is responsible for establishing that a given public key does indeed belong with a given individual. This can be done at various levels, ranging from little more than a simple confirmation that the person concerned has claimed the key, up to higher levels where individuals must present themselves in person with personal documents to prove their identity along with their public key.

As well as details about a public key and the individual associated with it, a digital certificate also includes information about who issued it and when it expires. The official standard is called X.509 v3, for which there are full details and a simple diagrammatic representation

The other crucial element of a digital certificate is that it is signed by the certification authority. A probabilistically unique number is generated (using a one-way hash function) from the information about the individual, public key and authority, and then encrypted using the private key of the authority. The resulting number is added to the other entries to form the certificate.

The signature allows prospective users to check that the certificate has not been tampered with. If it has, the result of the hash function will be different from the original. The original can be obtained by applying the certification authority's public key to the encrypted hash function result appended to the certificate. It is not possible to forge the hash function, since the authority's private key is required for the computation.

This does pose another problem: how do you know the public key supposedly of the authority really belongs to it? The answer is that the authority in turn has a digital certificate that allows its public key to be verified, which requires another certifying authority. In fact, the whole certification authority approach pre-supposes a hierarchy of authorities, with a central one sitting at the apex of each authentication pyramid. Among the leading commercial authorities are VeriSign and GTE .

Using certificates for Internet transactions

The beauty of digital certificates is that although the mathematical techniques that underlie them are complicated, using them is simple. Indeed, most of the time people will not be aware they are operational at all: everything is handled automatically by the relevant software.

A good example of this is offered by what is probably the most important public application of certificates: electronic commerce using credit or debit cards. One of the key considerations here is security, ensuring that card details remain hidden from third parties at all times. But in fact there are two other issues that also need to be addressed: privacy and non-repudiability.

Privacy means that in addition to keeping card details hidden from potential eavesdroppers, they should also, ideally, remain unknown to the vendor. All that the latter requires is confirmation from the company issuing the card that the requested payment will be made.

In this respect the widely-used SSL protocols are unsatisfactory: although they do indeed protect the card details in transit, they disclose them to the merchants, who might use them for fraudulent purposes. Another requirement not met by SSL is the need for the electronic equivalent of physical receipts. That is, it is important to be able to demonstrate that a sale was made and agreed to by both parties.

The new Secure Electronic Transaction (Set) protocols, recently finalised by the main developers Visa and Mastercard offer all three benefits, and make use of digital certificates extensively. Three parties are involved: in addition to the purchaser and vendor, there is also something known as a payment gateway, which acts as a gateway through to existing financial systems.

All three parties require digital certificates to use Set. These will be obtained from a Certificate Authority (CA) as explained last week. They enable the three parties to obtain and trust the public keys necessary for encryption. The use of public keys also provides non-repudiability: if messages can be decrypted using a given public key you can be sure that they were encrypted with the corresponding private key. Since only one person should have access to this key, any message must have come from that person, and so is binding.

Privacy is provided by using the payment gateway as an intermediary: card details are sent by the purchaser to the gateway, which contacts the card issuer to obtain authorisation. The gateway then confirms the authorisation to the vendor, which never sees card details. Both purchaser and vendor can trust the gateway because it has a digital certificate, which is guaranteed by a Certificate Authority.

The open Internet is an obvious arena for the use of digital certificates. Potentially you are dealing with complete strangers who may be located the other side of the globe, so the need for some kind of authorisation is all the more pressing. Digital certificates can also be applied to E-mail to ensure full confidentiality, for example using the standard called S/Mime.Certificates may also offer a solution to the growing problem of junk E-mail: a company might choose to accept E-mail only from correspondents with certificates that can be checked first.

Another important use of certificates is on intranets, where they can be employed to establish permissions for accessing various levels of sensitive information. One advantage of using certificates here is that it is not necessary to obtain them from an external CA. Instead, a certificate server (for example from Netscape ) is added to the intranet to handle certification and verification.

An obvious extension of employing certificates to determine access privileges is over an extranet. Here, though, external Certificate Authorities will probably be involved since each participating company's intranet may well have its own certificate server whose trustworthiness will need to be confirmed by an independent authority.

Digital Signatures

One of the ideas put forward of addressing the perennial problem of knowing when it is safe to download and run programs from the Internet is that of digital signatures, which Microsoft is suggesting as a way .

The idea is that if you know with absolute security who something came from, you can gauge the likelihood of its being hazardous for your computer. So if you can be sure a program comes from a major software firm, say, you might be rather happier running it than if it comes from an undergraduate at a Bulgarian university.

To achieve this certainty, a digital signature is used. Although meaningless in itself - it is just a string of bits - it can be used to show that the software it refers to must have come from a certain person. This is because signatures employ the same principles as encryption programs like Pretty Good Privacy: using the public key of a person or entity it is possible to check that the signature could only have been produced by that person or entity from the program it comes with.

The technique usually involves what is called a hash function that takes the program and derives from it a unique number. This number can then be encrypted with the private key of the program's owner.

If you download the program and use the purported owner's public key you will retrieve the signature which can then be checked against the output from the same hash function applied to the program. If they match, the program must have come from its claimed sender. If not, you should steer clear.

Directory-enabled networks

It is ironic that the very things that join people together - networks - should be so unconnected when it comes to detailed control of resources. Networks tend to exist almost in isolation, as separate entities that other core elements are plugged into. The Directory-Enabled Networks initiative aims to change that by integrating networks tightly with directories holding information about users and resources.

With such a standard in place, it will be possible to manage heterogeneous networks centrally, and from any location. But more importantly, it will be possible to provide network services on a completely individual basis. Using a directory, network attributes for users can be set according to their individual needs.

Currently, network bandwidth allocation, for example, tends to be on an all-or-nothing basis. With directory-enabled networks, different departments can be granted greater or lesser rights to corporate bandwidth. Similarly, key individuals can be given enhanced access privileges, perhaps for the duration of a special project.

Once this kind of fine-grained control is possible, other advanced network capabilities such as volume-based differential pricing and Quality of Service guarantees are likely to become more attractive.

The two main players driving the initiative are Microsoft and Cisco. Many other companies have also joined the movement, and it has now been passed over to the independent Desktop Management Task Force.

Distribution and Replication Protocol

Every Internet user waiting for Web pages or files to download knows that you cannot have too much bandwidth. Unfortunately normal Internet use is not particularly efficient as regards bandwidth utilisation, which means that the network's overall performance is even less than it could be.

An example of this inefficiency involves the updating of files. When information or data is updated it is nearly always the case that all of the files in each logical unit are sent over the Internet. Since many applications only change a small percentage of their constituent parts, this is manifestly extremely wasteful.

This was what made Marimba's approach with its Castanet product so interesting. There, Java programs could be updated incrementally, so that only those elements that had changed since the last download were re-sent. This meant great savings in bandwidth utilisation.

Marimba went on to extend its techniques to any programming language. Now it has gone even further, and proposed its general techniques to the World Wide Web Consortium as an open, non-proprietary standard for updating data, files and content. It is called the Distribution and Replication Protocol, and is designed as an extension of HTTP.

This eminently sensible proposal has the backing of most of the industry, except for Microsoft, which is probably concerned about losing control over what will doubtless prove a key element of software distribution in the future. To muddy the waters even further, another company, Novadigm, is claiming that elements of the proposed standard "may infringe its property rights".

Document Style Semantics and Specification Language

HTML is used for creating documents for display in a Web browser. Strictly speaking the markup used in a Web page is about the underlying structure, rather than its presentation, but for years HTML has been used as a way of creating a particular effect on the page.

The new Extensible Markup Language (XML) is much closer to HTML's original roots in the Standard Generalised Markup Language (SGML). This is purely about structure, and XML too says nothing about how any documents produced from its files will look. This task is left to a separate style sheet.

Style sheets have become more familiar with the support for cascading style sheets provided by the recent versions of Internet Explorer and Navigator. Style sheets allow advanced design elements to be created without the current HTML tricks, and for a single house style to be applied to different Web documents through the use of a separate style sheet file.

In the SGML world, style sheets are created in the Document Style Semantics and Specification Language, or DSSSL. As its rather intimidating name indicates, this is a very rigorous approach which allows SGML documents to be processed in a variety of powerful ways, including the arbitrary re-arrangement of elements.

Given XML's SGML heritage, it is perhaps not surprising that the emerging standard for XML style sheets, called the Extensible Stylesheet Language, draws heavily on DSSSL. Although this means that Extensible Stylesheet Language is unlikely to become widely used, it holds out the promise of some powerful applications when combined with XML.

Document Type Definition

Hypertext markup language (HTML) is sometimes described as a subset of the standard generalised markup language (SGML): that is, people understand HTML as a kind of cut-down version of SGML.

In fact, HTML is an application of SGML: an example of how the completely general SGML can define a particular document structure, in this case that of a Web page. That document structure is encapsulated in the document type definition (DTD). SGML is the language used to craft the DTD: it is a meta-language which can be used to create practical mark-up languages like HTML.

The full definition is never seen in HTML documents, although sometimes there is a vestigial reference found as a first line such as <!DOCTYPE HTML PUBLIC "-// W3C//DTD HTML Final 3.2//EN">. This essentially provides a cross-reference to the DTD used for the document.

How complicated DTDs are depends on how much structure you want to build into a document. The DTDs for HTML have become progressively more complex as features have been added; that for HTML 4.0 runs to tens of pages. DTDs contain abstract details about the markup that is permitted and what structure it has; which elements can be placed within others. It also specifies the characters that can be used in a document following that definition.

One of the main functions of a DTD is to let software check automatically whether a given document obeys all the relevant rules. Given the number of proprietary extensions to HTML it is probably just as well that such rigour has rarely been applied to Web pages.

Domain Name System

The Domain Name System, of DNS, lies at the heart of the Internet's approach to addresses. When you send a message to bloggs@acme.co.uk or use the Web browser to download the home page from WWW site www.ibm.com, your message and request for data must somehow find their way through the tangled mesh of networks that make up the Internet to precisely the right host at the right site.

This is achieved by converting a host name such as www.ibm.com into a numerical Internet address of the form 192.147.13.12, a process known as performing a host look-up.

Ideally there would be a central registry where the Internet sites with names such as www.ibm.com were held. But with millions of users joining the Internet every month, this is simply not possible.

Instead a hierarchical naming system is used whereby there are registries for local areas or domains, small enough so that each can be updated. for example, with-in a company there may be a database for individual users, and within a country or group of countries there will be a database of companies.

Then, if an address is required for a site that lies outside the home country, a request can be sent to the database in the destination country. This, in turn, may send on the request to a sub-database handling a smaller region or a company, which can provide the address, in question to the original enquirer.

Sub-domains

The main domains are familiar to anyone who has entered a few URLs: in the US, there is .com for the commercial domain, .edu for the academic world and .gov for government bodies. In the UK, the equivalents are co.uk, .ac.uk and .gov.uk where the country domain .uk is necessary because the default is always the US.

Nonetheless, the .us domain does exist, and is being used more and more by those who are conscious that the proportion of US sites as a percentage of the whole is diminishing steadily. Many that use the .us domain also adopt the useful policy of putting their state as the next sub-domain. For example, ca.us is California, fl.us is Florida and so on.

The rest of the online world seems to fall into three camps. Those that broadly follow the US sub-domains, such as Australia, which uses com.au and edu.au, those that follow the UK's approach such as South Africa (ac.za and co.za) and those which dispense with sub-domains altogether.

For example, Germany tends to adopt domain names that are descriptive, such as bauphysik.architektur.uni-kassel.de.

Many other countries take this approach.

The new domain names

As more businesses take to the Internet the subject of domain names is becoming critical. The basic problem is that the DNS system - used for converting human-friendly names like www.ibm.com into IP addresses - was never designed to cope with the millions of names or complex legal issues such as trademarks that are both facts of life on today's Internet. As a result, the current structure can no longer cope, and it is clear that changes must be made.

The first domain name registry, run in the US by InterNIC, has grown into a more distributed organisation. Alongside the main registry (http://rs.internic.net/rs-internic.html now administered by the company Network Solutions), there are other bodies, for example the European centre RIPE (http://www.ripe.net/) which co-ordinate name allocation in other regions. At a more local level are the national registries (http://www.uninett.no/navn/domreg.html) - in the UK there is Nominet (http://www.nic.uk/). Each of these registries is responsible for allocating the domain names within its particular area.

As well as national domains such as .uk, and .fr, there are general domains, including .com, .net and .org. While the former are functioning well, with new local sub-domains created where necessary - for example, there is now one called police.uk in the UK domain alongside the traditional co.uk and ac.uk etc - the .com domain is showing signs of strain. Originally designed for companies within the US, .com has become a global domain.

As a result of the increased demand, most of the memorable names have been allocated and - more seriously, perhaps - many thorny issues relating to trademarks are beginning to arise. For example, a company with a valid trademark in one region might register the name in the .com domain and thereby block another company's equally valid use of the same trademark elsewhere.

This has also led to the rise of what is often called trademark extortion: the registration of well-known trademarks as domain names in the hope of selling them to their rightful owners for a suitably large sum. It has therefore been clear for some time that the general top-level domains - those not within national domains - needed expanding, and the procedures for arbitrating disputes revising.

The main Internet body involved in this area is the IANA (Internet Assigned Numbers Authority), since it is responsible for handing out the corresponding IP numbers (into which the domains are converted by the world's DNS servers). The IANA therefore requested another venerable Internet body, the Internet Society to create a committee to draw up suggestions for an overhaul of general domain names.

Called by the rather uninspired name of the International Ad Hoc Committee, this drew on some of the key Internet and international trade organisations. As well as the IANA and the Internet Society, there were representatives of the Internet Architecture Board (which oversees the more technical aspects of the Internet, such as the IETF working groups) and the US Government's Federal Networking Council (FNB, perhaps best-known for the image it created of the Internet's US backbone structure).

The other members came from the Telecommunications Union, (it uses the rarely-used .int domain), the International Trademark Association and the new World Intellectual Property Organisation. The IAHC has now produced its proposals, and, perhaps surprisingly, will be implementing them almost immediately. Because the consequences and longer-term implications of this new initiative are potentially so important for businesses, and because this is very much the beginning of a process that promises to be both long and painful.

The proposals from IAHC for new domain names and other changes to the whole area of the DNS system can be found. There are three main areas: domain names, registries and the resolution of trademark disputes. The IAHC has come up with seven new generic top-level domains.

The new domains (as well as the older generic ones at a later date) will be allocated by up to 28 new registrars. This number is arrived at by allowing up to four registrars in each of seven regions in the world. These new registrars will not supersede the current national registries but offer an alternative to them, and outside their national hierarchies. Rather bizarrely, the four registrars in each region will be chosen by lottery, subject to the fulfilment of certain conditions relating to business and technical practices.

One important innovation of these new registrars is that they will be competitive: potential registrants will be able to pick and choose among them - on the basis of price, for example. One of the main criticisms of the current system for the .com domain has been the monopoly held by Network Solutions. The presence of multiple registrars around the world means that it should be possible to reduce the cost of registration considerably. If this system works well, the plan is to add even more registrars, at the rate of twenty to thirty a year.

However, this competitive system means that some way of unifying the multiple registrations must be available. This will be achieved through the creation of a shared repository, run by an independent and neutral third party. There will also be a further independent and neutral third party to run the master DNS server: this is necessary so that the new domains are integrated into the current Internet system. Without it, requests to sites using these new domains could not be routed.

On the trademark front, one proposal is for the creation of trademark-specific domain name spaces. These include a new sub-domain tm.int for international trademark registration, as well as national versions (such as .tm.uk). France already has such a subdomain. On the important issue of resolution of trademark disputes, the proposals include a clause in the application form for domains that binds applicant to agree to participate in either online mediation under the rules of the Arbitration and Mediation Centre of WIPO, or else in a binding expedited arbitration under these rules.

Leaving aside important questions of how these proposed dispute resolution procedures will work in practice - and especially how they will fit with national and international law in this area - there are some more general issues surrounding these IAHC proposals. First, there is the question of how the names were arrived at: the suggested new domains seem arbitrary and are certainly heavily-biased towards the anglophone world.

Secondly, they are being rushed through, without consultation, in a way that is bound to make their implementation much harder - see, for example, Nominet's comments. Finally, even before the proposals have been implemented, IAHC has a lawsuit to fight. All-in-all, it seems certain that much more work will need to be done in this area - and that companies will find themselves struggling with problems of domain names for some time to come.

Enter the weird domain of the Internet names

US presidential special adviser Ira Magaziner has created a furore by coming up with his own proposals for the Net naming system (January 1998).

The new domain name system proposed by the Internet Council of Registrars, announced over a year, was to have come into force around now.

Under this scheme, new domains ending in .firm, .store, .web, .arts, .rec, .info and .nom would have been added to the national domains such as .uk and .fr, and the generic domains .com, .net and .edu, which are widely treated as international in scope, even though strictly speaking they apply only to the US.

At the same time, some 28 registrars were to have been created, each able to offer any of these domains, as well as the pre-existing generic domains. The new domain system has not been activated, even though the contract with Network Solutions, the company that has a monopoly on handing out generic domain names, comes to an end in September. The delay is partly due to the US government's taking exception to the proposal to move ultimate control of the Internet's name space from the US to Geneva. In a move to head off this development, president Bill Clinton's special adviser on the Internet, Ira Magaziner, has drawn up his own proposals.

These were based on comments sent in by interested parties last year, partly as a result of their unhappiness with the council's approach. The draft of Magaziner's plan is at www.ntia.doc.gov/ntiahome/domainname/dnsdrft.htm

A splitting headache
Magaziner's scheme is fairly complicated. It involves splitting Network Solutions into two nominally independent parts, one providing registrar services (allocating names) and the other running the registry (the central database holding names and the corresponding IP addresses).

In addition, more or less anyone would be able to become a registrar and hand out names, while a small band of new registries would be created, each dealing with one of up to five new top-level domains (as yet unspecified).

However, the details are still very unclear; the Magaziner document asks for further input about many of its ideas, and leaves open key questions such as what the new top-level domains would be, who would be allowed to run the database registries, and how copyright clashes would be handled.

As the wealth of information at the Internet council's Web site indicates, the old plan may have been unpopular with some people, but at least it was complete and detailed. Some of Magaziner's ideas may have merit - though the general approach is very US-biased - but the manner in which the US government is trying to impose them is little short of disastrous.

Currently, it is not clear whether the council's proposals will simply fade away, whether they will be partially subsumed by whatever ultimately comes out of Magaziner's proposals, or even whether there will be a kind of Internet schism, with some DNS look-ups supporting the new council domains, and others following some new body set up by Magaziner.

An Internet council press release of 15 January 1998 cited March as the date for initiating its new domain names. A press release of 30 January, responding to the release of Magaziner's draft, seems to indicate that the Internet council is intending to press ahead.

Trial of strength
The Internet has already had a foretaste of how this might happen. At the end of January 1998, just as the Magaziner proposals were being finalised, the current root of the DNS system was bypassed, and a mirror system used instead. This was not the action of some crazed cracker, but a deliberate show of strength by Jon Postel, who is in charge of the Internet Assigned Numbers Authority, the official body that hands out blocks of IP addresses.

The authority is working closely with the group behind the council system but would lose its current role under the Magaziner plan. The fact that Postel was able to hijack the entire routing system of the Internet hints at just what might happen if mavericks decide to implement the council proposals regardless of the US government's attempts to steamroll its own plan through.

In March 1999, with a lot of talk and little action over the previous two years a new body that is charged with sorting out the mess, the Internet Corporation for Assigned Names and Numbers (Icann), barely got round to issuing proposed guidelines on how it intends to license new Internet domain name registrars.

As a result of this incredibly long and wearisome process, most users have probably assumed that either nothing much will ever happen, or if it does, it won't make much difference to the way they use the Internet for business. Unfortunately nothing could be further from the truth. For hidden within the draft guidelines from Icann are some potentially damaging proposals that could fundamentally change the way people use the Internet.

These proposals arise out of the white paper in which, as part of the process of privatising the DNS registry system, the US government asked the World Intellectual Property Organisation (Wipo) to "develop recommendations for a uniform approach to resolving trademark/domain name disputes involving cyberpiracy" among other things.

Although "ICANN's full consideration of the Wipo recommendations should await the final Wipo report", as the draft guidelines say, "many of the Wipo recommendations appear to serve the goals of the accreditation process," and ICANN seems likely to follow the final Wipo recommendations.

The background to these recommendations can be found online, as can the draft itself.

DNS round robin

For most people, the front door of commercial sites is represented by the domain name: www.ibm.com etc. But as far as the computers are concerned, what really counts is the IP address: 204.146.17.33 in the case of www.ibm.com. The technology used to map the domain name on to the IP address is the Domain Name System (DNS). The fact that most users are unaware of its operation is a testament to how efficiently it functions generally. Indeed, it is only when name look-ups fail that the system manifests itself.

Another situation where the DNS is working magic is particularly relevant to electronic commerce sites. Clearly, it is important never to turn away visitors, since they are all potential purchasers. This means that commerce sites must be able to scale to cope with visitor numbers.

Adding the hardware is simple enough, but a problem arises with the configuration. Since each machine must have a unique IP address, it would seem that different domain names must be used. This is a disaster in marketing terms, since users would never remember to rotate among them to ensure an even load at the site in question.

Fortunately this is not necessary, since DNS offers a facility called round robin. Instead of holding just one IP address to be associated with a domain name, several can be listed. Each of these is supplied in turn when a look-up is made, ensuring traffic is distributed evenly among the machines at a site, though not with any account taken of either machine power or current load.

Domain squatting

The domain name system was designed to allow numerical Internet addresses in the form 123.45.67.89 to be replaced with the more easily remembered bloggs.co.uk. But as has happened so often in recent years, what began as a neat engineering shortcut has grown into a pivotal facet of the Internet for business use.

Now this address mnemonic has become nothing less than one of the key elements of a company's Internet presence. As a result, domain names have passed from rather unremarkable synonyms for 32-bit IP addresses into hot intellectual properties with a quantifiable value.

And, of course, once money starts to enter the equation, there are always sharp operators prepared to create and exploit the market. The world of Internet names is no exception, and something known as domain squatting has become a familiar if rather regrettable aspect of domain name selection.

The practice involves the registration of domain names that are likely to be sought after by other users; typical targets will be major organisations and their brands along with common words. The theory (and sometimes practice) is that when the company decides it wants this domain it has to pay the squatter a considerable sum to give up the name.

Unfortunately, with the confused state of law on the use and abuse of trademarks online, it is not clear what recourse firms have against these speculators. Until the legal situation has been clarified - and the domain name approach extended to resolve the shortage of "good" names - such cyber squatters are likely to remain part of the business Internet scene.

In the draft guidelines from Icann are some potentially damaging proposals that could fundamentally change the way people use the Internet. This lies in the reasonable-sounding paragraphs dealing with a new dispute procedure (paragraphs 139 onwards). The key point is that "each domain name applicant would, in the domain name registration agreement, be required to submit to the procedure if a claim is initiated against it by a third party". The idea is to simplify the process of winning back trademarks that have been wrongly registered by what are often called cyber-squatters.

Although companies will applaud this aim, they might want to consider the strong criticism of the proposed way this will be done voiced by a certain Michael Froomkin. An academic specialising in Internet law, Professor Froomkin is not only an expert in this whole area, but possesses the further inarguable advantage of being one of Wipo's group of experts that were called upon to help formulate the draft recommendations.

So when Froomkin writes: "It seems transparent to me the result of such a scheme [the new arbitration system proposed by Wipo] - at least in the hands of minimally competent trademark counsel - is to create an enormous opportunity for brutally effective blackmail on the part of trademark holders against ordinary people", his bleak vision needs to be taken seriously.

Froomkin has put together a long and closely-argued analysis of the Wipo recommendations, complete with comments and proposals. It should be read by anyone in business who cares about the free and efficient running of the Internet. This is because the threat to "ordinary people" that Froomkin identifies actually applies just as much to companies, which would find their ability to register innovative and legitimate domain names greatly hampered.

Unfortunately the Wipo consultation process has been very low key to the point of invisibility, and it is already too late to provide input on its latest ideas (though you could still try).

However, ICANN may still be open to comments on its guidelines (March 1999), which will draw on the final Wipo proposals, at the address comment-guidelines@icann.org.

Downsampling

Visitors are obviously crucial to nearly every business site on the World Wide Web. But once electronic commerce is added to the mix - in the form of online sales, say - they become its very life-blood. Maximising the time they spend at a site - and their purchases - is a central concern of those running such sites.

Fortunately, companies using Web servers as their chosen tool have a very powerful weapon at their disposal - one that marketing departments in conventional sales environments would pay dearly to have. For from the time that a visitor enters a site, every move they make is recorded in a visitor log.

Information about how shoppers use physical stores can be obtained only through painstaking and expensive market research; but the visitor log contains all this for every single visitor to a Web site.

Hidden within this log, then, is marketing gold: the common patterns and overall trends that can tell a company how to spend its resources more effectively. The trick, of course, is extracting it.

Here, the company offering online purchases finds itself in a rather novel situation: unlike its peers in the ordinary world of shopping, it has too much data. A busy site can generate millions of log entries every day, making analysis a daunting task.

To get round this problem, the technique of downsampling can be employed. Rather than trying to analyse the entire log, only those visits that began in a given fraction of every hour are used, and the results then extrapolated.

Dynamic Addressing

To be truly "on" the Internet, you need to be identifiable by a unique Internet address of the form 123.45.67.89. This address is needed so that other nodes on the Internet - World Wide Web sites, FTP servers etc - can route to you the various information and data that you request. It is normally given to you either directly by your Internet supplier (e.g. Demon or City Scape) if you have a personal account, or else by your network manager in the case of a corporate connection where blocks of addresses have been allocated to the company in question.

However, it is not absolutely necessary that you have the same Internet address every time you log on as far as most Internet services are concerned. For example, an FTP server needs to know your address only at the moment that it sends a file, and is indifferent to what it was last week.

Many Internet providers and network managers take advantage of this fact to allocate addresses dynamically: each time that you log on (either to the Internet supplier, or to the corporate network) you are assigned an Internet address for that session from a pool of them.

The advantage is clear: instead of needing enough Internet addresses for every possible user, only enough to cover the demand for the maximum simultaneous number of users are required.

Whether you have a dynamic address or a static one (that is, the same for all Internet sessions) only really surfaces when it comes to setting up the TCP/IP functionality that hooks you into the Internet, as with the Windows 95 software.

Using the dial-up networking feature of Windows 95 you can set up multiple SLIP or PPP Internet addresses (though SLIP functionality has to be added separately: PPP is the default protocol throughout). Both of these can employ either static or dynamic addressing. Moreover, unlike the original TCP/IP set-up process that requires you to restart your PC every time that you make a change, this dial-up Internet connectivity is active immediately.

Although setting up this new TCP/IP functionality is much easier than the alternative approach, it is still likely to be beyond most users (not least because the jargon used is so unfamiliar). Recognising this, Microsoft has written what it calls an Internet Set-up Wizard (similar to the other Wizards now found throughout its software) that guides you through the process step-by-step using a series of screens with simple questions. Once you have answered all the questions, the dial-up Internet connection is created automatically.

The Internet Set-up Wizard is supplied as part of the Internet Jumpstart Kit, itself found on the Windows Plus! CD-ROM sold separately which adds various kinds of features to the basic Windows 95 product. Also of note here is the fact that the Windows Plus! disc contains Microsoft's Web browser, called Internet Explorer.

This is based on the original Mosaic browser, which has been licensed by Microsoft, IE possess a number of interesting features. For example, URLs (which are called shortcuts in Windows 95-speak) can be dragged from a Web page directly on to the desktop: double-clicking on them later causes an Internet connection to be made, Internet Explorer to be loaded and the relevant page accessed without further intervention.

Perhaps most fascinating of all, though, is another option of the Internet Set-up Wizard. As well as allowing you to use your own Internet supplier (by giving the telephone number, login name, password etc.) you can also choose to access the Internet via the Microsoft Network. This area signals perhaps the biggest change in Microsoft's strategy.

When you choose this option, an account is created for you with the Microsoft Network that allows you to access it not in the normal way, but with a full TCP/IP connection, using the built-in TCP/IP functionality of Windows 95. This lets you access all of the usual areas of the Microsoft Network, but also allows you to jump anywhere on the Internet (indeed, you can use the Microsoft Network and Internet tools simultaneously).

So, in effect, the Microsoft Network becomes just another part of the Internet, rather than a separate online service with limited gateways to it (the original plan). This enormously important shift of emphasis means two things: first, that Microsoft has realised that it cannot compete with the Internet, but must work with it; and secondly, that as a result the general rush to the Internet by companies offering services and the public using them is about to accelerate.

Dynamic Host Configuration Protocol

Despite the unplanned topology of the rag-bag collection of networks that go to make up the Internet, data packets do arrive (usually) at their intended destination, thanks to the addressing scheme employed. However, there are a couple of problems with the use of these addresses - numbers of the form 123.45.67.89 - which are assigned to the millions of Internet nodes.

First, addresses rather pre-suppose that they are allocated and then remain assigned to a particular computer that sits on a particular network. Unfortunately, the increasing use of mobile computers means that users may often log in to different parts of a network or even entirely different networks. Re-configuring the DNS tables to take cognisance of this fact soon becomes an administrative nightmare.

The second problem is perhaps rather deeper. The growth in the popularity of the Internet and the way that blocks of addresses are typically released means that there can often be a shortage of Internet addresses within a company.

The way round both of these problems is to use the Dynamic Host Configuration Protocol (DHCP) to allocate Internet addresses on the fly - dynamically - and temporarily. When someone logs on to any network supporting DHCP, they can be allocated an IP number for the duration of the session. This ensures that data packets are routed to them correctly wherever they are.

Moreover, the fact that the address is only 'leased' to them means that it can be used by another user when the first has finished. In this way a limited pool of Internet addresses can serve a far larger number of users who log on for only part of the time.

Dynamic HTML

One noticeable trend over the last year has been the move towards interactive Web pages. What were once static presentations of words, images and occasionally sounds are now turning into dynamic multimedia experiences. But this new format is bought at a fairly steep price.

In order to achieve a chameleon-like ability to respond in this way, Web page designers are forced to take one of two routes. One involves generating the varying contents and design of the Web page at the Web server. This has the advantage that there is plenty of processing horsepower, but the disadvantage that interaction with the user involves sending information back and forwards across the Internet - with all that this implies in terms of low bandwidth and slow response.

An alternative approach is to use a Java applet or ActiveX control to provide advanced multimedia features in the page. But this too means sending a large number of bytes over the Internet connection - in this case all at once initially instead of spread out over the session.

To get round these problems, a new lightweight kind of interactivity is being developed. In this approach, the processing is carried out at the client end to avoid the delays as data is passed over the Internet, but instead of requiring custom applets or controls to be sent first the functions required are built-in to the Web client as extensions to HTML and scripting.

Through the use of additional elements within the HTML code, Web pages can change in pre-programmed ways, or respond to user input, immediately and directly, without wasting time or Internet resources

The World Wide Web has been a continuing story of the triumph of ingenuity over the limitations of the basic underlying HTML technology. Perhaps the most dramatic instance of this has been the way that tables, formalised in HTML 3.2, have been deployed in an attempt to give designers the kind of pixel-level control they are used to elsewhere.

HTML 4, now at the draft stage (November 1997), promises to change all that. It offers the most radical extension of the basic HTML idea since its inception, providing total control over both the end-appearance and the underlying structure. Once HTML 4 is finalised, it promises to transform the Web far more radically than anything we have seen before.

Alongside incremental changes such as better support for international character sets (including right-to-left scripts) and improved accessibility for those with disabilities, HTML 4's most important innovation has to do with the way the basic elements of an HTML page are regarded and handled. Hitherto, HTML tags have simply been ways of describing and displaying elements of a document: headings, lists, paragraphs etc. HTML 4 instead regards them as a set of hierarchical objects. This Document Object Model (Dom), still only a rough sketch at the moment - see - is not just for the purposes of taxonomy: by turning HTML elements into objects, Dom allows them to be manipulated with great flexibility.

This means, for example, that it is possible to redefine how headings will look in a page, not just as it is loaded, but afterwards too. That is, Web pages no longer need to refer back to the Web server in order to change: they can interact with the user by employing Dom and code within the browser. Because of this ability to change and to respond, the name Dynamic HTML (DHTML) has been given to this new kind of Web page. Its full power is realised when Dom is coupled with scripting languages like Javascript or VBScript.

Using these lightweight versions, it is possible to create complex Web pages that possess, for many practical purposes, the full power of ordinary programs. For example, they can respond to events - pressing keys, clicking mouse buttons - redraw their screens and carry out computations just like conventional software. If this sounds too good to be true, it is. Although both Netscape and Microsoft agree that DHTML is the way forward, their implementations of it in version 4 of their browsers differ greatly. This means that, at the moment, developers are faced with the usual difficult choice: either to follow the Netscape or the Microsoft line.

Netscape has a DHTML basic resources page, more detailed technical information and demos

Microsoft hasDHTML pages with heavier stuff and yet more demos.

There is already a pair of excellent books on the different flavours of DHTML: Instant Netscape DHTML (£22.99, ISBN 1-861001-19-3) and Instant IE4 DHTML (£22.99, ISBN 1-861000-68-5).

For those who want to begin coding now, some tools that support DHTML are available. These include Microsoft's Front Page 98, HoTMetaL Pro 4 and ,Dreamweaver from Macromedia all with trial downloads. Macromedia has some good DHTML resources

Will DHTML end the demand for Java?

In many ways DHTML is an extension of the scripting languages already widespread on the Web. The revolutionary part of HTML 4/DHTML is that it allows scripting languages to access anything on the Web page via all the hidden tags not normally seen by users.

Hitherto scripting was limited by when it could be used and what it could be applied to. A good example of its current application is in the validation of input: when forms request information from users, Javascript or JScript (Microsoft's version of Javascript) can be used to check that dates are entered properly, or that E-mail addresses contain the "@" character, for instance.

With DHTML, Javascript can be invoked at any time, and alter anything on the page. You no longer need to click on objects on a Web page: just passing a mouse cursor over them can trigger events. And using DHTML it is not only possible to change things like fonts and layouts - including the use of superposed and shifting layers - but even the displayed text of a Web document (though the original held on the server remains unaffected).

What is really new about all of this is that there is no reference back to the server for the underlying programming logic. All of this happens on the client, using the scripting code in the Web page and the DHTML engine present in compliant browsers.

For businesses with intranets, this means that browsers can take on more complex roles, and provide richer front-ends. Whereas currently they act as a fairly easy-to-use but limited window onto corporate data, with DHTML they can start to interact with that data without needing to call auxiliary programs on either the client or server. This reduces the traffic on the corporate network and also simplifies the installation and maintenance of users' computers since updated versions of the scripting code can be downloaded along with the Web pages.

In this respect, DHTML begins to look suspiciously like Java. It can create highly-functional programs that are downloaded from a server to run on the client, and it will be completely cross-platform (provided there are DHTML-compliant browsers available).

It is this parallelism that has attracted Microsoft's interest. The ActiveX approach has revealed itself as fundamentally unsuitable as a general rival to Java technology on the desktop (though it still has its place on the server side). This would leave Java as the undisputed victor in the thin client stakes were it not for DHTML. Microsoft is now taking the view that what users want is not a completely new approach like Java but an extension of the very widely-used HTML.

This argument has many merits. HTML, it is true, has proved fundamental to the success of the new Internet/intranet/extranet model. HTML is very lightweight and extremely easy to employ, both for developers and users. And it is now under the aegis of the World Wide Web Consortium, a truly independent (and international) organisation.

But there are also some flaws in Microsoft's approach. Although HTML is very easy to code, DHTML is not. It requires both an understanding of a scripting language like Javascript as well as the new Document Object Model of HTML 4. Putting these two together creates something that is almost as complex as Java, but without the latter's rigour.

And while it is certainly true that much of the early uses of Java to provide advanced interfaces can now be achieved by DHTML, Java has moved on enormously since then. In fact these fairly trivial applets are becoming more and more rare as people realise that they contribute very little other than slowing down the user experience.

As DHTML becomes the standard solution for these kind of interactive Web screens, Java will continue to flourish elsewhere: as a cross-platform solution, a general approach for reducing the costs of writing and maintaining software, and in entirely new domains - smartcards, embedded systems etc. - that DHTML, for all its power, will never reach.

DHTML Editors

The most dramatic new element of HTML 4 is doubtless the implicit support for Dynamic HTML (DHTML), which allows pixel-level placement of elements, stacked layers, animation and wide-ranging interactivity. But this new power has a high price. In order to use all of the dynamic language's new effects considerable programming is required: basic HTML may have been limited, but at least it was simple to write.

Clearly this creates an opportunity for software houses to come up with Web page editors that allow designers to exploit the power of DHTML while shielding them from some or all of its complexity.

The previous generation of HTML editors is represented by SoftQuad's Hotmetal Pro 4 ( http://www.softquad.com/products/hotmetal/ ). Although this a good tool for creating rigorous Web pages (its roots lie in the world of Standard Generalised Markup Language, of which HTML is an application), and offers support for cascading style sheets, users will need to add advanced DHTML effects by hand.

Front page freedom
Microsoft's Front Page 98, by contrast, goes out of its way to make the addition of these as easy as possible. For example, to add animations and page transition effects you select from pull-down menus, and all the code is generated automatically and added at the appropriate place in the Web page.

Although real Web professionals will find this approach limiting, for those who simply want to create dramatic Web sites with the minimum of work, it is a good solution, and the latest revision of Front Page offers other refinements too. Front Page is naturally geared towards taking advantage of the particular DHTML effects offered by Microsoft's Internet Explorer 4. mBed's Interactor (www.mbed. com/try.html) is rather more flexible in that you can create dynamic pages aimed specifically at Internet Explorer 4 or Netscape's Communicator, Java-based browsers or Navigator with a special mBed plug-in.

Although the code generated depends on the end-environment selected, the process of constructing the dynamic page is the same. Various elements such as images, sounds, buttons, transition effects and paths are added to the basic HTML page. These all have properties that can be inspected (rather like controls in Visual Basic), and can be linked together by choosing from drop-down lists to create simple programs.

This technique allows a wide variety of dynamic effects to be created without the need for hand-coding. However, once again there is a trade-off between ease-of-use and power. In particular, it is hard to edit the underlying code once it has been generated.

If you want rather more control over this while still able to take advantage of automatic code generation, a better bet might be Macromedia's Dreamweaver. This exposes the details of the properties, behaviour and interaction between the various DHTML elements on a page more explicitly, and so allows you to customise the effects to a greater extent.

Dream machine
Dreamweaver offers a good balance between automatically-generated effects and the ability to change HTML pages by hand. But for the real power user the best tool is probably ExperTelligence's Webberactive 4.0.

This offers little in the way of automatic code generation, but compensates by providing perhaps the most complete support for creating complex dynamic Web pages by hand. For example, along with all the usual facilities for inserting HTML tags, anchors, images, tables and frames, it has a powerful tool called Tag Assistant.

This not only shows all of the attributes for each tag, but allows the former to be altered directly, again, rather like changing properties of a Visual Basic control. You can choose which document type definition (see last week's Net Speak) to follow, and the list of available tags changes accordingly. Stylesheets can also be applied, or created separately. Impressive too is a similar facility that offers the available methods, properties and events for relevant page elements when writing scripts. All in all, Webberactive makes writing DHTML pages from scratch considerably easier.

EDI - Electronic Data Interchange

Security fears have hindered the use of EDI on the Net but new encryption techniques are ensuring users' safety

Described elsewhere is how Internet technologies can be used within a company to provide a complete groupware/EIS solution using a standard TCP/IP network. Exactly the same approach can be used in electronic document interchange (EDI), an area currently the domain of proprietary solutions constituting a small and rather fragmented market.

Despite its vague name, EDI is really about document interchange, and special kinds of documents at that: those involved in commercial transactions such as orders and invoices. EDI is designed to save costs and increase efficiency through faster turnaround times.

A good background document to EDI can be found at http://www.imaginet.co.uk/edi/ feature4.htm#ABC, while fuller information is at http://www.ecworld.org/, part of the large holdings of the EDI World Institute (at http://www.ediwi.ca:6900/ welcome.html).

The two main standards are X12, mostly used in the US (details from http://www.premenos.com/standards/X12/ index/setindex.html) and the more recent global standard EDIFACT (EDI For Administration, Commerce and Transport, available from http://www.premenos.com/unedifact/).

Traditionally EDI has been conducted over special Value-Added Networks, or Vans. This was necessary to offer the required security (vital when financial transactions were involved). However, the downside has been the appearance of a number of rival Vans, resulting in compatibility problems. For example, for a company and one of its suppliers to use EDI both would have to agree on a common Van.

It is against this background that the Internet offers the perfect solution. It is a global, open standard, and so choosing it is a vendor-neutral decision. Most large companies are already attached to the Internet, and so adding EDI functionality does not require any additional communication infrastructure to be created (though obviously software will be required).

There are two main routes for Internet EDI: E-mail, which uses the fact that documents are involved, and FTP, which transfers them as binary files.

The benefits of using the Internet for EDI seem so compelling - no extra costs from using Vans, which traditionally have charged highly for their specialised services; instant, worldwide compatibility; and the ability to integrate EDI into other corporate network functions - that it might be asked why it has not become more common.

Of course, the principal problem of Internet EDI is security: as an open system, the Internet is not exactly watertight, and so sensitive documents cannot be sent over it directly. However, encryption techniques are now well-established for Internet services, and so wrapping up EDI documents appropriately is a straightforward task.

Nor is adding the other EDI paraphernalia that hard: there is both an Internet draft ( http://www.va.gov/publ/standard/edifaq/index.htm) and RFC text (RFC 1767 at ftp://ds.internic.net/rfc/rfc1767.txt) that spell out how to do this.

Moreover, there is even a commercial Internet EIS product already available, called Templar ( http://www.templar.net/). This has been written by Premenos, perhaps the leader in this field. As a couple of the URLs above show, Premenos also offers useful EDI resources ( http://www.premenos.com/).

Complementing these, there is an excellent independent UK site covering all things EDI, set up by Jim Smith at http://www.ibmpcug.co.uk/~jws/index.html. Also worth noting is the IETF-EDI mailing list, which has been established as a forum for discussing methods of operating EDI transactions over the Internet. To subscribe, send the message sub ietf-edi YourFirstName Your SecondName to listserv@byu.edu.

Elliptic curve cryptography

Public key cryptography lies at the heart of almost every secure Internet transaction. And yet it is not used to encrypt the messages themselves, but other encryption keys. These employ symmetric encryption methods: one secret key that both encodes and decodes the message. Public keys are used to send this important element to the recipient, since direct transmission of the symmetric key with the message would obviously nullify security.

This curious two-step process arises out of the fact that a price has to be paid for the wonder of asymmetric, public key encryption: it is extremely computation-intensive, and so it is not feasible to encode the entire message by this means. This dual technique has worked well so far, but there are two problems that need to be addressed soon. The first is that computer hardware continues to fall in price and increase in speed, making it necessary to strengthen the encryption employed to maintain the same relative security. This in turn means longer keys, and even longer processing times.

The other problem is that it is not possible to apply current public key encryption systems in situations where there is very little processing power: in smartcards, for example:- A solution may be offered by a new public key encryption system called elliptic curve cryptography. Like the currently-used Rivett-Shamir-Adleman (RSA) technique, which employs number theory, elliptic curve cryptography is also based on obscure mathematics, but requires far less processing power. The downside is that it is not yet as well-tested as RSA's approach.

Electronic Business

Software companies have had to take great pains to keep up with the development of Internet technologies within business. For some years corporate re-engineering was all the rage, and software companies were happy to encourage this trend in the hope of some useful sales along the way.

Ironically, these same companies have more recently had to reinvent themselves in order to cope with the uptake of Internet technologies within business. As with the earlier spate of re-engineering, the process has often been difficult, and not all companies have managed the transition gracefully.

One company that has managed to do so, despite its late start, is Microsoft. Since 1995 or so the company has passed from Olympian indifference through amateurish first efforts to an extremely shrewd understanding of the Internet's implications and potential - and of how to work the Net, in all senses.

Refocusing
It has achieved this by refocusing the entire company on this area, and making the Internet central to just about everything it does. In the process, it has become a paradigm of Internet re-engineering.

But other software companies have not been so lucky as to have a powerful leader capable of recognising and remedying major blunders through swift and decisive action. Instead, they have had to work out piecemeal - and often slowly and painfully - their own accommodation with the new Internet world.

Rather ironically, IBM's task was easier than it might have been. After losing the leadership of the software sector to Microsoft, the Internet represented as much an opportunity as a threat.

Recognising the shift in business caused by Internet technologies, IBM has cleverly taken the occasion to unify much of its rather sprawling product line under the catchy banner of e-business. It even invented its own 'e' symbol based on the by-now familiar @.

Of course, this has more to do with marketing than technology, but the e-business brand, with an optimistically long list of products has allowed IBM to update most of the company in a fairly organic way.

Its Lotus subsidiary has also been surprisingly successful in this sphere. It has weathered the seemingly fatal challenge represented by open intranet software to its proprietary groupware Notes, by employing a clever migration policy, notably through the Domino product line.

Another company whose core business was undermined by the arrival of the Internet, Novell, has fared less well. It took too long to recognise that TCP/IP would supersede its own SPX/IPX protocols. With Intranetware it started to move towards Internet protocol seriously, but that process is still not complete. Still, Netware 5, should remedy that.

However, the inroads made into its market by Windows NT, which has always provided good TCP/IP support, may now be too deep to halt.

Keeping aloof
Oracle is an interesting example of a company that has been able to stay aloof from all these great technology battles, since databases can work as adjuncts to more or less any approach. However, the company has wisely chosen to bolster its position through a serious of ancillary products that are designed for the Internet world.

These include an application server, commerce server, video server and proxy server.

A common thread to all these re-engineered companies is Java.

IBM has the ambitious San Francisco project, Novell an increasing commitment to Java - not to mention one of Java's godfathers, Eric Schmidt, running the company, while Oracle has become the leading proponent of Java-based network computers.

Electronic Commerce

The first generation of business Internet applications were little more than online poster sites. A basic Web server provided information about a company and its products, but apart from wandering around the usually rather thin holdings, there was not much a visitor could actually do. The next generation added connectivity to back-end databases, which in turn permitted Web pages to be generated on the fly. In this way product information could be completely up-to-the-minute and customised for each visitor.

But even these sites were relatively primitive. They certainly did not mirror the full reality of how a company functions. In this respect, the latest releases of the leading electronic commerce software packages are a real advance and represent perhaps the first third-generation business Internet applications. Their complexity stems from the richness of the processes they seek to model. As a result, although the Web server forms the portal to the electronic commerce site, it is now a relatively minor part of the whole solution.

With these products, the simple Web-database tie-up of second generation Internet software has become one involving databases for customers, products, inventory, accounts and so on. Many of these may well be further linked to other legacy systems. At the front-end of the site, alongside the dynamic generation of customised pages (either using standard languages such as Javascript, or through proprietary server-side scripting tools) there are important merchandising techniques that must be accommodated.

These include intelligent cross-selling and multiple discount schemes (depending on the buyer's history). Similarly, some kind of electronic shopping trolley to hold pending purchases needs to be provided. Once a sale has been closed, the ordering process must be set in train, involving inventory controls (checking that goods are still available and automatically notifying purchasing managers if stocks are low), sending confirmations (perhaps by E-mail) and invoices or receipts (by post, if necessary).

At some point, monies must change hands. This implies the use of secure transactions (via SSL) at the very least, and may also mean offering the newly-established SET protocol to allow completely secure credit card transactions. One option to ease the process for users is to create an electronic wallet that resides on the client machine and stores information such as credit card details and home and office delivery addresses.

And as well as tying in with the vendor's existing accounting system, electronic commerce software may well have to cope with multiple taxation schemes (depending on where the supplier and purchaser are located). Another complex issue is shipping: purchasers will expect various options, and the pricing for these will probably change according to their location.

Clearly, then, electronic commerce solutions meeting most or all of these requirements are extremely complex programs, and represent a major advance beyond previous Internet offerings. What is also interesting is how many of the leading products are now quite reasonably priced, and also available for Windows NT. That is, online commerce may soon be an option for every company, even relatively small ones.

In fact, even for those organisations that would never contemplate selling their products on the open Internet (perhaps because they serve only a local market, and shipping elsewhere would be prohibitively expensive), this software may still become a vital tool. The reason for this is the rise of extranets. These secure linked intranets mean that business-to-business selling (perhaps using the Open Buying on the Internet standard, discussed recently) is likely to become as important as sales to the end-user.

Indeed, it is significant that one major electronic commerce player, Actra (home page at http://www. actracorp.com/) - formerly a joint venture with electronic data interchange specialist GE Information Systems, now wholly owned by Netscape - is concentrating almost exclusively on this sector.

Saving money to make more dosh

There's no need for major investment to get a commercial Web site going: there are a host of cheap new products out there to help.

Once upon a time, setting up a Web site that sold goods or services would typically mean having to invest hundreds of thousands of pounds in bespoke software and an expensive Unix machine. Today the total cost for an out-of-the-box solution running Windows NT can be a few thousand pounds with no loss in quality.

O'Reilly's Website Professional 2.0 costs just £499, yet the range of software provided is astonishing. In addition to the basic Web server, there is a built-in search engine facility, a site-mapping program, the Home Site HTML editor, Perl 5 and even the programming language Python.

For electronic commerce, there is support for Java server applets, Microsoft's Active Server Pages, and another server-side scripting technology called iHTML. This employs non-standard HTML tags that are processed by the iHTML engine: because they reside purely on the server, and generate ordinary HTML, they work with any Web browser.

They form the basis of the bundled merchant server application: this is a powerful solution, but does mean that, as with Website, you have to be prepared to code to add advanced features. However, Website can be recommended for its Swiss army knife approach to site creation, its exemplary manuals and unbeatable value. A trial version can be downloaded from http://website.ora.com/wspro2/demo_form_frame.html.

For those looking for more help when building commerce sites, iCat's Electronic Commerce Suite 3.01 is as good a place to start as any. This takes a completely different approach from Website, one built around store templates. The idea is to shield users from the underlying HTML. Like Website, iCat too uses its own server-side language, called iCat Carbo Command language, to build complex catalogues from databases.

iCat's greatest strength is on the merchandising side: for example, it is ideal for cross-selling and creating discounted offers. However, not everyone will take to its rigid Web-based approach: while this makes it easy to get up and running, anything out of the ordinary means plunging into the Carbo Command language. Luckily, the manuals are good.

The professional version of iCat's product running under NT costs £6,995. Another solution taking a very similar approach is Intershop Online 2.0 Like iCat, the basic unit is the template into which data is dropped. Unfortunately the demo version allows little of the underlying technology to be observed, so it is hard to judge how malleable it is. Intershop Online for NT costs $4,995 (£3,122).

The firms mentioned so far are relatively small players, a reflection of the immaturity of the electronic commerce sector. This makes IBM's Net.commerce 2.0 all the more interesting. The installation process is a rather complicated business, but the end result is remarkably usable.

This is not least because IBM too has adopted a Web-based approach, and has topped everything off with a dash of Java for creating templates, to great effect.

One product that is still to make it on to Windows NT is Oracle's Internet Commerce Server: this should be available sometime next year. The other major player in the NT electronic commerce sector is, of course, Microsoft, and its rather unusual solution, the Siteserver bundle.

Microsoft aims to cash in through E-commerce

E-commerce will really take off in 1998. At least, that's what market research reports say. Microsoft must certainly be hoping this is true. With the Commerce Edition of its Site Server product (home page at www.microsoft.com/siteserver/commerce/default.asp), it has come out with its richest offering in the E-commerce field yet.

The Commerce Edition takes an agglutinative approach: you must have the standard version of Site Server installed first before adding E-commerce extras.

The product aims to serve three markets: internal company sales; sales to the public; and business-to-business sales. Internal sales are largely about authorisation processes, and the Commerce Edition is designed to address this. For external sales, a major addition is an advertising module. Microsoft has once again engulfed an entire fledgling application market by bundling the necessary software with other products.

Fine control
The Ad Server allows fine control over the ad size and placement, and gives plenty of management information about customer response. Other advanced online shopping features include intelligent cross-sell. This allows goods to be offered on the basis of what the individual customer has bought now or in the past, or on what customers with similar tastes have purchased.

Far less obvious, but in many ways far more crucial, is the fact that the selling process is handled by Microsoft's Transaction Server. This works in conjunction with the Order Processing Pipeline, already present in version 2 of the Commerce Edition of Site Server.

Extremely clever
The Order Processing Pipeline's visual representation of the various elements of the purchasing process is extremely clever. But version 3 goes even further. Because each of the elements in the pipeline is an object, its interaction and the overall control of the ordering process can be left to the Transaction Server.

In this way, the integrity of the purchasing transaction is automatically guaranteed without the need for further programming. Such transactional checks are vital for serious E-commerce applications, and the marriage of Site Server with Transaction Server is a shrewd move.

The transactional capabilities of Commerce Edition are important, but they are not Site Server's most innovative element. Alongside the Order Processing Pipeline, Microsoft has introduced the Commerce Interchange Pipeline. This is an extension of the Order Pipeline to business-to-business commerce. It allows firms working together across virtual private networks to automate the exchange of information, and to use Transaction Server to ensure integrity.

This is a very important move for Microsoft. It takes it squarely into the extranet domain that Netscape has hitherto both defined and dominated. What makes Microsoft's solution more impressive is that it is explicitly designed to use extensible markup language (XML) as one of the possible data exchange formats.

True to form The idea is that whatever the form of the databases used by firms taking part in an extranet, the information they hold can be converted to XML files without losing any of the databases' structure. Also, one database structure can be mapped automatically on to a different one using XML as the supplier-independent translation medium.

The importance of the Commerce Edition of Site Server 3 is not just that it offers many powerful tools for conducting online commerce. Rather, it shows that Microsoft has addressed perhaps the biggest gap in its portfolio - extranets - and leap-frogged the competition by embracing the future standard in this area: XML.

Security alerts

Manufacturers are rushing out fixes to a minor flaw in a key encryption standard. None the less electronic commerce could find it falls prey to a much greater danger. After many years of scepticism, business finally seems to accept that not only can the Internet be used by companies, but that it will be an important medium for commerce itself. But a recent event has highlighted the extreme fragility of the entire E-commerce edifice.

The announcement by a researcher at Bell Labs that he had found a flaw in the Public Key Cryptography Standard number 1 (PKCS 1) sounded about as exciting as the discovery of a kind of beetle by entomologists. But the implications of this apparently academic exercise in cracking codes are important.

To understand what PKCS 1 is, and why it matters, we need to revisit the world of public key encryption as applied to commerce. The crucial advantage of public key cryptography is that it is not necessary to exchange a private key for the purposes of encryption beforehand.

Instead, it is possible to use two keys, one private and one public.

Secure protocol
This cryptological breakthrough allowed Netscape to come up with the secure sockets layer protocol. This enables secure channels to be opened between Internet clients and servers.

Secure sockets layer, in its turn, has become the basis for practically all today’s E-commerce.

The flaw in PKCS 1 matters, because secure sockets layer uses this standard. A way of cracking messages sent using PKCS 1 means a way of reading nominally secure messages sent over an secure sockets layer Internet connection. Credit card details, confidential information and all the rest are as a result – theoretically – liable to electronic eavesdropping.

RSA is the company that owns the patents on these cryptography methods. They host a number of pages on the topic: an index page, a good question-and-answer document, and a more technical explanation.

In practice the problem is not enormous, because the researcher’s technique consists of sending literally millions or billions of special messages to a server, and using the errors the latter returns to decipher an encoded text. In practical circumstances, it is highly likely that such an attack would be noticed well before it was successful.

Even though the risk is largely theoretical many software manufacturers have rushed out fixes. For example Netscape (notice how just about every product is affected), Microsoft and C2.

Others, such as IBM and its subsidiary Lotus, have downplayed the affair. RSA itself will be coming out with a revised version of the cryptography standard protected using something called Optimal Asymmetric Encryption Padding.

Trust is the key
The reason for the concern is that e-commerce will succeed only if the majority of users trust the underlying security. Even though the flaw in the cryptography standard is more apparent than real, the worry is that it might shake confidence.

In fact the ramifications of this incident go much further. As the above described, the cryptography standard vulnerability filters down through the use of secure sockets layer to affect most secure servers. But underneath the cryptography standard is something even more fundamental: the public key techniques themselves.

Remarkably, there are only two basic ways of doing public key encryption: one based on prime numbers; the other on what are known as elliptic curves. Both derive their strength from the fact that brute force attacks against them would theoretically take millions of years.

But if one day some shortcut techniques were discovered (or have already been, secretly) the entire e-commerce house of cards will come tumbling down. And that will not be so easy to fix as the current minor problem.

Microsoft's real goal: E-commerce

Conventional operating systems are on the way out, but Microsoft is ready for the next big thing - online commerce. hatever the final outcome of the current legal action by the US Department of Justice against Microsoft, it will soon be more or less irrelevant. Assuming that the case is not thrown out completely, the impact of a judgement involving Windows 98 or even NT5 will be undermined by the diminishing relative importance of conventional operating systems.

This may occur through attrition by the Open Source movement in general, and Linux in particular, or it may be because the main focus for software will move on to embedded systems, hybrid communication-computer units and other new digital devices - all areas where Microsoft has yet to gain a significant presence.

But the Seattle-based giant did not become the world's most valuable company through blindly clinging to one market-leading product line. If the firm has a watchword it is adaptability, as its successful U-turn on the Internet in December 1995 proved.

In fact the new, post-Windows Microsoft is already taking shape, out on the Internet, in the form of substantial if largely unsung successes in the E-commerce sphere. This group of sites is potentially so important that they deserve to be tracked regularly by anyone with an interest in online commerce.

Thinking ahead

The first sites appeared almost two years ago, and were already ahead of their time. Expedia and Microsoft Investor provided intermediation services - respectively in travel and finance - that were far more sophisticated than the other product-based offerings of better-known E-commerce sites such as Amazon.com and CDNow.

Where the latter essentially had huge databases of things, which they sold, the Microsoft sites offered relevant information that linked in to services and products provided by third parties.

Microsoft has since added Carpoint, which, alongside background information about particular models, sells them by putting potential buyers in touch with the nearest dealers holding models of interest and linking to relevant used car classifieds. More recently it has created a new Web site to act as a mediator between house buyers and sellers with Home Advisor.

Together with these monothematic sites, there are also a couple of more general E-commerce projects. Sidewalk offers local information, and related classified advertising, while The Microsoft Plaza is an online mall, with merchant sites for different categories.

Cross-promotion

As you might expect, the Plaza lists Microsoft's own E-commerce sites where possible. But such cross-promotion is also omnipresent on the single-theme sites. In fact, in E-commerce, as for the Internet, Metcalfe's law applies: the value of a network increases as the square of the number of nodes. The more E-commerce sites Microsoft has, the more cross-promotion it can run.

Of course, media firms do this all the time. What is novel is the instant linking that the Internet provides, allowing visitors to be carried straight to where they can buy at precisely the moment they are most likely to do so.

Microsoft's E-commerce sites may be young, but they are already very successful. For example, Carpoint drives $200m (£122m) of sales a month, while Expedia has weekly sales of $5m .

And Microsoft has barely begun. With the exception of Sidewalk, all the sites are US-centred. But the essence of the Net - to say nothing of Microsoft's ambition - is global, and Expedia, at least, is poised to offer its services elsewhere.

Given the trend for one online site to become dominant in a given sector, Microsoft may well soon become the main conduit for entire sectors of E-commerce.

As a result, these sites may be the beginning of a new Microsoft bigger than anything so far. Let's hope the Department of Justice is watching them.

E-mail

E-mail represents a cultural shift away from working methods based around telephone calls, letters and face to face meetings. E-mail combines the immediacy of speech with the convenience of the written word. Like letters E-mail (nearly) always arrives; no online engaged tones, no typing tag. But as with telephone conversation, there is a tendency in E-mail to react immediately, to write - as you would speak - without thinking too much about the words or overall form.

This leads to the biggest problem of E-mail, the fact that while it reads as a transcript of speech with all the benefits of spontaneity and informality that implies, it lacks the vital ancillary clues usually accompanying conversation. In particular, the tone of voice and non-verbal signals sent by facial expressions or body language are missing. All too often this generates misunderstandings, rash responses and the escalation of E-mail to what is called flaming; raw out-pourings of emotion rather than a reasoned reply. If there is even a faint possibility that your words will be mis-understood by somebody they almost certainly will be. Writers of E-mail should read what they have written from the stand-point of their hardest critic and wait before sending the message to give a change to re-read what has been written. What seemed witty at the time of writing may look pretty foolish when considered more objectively.

Comments can be added to explicitly hint about your written intent or smileys can be used. See the section Smileys.

How to retrieve binary files by Internet E-mail

The standard for Internet E-mail specifies messages composed of alphanumeric characters that can be represented by seven bits of a byte, whereas binary files use eight. A technique of encoding is used where three eight-bit characters are put together and re-written as four six-bit characters. The latter are admissible with an E-mail message, and can be sent over the Internet.

On reception they are converted back to the original group of three bytes by reversing the process. The intermediate encoding is quite arbitrary, provided a common standard is agreed. The most common one is uuencoding.

So, to retrieve the binary file pkz204g.exe held at the site ftp.tex.ac.uk you would send the message: connect ftp.tex.ac.uk binary uuencode chdir ctan/tex-archive/tools/pkzip/ get pkz204g.exe quit

Two new commands have been added; binary tells the public FTP site it will be sending a binary file. uuencode tells the FTPmail server to send the file on to you as an E-mail message using the uuencoding technique. The request is sent in the usual way to ftpmail@doc.ic.ac.uk.

The utility uudecode is available on most UNIX systems. PC users will need a standalone program such as Wincode or Winzip that does the decoding.

When sending encoded files FTPmail servers usually break them up into messages of 64,000 characters or less. On receipt you must then rename each message sequentially: file1.uee, file2.uue, file3.uue, etc. The uudecode utility is intelligent enough to find subsequent files after the first one.

Digging for Data at the Internet's Core

Many mailing lists have a secondary function as a store for information. Searches in these archives; those using the Listserv software, can be obtained using E-mail. In the document that is sent when you joined a mailing list will be details on searchable archive, it any. Normally you will have to obtain the basic reference guide.

To retrieve the guide send the message info database to the address listserv@irlean.ucd.ie (or any other Listserv mailing list address that you know of or have joined). The main syntax is quite straightforward. As an example if you were using Netscape and found difficulties with the URLs that you entered, leading to a DNS look-up failure you might hope that others on the Netscape mailing list had similar problems and perhaps even solutions. To find out, you would send the message // Database Search DD=Question //Question DD * Search DNS OR 'lookup failure' in Netscape Index /*

to the address listserv@irlean.ucd.ie. This requests that the Listserv program searches for either of the phrases 'DNS' and 'lookup failure' in the archive of postings to the Netscape list. The Index command simply asks for the results. In due course you will get something like:- Item# Date Time Recs Subject ------ ----- ----- ----- ------- 000106 94/11/20 21:17 162 New Netscape beta ... 000728 94/01/23 06:08 26 Re:DNS lookup failure...(followup)(solved..sor+ etc.

To view item number 728 you would send the following command: // Database Search DD=Question //Question DD * Search DNS OR 'lookup failure' in Netscape Print all of 728 /*

to the same address. The only difference is the 'Print all of' command.

How to access the World Wide Web using Internet E-mail

At first sight it might appear a hopeless task to try to use E-mail to access the World Wide Web. After all, one of its most important facets is the hypertextual nature of its structure. And yet there is indeed a site that will send back to you Web documents upon receipt of your e-mail request. As usual, no charge is made for this ingenious service.

The syntax could hardly be easier: you simply send a message of the form

send http://www.tardis.ed.ac.uk/~paola/inetuk/providers.html

to the address listserv@mail.w3.org. That is, you just place the complete URL of the Web page you are interested in (including the http://) after the word send. In the example above, you would then receive back a document that begins

UK Service Providers UK SERVICE PROVIDERS This page was last updated on 25 Apr 1995. There are also automatically generated summary[1] and  detailed[2] lists from when they were posted to  Usenet on 25 Apr 1995. Send additions and corrections to inetuk@arcglade.demon.co.uk[3]. See the section on the required entry format[4] before submitting. Go back to the Internet index[5].

etc.

As you look through the e-mail document you will notice numbers in square brackets: these correspond to the hypertext links (the hotspots) that exist in the original Web page. At the bottom of the E-mail message there are 'footnotes' corresponding to these numbers which contain precisely the URL referenced by that hotspot. It is therefore possible, by sending another E-mail to the listserv@mail.w3.org address, to follow any of these links that exist in the original hypertext document.

For example, say you wanted to find out more about Compulink Information eXchange's WWW page, which has reference [94] in the above document. Using the corresponding footnote, you would therefore send the following message (derived from the URL given in footnote 94):

send http://www.compulink.co.uk/CIXemo.html

to listserv@mail.w3.org. This would then return text beginning

Overview of CIX (Compulink Information Exchange) Overview of CIX - Europe's largest computer conferencing system CIX (pronounced 'kicks') is a highly popular, British based, conferencing system, providing access to hundreds of conferences etc.

In this way, any Web page can be retrieved, and any link within it (including those to FTP and Gopher sites, though obviously not to interactive telnet addresses). Although the multimedia element is missing from the messages returned, this technique does at least allow you to access the content of Web pages in a remarkably simple way. If you wish to find out more about this service, send the message

www

to listserv@mail.w3.org as before.

How to track someone down on the Internet

One of the principal achievements of the Internet has been the bringing-together of more than 30 million people in a seamless web of personal communication that embraces one-to-one E-mail, one-to-many Usenet newsgroups and many-to-many interactions like Internet Relay Chat. It is therefore somewhat ironic that there is no kind of centralised directory for this huge collection of users where you can find out how to contact one of them. Although the Internet probably gives the general user direct access to more information than any other system previously invented, reliably finding out something as apparently simple as an e-mail address is almost impossible.

This is not because there are no Internet white pages - directories of users and their Internet locations: it is more that the many attempts to map out who uses the Internet are woefully limited, covering particular areas well, but little outside those domains.

A good example of an ambitious but so far unsuccessful attempt to create a central source of information is provided by the X.500 service. This highly-formalised system does indeed provide much in-depth information about people held on its servers around the world, but the coverage offered is only a small proportion of the total user population. X.500 is mainly found in Europe, particularly in academic establishments, where its rigour seems to appeal. More about the system including ways of accessing it can be found at http://www.earn.net/gnrt/x500.html.

If you are looking for someone based in the US, a better system to try is Whois. Like X.500, this adopts a client-server model with a distributed database; unfortunately, also like X.500, it has been taken up mainly in universities. Although the information that it holds is rather skimpier than that found in the X.500 directories, Whois is easier to use. A list of Whois databases is available from ftp://sipb.mit.edu/pub/whois/whois-servers.list.

The Whois system has at least attained a certain prominence in the Internet world as one of the main white pages; the same cannot be said about the Computing Services Office (CSO) servers and their gnomically-named client Ph (short for phone) which is used only very sporadically. Once more, this service is mainly found at academic establishments.

In the absence of a centralised database of users, a number of alternative approaches have been developed. For example, the Netfind program asks you to specify likely sites or type of organisation and then carries out searches using a list of possible candidates that it generates from that input. This makes Netfind a convenient tool if you have an approximate idea of where the person you are looking for is located. It can be accessed at telnet://monolith.cc.ic.ac.uk, logging in as netfind. There is also a World Wide Web gateway at http://alpha.acast.nova.edu/netfind.html.

Knowbot Information Service, also known as KIS or just Knowbot, adopts an interesting technique that shows what might one day be achieved in terms of drawing together all these disparate threads. It is not so much a directory service in its own right as a front-end to various pre-existing databases, including X.500 and Whois. Its big advantage is that it will interrogate those sources automatically, which saves you logging on separately; its disadvantage is that it carries out its search blindly and unstoppably, and cannot be fine-tuned by further intervention from the user. Knowbot can be accessed at telnet://info.cnri.reston.va.us; no login is required, but you are asked to leave your E-mail address in its visitor's book.

Junk E-mail

Spamming (see separate section) has been around for awhile. Junk E-mail has taken longer to arrive because it is logistically harder to set up. It requires hundreds of thousands of E-mail addresses to be gathered (either by hand, or using software), while spams are trivial to carry out. But where spams effectively are washed away in the deluge of messages found in most Usenet newsgroups, junk E-mail by contrast pollutes one of the most personal aspects of using the Internet, your E-mail in-tray.

E-mail represents one of the most direct ways of reaching somebody - even more direct than letters or telephone calls. Junk E-mail is therefore a particularly personal affront in terms of wasting time and abusing a service.

These new junk E-mailers gather their lists from many sources, principally Usenet newsgroups. It is also possible that other mailing lists and Web sites requiring registration sell on their lists, though they should give you the option to block this. Since E-mail addresses must generally be public to be useful, it is hard to stop them falling into the hands of the unscrupulous.

In terms of response, you can try replying to junk E-mail with a message asking where they obtained your name. If this is not bounced by the system (as it often is), you might then pass a few idle moments sending a few more such messages. If junk E-mailers become flooded with junk replies from enough of their victims, it becomes difficult for them to function.

If your mail is bounced, you can try writing to the postmaster at the address the junk E-Mail originated from (for example postmaster@abc.com if the sender's E-mail address ends abc.com). Every site must have such a postmaster, so it should generally be possible to reach the offending company in this way. However, there are various tricks that junk E-mailers can use to hide their identity further. In this case your only option is to direct your comments to any advertisers carried in the junk E-mail, pointing out how counterproductive their approach is. Unfortunately all of these actions require yet more of your time - something that has already been wasted enough by this new and regrettable Internet plague.

Industry acts to clear the Net of junk E-mail

The problem of junk E-mail has become one of the most annoying problems of the Internet. It is therefore worthwhile keeping track of the latest developments in this area, not least in the hope of a solution. A good summary of the issues from a business point of view has been put together by the Internet Mail Consortium. There are also General junk E-mail/spam resources

One way of fighting junk E-mail is to use existing legislation, notably that regulating pyramid schemes and other frauds. Recently, the US Federal Trade Commission and similar bodies around the world have started to act against such schemes. See New legislation is also under consideration in the US. One bill has been promoted by an anti-spam organisation called at Coalition Against Unsolicited Commercial E-mail and details can be found at http://www.cauce.org/why.html#smith

Another anti-junk E-mail site offers good analysis of this and other similar bills

Self-regulation
An alternative approach is self-regulation, and this has been adopted by the Canadian Direct Marketing Association, which forbids its members from employing junk E-mail without the user's consent. Of course, the problem with this and similar schemes is that it has no effect on the more dubious junk E-mail merchants who are not members of any such associations, and who will continue regardless.

Unfortunately, it is unlikely that legislation will succeed in controlling them either: on the Net, it is only too easy to move outside jurisdictions that have anti-spam laws. Similarly, some of the more extreme online vigilante action - whereby Internet service providers that harbour junk E-mailers are subjected to various forms of harassment - will ultimately only drive spammers off-shore. What is needed is a technological solution that can be applied by the user.

One obvious approach is to apply filters to E-mail. This can be done by the service provider, using lists of known offenders, or the more sophisticated system from the Mail Abuse Protection Scheme. Or it can be done locally, on the user's machine. Most advanced E-mail programs offer a facility whereby E-mail can be pre-sorted and even automatically deleted according to the sender, heading and contents.

Cloak and dagger
Unfortunately, junk E-mailers are only too adept at hiding their intentions, at least in subject lines. They also use fake E-mail addresses to cloak the origin of these messages.

None the less, it is possible to dig out addresses with a little work, or by using programs such as Spamhater, which not only finds originators, but also has template letters of varying anger for firing off to the offenders. But reacting to junk E-mail is a waste of the user's time, whether in setting up filtering rules or adding new domains to be blocked.

What is needed is a way of checking the credentials of the sender automatically, digital certificates are one solution. Once these become widespread - as they will for authentication and authorisation on intranets and extranets - they can be used to establish with certainty the provenance of E-mail.

Users might choose to accept only those certificates already known to them, or to accept any certificate from a reputable certification agency that promises to repudiate certificates of anyone who indulges in spamming. This would get around the problem of off-shore junk E-mails, as they would still need certificates from well-known agencies.

Free E-mail

Free E-mail functions as an extension of the dominant economic model used on the Internet whereby Web site content, for example, is provided free in return for viewing and possibly responding to banner advertising. Similarly with free E-mail, accounts are provided on the basis that various forms of advertising will be displayed along with the mail messages you receive.

The pioneer in this field was Juno which employs proprietary software and a collection of dedicated dial-up points. Users do not pay even for the phone-call to download their mail, but this rules out use by non-US nationals since there are currently no overseas dial-up points. Its main rival, Hotmail takes a different approach: displaying E-mail messages as Web pages. This has a number of advantages. First, you do not need proprietary software to view your mail: any browser will do.

Secondly, and perhaps most importantly, you can retrieve your E-mail from anywhere in the world that you can connect to the Internet - which also means it is available for non-US users. The third advantage (for Hotmail, at least) is that the advertising which pays for the service can be displayed along with the E-mail message just as it would within a conventional Web page.

Irrelevant advertising
One downside is the fact that to enrol for Hotmail you must give various kinds of personal information, including your income. This is so the adverts placed within the Web pages displaying your E-mail can be targeted. This is a good idea in principle, since it means users are not troubled with the kind of irrelevant advertising that bedevils the world of junk E-mail.

None the less, users are rightly chary about giving up intimate personal details of this kind - whatever the privacy guarantees offered. Of course, once an idea takes off, others are quick to emulate. An interesting development among these second-generation free E-mail services is they tend to be used as adjuncts to existing Web sites.

For example, for the user directory Four11 a free E-mail service was a natural addition. The fit was also good when the search engine Yahoo bought Four11. In an attempt to establish itself as a one-stop shop on the Internet, Yahoo has constantly been adding ancillary services, and now at free E-mail was an obvious candidate.

Less intrusive
So obvious that one of Yahoo's main rivals, Excite, has done the same. The advantage of using Yahoo's and Excite's services is that the information required from you is less intrusive, and users may prefer to take this route rather than divulging more details to Hotmail. As well as the benefits of zero cost and accessibility from anywhere in the world, it is worth mentioning one other motive for setting-up an extra E-mail account.

Since all of these services depend critically on their users reading their E-mail - so that adverts are seen, and advertisers are happy to pay - companies offering free E-mail tend to be among the most vigilant when it comes to fighting junk E-mail. They try to filter out the most infamous offenders, and often offer extra tools to allow users to sort their mail still further.

As a result, free E-mail services may well be the best for public use on the Internet - in Usenet postings, say, or for the many free services or trial software schemes that require an E-mail address. Rather than giving out your main E-mail details - and run the risk of having your inbox flooded with junk - you could use a secondary account just for this purpose. Your personal address could then be reserved for the use of close business associates, friends, family etc.

Dangers of E-mail

Readers will need no encomiums about E-mail, or the benefits that can accrue from its use in business. It is against this background of E-mail as a key business tool that requires a warning note. For E-mail, as currently employed, could turn out to be a business's worst - yet secret - enemy.

To see how this might be so, consider the US Department of Justice anti-trust action against Microsoft. The case itself has been made almost unbearable through the tedious - but undoubtedly purposive - pedantry of Microsoft's lawyers as they question every statement made by government witnesses.

Fortunately, these long stretches of legalistic nit-picking have been punctuated by moments of high drama when one side or the other has, as if from nowhere, pulled out old and highly embarrassing E-mails that undermine what the witnesses have just said. Aside from the schadenfreude of seeing the computer industry's powerful and rich tripped up, it is interesting that none of the leading companies seems immune from this error.

And if some of the shrewdest operators in the computing world have remained blind to the threat such incriminating messages represent, then it seems almost certain that most other companies using E-mail also harbour similar legal time-bombs just waiting to explode in some future court case.

Convenience

It is not hard to see how this situation has arisen. E-mail is so convenient that once an intranet is in place electronic messages quickly become the norm. E-mail is simple to create, and to send to as many as hundreds of recipients. Worse, it is easy to store.

The falling price of hard discs means that every worker can store every message they have ever sent or received - just in case. Moreover, there is also likely to be a consolidated corporate store of messages.

And these stores are just what a lawyer or police officer with a suitable court order will be looking for. Applying a decent intranet search engine to such a collection of messages will make it easy to locate any combination of incriminating words (such as "polluted" and "Java", say). What would have been impossible 10 years ago - searching through millions of documents by hand - can now be accomplished in a second or two thanks to intranet technology.

So companies need to consider carefully which E-mail messages are kept, and for how long. There is a strong argument that the default should be for all E-mail to be deleted as soon as possible unless there is a good reason - or legal requirement - for keeping it.

And certainly the decision to keep E-mail should be taken centrally, with input from company lawyers, and not left to individuals.

Ironically, XML will only make matters worse. It allows files in proprietary formats to be turned into text files just like E-mail - and makes searching through them even simpler. Drawing up an E-mail policy now will make it easier to deal with these later challenges.

Encryption key length

The main encryption technologies all have at their heart the public key technique that allows an encryption key to be sent in an open way (typically with the message itself) without jeopardising the security of the encrypted contents.

This key consists of an extremely large number used in the mathematics that makes the public key approach possible. The size of that number (expressed as the number of binary digits required to describe it) is known as the key length. The bigger the key size, the larger the number, and the more secure the encrypted message.

The recent news that a message encrypted with the Netscape browser in its secure mode using these techniques had been cracked is a consequence of the fact that the key length employed in non-US versions of the software is relatively small.

This in turn is as a result of US export laws that forbid the general sale abroad of products with full-power encryption (since such systems are classed as munitions, of all things). The much larger key length in the US version of Netscape remains, for the moment, beyond the computational reach of any known computer, but unfortunately also unavailable to UK users. There is some hope that the recent success in cracking the smaller key length will allow at least somewhat more powerful versions to be exported.

Further information on cryptography and the main encryption tool on the Internet, Pretty Good Privacy (PGP) can be found at http://draco.centreline.com:8080/~franl/crypto.html. The cryptography page (at http://draco.centreline.com:8080/~franl/crypto/) has links to background information on the subject, and to digital cash. The PGP link (at http://draco.centreline.com:8080/~franl/pgp/) has many good links to other pages, including one explaining where you can get PGP (at http://draco.centreline.com:8080/~franl/pgp/where-to-get-pgp.html), a complex subject because of bizarre US munitions laws that forbid the export of elements contained within PGP products, even though these are now freely available around the world. Even if they weren't, they could easily be written from scratch - see the URL http://draco.centerline.com:8080/~franl/rsa-guts.html for an amusing demonstration of this.

Related to this issue is the plight of PGP's author, Phil Zimmerman. Because copies of PGP are now found outside the US, Zimmerman is under investigation for an alleged breach of the US export laws. More about this and his European Legal Defence fund can be found at http://draco.centerline.com:8080/~franl/pgp/phil-defense-fund-europe.html.

ECMAScript

One of the many confusing aspects of the Internet for newcomers is the strange dynamics that rule its operation. For example, there is no governing organisation that establishes correct practices, much less any structure for enforcing them and yet there are indeed standards on the Internet: it is simply that they evolve by general agreement, and their acceptance is equally consensual.

Nobody is forced to use the Domain Name Server system, or HTTP; but without the former your Internet sites will be invisible, and without the latter they will not be accessible by tens of millions of Web clients. For this reason, the issue of standards has become a kind of Holy Grail for companies selling Internet and intranet products, an indispensable ingredient in the marketing mix.

This manifests itself in curious ways. For example, both Microsoft and Netscape (the two main players in the standards game as they are on the Internet) habitually trumpet the fact that they submit their approaches to the Internet Engineering Task Force or World Wide Web Consortium as if that by itself were enough.

More recently, there has been an even more extreme manifestation of the standards bug. Companies are now turning over their products to independent bodies in an attempt to demonstrate once and for all that they represent "real", open standards.

Microsoft's ActiveX is under the auspices of the Open Group, Sun is trying to establish Java in a similar way, and now there is ECMAScript. This is the new incarnation of Netscape's Javascript (and Microsoft's JScript) as a "true" standard, under the auspices of Ecma, originally the European Computer Manufacturers Association.

Enfopol

One of the collateral benefits of the current Internet share frenzy is that the Net is no longer regarded, as it once was, simply as a playground for terrorists and pornographers. There is an increasing consensus that it is emerging as the central medium not just for global communications, but for worldwide commerce, too. However, governments are acutely aware that the rise of this supranational medium poses a unique threat to their old powers of surveillance and control.

In fact, there are already operations that aim to undermine the Internet's power. It will probably come as something of a shock to most readers to learn that "within Europe, all e-mail, telephone and fax communications are routinely intercepted by the United States National Security Agency, transferring all target information from the European mainland via the strategic hub of London then by satellite to Fort Meade in Maryland via the crucial hub at Menwith Hill in the North York Moors of the UK".

This statement can be found in the executive summary of an extremely detailed and ultimately rather dispiriting report on political control prepared by the Omega Foundation in Manchester and presented to the European Parliament's Scientific and Technical Options Assessment (STOA) panel. The full report can be found online.

As if this were not bad enough, there are plans to extend this surveillance to include all European Internet transmissions. The plans are known by the suitably Orwellian name of Enfopol 98, and there is a site devoted to the subject. A time-line detailing how the Enfopol 98 plans came about and have been progressing towards realisation is available.

The original Enfopol 98 draft resolution in German, and an early revision in English, are the most explicit about what the Enfopol plans entail, while the most recent version has adopted a more guarded formulation.

However, as the explanation of the current situation details, one of the requirements of the Enfopol legislation is for "Internet Service Providers to set up high-security interception interfaces inside their premises. These interception interfaces would have to be installed in a high-security zone to which only security-cleared and vetted employees could have access." This would allow law- enforcement agencies to gain access to any Internet communication, and aims to provide them with all the relevant IP addresses, passwords and e-mail addresses of any Internet session in more or less real time.

The Telepolis site, which houses the Enfopol papers, is to be lauded for making these important documents available and for providing useful commentary on them. This is in stark contrast to the European Parliament's site which is so labyrinthine that it is almost impossible to track down any of the relevant papers, despite their enormous importance to the public and business.

One of the few explicit references to the Enfopol plans (officially known as "Lawful interception of telecommunications") can be found here, recording that the draft resolutions were adopted by the European Parliament on 7 May 1999.

Given both the US origins of the entire Enfopol project, and the ways in which the US has already abused other such eavesdropping systems for commercial advantage, it seems extraordinary that such important damaging resolutions should have been passed by the European Parliament, without input from business, or public debate.

Extranet

The idea behind extranets - that the corporate intranets of business partners can be linked tightly together to create a secure network allowing closer co-operation - is clearly appealing.

However, the size of the challenge facing those who wish to create such an extranet should not be underestimated. It implies a complete review of the intranet technologies used by all the companies concerned.

It is significant that one of the few extranets to have been announced is Netscape's: only a company right at the forefront of Internet technologies stands much chance of realising the extranet vision at this early stage.

Alongside the logistical problems of connecting different networks together is a more subtle issue that needs to be addressed. Even after the basic extranet plumbing is in place, there is the question of how the data that flows across it should be structured.

For simple applications, such as accessing internal Web sites or LDAP directories, this is not a problem; but for more complex applications, such as intercompany sales and supply, the issue of data format becomes paramount.

Fortunately, an initiative begun last year by the Internet Purchasing Roundtable may provide a ready-made solution. The initial impetus for what is now known as the Open Buying on the Internet (OBI) standard was to ease the purchase of goods across the Internet in the business-to-business arena.

Specifically, it was aimed at high-volume, low-cost purchases, especially those in the area sometimes known as MRO (maintenance, repair and operation).

Because unit cost is low, it is important to hold infrastructure costs down. This, in turn, implies a widely adopted, open standard with many players, so that competition keeps margins tight. OBI was created with the backing of major companies, such as Ford, and financial institutions, such as American Express, which supported the initial work.

The basic idea behind OBI is relatively straightforward. A corporate requisitioner accesses a supplier's system using a Web browser across a network to view a catalogue of supplies. After choosing the required items, the electronic billing is handled automatically. However, OBI goes into rather more detail than this superficial analysis might suggest.

For example, it specifies that the catalogue seen by each seller must be tailored specifically for them, perhaps generated dynamically from a supplier's back-end database. It must have search facilities and a list of frequently ordered items to expedite the selection process.

A user profile database holds information about the buyer, such as authorisation limits, billing and shipping preferences, to supply defaults in on-screen forms.

A receipt of order must be generated by the process to allow full audit, and many different payment options have to be supported, including bulk invoices and cheques, EDI invoices and electronic transfer methods.

Security is a key element of OBI: contract prices must remain confidential, and all transactions must be protected through encryption. Digital certificates are also vital for establishing the identity of the parties.

Sensibly, OBI is based on current Internet standards, such as HTML for content display, SSL for secure Internet communications, X.509 for digital certificates and SET for credit card transactions.

But the standard looks to the future too: it specifies that OBI applications must support international use and not be dependent on US-only technology. OBI has its home page, and there are excellent and extremely full explanations of the standard available from the OBI Library

Although complex for such low-level transactions, OBI seems likely to catch on, not least because Microsoft, Netscape (through its part-ownership of E-commerce company Actra) and Oracle have all said they will be supporting it in future products.

Moreover, as extranets are implemented, companies will find that they need precisely the structures defined by OBI to maximise the benefit they obtain from them.

FAQ

Frequently Asked Questions (FAQs) are documents that attempt to encapsulate fundamental knowledge about the Internet in various areas. They are not official publications in any sense, but have been put together by public-spirited Internet users as an aid to newcomers. They grew out of the Usenet newsgroups where there is a natural tendency for people just joining to ask the same basic questions. Usenet FAQs are posted periodically to relevant newsgroups, and can also be obtained from reference sites by FTP.

To obtain a list of available documents from the main Usenet FAQ site at the Massachusetts Institute of Technology, send the following message to ftpmail@doc.ic.ac.uk connect rtfm.mit.edu chdir pub/usenet-by-group/ dir quit

In due course a list of over 1,000 FAQs will be E-mailed to you. To receive a particular FAQ, say the one posted periodically to the newsgroup alt.winsock send the following message to ftpmail@doc.ic.ac.uk connect rtfm.mit.edu chdir pub/usenet-by-group/alt.winsock dir quit

This retrieves a list of various files held in pub/usenet-by-group/alt.winsock directory. You need this to give the correct name of the FAQ file you want to retrieve. The FTPmail message sent in response to the above commands is as follows: -rw-rw-r--14 root 3 109099 Nov 3 01:34 comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_1_of_3 -rw-rw-r--14 root 3 87289 Nov 3 01:34 comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_2_of_3

These unwieldy names can then be used to retrieve the FAQ in question by sending to ftpmail@doc.ic.ac.uk messages such as the following: connect rtfm.mit.edu chdir pub/usenet-by-group/alt.winsock get comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_1_of_3 quit

FAQs are not limited to technical or computing areas, but exist for most subjects.

A better way to obtain FAQ is from http://www.cis.ohio-state.edu/hypertext/faq/usenet . The Web front-end has been added to help you move around.

A list of more or less all the anonymous FTP servers in existence is found at http://hoohoo.ncsa.uluc.edu:80/ftp-interface.html. The list includes a short description of what can be found there and hotspot linking for instant access.

A repository of FAQ can be found at http://www.cis.ohio-state.edu/hypertext/faq/usenet/.

File transfer Protocol

One of then central features of the Internet is the ability to transfer files between computers connected to it. The standard that handles this is the File transfer Protocol (FTP). Note that ftp in lower case forms the beginning of many Internet addresses.

Many sites allow so-called anonymous FTP, as they offer huge stores of free or shareware programs.

Anonymous FTP means you do not have to be a registered user of a site to access it. Instead, when the log on prompt appears, enter the word anonymous of ftp as a generic user-name. A password is usually requested. Internet etiquette requires you to give your full E-mail address so that those running the anonymous FTP site have a record of users.

Once admitted to such a system, depending upon local off-peak and number of permitted anonymous visitors you will be faced with a directory structure. To see its topmost layer type dir. Each site is unique but usually there is a README file (read_me, index.txt, or similar). This often gives information about the directory structure. There is generally a directory called /pub which is a public directory containing downloadable files.

Imperial College's anonymous FTP site (ftp://src.doc.ic.ac.uk) has more than 30 Gbytes of free software. Of particular interest is src.doc.ic.ac.uk/computing/systems/ibmpc which contains DOS, win3, win95, nt and simtel sub-directories. Each contains an index file (zipped and lst files) which are worth downloading and marking up off line. The same shareware files are available from ftp.informatik.uni_muenchen.de

Microsoft's anonymous site is ftp://ftp.microsoft.com. Instead of organising in sub-directories Microsoft includes several thousand files in one directory, so using a browser to view the directory is not efficient. Enter the name of the file in URL or us an ftp application.

Chinese software is available from ftp://ifcss.org/pub/software/.

It is possible to retrieve FTP-accessible files using basic E-mail that can send and receive messages over the Internet. Some sites accept requests of FTP files via E-mail. These the retrieve, using the standard FTP function, and then forward them to the source E-mail address of the original request, all free of charge.

The requests are sent in the form of a list of simple commands that tell the FTPmail sites where the file is called. A typical set of instructions would be as follows: connect ftp.src.doc.ic.ac.uk chdir computing/systems/ibmpc/win95 list/ get index.txt quit

In the above example, the FTPmail site is being asked to connect to src.doc.ic.ac.uk, change directory and get the file index.zip. The quit command marks the end of the sequence. To see a full list of the possible commands available, send the message help to ftpmall@doc.ic.ac.uk. This is also the address to which you send your request for a file. Note that each command should be on a new line, obtained by pressing the Return/Enter key; you can leave the subject line blank or input an identifying name to help you recognise the file on its return. The FTPmail server will then process your request, fetch the file and send it to you. Within a few hours, or perhaps a day, the requested file should appear in your mailbox.

Files available by FTP will often be given in the following format:

(technically known as a Uniform Resource Locator or URL), and also without the initial ftp://. To use the FTPmail facility you must break this up into its component parts, e.g. connect ftp.demon.co.uk chdir pub/archives/uk-Internet-list/ get inetuk.lng quit

Eventually you will receive a list of Internet providers with details of who provides what and for how much.

Note that the files used in the examples are ASCII or pure text format; binary files can not be downloaded this way.

Firefly

Microsoft's announcement that it is to buy the company Firefly (April 1998) might at first seem surprising. Firefly is a typical Internet start-up, with products in the new and as-yet rather unproven area of online privacy: see the privacy resources.

It has developed a scheme it calls Passport, which enables users to create, free of charge, a profile for use online. When used at Passport-enabled Web sites, it allows personalised information or services to be offered automatically while giving users control over what personal information is conveyed to those sites.

Exploit information
On the supplier side, Firefly has a number of products designed to exploit the data held in Passport profiles:Passport Office, Passport Network Hub and Catalog Navigator.

These are all very interesting ideas, and certainly privacy is likely to become an increasingly important area as E-commerce takes off. But on its own it is hardly reason enough for Microsoft to spend a sizeable sum on acquiring the whole company. Indeed, Microsoft already has considerable expertise in this area, as its privacy and profiling submission to the World-Wide Web Consortium indicates.

Firefly is of interest not so much for its present products as for the work it is doing on an important future World Wide Web Consortium standard called the Platform for Privacy Preferences, known as P3P. This represents the generalisation and extension of a privacy proposal put together by Firefly and Netscape, called the Open Profiling Standard. However, the platform is much more ambitious, and has the backing of both Microsoft and Netscape.

It is the relatively long-standing (by Internet standards, at least) relationship between Firefly and Netscape that makes Microsoft's acquisition more interesting.

Stranglehold
Through its purchase of Firefly, Microsoft gains a stranglehold on an emerging technology and removes one of Netscape's allies. For, however diligent Firefly may be in continuing to work with Netscape, the latter will hardly be happy about revealing its secrets to a part of the Microsoft empire.

But Microsoft's move goes even further than this. Firefly was also one of the prime movers of the Information & Content Exchange ICE proposal. This is a way to allow the automatic but controlled exchange of business information. For example, one Web site may wish to draw in constantly changing data from other information providers: ICE offers a framework for doing this.

What is noteworthy about the proposal is that it is based on the new Extensible Markup Language (XML). Put this together with Microsoft's work on Extensible Style Language, its submission on XML-Data (and Net Speak) and XML-based E-mail threading, and you have a situation whereby the company almost totally dominates what many regard as the biggest advance since HTML.

The acquisition of Firefly, with its expertise in the Platform of Privacy Preferences - yet another major application of XML - sees Netscape cut off from a partner and potentially isolated from the single most important Internet development in recent years.

Firewalls

There are business advantages to be gained by a company connecting to the Internet, unfortunately there is a downside to be considered, one that stems from the very nature of this global network.

The Internet is essentially democratic in that all points on it are equal: there are no centres. A related property is its symmetry: when you connect a computer to the Internet, the Internet also connects to you, and while the connection is in place any of the several tens of million users can access your machine - and probably those networked to it - in a variety of ways.

Clearly for businesses with confidential information stored on machines attached to their corporate networks, this is a worrying prospect. But of course it is not a new problem - it is implicit in the original design of the Internet - and there is now a well-established defence.

This generally goes by the name of the firewall, by analogy with the physical obstacles that are placed to halt the advance of fire within a building. Similarly, Internet firewalls are designed to block the spread of the digital fire that licks around networks in the form of unauthorised access. When potential intruders attempt to gain entry to a corporate system attached to the Internet the firewall stops them safely outside the internal network. At the same time, suitably configured, firewalls will still allow company users to access the Internet (were this completely blocked you might as well disconnect from the Internet completely).

At its simplest, the firewall is created by a separate computer that acts as a kind of cyberbouncer, rejecting unauthorised accesses. Unauthorised in this context might mean that it comes from the wrong Internet addresses (or, more usually, not from one of those held in a file of acceptable Internet addresses), using the wrong Internet port (which is generally associated with a particular Internet service like FTP or telnet, and which may have known weaknesses) or a combination of these.

These packet filters (also known as screening routers) work by examining individual TCP/IP packets for basic network information. Alongside them are the application gateways that allow Internet services and their users to be controlled more directly. An example of this approach is the proxy server running on a firewall computer which retrieves information on behalf of an internal user, and then transfers the results of the query to the originator. In this way, only the proxy machine is visible to the outside world, and can be well-armed against attempts to subvert it.

As the above indicates, the principle of the firewall is almost trivial: it is the details of the implementation that are crucial. For this reason there is no substitute for practical help and advice from practising firewall administrators, and one of the best places to obtain both is from the Firewalls mailing list. To subscribe to the digest form, send the message

  • subscribe firewalls-digest
  • to the address majordomo@greatcircle.com.

    Now is a very good time to join: after several months of near-inactivity there has been a flurry of activity on this list, with several digests being sent out every day. For those who wish to catch up on some of the background to the subject, compressed back issues are available by FTP from the URL ftp://ftp.greatcircle.com/pub/firewalls/digest/.

    Also highly recommended are two books on the subject. The first, Firewalls and Internet Security by Cheswick and Bellovin (price £20.95, ISBN 0-201-63357-4) is the accepted classic in this area. As well as much hands-on detail, it is notable for containing a gripping tale of how the authors fought off the attempts of an intruder to enter their system, complete with logs of the epic battle. More sedate, but even more practical is Internet Firewalls and Network Security by Siyan and Hare (£32.49, ISBN 1-56205-437-6).

    Among the well-known names already in this area are Sun (http://www.sun.com:80/sunexpress/europe/catalog/uk_english/parts/FIR-121-B.html) and Digital (http://www.digital.com:80/info/Customer-Update/950601004.txt.html). Less familiar in this country, but well established in the US are ANS+CORE (http://www.ans.net/Products?interLock/InterLockBrochure.html) and BBN Planet (URL at http://www.bbnplanet.com/doc/spatrol/spchart2.html) part of one of the very first companies to work on the Internet.

    Other firms include Border Software (http://www.border.com/contents.html); Checkpoint (http://www3.checkpoint.com/intro-ver.2.0.html); Harris (http://www.hcsc.com/trusted/cyberguard_bulletin.html); Livingston (http://www.livingston.com/); Trusted Information (http://www.tis.com/docs/Products/gauntlet.html); Secure Computing (http://www.sidewinder.com/sidewinder.html); Raptor (http://www.raptor.com/prodinfo/ds/eagle/eagle.html); and Virtual Open Network (http://www.v-one.com/product/swall/oview.html).

    One of the best firewall resources is British, and can be found at http://www.zeuros.co.uk/firewall/.There are useful white papers about the basic concepts at http://www.zeuros.co.uk/firewall/mustread.htm.Particularly good are the Firewalls FAQ at http://www.zeuros.co.uk/firewall/mirror/www.v-one.com/pubs/fw-faq/faq.htm.

    Fonts

    Languages

    One of the great attractions of the World Wide Web for businesses is its global reach. But this apparent universality is something of an illusion: it is, indeed, easy to reach customers in other lands, but it is very hard to speak to them in their own language. Most people probably know that some "foreign" characters can be created in a Web page using numeric codes - for example &#200 produces an e-acute. It is also possible to enter entities such as &eacute; to produce the same effect. All of these generally refer to the basic character set formally known as SO-8859-1 or ISO Latin-1 (see http://babel.alis.com:8080/codage/iso8859/jeuxiso.htm).

    It is generally assumed that Web pages are irredeemably wedded to this character set. But in fact HTML consists simply of a stream of bytes; what these bytes represent is arbitrary. One obvious solution to the problem of representing other character sets is therefore simply to re-define what those bytes mean. For example, instead of ISO-8859-1 (Latin-1) the ISO-8859-2 set could be employed. This gives all of the "normal" Latin characters, but replaces some of the Western accents with those required for Central and East European languages such as Czech, Hungarian, Polish and Slovenian.

    To work in any of these languages, you need to have installed a font that offers the extra characters and the ability to ability to change to this font in your browser. How to do this depends on the Web browser: next week's feature will give some practical details. For the Cyrillic alphabet there is the ISO-8859-5 set. However, for historical reasons most Web sites have chosen a rival encoding, known as KOI8-R; there is also another Windows encoding called Win1251.

    There are similar problems with other languages. For example, Japanese Web pages employ one of three systems: JIS, Shift-JIS and EUC. This is over and above the complication that Chinese and Japanese Kanji characters cannot be encompassed within the 8-bit space used by ISO-8859. Instead a two-bit representation is used - requiring yet more intelligence in the software so that these can be converted correctly.

    Arabic and Hebrew present another challenge to the browser: the representation of characters that are written right to left. There is also the issue of context: Arabic characters can take one of four forms depending on whether they stand alone, or are found at the beginning or end or in the middle of a world, and may also require ligatures to join them to surrounding letters.

    To try to define a standard format for all character sets the Unicode (http://www.unicode.org/) or Universal Character Set (UCS) has been drawn up. The idea here is to replace the single byte representation used by ASCII by a general two-byte coding. Because such two-byte encodings can be problematic for many programs, two related encoding schemes (ftp://ds.internic.net/rfc/rfc2044.txt) have been drawn up, the UCS transformation formats (UTFs) UTF-7 and UTF-8. These employ single bytes, and have the property that they preserve the 7-bit ASCII character encoding.

    Next week's feature will show all this theory is used in practice, and how multilingual Web pages are now relatively simple to create. But it is worth flagging up some other areas that still pose major challenges to the Web as currently constituted. For example, although it is relatively straightforward to create multi-lingual Web pages, the HTML code remains in ASCII. Alongside complex proposals such as the Extended Reference Concrete Syntax (http://www.sgmlopen.org/sgml/docs/ercs/ercs-home.html), there are simpler ones (http://www.alis.com:8085/ietf/html/draft-ietf-html-i18n.txt) involving the use of a language attribute with HTML elements.

    URLs too are currently limited to ASCII characters, but ideally would allow users to specify addresses in their own language and script. One proposal is to employ the UTF-8 encoding to offer this while providing backward compatibility. Yet another challenge is to cope with forms input in different languages. For good introductions to this whole area, see Multilingual World Wide Web (http://www.ebt.com/docs/multling.html), Internationalisation of HTML (http://www.w3.org/pub/WWW/International/francois.yergeau.html) and the excellent site at Babel (http://babel.alis.com:8080/).

    Few people in the anglophone world worry much about these, and probably equally thin on the ground are those who have tried to add multilingual capabilities to their browsers. A good reference point for users interested in this area is the site for multilingualism and the Internet is http://wwli.com/library/localize.html.

    Perhaps not surprisingly, Microsoft's Internet Explorer is well thought-out as far as international use is concerned. For all its faults, Microsoft is a company that is keenly aware of the importance of localisation for software. Internet Explorer is available in 9 languages (that is, with menus changed appropriately), and it is also easy to add extra characters sets to view multilingual pages.

    This is done by downloading the appropriate font file (http://www.microsoft.com/msdownload/ieadd/03.htm), (the choices are Simplified Chinese, Traditional Chinese, Japanese, Korean and Pan-european). Running the file causes it to be installed and the relevant changes for Internet Explorer made automatically. Thereafter, there is a pop-up list of available character sets available in the bottom right-hand corner of the browser window. When you encounter a page using a character set other than Latin-1 you can simply select from this list to refresh the page.

    Adding this capability to Netscape is much harder, and reflects this young company's relative inexperience in dealing with international markets. First, you need to find the relevant fonts yourself (in practice, the simplest solution is to use those provided by Microsoft). Once these have been installed, you must then activate them in Netscape. This done from the Options menu, choosing General Preferences and Fonts. For each of the encodings you specify the font that you have added. Then, to use this encoding for a Web page, you will need to go to the drop-down list available on the Document Encoding entry on the Options menu.

    None of this is very intuitive; worse is the fact that for Japanese font capabilities you have to edit an entry in the Windows registry - the software equivalent of brain surgery, and about as risky. If you want full foreign language capabilities for Netscape, it may be easier to buy the plug-in (http://www.accentsoft.com/) called Navigate with an Accent from Accent Software. This adds a new drop-down list of character sets alongside the main menu buttons. An evaluation copy is available from http://www.accentsoft.com/download/dleng.htm. Unfortunately the add-in disables important features such as plug-ins, frames and Java.

    Accent produces its own standalone browser (based on the original Mosaic). This too adopts a drop-down list of language options, though strangely Chinese is absent. Accent does, however, offer both Arabic and Hebrew, something that neither Internet Explorer nor Netscape is capable of. An evaluation copy is available from the URL given above. Another product in the Accent range that can be downloaded from there in a trial version is Accent Publisher. This addresses the other side of the multilingual problem: creating Web pages with character sets other than Latin-1.

    With Accent Publisher, you can design a page in most European languages plus Arabic and Hebrew (floating keymaps let you use a QWERTY pad to enter non-Latin characters) and then to convert them to HTML files automatically. More advanced features such as tables are supported. Also notable is the ability to swap among 21 languages (including Arabic, Greek, Hebrew, Russian and Turkish) for the menus.

    Another browser product based on the original Mosaic is Tango, from Alis Technologies, whose site was mentioned last week as a useful starting point for exploring Internet multilingual issues. An evaluation copy can be downloaded. Tango can display no less than 90 languages, including Arabic, Chinese, Greek, Hebrew, Korean, Russian and Thai. The interface can be switched to any of 19 languages. The corresponding creation software called Tango Creator lets you compose HTML pages in 90 languages using character sets other than Latin-1, and supports tables and frames.

    The changing (type)faces of the Internet

    Advances in font design for the Web are ushering in a new era of Net publishing that puts the emphasis on corporate identity.

    Design is a crucial issue for businesses today. For example, it is commonplace for companies to adopt a distinctive housestyle when it comes to typefaces for information about themselves and their products, and corporate designers naturally expect to have full control over how materials will look.

    On the Web, by contrast, it is the consumer rather than the supplier who determines the onscreen appearance. This arises from the way in which HTML works, whereby overall structure is transmitted from the Web server and then given a local form by the Web browser. For example, within an HTML document there are various kinds of headings, but the exact details of how each of these is implemented - typeface, size, weight - cannot be guaranteed.

    One solution to this problem is that adopted by Microsoft. It has introduced a special HTML tag that allows Web page designers to specify what typeface should be used if available. So <FONT FACE = "Arial Black, Courier New"> specifies two possible typefaces for the text it refers to.

    One difficulty with this approach is that it is browser-specific. For example, not even the latest version of Netscape supports this syntax. A more formal approach to the problem is offered by the idea of style sheets.

    These are a way of specifying the overall design elements of an HTML file, including things such as how the various levels of headings will appear, colours of body text etc. They are analogous to the style sheets found in many word-processing and desktop publishing packages. There are a couple of rival approaches to style sheets, of which http://www.computerweekly.co.uk/gwfeat/gwspeak/css1.html is the most important.

    The agreement of an official HTML style sheet standard is good news because it will avoid the proliferation of local HTML dialects that had started to occur. However, apart from the fact that the idea is so new that few browsers currently support the feature, there is a more fundamental problem.

    Style sheets specify what typefaces should be used if they are available. However, if a particular font is not on the client system it will be substituted with a default - and completely change the original design. To avoid this it is necessary to provide some kind of font delivery mechanism whereby any fonts that are needed in a page are sent with it (rather as Java applets are sent to add extra functions).

    This is such an obviously sensible approach that two rival standards were proposed to implement it, one from Adobe, Netscape and Apple, and the other from Microsoft. Miraculously the Internet spirit of co-operation seems to have prevailed again, and a joint standard called Opentype (http://www.microsoft.com/truetype/fontpack/pr3.htm) has been announced with support from all parties.

    On its own, this kind of font embedding would not be enough: Web pages would be grossly inflated and become unusable for most kinds of users. Two other components of the Opentype idea are therefore crucially important. The first is called subsetting, and means that not every character of a font set needs to be sent. For example, a font may include irrelevant characters; by omitting these from the embedded font that is sent with the Web page the overall size is kept within reasonable limits.

    Size is further reduced by employing a special kind of font compression technology. Developed by Agfa, Microtype Express is a lossless, on-the-fly compression technology that can compress a font in seconds and decompress it in far less than that.

    Given the very broad industry support for the Opentype standard, and the long-standing need of companies for precisely this kind of control, it is likely that a new era in Web publishing is about to begin. In the not-too-distant future businesses can start to treat the Net as a medium for marketing and creating brand-awareness that matches any other for power and flexibility and goes beyond them all in terms of reach and cost-effectiveness.

    Frame relay

    Frame relay, like Asynchronous Transfer Mode (ATM), is an advanced data communications technology that is starting to become more common in corporate environments. It is sometimes seen as a rival to ATM, but in reality it is a complementary solution.

    ATM is optimised for use with networks capable of tens of megabits per second or more and is good for carrying data-intensive applications such as multimedia.

    It employs fixed-length packets of data that can be quickly switched. Frame relay, by contrast, uses variable-length frames. This means it is more efficient for slower applications where the overhead of ATM can be a penalty. Frame relay is related to the older X.25 packet-switching technology, which used to be common among companies, but again it is more efficient.

    Its other advantage is that it tends to cost less than a leased line of equivalent speed. It uses virtual circuits that may exist only for the duration of the connection, rather like the Net, although frame relay also uses permanent virtual circuits.

    Frame relay is well suited to traffic that comes in bursts, where dedicated lines would not be filled to their capacity all the time.

    Internet traffic and communications between intranets often follow this pattern. Therefore, frame relay is a good solution for users connecting to Internet providers at higher speeds and for joining up intranet islands.

    Free Internet Services

    One of the Internet's most unsettling aspects for business users is the way it seems to fly in the face of conventional economics. Its basic operational characteristics, whereby connections to the other side of the globe cost the same as those to the other side of the road, are confusing enough. But even worse for many is the constant recrudescence of the "free" idea.

    Alongside the more formally constituted Free Software Foundation, there is the burgeoning open source movement, to say nothing of the various less-than-legal manifestations of a more general attempt to implement the "information wants to be free" idea by posting copyright material online.

    But it is not just the hippy/hacker fringe that is infected with the idea of freedom: practically all content is freely available, the main browsers are now all free, free e-mail services are commonplace, and other free Web-based services such as calendaring are being devised.

    Further proof that the business rules are being turned upside-down is provided by the launch of a US company called Free PC. As its home page explains, 10,000 qualifying participants in this scheme will be given a new PC and Internet connection completely free for two years. To qualify, applicants must provide extensive demographic details so that targeted Web-based advertising can be sent to them. In fact the ads will actually reside on the PC's disc: no less than 2 Gbytes will be allocated to multimedia displays from the scheme's sponsors. This will avoid long download delays for the ads - which are always present on-screen when users are online.

    Free-PC is yet another venture from the ever-fertile IdeaLab. It represents the logical extrapolation of a number of other free US services that provide free Internet connections, but no hardware.

    These services all generate their revenue by selling targeted advertising. Rather remarkably, it has proved possible in the UK to provide similar free Internet services without necessarily imposing any advertising. This is because of differences in the way that the telecoms industry functions in the two countries. In the US, local calls are frequently free, but in the UK they rarely are - a fact that has probably acted as a considerable brake on the uptake of the Internet both here and on the continent, where the call charge structure is similar. However, because of the way the deregulated telecoms industry works, it is possible for companies offering a free Internet service, say, to generate income from interconnect fees.

    This has led to an extraordinary spate of free Internet services in the UK, many of them with no advertising (unlike in the US, where it represents the only available revenue). Even BT has joined the club.

    The most famous of these is Dixon's Freeserve which now boasts over one million users. Dixons can make money in a number of ways: the interconnect fees, online advertising and online sales. The latter is perhaps the most interesting, since retailers are now able to remove a major barrier to the spread of e-commerce - the need for an Internet connection - by providing it free. Proof that this model is spreading is provided by the news that Tesconet, originally a paid-for Internet service, is to be made available free. Other, originally paid-for Internet services that now offer a free service option include UKOnline and VirginNet.

    It will presumably soon be easy even for the smallest online merchants to offer a free Internet account from dedicated Internet service providers serving this market. Another manifestation of the Internet's "free" culture is even more radical. The US e-commerce company Onsale sells all its goods at wholesale prices - for zero profit, that is - and aims to make money from the advertising it carries. This idea of turning an entire product catalogue into just a marketing tool will send shivers down the spines of many business people brought up with more conventional approaches.

    Ghostscript

    Ghostscript as you might guess from the name, is intimately related to PostScript, and in a sense is the Internet's home-grown version.

    PostScript is Adobe's page description language for documents (see http://www.adobe.com/prodindex/postscript/overview.html ). That is, instead of describing a printed page of words and images in terms of the dots that computer printers typically place on paper, it creates those letters and graphics by defining them mathematically as a series of lines, curves and shapes, placed together in a certain way.

    This has a number of advantages. First, the description is typically very compact: for example, to describe a large solid circle in PostScript is trivial, but a conventional graphics file holding the positions of every dot required to create it would be sizeable. Perhaps more importantly, PostScript descriptions are platform-independent: this means that the same PostScript file can be printed out from any machine without the need for further modifications to take account of the fact that the dots used to realise the image will be different in each case.

    Ghostscript is the online community's response to this powerful but proprietary product, a clone that can process PostScript files to produce the correct output, but one which is available free to end users.

    Ghostscript's home page is at http://www.cs.wisc.edu/~ghost/index.html.here you can download the latest version of Ghostscript (which is an interpreter for Postscript) and Ghostview an X11 previewer for Postscript files. GSView at http://www.cs.wisecedu/~ghost/gsview/index.html)is a graphical interface for Ghostscript, and can be used as a helper application for browsers. This allows postscript files to be displayed in the browser.

    A first guide to postscript can be found at http://www.gkss.de/W3/PS/postscript.htm.General resources can be found from http://www.geocities.com/SiliconValley/5682/psotscript.html.

    Global Network Navigator (GNN)

    One of the most interesting developments on the Internet has been the appearance of commercial operations offering free information. Obviously this is not pure philanthropy: rather, it is a kind of sponsorship that gets these companies' names before the vast Internet public.

    One of the first firms to do this is O'Reilly & Associates. Computer Weekly readers will know it for its series of technical books on Unix and network issues. The Global Network Navigator (GNN), its World Wide Web server (best accessed by UK users through the mirror-site http://src.doc.ic.ac.uk/gnn/), has many other jumping-off points for further exploration of the Internet, and is open to all, though you are asked to register.

    At the heart of the GNN is the catalogue of Internet sites found in Ed Krol's classic The Whole Internet book, converted into a searchable hypertext page with hotspots.

    Particularly interesting is a list of the top 50 sites. A separate Business Pages contains various commercial services, while NetNews details some of the latest Internet developments. Other sections include a Help Desk for Internet beginners, and online 'exhibitions' that look at particular topics.

    Nearer home, and on a slightly smaller scale, there is the NewsWire server (http://info.learned.co.uk/) from the UK company Learned Information. As well as providing subscription details, this lets you read features from the current and previous issues of NewsWire. This often carries interesting and unusual stories from a nicely Euro-centric viewpoint. There is a Gopher at info.learned.co.uk

    Finally, the Perl server at http://www.cis.ufl.edu/perl/ is a WWW server with serious information. It may not tell you everything about the Unix junkie's favourite scripting language, but you will find links to manuals, FAQs, FTP, other Perl sites and quotations from the Perl god Larry Wall.

    GNU General Public Licence

    A far more sophisticated kind of shareware has become an important part of Net culture. This is the GNU General Public Licence, which arose out of the larger project based on GNU (which stands recursively of GNU's Not Unix, and is a program to create a complex version of Unix in the public domain).

    The essence of the GNU General Public Licence is that it tries to safeguard everyone's rights to use a program by setting out the minimum freedom available, unlike most licences which try to do the opposite by strictly limiting what you can do with a statement of maximum freedom.

    In particular, the licence specifies that you can distribute copies of any software subject to the GNU licence (and charge for this service if you wish); that you can receive the source code or obtain it if you desire to do so; that you can change the software or use pieces of it in new free programs; and that you know you can do all these things. The other important element of the GNU Public Licence is that there is no warranty for any software distributed according to its terms.

    The full text of the GNU Public Licence version 2 can be found at http://www.cs.utah.edu/csinfo/texinfo/emacs19/emacs_3.html#SEC3.

    Glue Languages

    the Internet has had a dramatic impact on many aspects of computing, but one that has been little remarked upon concerns programming.

    For the new connected world of computers has led to a shift away from traditional heavy-duty programming languages such as C/C++ towards lightweight tools. Most of these are generally called scripting languages, and the reason for their adoption has varied. For example, Javascript is the scripting language par excellence for Web pages, while on the server side VBScript is popular because of the large installed base of Windows NT Systems.

    Also important on the server side is Perl, which has established itself as the language of preference when creating CGI scripts. Other languages popular among Internet users include Tcl/Tk and Python.

    Scripting languages are becoming more widely used in part because the latest generation of Web programmers is growing up.

    For them lightweight languages that can be learnt and applied quickly to solve immediate problems are more suitable than complex languages like C which can require months to master. Such scripting tools also reflect a new approach to software projects, rather than creating huge programs, the tendency is to assemble pre-existing elements. In this respect, scripts for the glue that binds these components, so these new programming tools are sometimes referred to a glue languages.

    The use of smaller programming elements in this way is part of a more general move towards component based development for example with Javabeans or Microsoft's Common Object Model architecture.

    Rise and rise of glue languages

    Greatly simplified, the history of computer hardware can be viewed as one of progressively smaller and more personal units linked together ever more widely. After the mainframe with dumb terminals came minis offering a client/server architecture, to be followed by local area networks of peered PCs and the currently-evolving global Internet of IP-enabled devices.

    This evolution has gone hand-in-hand with the development of the interface for each hardware generation. After batch processing came character-based command lines for real-time control, followed by desktop graphical user interfaces culminating in the present browser-based, application-independent Webtop approach.

    And in the world of programming, there has been a parallel movement. Machine code and assembly language were followed by higher-level languages such as Cobol, Fortran, C, C++ and Java, and these in turn by today's scripting languages.

    Such scripts frequently make use of software code written in other languages. In this respect they act as a way of hooking up a broad range of pre-existing elements - hence their other name of 'glue languages'.

    In part the rise of scripting is down to economics. Programming is an expensive activity, and the ability to employ tools that can be learnt quickly, and applied widely and easily, together with the possibility of re-using existing code, is a compelling combination.

    Moreover, the emphasis is increasingly on getting more value out of current systems and the data they hold, rather than writing entirely new ones. This has put the focus on connecting to and working with often disparate collections of legacy resources - an area where scripting excels. But alongside such considerations, the Internet is also proving a major force in the current rapid uptake of scripting tools.

    By their very nature, the Internet and related technologies such as intranets and extranets are heterogeneous. Portability is therefore a key requirement of tools that can work in these environments, something that scriptings tools typically offer.

    The shifting collection of platforms and applications found in IP-based distributed systems also means tools must be very adaptable. Scripting languages are typically highly extensible, allowing them to be modified quickly to meet the demands of a particular situation.

    Alongside these general considerations, there are also several specific reasons why scripting tools are spreading rapidly in the Internet world. For example, Perl is almost routinely used where the Common Gateway Interface approach is employed to link Web sites to back-end database and other software. And because Perl excels in string manipulation, it is a natural tool for processing HTML files, which are just text.

    Microsoft, too, has contributed to the popularity of Internet scripting, particularly on the server side.

    Its widely-used Active Server Pages technology employs scripting to generate Web pages on the fly from back-end databases and other sources. One reason why ASP has taken off is that it is scripting-neutral: you can use Perl, Microsoft's VBScript (a cut-down version of Visual Basic) or Javascript, among others.

    Netscape's introduction of Javascript in the Navigator 2 browser, its formalisation as ECMA-Script, and Microsoft's support for the latter, have all helped to ensure that scripting is also very common on the client side.

    Finally, the increasing acceptability of open source software like Apache or Linux has meant that free scripting languages such as Perl, Tcl and Python - all of which are open source - are now viable options in a corporate context.

    The Active Scripting organisation aims to promote scripting further through the development of open source ports of Microsoft's ASP technology to other platforms such as Apache and Mozilla. The fact that even new commercial scripting languages, like Rebol, are being released for free download seems likely to help spread the scripting word further.

    Gophers

    Gopher is the name given to a way of finding your way round the Internet using a simple menu-driven approach.

    Gophers start off with a list of top-level categories - e.g. arts, business, etc - each of which takes you to a further list of subsidiary options - the business option might lead to finance, economics, taxation, etc - until at last you reach the detailed information itself. Connections are handled automatically with text files being displayed on your screen and binary files being downloaded to your computer.

    The name gopher was chosen partly because, like its human counterpart, it can 'go for' things and partly because the animal gopher was the mascot of the University of Minnesota where the system was developed.

    Gophers consist of a client and server. you use the client to ask for information held by the Gopher server that contains the index.

    The easiest way to use a gopher is to have a Gopher client resident on your system - you need more than just an E-mail connection for this. Then, to search for information, call up your local Gopher client (either by typing something like gopher at the command line, or by running a Gopher application such as HGopher or WinGopher under Microsoft windows), and connect to a Gopher server over the Internet.

    Because Gophers are merely a way of structuring references to other information, it is also possible for companies to use them for disseminating information about themselves, by providing a menu-driven interface to marketing and sales documents, for example.

    You can save time and energy by choosing an appropriate Gopher before beginning to search for particular information. One of the best general Gophers is the Gopher Jewels implementation. This was an ASCII file containing interesting Gopher sites; it has now turned into a Gopher site itself with links to all the 2,200 or so entries to be found at the end of its nested menus. Among the mirrors sites is gopher.csv.warwick.ac.uk.

    If you choose option number 9 on the main menu, and then option 4 on the sub-menu you will be taken to the Gopher Jewles section. From there you can either explore specific subjects or else carry out searches in a variety of ways.

    If you want a little mystery try http://kuhttp.cc.ukans.edu/cwis/organizations/kucla/uroulette/uroulette.html which takes you to URouLette and whisks your computer into the Internet and connects you to a random URL.

    Gopher exists in two forms, the basic Gopher and Gopher+ (pronounced 'Gopher plus'). Among the more advanced features of Gopher+ is the ability to request information about a Gopher menu item before you download it. This might include what kind of file it is (for example, a sound file or executable), who created it and when it was last updated.

    Gopher+ servers are also able to store the same file in different ways: for example, words as a text file or a PostScript file, graphics as both .gif and .jpg files. If you use a Gopher+ compliant client you can set it up in such a way that the choice among these alternatives is made for you automatically according to your preferences.

    Since the Gopher+ standard is backward-compatible with the vanilla variety you only see the extra features if your client software is capable of handling them. For everyone else, a Gopher+ machine looks exactly the same as the older kind.

    One of the richest selections of material available from one site can be found at the University of California gopher://peg.cwis.uci.edu:7000/. Here you can find links to other Gophers, telnet services and various kinds of White Pages. This site calls itself the Peripatelic Eclectic Gopher (PEG). PEG's main menu includes a number of fairly academic areas such as biology, the humanities, physics and philosophy, but also some general resources that business users will find well worth exploring.

    The link to Electronic Journals connects to the Electronic Newsstand with tables of contents and sample articles from business magazines such as the Economist, Business Week, Inc etc.

    The menu entry headed Virtual Reference Desk leads to the Internet Mall, a searchable Acronyms dictionary, the CIA World Factbook, information organised according to subject - including very good lists of economics and business information - the US National Trade Databank, stock market reports, Roger's Thesaurus and worldwide weather.

    A list of gophers can be found at http://www.infohiway.com/gof/index.html. At http:www.infohiway.com/index2.html is a list of FTP and Web servers.

    How to Gopher it using E-mail on the Internet

    To begin an E-mail Gopher search, you send the message help to the address gopher@earn.net. In the subject line you put the address of the Gopher where you wish to start. For example, using the address of the 'Mother Gopher' at gopher.tc.umn.edu in the Subject line you would receive a message beginning: Mail this file back to gopher with an X before the menu items that you want. If you don't mark any items, gopher will send all of them. 1.Information About Gopher/ 2. Computer Information/ 3. Discussion Groups/ 4. Fun & Games/ 5. Internet file servers (ftp) sites/ 6. Libraries/ 7. News/ 8. Other Gopher and Information Servers/ etc.

    This is the main Gopher menu: to retrieve one of these items, say number 7, you would send back to gopher@earn.net the message (the subject line can be left blank) Split=64K bytes/message <- For text, bin, HQX messages (0= No split) Menu=100 items/message <- For menus and query responses (0=No split) Name=News Numb=7 type=1 Port=70 Path=1/News Host=gopher.tc.umn.edu

    In addition to the first two line (essentially specifying information about how a message is to be packaged) a block of lines has been added from the second half of the returned E-mail (not shown): this gives vital information to the E-mail Gopher on which menu item you wish to retrieve.

    You can also obtain this item by sending back the whole message with an 'X' placed in front of the appropriate menu number.

    In due course you will receive the following sub-menu corresponding to item 7 in the main menu: 1. AMInews Ski Reports/ 2. Cornell Chronicle(Weekly)/ 3. French Language Press Review/ 4. IT Connection (University of Minnesota)/ 5. Minnesota Daily/ 6. NASA News/ 7. National Weather Service Forecasts/ etc.

    To retrieve item 6, say, you would send to gopher@earn.net the following message Split=64K bytes/message <- For text, bin, HQX messages (0 = No split) Menu=100 items/message <- For menus and query responses (0 = No split) Name=NASA News Numb=6 type=0 Port=79 Path=nasanews Host=space.mit.edu

    which has the same bipartite structure as the other request you sent: the format information and the details about which menu item you wish to retrieve.

    After a little while you should then receive the text corresponding to item 6: nasanews: "space" Fri Apr 7 01:46:44 1995 MIT Centre for Space Research This NasaNews service is brought to you by the Microwave Subnode of NASA's Planetary Data. etc.

    Obviously any menu item could be chosen at any point in the above, and any Gopher could have been chosen for the subject line of the first message.

    Graphics

    One of the most important elements of the HTML pages that underlie the World Wide Web is graphics. These can be of two kinds: those that are pulled in only when you click on a link to them (and therefore exist as a separate element in the hypermedia web) and those that are inline images.

    Links to the latter are embedded in the HTML coding in such a way that they are generally loaded automatically. There is a very important subclass of inline images that are becoming increasingly common, and which appear on most of the more advanced Web sites.

    These images are called transparent, and they are readily recognisable by the way they seem to float in the page, with the background colour of the Web browser meeting the outline of the images exactly. Contrast this with a non-transparent image where there is a definite boundary, which is usually rectangular, that separates the image region from the rest of the surrounding page.

    A good example of a transparent image can be found in the top-left hand corner of Silicon Graphics' Home Page at http://www.sgi.com/. Here the company's logo seems to be an embossed element on the page, but is in fact a transparent image.

    Transparent images derive their name and their effect from the fact that it is the image's background colour (outside the central area) that is rendered transparent, and thus allows the underlying Web browser background to show through. Various graphics editing tools exist that can be used to create this effect.

    Graphical file formats

    One of the most important shifts that has taken place on the Internet in recent years has been that from an essentially text-based medium to one that routinely uses multimedia elements. Among these, visual components are by far the most common for the simple reason that no special equipment is required (as with sound on PC, for example). The ineluctable rise of the World Wide Web browser with its inherently graphical approach has also contributed.

    And so it is that graphical file formats have moved closer to centre stage. One manifestation of this was the recent brouhaha over the .gif format. The Graphics Interchange Format was devised by CompuServe, but used older proprietary compression technologies - hence the undignified saga of accusations and counter-accusations between CompuServe and Unisys when it came to placing the blame for the sudden attempt to impose licensing fees for these.

    The important thing to note about .gif files is that they are lossless: that is, although they succeed in reducing the size of a raw graphical image, no data is lost in the process. This makes .gif files ideal for higher-quality images.

    However, even though they are compressed, .gif files can still be large and hence take slower connections a long time to download. This explains in part the popularity of the main alternative graphical format, .jpg. Named after the Joint Photographic Experts Group that devised it, .jpg files are generally smaller than the corresponding .gif file. However, in the compression process information is lost, and so the quality tends to be lower.

    Fractal compress techniques (http://www.iterated.co.uk/info) provide higher compression ratios and images that produce better zoom facilities than JPEG. These require plugins to browsers such as Netscape. Fractal compression is asymmetrical - it takes more processing power to compress than to uncompress. This makes it idea for all multimedia use as the receiver of the material does not need to be as powerful at the creator.

    These, then, have their advantages and disadvantages, and the appropriate format should be used according to the particular situation.

    Gutenberg Project

    The Gutenberg Project is another of those seemingly crazy, altruistic exercises that in part help to define the unique spirit of the Internet. The Project's basic plan is to convert first hundreds, then thousands and perhaps one day millions of books into a simple ASCII format, so that anyone can download them for free from the Internet, or acquire them for the cost of a floppy disc. The name derives from the Gutenberg revolution of the 1400s, when the spread of new printing technology brought the cost of a book down by a factor of several hundred, allowing a whole new class of readers to emerge. The aim is to reduce the effective cost of a book by the same ratio again, and perhaps enfranchise another great swathe of the population.

    Starting with a modest conversion rate of 10 books a year in 1991, and reaching around 100 in 1994, the project hopes that by the year 2001 it will have a library of some 10,000 E-texts. Practically everything is the work of dedicated amateurs who painstakingly enter volumes for no reason other than a desire to see the Project flourish and the love of reading spread.

    Already most of the obvious titles such as The Bible and Shakespeare have been entered. Copyright remains a problem, but happily there is a generous supply of worthwhile books that are in the pubic domain. One of its most recent feats shows the ambitions of the Project: Volume 1 of the classic 1911 edition of the Encyclopaedia Britannica is now available online, part of something be called the Project Gutenberg Encyclopaedia that will grow to include all of the text. It can be downloaded from ftp://mrcnext.cso.uiuc.edu/pub/etext/etext95/pge0112.txt; but note that the file is over 8 Mbytes in size.

    H.323

    Using the Internet as a means for telephony, with all that this implies for costs, is clearly a fascinating idea. There are now a very wide range of Internet telephony products available. But in a sense, this is the problem: the many competing products mostly follow their own proprietary path. Clearly, the whole point of Internet telephony is nullified if you can only contact a small subset of those active in this area - itself a fraction of the total Internet population.

    What is needed is a standard, and it is becoming increasingly likely that it will be one called by the rather unmemorable name of H.323. This has been drawn up by the International Telecommunication Union (ITU), and applies more broadly than to just Internet telephony. As the standard's summary states, H.323 "describes terminals, equipment, and services for multimedia communication over Local Area Networks (LAN) which do not provide a guaranteed Quality of Service. H.323 terminals and equipment may carry real-time voice, data, and video, or any combination, including videotelephony."

    The key phrase here is the absence of Quality of Service: this means that some of the digitised packets sent out of the network (the Internet in this case) may get lost, and H.323 knows how to cope with this. One of the first Internet telephony products to support H.323 was from Intel; now Microsoft has come out with a revised version of its NetMeeting, while Netscape too has announced that it will be following the standard. With this kind of backing, the future of H.323 looks rosy.

    Hacker

    Although the Internet originated from work carried out for the US Department of Defence (a principal concern of which was designing a network system that could withstand a nuclear attack, one of the reasons for the Internet's robustness to this day), it soon developed into a mainly academic medium.

    As well as providing university researchers (mainly in the US, but later throughout Western Europe and further afield) with a means of communicating, it became closely bound up with the prevailing student ethos, particularly among those working in the computing fields. One of the consequences of this was that the early enthusiasm among this group for programming and related areas soon became central to the Internet culture too. (It was already tightly woven into that of computers and communications.)

    Among these young adepts, this activity was known as hacking. As Hackers, Steven Levy's highly readable 1984 description of this world recounts, "hacking" had none of the negative connotations it so often does today, and the term "hacker" was a badge of honour.

    According to this view, hacking refers to any skilful activity to do with computers, and has no sense of the illegal or illicit. This aspect is covered by the term "cracker".

    However, the increasingly widespread pejorative use of the term hacker has muddied the waters: it is not always clear whether an Internet hacker is to be admired or not. For this reason the unambiguous term cracker is preferred among Internet purists for those whose intentions are less laudable than their abilities.

    HDML

    Even though HTML is just an application of the more general and powerful SGML, its extraordinary success has meant that it has become an example others have tried to emulate. One of the latest manifestations of this is the hand-held device markup language (HDML).

    This takes as its starting-point the observation that HTML, for all its power, is essentially wedded to a navigational model that presupposes a decent-sized screen and a fair amount of memory and computing power. None of these is available in the world of hand-held devices. These are the not-so-small computers running Windows CE, but those with very small displays and limited bandwidth.

    The idea of HDML is not to cram all the content of a Web page into a format that can be displayed by these units - this is clearly impossible - but to come up with a way of displaying appropriately formatted information on these kinds of devices.

    In particular, HDML offers a navigational model that is suitable for these screens. This is built on the idea of cards, rather like Apple's Hyper Card, with the concepts of next and previous corresponding to forward and back buttons on a browser.

    HDML has a surprising number of big names behind it, including AT&T, Mitsubishi, Sun and Tandem. Its originators claim that more than 2,000 applications have already been created using it. Several US mobile phone companies offer HDML services and both parcel delivery firms Federal Express and UPS provide information in this format.

    Hits

    In 1994, a 'hit' in the context of the Internet would have meant a successful search item, probably using a WAIS system. Today that meaning is still widely found, but another has usurped it as the primary connotation - and become one of the hottest current discussion topics online.

    This type of hit refers to some kind of visit to a Web page: typically a site-owner would claim to get so many thousands of hits a day (or hour if they are a top spot). Hits matter because they are the nearest thing that the Internet (and in particular the World Wide Web) has to readership levels. And these are crucially important for attracting advertisers (now well-established as the preferred way of paying for free Internet sites) and extracting healthy ad revenues from them.

    Although everyone agrees hits are important, nobody can agree quite so well on what constitutes a hit. For example, many sites count every Web page access attempt as a hit. This clearly overstates the number of visitors in many ways, notably by including failed and repeated attempts by one person to access the same page. The other extreme is to count only each Internet address, but here there is the problem of multiple users of one account (for example in a company), to say nothing of the fact that no further information is obtained.

    It will probably be some time before the hit is pinned down satisfactorily, but its central importance in establishing who is looking at what where when visiting online sites means that at some point and in some way a generally accepted definition will evolve.

    HTML

    The HyperText Markup Language (HTML) lies at the heart of the World Wide Web, the multimedia hypertext system that has become the most popular way of providing and accessing information over the Internet.

    Web documents offer an amazing variety of features, including the ability to employ forms, drop-down menus and area-sensitive clickable images. Remarkably, the underlying HTML document is extremely simple and can be produced by nearly any text editor or word-processor.

    Any Web page can be viewed and saved locally. You can then view the file produced with a word processor. This is a good way of learning about features that interest you.

    HTML consists of a few tags that are placed in a text file in order to indicate to Web browsers how they are to be displayed. For example, the HTML command <B> turns any text if frames into boldface. So if you type the line:

    in an HTML document, it will produce:

    in a Web browser.

    This use of the angled brackets and pairs of commands (one initiating, the other cancelling) is found throughout HTML.

    Text elements

    Just the title of the page is enclosed within the <TITLE></TITLE> tags, so the main body is within the <BODY></BODY> pair (with the same "cancelling" forward slash in the second of them).

    The first thing to place in the body is generally a heading, which appears in the first line of the page. Headings are normally of a larger type face. There are six headings tags to choice from <H1>,...,<H6>, along with it's cancelling twin </H1>,...,</H6>. The heading text is placed between them.

    You can enter the main text, broken into paragraphs using the <P> tag, one of the few that does not require a matching tag to cancel its effect. Line breaks within the HTML file have no effect on its appearance in a Web browser, which is determined wholly by the tags within it. This means you can use them to create files that are more readable in their raw state. Putting these elements together you lead to an HTML file along the lines of: <HTML> <TITLE>A Minimalist HTML document<TITLE> <BODY> <H1>My home page</H1> <H2>First Document</H2> Text for section one. <P> Note that the line break within the HTML document have no effect on its final appearance. </BODY> </HTML>

    You can change the character of the text using <B> for bold and <I> for italic. You could replace the line HTML documents have no

    with HTML documents have <B>no</B>

    to embolden the word "no".

    Another useful element that is easy to add is the list. There are two basic kinds, ordered and unordered.

    The former numbers each entry, while the latter uses simple bullets. The syntax is straightforward: first, you place the tag for the list type that you want - either <OL> for an ordered list, or <UL> for an unordered list - then each element of the list is prefixed by <LI>.

    Note that <LI> is like <P> in that it does not require cancelling, whereas both <OL> and <UL> have their corresponding closing tags </OL> and </UL>.

    A simple unordered list which could be placed anywhere within the body of the HTML document would therefore take the form: <UL> <LI>My first point<LI>My second point </UL>

    Although there are many refinements that can be added to these HTML elements, the basic use described above are more than adequate to create a perfectly viable home page, albeit a slightly dull one.

    [For more on HTML look at http://www.computerweekly.co.uk/gwfeat/180496.html articles by Glyn Moody.]

    HTML 4.0

    The hypertext markup language (HTML) not only sparked the rapid uptake of the Web for business purposes, but also arguably drove the wider acceptance of the Internet in firms. Indeed, without HTML, the Internet would be little more than a global E-mail system for most companies.

    This central importance lends a particular weight to each HTML standard, and makes the formal release of the most recent, version 4.0, a significant event. The full standard can be downloaded in various formats HTML 4.0's relationship to previous standards and to current Web practice is complex. The first official release was version 2.0, published as RFC1866 . Unfortunately, by the time it emerged, the real world of Web publishing had moved on. In particular, the sharpening contest between Netscape and Microsoft led to both introducing unofficial extensions to the HTML standard.

    HTML 3.0 was meant to address this threat of fragmentation, but proved too ambitious in terms of achieving all parties' agreement on what should be included. As a result, it was never implemented.

    The next official release after 2.0 was 3.2, which included fewer new features than 3.0. Against this background HTML 4.0 was designed to accomplish a number of objectives. First, it had to bring the official standard closer to the reality of what users are doing on the Web.

    Second, it needed to establish a firm foundation for future developments. And third, it had to try to incorporate at least some of the long-planned extensions that HTML 3.0 had put forward. So, for example, HTML 4.0 formalises the use of frames for the first time. The standard also builds on the increasingly widely used tables, offering features such as the independent scrolling of table bodies (with table heads and feet fixed), new styles for cell dividers and body fills, and support for grouping columns.

    Some new buttons for forms are introduced, as are rather more important internationalisation features. These have long been a glaring omission: the anglocentric bias of HTML was clearly inappropriate in what has aspirations to become a truly global medium.

    HTML adds support not just for other character sets (following the suggestions put forward in RFC2070 but also the tricky question of direction. It is now possible to incorporate both left-to-right and right-to-left writing systems on Web pages, and even to mix them in the same line.

    Some of the most important aspects of HTML 4.0 are the least obvious, because they flow by implication rather than explicitly from the new standard. For example, there is a general move away from using attributes to change particular HTML elements - such as altering the typeface or typesize of a heading or body-text - to employing style sheets.

    These allow full control over the appearance of a document while keeping form and content quite separate; something of a religious issue for HTML purists. There is also a new <SPAN> tag that allows HTML elements to be grouped together: this is particularly useful when taken in conjunction with some newly-defined intrinsic events. These are things like mouse-clicks or the gain and loss of focus for HTML elements.

    Combined with scripting languages, intrinsic events allow new levels of interactivity to be added to HTML 4.0 pages.

    It is now officially possible to attach scripts to actions such as passing a mouse cursor over a hyperlink: for example, a new style sheet could be applied when this occurs, causing the hypertext link to change colour and font. These new features, coupled with the emerging Document Object Model formalise what is called Dynamic HTML, perhaps the most important advance in Web design since the introduction of tables.

    Frames

    The essence of the World Wide Web is non-linearity. Hypertext, by definition, lets you leap from one part of a collection of documents to other locations, determined only by what links have been embedded, and where you wish to go.

    However, in one respect, the Web page remains a prisoner of its linear origins in text. Although it may contain wonderful multimedia elements, each of which can be a hotspot taking you to many other pages, the basic structure of the Web page on the screen is remarkably conservative. What you see is essentially monolithic, a single page with fixed areas on it.

    The introduction of tables in HTML version 3 (through implementations from Netscape and others) alters the appearance of the page, not its underlying structure. This is what makes the new Web page element that Netscape calls "frames" so interesting. It represents a genuine step beyond what is currently possibly (and along the way introduces yet another non-standard HTML element).

    Frames let you divide up your Web page into completely independent areas, each of which may have hot links that pull in new elements to that area, leaving the others unchanged. So, for the first time you can choose to place different combinations of hypertext elements on screen as you navigate through the pages.

    A more conventional but highly useful way of employing frames is to fix one down one side or along the bottom to act as an on-screen index. Some commercial sites have exploited frames to create advertising banners that do not go away when you move through the main document, making them something of a two-edged sword for users.

    A multi-purpose form for your pages.

    What are forms?

    Forms enable you to collect information from people viewing your Web pages in a structured way and have it automatically mailed back to you.

    How do I set one up?

    1. Include the HTML code below in your page.
    2. Replace index in the ACTION section with the subject line you wish to see on the emails.
    3. Replace johnc@ukonline.co.uk with your email address to send the email to yourself (or all your data will go to John!).
    4. Replace Manual with something to tell you where the email came from.
    5. Put up to ten fields onto your form. The information collected from them is mailed to the address you gave above. There are a number of types of input fields possible. The most commonly used are shown below: <INPUT TYPE = "text" NAME = "field0"> used for basic text fields.
      <INPUT TYPE = "hidden" NAME = "to" VALUE= "johnc@ukonline.co.uk">
      used for hidden text fields to pass data between pages. Data
      <INPUT TYPE = "password" NAME = "mypassword"> used for password
      fields.
      <INPUT TYPE = "Submit" NAME = "Send"> A button to send/submit the
      form.
      <INPUT TYPE = "Reset" NAME = "Reset"> A button to clear mistakes.
      Each field must have a unique name - call the first one field0, the second one field1,
      and so on, up to field9 if you want that many fields.
    6. Replace the text 'Heading for Field' with whatever you want to call each field. This is
      what will appear on the page next to the field.
    7. Replace the word "send" in the NAME section of the Submit line if you want the button that sends you information to have a different text
    8. Replace the word "reset" in the NAME section of the Reset line if you want the button that clears all the fields to have a different text.

    What appears on my page?

    Have a look at this example. If you view the source, it's fairly easy to work out what's going on.

    Example HTML:

    <FORM ACTION="/public-cgi/user_form/index" METHOD="POST">
    <INPUT TYPE = "hidden" NAME = "to" VALUE = "johnc@ukonline.co.uk">
    <INPUT TYPE = "hidden" NAME = "from" VALUE = "Manual">
    Heading for field:
    <INPUT TYPE ="text" NAME="field0" SIZE ="100"> <BR>
    <INPUT TYPE = "submit" VALUE = "Submit" NAME = "Send">
    <INPUT TYPE = "reset" VALUE = "Reset" NAME = "Reset">
    </FORM>

    A autoresponder for your pages

    Add a gadget that emails information to your visitors.

    What is it?

    The auto-responder allows you to get a person's email address and then email them a file from your web space.

    How do I set it up?

    1. Include the HTML code below in your page.
    2. In line 2, replace john/example.txt with username/filename. This is the file that will be emailed to them. (Remember to include a signature in the file - it won't be added automatically.) If your file's not in the top level directory, you'll have to give the full pathname.
    3. In line 3, replace exampledoc with the subject you want to appear on the email
    4. In line 4, replace john with the sender you want to appear on the email
    5. You can, of course, change the text on line 5 to read anything you like.

    That's it!

    What appears on my page?

    Please enter your email address:

    Example HTML:

    <FORM ACTION = "/public-cgi/autorespond" METHOD = "post">
    <INPUT TYPE = "hidden" NAME = "send" VALUE = "john/example.txt">
    <INPUT TYPE = "hidden" NAME = "subject" VALUE = "exampledoc">
    <INPUT TYPE = "hidden" NAME = "from" VALUE = "john">
    Please enter your email address:
    <INPUT TYPE = "text" NAME = "to"> <BR>
    <INPUT TYPE = "submit" NAME = "Send me the file">
    </FORM>

    A guestbook for your pages

    What's a guestbook?

    This form allows users to make comments that are then displayed on your Web page for all to see.

    How do I set it up?

    1. Copy the example HTML below into your page
    2. Change the text in line 1 to read whatever you like
    3. In line 4, change "john/guestbook.html" to the full name of the page where you want the results to appear - normally the same page you're putting the form on.
    4. Do the same for line 5 (yes, I know it seems redundant, but believe me, it makes it much faster and more secure!).

    That's it!

    Guestbook entries are added after the <!--NEXT ENTRY--> line in your page source.

    Example HTML

    <b>Use this form to add your entry to my guestbook:</b> <br>
    <FORM METHOD="post" ACTION="http://web.ukonline.co.uk/public-cgi/guestbook">
    <INPUT TYPE="hidden" NAME="URL"
    VALUE="http://web.ukonline.co.uk/Members/john/guestbook.html">
    <INPUT TYPE="hidden" NAME="page" VALUE="john/guestbook.html">
    Name: <INPUT TYPE="text" NAME="name" VALUE="" SIZE=20>
    Email: <INPUT TYPE="text" NAME="email" VALUE="" SIZE=20>
    Comment: <TEXT AREA NAME="comment" rows=5 cols=60> </TEXTAREA>
    Where are you? <INPUT TYPE="text" NAME="location" VALUE="" SIZE=20>
    <INPUT TYPE="submit" value="Okay">
    </FORM>
    <!--NEXT ENTRY-->

    Stylesheets/CSS1

    Style sheets are a way of complementing the structural information provided by HTML, and offer a means of imparting concisely design characteristics. There are currently two main style sheet approaches, DSSSL and CSS1. The former stands for Document Style Semantics and Specification Language, and is an ISO standard

    Although DSSSL has various virtues, it is CSS1 - Cascading Style Sheets Level 1 - that is likely to have the biggest impact on the Internet. This is largely because it is backed by the official World Wide Web Consortium and all the major players in the Internet market.

    The style information contained in a style sheet can either be applied to an entire document, or to just parts of it. In the former case it can be stored as an external file, perhaps used by several Web pages. This can be done using a new tag that points to the external style sheet.

    Alternatively it can be embedded within the HTML document itself. In this case, each mini-style sheet - and there may be several within a document - uses the tag. Style properties can be applied to blocks - for example paragraphs and lists - or to individual aspects such as bold, italic, emphasised text etc.

    Elements that can be controlled include font size, font weight, font family, letter spacing and word spacing. As the name implies, Cascading Style Sheets can be applied in layers, so that various style properties are overlaid on top of others, replacing them wholly or in part.

    HTML Programs

    It is one of the World Wide Web's many interesting features that even the most complex of Web pages can be created on the humblest of desktop PCs. In fact all you need is a simple text editor like Windows Notepad to produce the HyperText Markup Language (HTML) documents that underpin the WWW. Notepad may do, but this has not stopped people from seeking to devise more powerful HTML editors. The overwhelming majority of these have been written for Microsoft Windows, not surprisingly given its dominance of the PC software sector.

    Three quite distinct approaches are used: word processor templates, standalone programs and extensions to browsers. The first, building on the fact that HTML files are essentially text, employs pre-existing word processors (mostly Microsoft Word, but also WordPerfect) to provide the main editing functionality. The word processors are customised through document templates that provide both the basic HTML document structure and also modify the on-screen appearance (usually through the addition of button bars and drop-down menus) to provide easy access to the HTML elements (such as lists, images and links to a Web page).

    Several such templates have been produced, including CU_HTML (available from ftp://emwac.ed.ac.uk/mirrors/indiana/winword/cu_html.zip), GT_HTML (http://www.gatech.edu/word_html/release.htm) and HTML Author (http://www.salford.ac.uk/docs/depts/iti/staff/gsc/htmlauth/summary.html), all of which are free. CU_HTML has the advantage that it will work with both Word 2 and Word 6 (the other two will only work with Word 6), while HTML Author offers more features (including the ability to create simple tables) and a better interface.

    Microsoft, no less, has also taken this route with its Internet Assistant, available from ftp://ftp.microsoft.com/deskapps/word/winword-public/ia/wordia.exe - note that this is a big file, over 1Mb for the 16-bit version (Word 6 for Windows 3.1x) and over 2Mb for Word 7. What is particularly remarkable about this program is not just that it is free, but that included with the basic template is a fully-functional Web browser. Moreover, this browser can be used as a standalone program: just run the file iwia.exe found in the Internet subdirectory of Winword on your PC and you will be able to access Web, Gopher and FTP sites.

    In the face of the competition from these free programs it was a somewhat foolhardy decision on Quarterdeck's part to charge £99 for its WebAuthor template, a product which really does little to justify its price-tag (see http://www.qdeck.com/beta/WebAuthor-highlights.html for more information). Perhaps a better move would have been to adopt the alternative approach of a standalone HTML editor which allows more advanced features to be added.

    Once again, there are a number of good freeware programs in this area, including HTML Writer (at http://lal.cs.byu.edu/people/nosack/get_copy.html) and HTML Assistant (from ftp://src.doc.ic.ac.uk/computing/systems/ibmpc/windows3/misc/htmlasst.zip, available as a supported commercial program too). The HoTMetal program also comes in free and supported versions: see the URL http://www.sq.com/ for more details.

    More advanced is WebEdit, which can be obtained from its home page at http://wwwnt.thegroup.net/webedit/webedit.htm. As well as its very clear design, WebEdit is notable for supporting advanced HTML options like tables and the extensions introduced by Netscape. WebEdit is shareware, and costs $99.95 to register.

    Another product which is a step beyond the simpler HTML editors is Live Markup. Its innovative interface goes much of the way to offering a fully WYSIWYG editing environment for HTML - no mean achievement given that this is almost a contradiction in terms: HTML does not define the exact appearance of a Web page, only its underlying structure which is then realised by the Web browser that views it. Although slow in its current version, the $99 Live Markup is worth examining as an indication of how HTML editors are likely to develop; it can be downloaded for evaluation from http://www.mediatec.com/mediatech/download.html.

    Netscape Gold (and has had since version 1.2) includes a basic editor, but not all features of HTML 3 are covered and it is a little pedestrian on longer documents.

    Help on HTML

    As you would expect help on HTML is available at several sites. An excellent list of guides to HTML can be found at http://union.ncsa.uiuc.edu/HyperNews/get/www/html/guides.html. One particular good guide is at http://info.med.yale.edu/caim/StyleManual_Top.HTML, while there is another written by the inventor of the World Wide Web, the Briton Tim Bernes-Lee at http://www.w3.org/hypertext/WWW/Provider/Style/Overview.html.

    Extensions to HTML introduced by Netscape (see http://home.netscape.com/home/services_docs/html-extensions.html for a full definition of these) are open to abuse. Netscape has done more than any other product to turn the Web into some kind of hallucinogenic online theme-park full of the gaudiest and most bizarre designs imaginable.

    Anyone who doubts this should visit http://www.europa.com/~yyz/netbin/netscape_hos.html. Justly called the Netscape Hall of Shame, this presents the very worst offenders against good taste and public morals who, through their reckless use of Netscape HTML extensions, show how not to put together a World Wide Web site. Look, loathe and learn.

    Starting to get graphical with Web pages

    One of the most striking aspects of the World Wide Web is its graphical nature. Indeed, although the World Wide Web was creating much interest after it was devised by Tim Berners-Lee in 1989, its current steep growth curve can be dated from the appearance in 1993 of the graphical browser Mosaic.

    Given the dramatic effect inline graphics have had on the WWW, and how important an element of Web pages they can be, it is surprising how easy they are to insert. There is one basic command, which takes the form:
    <IMG SRC="one.jpg">
    This places the file one.jpg at the point in the HTML document where the tag is found. This assumes that the graphic is in the main Web root directory. A graphics file elsewhere can be used simply by giving its full path (or URL if it is located on another machine on the Internet).

    There are several refinements that can be added. For example, with this default form, the bottom of the image is aligned with surrounding text. However, you may well want the image to be placed differently. This is done by specifying the alignment explicitly using ALIGN within the tag:
    <IMG SRC="one.jpg" ALIGN=MIDDLE>

    Another variation is to change the size of the image as displayed within the page. This is achieved by specifying the number of vertical and horizontal pixels it should occupy on the screen:
    <IMG SRC="one.jpg" ALIGN=MIDDLE HEIGHT=50 WIDTH=50>

    If the ratio of the HEIGHT and WIDTH elements differs from that found in the image, the latter is stretched in the relevant direction. Since the size of the graphics file does not change if it is used in a scaled-down form (it is the browser that does the scaling of the original), it is seldom worth using this scaling for any but very slight changes. As the user will pay (in download times) for the graphic whatever size it is used, better to fit the graphic to the space beforehand if possible (or use the HEIGHT and WIDTH only for scaling up rather than down).

    There is a final subtlety involving graphics images, generally accepted as good practice. Using an element of the form ALT="a .jpg image" the words "a .jpg image" will appear in the graphic's stead when it is not possible to display the image itself. Obvious applications of this are when visitors to a site are using text-based browsers such as Lynx, or when the visually-impaired employ devices for converting text into sound, say, where graphical elements without such text make the pages unusable.

    Moreover, there is another important class of users who will benefit from the use of ALT. Many people prefer to navigate around the Web with inline graphics turned off. This allows them to move through pages very rapidly without waiting for the sometimes large graphics files to be downloaded. The more types of users you cater for with your Web page, the more popular and successful it is likely to be.

    Putting all these elements together in the standard HTML containers gives a minimal Web page of the following form:

    <HTML> <TITLE> Graphics </TITLE>
    <BODY> <H1> Graphics in Web pages </H1> This is text <IMG SRC="one.jpg"> at the bottom. This changes the image size <IMG SRC="one.jpg" ALIGN=MIDDLE HEIGHT=50 WIDTH=50> This copes with text-only browsers by displaying the words <IMG SRC="one.jpg" ALIGN=TOP HEIGHT=50 WIDTH=150 ALT="a .jpg image"> </BODY> </HTML>

    HTTP 1.1

    The hypertext transport protocol (HTTP) is so fundamental to the World Wide Web - it mediates every connection between Web client and server - that it is all but invisible apart from its presence in URLs. And yet ironically this crucial element of the Internet is responsible for much of the current slowdown in response.

    Of course, this is hardly the fault of Tim Berners-Lee when he drew up HTTP. It does what it was designed to do very efficiently: move packets across the network. But the pattern of use found today is so far removed from the original intention that HTTP is now beginning to show the strain. HTTP 1.1 is designed to address some of these problems.

    For example, every URL that is retrieved initiates a completely new request by the client to the server, even if it refers to the same Internet server. Setting up and tearing down these connections is wasteful, so the revised HTTP specification allows what are called persistent connections.

    Once a connection is made, data can be buffered up so that it is sent in the most efficient way possible, leading to faster throughput of information and faster Web page downloads.

    Another advance over the original HTTP is to do with acknowledgements. Under version 1.0, a server would wait for an acknowledgement after every packet of data that it sent. With HTTP 1.1, it can start sending subsequent packets without waiting for the acknowledgement of the first, a process called pipelining.

    Although apparently minor changes, these refinements should go some way to speeding up the Web and the Internet, but only once HTTP 1.1 is widely implemented in software.

    HyperText Transfer Protocol

    Anyone who uses World Wide Web browsers such as Netscape or Mosaic soon becomes familiar with the four letters 'http'. All Web site addresses (URLs, or Uniform Resource Locators) begin with them, as in http://www.ukonline.co.uk/. These initial letters change according to the type of service you access.

    So, for example, if you want to connect to an FTP site with your Web browser, you would enter something like ftp://ftp.microsoft.com/.

    In this case 'ftp' stands for File Transfer Protocol, and refers to the set of rules (or protocols) used by the two computers in question - your client and the distant server you wish to access - to negotiate the admission to the FTP site in the first place and the subsequent transfer of files from there to your machine.

    In a precisely analogous way, 'http' stands for HyperText Transfer Protocol, and defines how your Web client program contacts the Web server whose address comes after it (the www.ukonline.co.uk part of http://www.ukonline.co.uk/), and how data is passed between the two.

    A common source of confusion in this context is the almost universal use of HTML - HyperText Mark-up Language - alongside HTTP. In fact they serve quite different functions.

    HTTP is about how contact is made and maintained between the two sites - how messages are sent, with no reference to what is sent. HTML, on the other hand, defines the structure of the hypertext documents that are retrieved - what the messages are - with no reference to how they are passed around.

    Hypervideo

    One of the key elements of the World Wide Web is the hypertext link. Although this idea was not new when the Web offered it, it was the Web that made it easy to implement and that demonstrated so convincingly how useful the ability to jump from document to document could be.

    The Xlink application of XML extends the range of possibilities as far as hypertext links are concerned. It still only applies to static objects in a document and does not attempt to embed hyperlinks in elements that exist in time. This may be remedied by IBM's Chinese research laboratories. It has devised a hypervideo system with rather unfortunate name of Hot Video, and has written creation and viewing tools for tit. These allow hot spots to be embedded within popular video formats such as AVI, MPG and MOV.

    Users can play Hot Video content from local discs, CD-ROMS and servers as well as view contents over the Internet using special Hot Video plug-in for Netscape Navigator or Internet Explorer.

    HVML

    Even though many high-speed corporate connections are not being installed, the telephone is still a key element of the Internet's infrastructure. The Internet is also wedded to the telephone through Internet phones that allow conventional voice messages to be digitised and sent over global wiring for the cost of a local call.

    But there remains another important facet of the Internet-telephone interface that has so far remained untouched. Accessing Web pages, say, using an ordinary phone without a computer.

    The idea is that the most important elements of hypertext markup language (HTML) pages are text and links; the former can clearly be converted into audio files, while selecting hyperlinks is no different from already ominipresent touch-tone menus employed by many voicemail systems.

    All that is required is some standard way of converting information on HTML pages into output that can be channelled down the phone. A company called Stylus has extended work in the area of interactive voice response systems (such as touch-tone banking and fax-on-demand) to come up with the hypervoice markup language (HVML).

    These are extra tags added within an HTML document, transparent to most browsers, but enabling suitable software to take the content they refer to and make it available to users calling in with an ordinary phone, but no computer, to special servers.

    The HVML tags include ones for playing a pre-recorded file, prompting a caller for touch-tone input, speaking information from the HTML page, and following a hypertext link.

    ICQ

    One of the ironies of the Internet is that a medium with potentially tens of millions of simultaneous users has almost no sense of this huge online community. The two main forms of communication - E-mail and Usenet newsgroups - are not real-time, and Internet Relay Chat, which is real-time, is unknown in mainstream business circles.

    Various programs have allowed real-time conversations across the Internet, for example WS_Chat and Wintalk. Both of these allowed messages to be typed into a small window which would then appear on the recipient's screen. But before you could use such software, you needed to know both the recipient's Internet address and when they were online. These requirements meant that chat programs remained niche products with small user bases.

    Messenger
    AOL got round both problems with its AOL Instant Messenger service (see www.aol.com/aim/home.html), since user names could replace Internet addresses, and the central AOL servers could maintain lists of who was online at any given time. The disadvantage, of course, was that only members of AOL's online service could use this system.

    More recently, Netscape offered AOL's approach to anyone on the Internet using its browsers (http://home.netscape.com/newsref/pr/newsrelease511.html), but in the meantime an Israeli start-up company called Mirabilis has colonised this sector in the most dramatic manner conceivable. The firm claims that its ICQ program has more than 13 million users, with an astonishing 60,000 signing up every day. ICQ is available free for most platforms, including Java (see www.icq.com/productsdesc.html); there is also a groupware version for intranet use (see www.icq.com/groupware/). A cluttered home page (at www.icq.com/icqhomepage.html) gives some idea of the amazing culture that has grown up around this program.

    After you have downloaded the program and signed up to the service, you can create contact lists of friends and colleagues (sometimes called "buddy lists") with whom you wish to chat. The ICQ server monitors who is online when you are, and indicates this in your ICQ window. It is then a simple matter to initiate a chat session with one of your contacts, through further windows that appear (including one that shows the entire history of the conversation, useful for keeping copies). You can also send files, URLs, contact lists and E-mail. ICQ offers an E-mail service as well as free home pages.

    It is possible to search ICQ servers for particular users, or to investigate general groups of people with common interests. One powerful extension of ICQ is that it can be linked with Net telephony: that is, you can use ICQ to alert you when certain users are online, and then call them up using one of the many IP telephony products.

    Eminently usable
    As the level of users suggests, ICQ is something of a cult program. Despite this, and its rather gaudy user interface, it is eminently usable for business purposes.

    For example, with sales people on the road ICQ would be a valuable way for managers and colleagues to find out when they are logged on - and perhaps contact them cheaply using Net telephony, even if they are abroad. You can also create small closed groups of users, called Virtual Private Online Networks - see http://www.icq.com/create-network.html. ICQ will work behind corporate firewalls, including those using proxy systems, as the excellent resources for the subject at www.icq.com/firewall/firewallhelp.html explain.

    Alongside its vast user base, ICQ's most notable feature is that it provides Mirabilis with no revenue stream. The software is provided as a rather curious time-limited free beta, and there is currently no advertising. Quite why this company without profits or income, and with huge and constantly growing costs, was recently bought by no less than AOL - which already has its own buddy-list service - for $287m (£179m) is discussed below.

    Portal mania grips the Net

    The race is on as Web companies aim to bag the most users through portal sites.
    America Online's (AOL) decision to pay $287m (£179m) for the company Mirabilis (see press release at www.icq.com/press_release26.html), described above, was not some desperate attempt to take its rival out of the chat market. AOL was really buying the ICQ program's claimed 13 million users.

    Aside from their number, what made these so attractive was their geographical distribution: more than 60% are outside the US. With the purchase of Mirabilis, AOL was increasing its global Internet reach dramatically. That reach is important because the race to generate concentrations of users - through the creation of portal sites - is hotting up.

    Even though the costs of adding new content and services, and of buying complementary or rival sites, is enormous, all the major players have plunged into this cyber-maelstrom almost without worrying about the consequences.

    Valuations
    They are able to do this in part because the US stock market is placing truly incredible valuations on these key sites. Even though very young, with few tangible assets and never having turned in a significant profit, portals are now valued in billions of dollars.

    For example, according to the stock market (June 1998), Yahoo is worth about $6.85bn, Excite $1.94bn, Lycos $1.27bn, and Infoseek at $950m. In fact portals are now so attractive to investors that Netscape, the original Internet company, is trying to redefine itself in part as a portal.

    This aping goes further. The new version of Netscape's Netcenter home page ( http://home.netscape.com/) looks almost identical to Yahoo ( www.yahoo.com/), Lycos ( www-english.lycos.com/), Snap ( www.snap.com/), and AOL ( www.aol.com/).

    Microsoft, too, is trying to join the portal club with its new Start service, although the beta version (at http://home.microsoft.com/) doesn't adopt the Yahoo approach.

    This portal frenzy, and the accompanying stock market madness, are driven by two important trends: one happening now, and one that most people involved with the Internet business believe will occur at some point in the future.

    The first trend is the move by old media - press, TV, cable, and film - to get into new media as quickly as possible. There seems to have been a sudden realisation that the Net is not just a fad, but will revolutionise all aspects of every media.

    Established media companies are only too conscious that they are being wrong-footed by smaller, swifter competitors, and are terrified that they have left things too late. As a result, there is a rush to create major alliances with the main online players - the portals.

    Perhaps the most dramatic meeting of old and new media was the stake Disney took in Infoseek (see http://info.infoseek.com/doc/PressReleases/magic/press_release.html).

    Fledgling
    Other deals have already taken place - for example, NBC with the fledgling portal Snap (see www.snap.com/main/help/item/0,11,-8032,00.html) - and many more will doubtless follow.

    The other, future trend is in many ways the justification both for the crazy valuations placed on these portals, and for the rash of morganatic media marriages. Most people believe E-commerce will be huge, and that portals will be one of the main channels for such purchases.

    Certainly the pace at which E-commerce initiatives are coming through is encouraging, and the technical issues such as security have largely been addressed.

    But it is hard not to see current stock market valuations as excessive, even against this positive background. In many cases, flotations of Net firms look to be highly opportunistic, riding a wave of investor enthusiasm that will surely evaporate as losses mount and huge returns prove slow in arriving.

    The danger is that there will then follow a backlash, with massive drops in share prices, a knock-on effect on the stock exchanges, and a general disenchantment with all things Internet.

    Infoseek

    As portals have moved to the heart of business Internet activity, it has mainly been the big names - Yahoo, Excite, Netcenter etc - that have made the news.

    One company notable by its absence from all this frenzy is Infoseek which, on the basis of its visitor levels and turnover, must be considered something of a second-division player.

    But Infoseek has been up to some interesting things, as a visit to the old URL soon shows. This takes you to the slightly modified address - and a very different site from previous versions.

    This new portal with the simple and memorable name of Go is a highly ambitious reworking of the increasingly marginalised Infoseek search engine and is backed by Disney.

    Now, alongside the basic and advanced search capabilities (the latter with many powerful search options), top news stories, the weather and several other typical portal features, there are the by-now familiar channels (called centres here).

    Among the 18 channels available, there are ones devoted to business, computers, the Internet and news.

    Go also offers several novelties. Each channel/centre adopts a kind of subsidiary portal design, and sports three (virtual) tabbed pages, one of which is visible. The other two provide parallel information, specifically Web sites that are relevant to the sub-portal area currently viewed, and related community chat areas.

    But the real significance of Go lies not in its interesting and generally excellent design (the underlying HTML code is well worth studying as an example of how to create complex but clear Web pages using tables). Rather, it is the business issues surrounding the appearance of Go that are key.

    Go is a joint venture of Infoseek with Disney. Disney currently owns 43% of Infoseek, and has options to acquire a majority stake.

    As part of the deal, Disney handed over to Infoseek another Internet company, which it had bought in 1998. Starwave is one of the many companies set up by Paul Allen, the other Microsoft founder, in 1993, and it has created several sites in conjunction with more traditional media companies (more information here)

    Among the most important of these is the ABCNews site at, one of the big news offerings on the Web, and widely used by other sites as the featured news service. Netscape's Netcenter is one example, which means that there is now a growing relationship between AOL - Netscape's imminent owner - and Disney, through its stake in Infoseek.

    The Go portal represents the first in a major new wave of consolidation in the Internet industry. It is particularly important because it is a classic marriage of old media (with plenty of assets, both financial and intellectual) and new media (with plenty of Net experience).

    Both sides desire each other desperately: old media needs new media so as not to get left behind in the stampede to the Web, and is itself needed because its financial health is rather better grounded than that of the rich but fragile Net upstarts.

    Further proof that a major re-structuring is under way is provided by the acquisition of Excite by @Home.

    The latter is a relatively new company providing fast Internet access via cable. It it is partly owned by TCI, which in turn is being acquired by AT&T - another old-time behemoth very keen to re-invent itself in the image of the Internet.

    The interesting question is now how the intertwined relationships among AOL, Amazon, Yahoo, Lycos, Microsoft, IBM, News Corporation, Bertelsmann and all the other new and old media giants (plus a few telecoms ones) will evolve into distinct, competing blocs over the next 12 months.

    One thing is certain: it's going to be a busy year for the portals, which will undoubtedly form the focus of these new forces.

    Inline images

    The original World Wide Web browsers were purely text-based, and offered a basic means for following hypertext links (one such browser, Lynx, is still encountered today). Perhaps the crucial advance beyond those path-breaking early versions was the addition of graphical elements.

    There are actually two quite distinct types of images found in such multimedia-enabled programs. The first are those images that are displayed as separate entities; more interesting - not least from a design viewpoint - are graphical files that are integrated into the page, so-called inline images.

    Although these are of necessity sent separately from the ASCII HTML file that defines the overall structure of the Web page, they seem to be embedded within in it (though exactly how they will appear on the screen depends critically on the capabilities of the browser, as is the case with nearly all attributes of a World Wide Web creation).

    Browsers like Netscape or Mosaic that are able to cope with such inline graphics process them first (since they are almost always sent as a compressed .gif or .jpg file) before displaying them at the appropriate point in the on-screen representation of the HTML document.

    Other graphical formats (for example .bmp or .tif) require the specification of a helper application that can similarly process and display the file. However, in this case the image is placed in a separate window, and thus is not integrated into the fabric of the Web page. One of the latest developments in the Web arena has been the extension of the inline idea to other, more complex file formats such as VRML.

    Intermercials

    Web advertising remains the main way of generating money on the Internet. Given this importance, it is not surprising that advertising agencies are starting to create variations on this basic theme. The simplest approach is to place advertising messages at strategic points in a document, usually the top and sometimes the bottom. These banner ads normally take the form of rectangular areas with static messages and links to advertisers' sites or short animations.

    But visitors to Web sites can pass quickly over these formats. Indeed, the commoner such banner ads become, the more reflexive is the user's act of scrolling down a page to reach the editorial. As a result, a different approach is being developed. Rather than placing advertisements within the space of a Web document being viewed - in the page - the idea is to place them in the dimension of time.

    That is, as the visitor passes through a Web site, advertisements interpose themselves between the editorial pages. The viewer has to wait the pre-defined number of seconds before the ad gives way to the next editorial page. Because these advertisements take place in the self-created gaps of the viewing experience - the interstices - they are generally known as interstitial, or even Intermercials (an abbreviation of interstitial commercials). Clearly this forced viewing is good news for advertisers. But equally it goes against the whole dynamic of the Web, and if abused may prove counter-productive.

    Internet Connections

    The simplest Internet connection is that provided by E-mail. On its own E-mail is restrictive. To be "on the net" you need a true Internet address. These are expressed numerically. The equivalent to E-mail address is expressed in characters. For example jane.smith@abc.industries.co.uk is uniquely expressed numerically as four number sets: 199.100.128.64.

    You can also use other Internet facilities through dial-up online services, such as FTP (file transfer protocol), gopher, archie, telenet and World Wide Web.

    Many companies set up a central server to act as a gateway. to use the Internet in this way the company requires some kind of full Internet connection and the system administrator has to set up accounts for end-users within the organisation.

    Full control - and your own Internet address with its numerical equivalent - is available when you have access to some kind of transmission control protocol/Internet protocol (TCP/IP) link, which can be provided over a company network, or over a dial-up line using a modem. TCP/IP underpins the whole Internet and is basically a set of standards that determine how information is transmitted and received. It is also the name of the implementations of those standards.

    If you have a TCP/IP link to your desktop computer, you can then be truly "on" the Internet, have full access to all of its facilities, and can run advanced Internet programs that understand the TCP/IP protocols on your machine.

    When you obtain a full Internet connection, you will be allocated one or more numbers that make up one form of your Internet address - something along the lines of 129.34.139.4 (this is one of IBM's many addresses). A friendlier form, along the lines of bloggs.co.uk is provided. These names are registered centrally to catch potential abuses, such as intentionally registering someone else's name. It is possible for UK businesses o register names in the US (to give an address of blogs.com). this means that although the UK committee will probably stop your competitors from filching your company name in the UK, there may be nothing to stop them doing so in the US, where the central registry will probably never have heard of either company. The thorny subject of trademarks and Internet names remains unresolved (see http://www.fenwick.com/newpub/sma-trade.html for a useful summary of the situation in the US).

    At the main InterNIC name registry, free registration has been replaced by a $100 charge for two years (see http://rs0.internic.net/announcements/fee-policy.html). In the UK the former gentleman's club has been replaced by a non-profit making organisation called Nominet who now register the .uk domain. The charge is £100 for two years' registration. See http://www.nic.uk/press.html for details. A new company called Nomin Nation is offering a two year registration using the uk.com sub-domain (instead of .co.uk, .plc.uk or .ltd.uk) for £45. Details at http://www.nomination.uk.com/. It may be possible in the future to create a domain, e.g. IBM could use .ibm instead of .com. Individual brands could also be registered. To join the debate and mailing list, send the message subscribe to the address newdom-request@iiia.org. Previous postings are held as a series of linked Web pages at http://www.iiia.org/lists/newdom/. Other useful documents on this topic are:-

    http://www.iiia.org/itld/rfcs.html which contains all the RFCs dealing with Internet names

    http://www.iiia.org/draft/draft-postel-ianaitld-admin-01.txt which presents the overall framework for extensions to the current registries.

    http://www.iiia.org/draft/draft-higgs-tld-cat-01.txt offers some proposals on what form the new domains make take based upon the International Trademark Schedule of Goods. Examples include .rope, .clothes, .handtool, .meat, and .cyborg: the last for manufacturers of surgical and medical equipment, apparently.

    Registration in the UK is about £100 and can be done through the main Internet providers:- Demon: sales@demon.net, EUnet GB: sales@britain.eu.net and Pipex: sales@pipex.com.

    Internet Message Access Protocol

    In the section POP3 v. SMTP the distinction between Simple Mail Transfer Protocol (SMTP) and Post Office Protocol 3 (POP3) is discussed. The former is used for the sending of E-mail messages across the Internet, which pass from system to system until they arrive at their end-destinations, while the latter is designed to allow users to retrieve E-mail from a POP3 server, which acts as a kind of poste restante.

    POP3 therefore allows you to download your E-mail at anytime, and from anywhere, whereas SMTP can only send to a given address, and delivers as soon as it can.

    Although POP3 is a convenient adjunct to SMTP, it is very limited. All you can really do is download a message.

    More complex manipulation of the E-mail held at the electronic post office is not possible. For this reason the Internet Message Access Protocol (IMAP) was developed. It builds on the ideas of POP3, but offers a far wider range of features. These include the ability to create, delete, and rename mailboxes held on the IMAP server; check for new messages; permanently remove old ones; search through the messages; and fetch message attributes, texts or portions of texts.

    Messages are accessed through IMAP by the use of numbers. These are either message sequence numbers (where each message's relative position to the first message is used) or other, unique identifiers that have been assigned.

    The basic document for IMAP is RFC1730.

    Internet Relay Chat

    Internet Relay Chat, or IRC, is one of those areas of the Internet that provoke strong feelings. For some, it represents the point at which the millions of Internet users can come together in the most unmediated way; for others, it is the biggest electronic waste of time ever invented, as proved by the thousands of helpless IRCers who spend most of their waking hours inhabiting this twilight world.

    The idea (first thought-up in Finland) is simple: to use the Internet as a means of communication, but not just from one person to another, but as a way of passing typed messages in real time among a group of people. This is achieved using the standard client-server architecture: an IRC server relays the messages sent by IRC clients to the other participants, who may be anywhere on the Internet and connected via other IRC servers (who send and receive the messages among themselves before passing them on to the users).

    Obviously it would not be feasible to pass all messages to all users: instead, IRC is divided up into channels. Each channel has nominal topic, and is controlled by the person who creates it (new channels are created very easily). Typically there will be a few dozen people joined to a channel, with several thousand simultaneous channels.

    Just to complicate matters, there is not one but two IRC systems, both working in exactly the same way, but consisting of separate groups of IRC servers. The main one is called EFNet, and is larger but renowned for its political divisions and consequent temporary fragmentation; the other is called the Undernet and at least theoretically offers a slightly stricter operational framework.

    Internet Users

    For a market as important as the Internet, remarkably little is known about its users. While it is true that there are an increasing number of surveys and market research reports, it is still quite hard to draw all these together to form an overall picture of the online world. This makes the CyberAtlas site at all the more valuable. It has been put together by I/Pro, a company specialising in the field of Web measurement. Aside from the odd discreet plug of this company's own services, the site is pleasantly free of advertising.

    Its basic navigation metaphor is extremely simple, using points of the compass on the image of the opening page (though there seems to be no route in for text-only browsers - a rather basic blunder to make). The site itself is very shallow structurally, so there is no danger of getting lost in ever-deeper levels.

    Most of the links are self-explanatory. So News leads to the latest news in the field of Web measurement, while Market Size pulls together figures from many sources to provide a fascinating overall view of this area. Other obvious links are Demographics, Geographics and Usage Patterns.

    There are market research figures for browsers, modems, and servers. Other ways of looking at the online world are in terms of advertising, electronic commerce, intranet activity, and overall site building costs. One of the best links is to other related resources. It is also worth emphasising that all of the above-mentioned pages have links through to the original sources for their figures, allowing further research to be carried out at the original site.

    ISDN

    As most people know, to connect a computer to an ordinary telephone line requires a modem. This stands for MOdulator/ DEModulator, and refers to the process of converting the computer's digital 1s and 0s into audio tones (the modulation) that can be sent across the telephone network and then decoded (demodulated) at the other end. Clearly this is an inefficient process, and ideally you would like to be able to send digital signals directly across a phone connection.

    This is precisely what the ISDN (Integrated services digital network) system offers. ISDN allows you to send digital data across special lines to other computers without the need for conversion. Aside from obviating this need at both ends, another benefit is speed: ISDN lines usually run at 64Kbps, twice as fast as the raw speed of a V.34 modem.

    Moreover, ISDN lines typically come in pairs that can be combined to double the throughput to 128Kbps; although most service providers are not able to offer this service.

    ISDN is not a new technology, it is one that has hovered on the sidelines of the computing world for many years. It was originally developed with a view to allowing a wide range of high-throughput services - hence its name. However, it is increasingly attractive as a means of offering fast Internet access.

    More and more suppliers in the US are now starting ISDN services, with those in the UK following suit.

    The biggest barrier to the take-up of ISDN remains BT's high cost of installation for the lines. In this respect, the UK is well behind other European countries such as Germany where installation costs have fallen, leading to an sharp rise in the number of ISDN lines installed. Until recently, BT's charge for installing an ISDN line with 2B+D channels was £400, while annual rental was £336. This was far higher than in countries like France and Germany (both around £80 for installation and lower than BT for rental, and has doubtless been the main reason for the slow take-up of ISDN in this country compared to those countries.

    BT re-organised the price structure in September 1996. Although reducing the first year cost by over £100 the customer has to pay in advance for as the new price includes a number of calls and standing charges in the start-up cost (ISDN calls are charged at the same rate as analogue calls, as they are in most European countries). It is clear that BT do not want to offer this service as they make it very difficult to work out the pricing information from them.

    Then Oftel intervened. There was some hope that it would force BT to offer real reductions rather than the window-dressing of the new package. But instead, the revised charges that emerged from consultations between BT and Oftel actually increased the overall charges in the case of the frequent user - the only novelty of BT's pricing, and the one most likely to appeal to Internet users who typically spend more time logged-on than for voice calls. As of

    October 1996 the prices for ISDN-2 (2 lines) are as follows Connection charge, £199 Quarterly charge (for the first 24 months) £132.75 which includes £105 worth of calls Annual standing charging (after 24 months) £230 Calls, locally, nationally and internationally are charge at the same rate as standard lines. International data connections are priced differently. e.g. Destination, day-time, evenings, week-end (per minute) Hong Kong, 86.5p, 69.3p, 63.8p USA. 25.2p, 23.9p, 22.2p

    In terms of performance this compares favourably with leased line rates, which start at 64kbps. However, leased line connection offers permanent connectivity, whereas ISDN (and Pots) is on-demand. The dream of all Internet users must be to possess either a T1 or T3 line. Sometimes the letter E is used in Europe instead of T, with slightly different values: E1 is 2Mb/s and E3 34Mb/s while T1 offers 1.5Mb/s and T3 45Mb/s. Understandably, E3/T3 connections are rare at the moment, although probably there will come a time when everyone will have one.

    If BT is the continuing bad news about real-life ISDN for business Internet users, fortunately just about everything else is good news. For example, Demon Internet offers ISDN connectivity to the Internet for the same basic £10 a month it charges for its ordinary national dial-up service - very good value compared to many previous (and some current) ISDN pricing which ran to thousands of pounds of year. And Demon Internet is not the cheapest: Global Internet offers nationwide ISDN access to the Internet for just £7.50 a month. Moreover, even consumer-oriented services like UK Online offer country-wide ISDN, which shows how general ISDN is becoming.

    On the software front the biggest breakthrough, in Europe at least, has been the arrival of the German-led cross-platform standard Common ISDN API - CAPI - now at version 2.0. Basically it offers a standard software interface that lets software control ISDN hardware. This allows generic comms programs such as Procomm Plus to support any compliant ISDN hardware and drivers. An older standard sometimes still encountered is WinISDN.

    As far as hardware is concerned, there are three main ways of connecting to the Internet using an ISDN connection. External devices - usually known as Terminal Adapters or TAs - hooked up to a serial or parallel port; routers, and internal cards, the cheapest route.

    Two years ago, ISDN cards for PCs cost around £1000; today there are several for just a few hundred pounds. Perhaps not surprisingly, it is a German manufacturer, Teles, that offers about the cheapest product; its plug and play card supports all the main European standards, comes with a wide range of useful software, and costs just £169. I was able to connect at 64Kbit/s to the three ISPs mentioned above without problems with this card. Another European company, Chase Research, has launched the new NetChaser TA to complement its ISDN-PC internal card.

    Worth mentioning is Microsoft's growing support for ISDN which is likely to push along these emerging standards even faster. It has devoted a whole section of its Web site to this area. As well as downloading ISDN hardware drivers you can even order ISDN connections online - at least you can if you are in North America or France (with Germany to be added soon).

    Why ISDN is (finally) perfect for business

    For many years, the Integrated Services Digital Network (ISDN) telephone technology has been a solution in search of a problem. There are increasing signs that the Internet may well be that problem, and that ISDN can offer the perfect fast and low-cost dial-up solution for companies.

    When ISDN first appeared over a decade ago, it was an exotic beast; it was based on the then strange idea that all information should be transmitted digitally over the telephone network, rather than as smoothly varying analogue signals. Today, when most of the telephone infrastructure is built around digital technology, it is the analogue last leg to your desktop that is the anomaly.

    Indeed, we find ourselves in the crazy situation where digital data from a computer is converted into analogue signals (using a modem) to be transmitted to the local exchange where it is then converted into a digital format for transmission over the telephone network. ISDN simply cuts out the intervening analogue step - with obvious benefits in efficiency and throughput. Moreover, no new wiring is required.

    The basic kind of ISDN, generally known as Basic Rate Interface, or BRI, offers three distinct channels in this datastream. These are conventionally called 2B+D, and refer to one channel (D) used for signalling, and two bearer channels (2B) which carry the data. The D channel has a capacity of 16 Kbit/s, while each data channel offers 64 Kbit/s. That is, each B channel has over double the throughput of a standard 28.8 Kbit/s V.34 modem. Because the signal is purely digital, there are almost no line noise problems typically found with high-speed analogue modems.

    As well as this far greater throughput, the distinguishing characteristic of ISDN is the presence of the D channel. This acts as a control channel, and coupled with the intelligence residing in a computer allows many advanced services to be offered. Perhaps most importantly, the D channel allows ISDN connections to be made almost instantly. Those connecting to ISPs via ordinary dial-up connections will be only too familiar with the long period of negotiation between modems at each end. With ISDN this generally takes less than a second.

    As a consequence, ISDN is perfect for obtaining quick and frequent access to the Internet. The fast connection and disconnection times (known as set-up and tear-down in the ISDN community) mean that it appears as if you have a permanent link, but you pay only for the time that you are connected - the best of both worlds.

    Other tricks that are possible using the D channel and PC intelligence include the combination of the two B channels to give you an effective throughput of 128 Kbit/s; the possibility of making and receiving calls on one B channel while the other is already in use; and the ability to use a series of numbers with a single ISDN line, allocating each to a different preset function (multiple telephone handsets, data, fax etc.).

    Given these and other advantages, ISDN represents a natural progression from ordinary analogue telephone connections. However, until recently, the obstacles to accessing the Internet in this way were so great as to render it impractical. The first problem was that there were many different and largely incompatible implementations of ISDN. By some miracle, Europe has managed to act in concert and devise a standard form of ISDN, known as Euro-ISDN.

    Ironically the US is much further behind in this respect, and ISDN services there have a dizzying array of standards and options. For a full explanation of this and all other aspects of ISDN see the excellent book Using ISDN (£36.99, ISBN 0-7897-0843). There is also an extremely comprehensive set of links relating to ISDN .

    The second problem was the absence of support for ISDN within general software, and few dedicated programs. The introduction of standards like WinISDN and CAPI have changed that, as has the ISDN support in Windows 95 and NT. Similarly, until recently there was no agreed way to connect to an ISP using ISDN, where today there is Synchronous PPP. Last, and by no means least, prices for ISDN equipment were prohibitive. During 1996 prices have continued to drop dramatically, and there is now no reason why the golden age of ISDN in business should not at last begin.

    Intranet

    The new World Wide Web - inside your company

    One of the most obvious trends during 1996 has been the rise to predominance of the World Wide Web for business purposes, particularly marketing and sales, and there seems little doubt that this growth will continue. But alongside the expansion in the use of these public sites - that is, those that are expressly-designed to be accessed by anyone on the Internet - there is a new and interesting use of the Web purely internally within a company, often with no direct links to the outside world - Intranets.

    At first this might seem a strange thing to do, since part of the World Wide Web's power and appeal is that it is so effortlessly worldwide: anyone can access a public site from anywhere. But companies are beginning to realise that the Web is not just a trendy alternative to conventional marketing media, but a complete information delivery system. Moreover, it is one that possesses a number of advantages over proprietary solutions that attempt to offer the same facilities.

    For example, the Web is an open system that is completely independent of platform. You can mix and match any operating system and any hardware that supports the TCP/IP protocols underlying the Internet (and hence WWW). Similarly, it is completely scalable: you can start from a minimalist Web where the HTML files reside on the same machine as the Web browser and progress incrementally to millions of connected machines and thousands of distributed databases - as the Internet itself proves.

    Another advantage of internal Web systems over proprietary groupware and executive information systems (EIS) is cost: even Netscape costs only $39 for the supported commercial version, and there are other good browsers that are free. Ease-of-use is also a major benefit: few people have any difficulty using Web browsers with little or no training; moreover, the nature of the hypertext systems means that navigation is truly a point-and-click affair, and that multimedia elements are integral and active components rather than redundant frills tacked-on afterwards.

    The flexibility that has made the World Wide Web so popular for commercial applications means that it can be applied to almost any department and to any task. An obvious application is internal communications, replacing telephone directories, company organisation charts, corporate newsletters, annual reports, general circulation memos, noticeboards and job postings. The seamless integration of Web clients and servers hanging off intercontinental networks means that multinational companies can bring together staff in a way hitherto impossible.

    Alongside company-wide Webs, there are more local applications. Product development teams can set up stores of information that are shared among the relevant participants, ensuring that the appropriate parties are informed in a timely manner and working together efficiently. Internal Webs are an ideal way of gathering and distributing all kinds of marketing information, and allow physically separated departments to work together on common projects (which can involve graphics, audio files for radio work and even complete videos for television or cinema).

    Sales teams can similarly obtain up-to-the-minute information on company products and services, pricing, competitive data, client records, and access these at any time of the day or night when out on the road by dialling into the corporate network using TCP/IP protocols and a browser on their portable. Financial departments can use Web resources for distributing quickly time-sensitive information such as key indicators or budget allocations and obtaining managers' regular forecasts for analysis and consolidation. The new security features of Web servers means that even the most confidential information can now be sent safely over open corporate networks.

    In fact, combined with standard Internet e-mail for one-to-one communication and newsgroup servers for many-to-many discussion groups, there is little that internal Webs and ancillary Internet software cannot now accomplish in a way that is far easier and certainly much cheaper than current solutions involving groupware or EIS.

    'Intranet'

    In a way, the name given to the Internet is highly confusing. It derives from the part of the TCP/IP suite of protocols that define its operation: IP stands for Internet Protocol, and refers to the way in which data packets are routed between networks.

    In fact, before the Internet, internets were nothing special, and despite the overwhelming importance of the Internet, internets still perform vital functions - though they probably tend not to be called internets for fear of confusion.

    Almost the opposite has happened with the word 'intranet'. The term is a neat adaptation of the word 'Internet' to apply to TCP/IP-based networks that exist entirely within a company - with no external portions. Just as the Internet supports multiple services, so intranets are not just Web-based, but can offer E-mail, FTP, telnet and much more.

    In some ways, intranets can be thought of as hidden zones of the Internet, either entirely disjoint from it, or perhaps attached at one or two points by a firewall that acts as a one-way mirror, allowing intranet users to see out, but no one to see in - in theory, at least.

    Obviously, any company can set up an intranet - or even more than one, if there are several disconnected internal networks running the TCP/IP protocols (if they are connected then they just merge into a larger, single intranet).

    For this reason, intranets should, strictly speaking, have a lower case first letter, since they represent a general class, rather than an upper case, as with the Internet, which is unique, and refers to the global TCP/IP network that started it all - and also gave rise to all this hair-splitting.

    Real-life internal Internets

    For Hewlett-Packard (HP), such an internal Internet system is central to its whole way of working: it has over 1,400 servers, an e-mail system that handles 1.5 million messages a day, private newsgroups and routine internal transfers of files using FTP.

    Among the 200 or so home pages that exist on its internal Web is one set up by the sales office in Seattle. Since this group of employees has responsibility for the major local customer Boeing, the home page lists all those working on the account, not just in Seattle, but in HP offices around the world. There are details of the current projects as well as links out through a firewall to Boeing's home page so that those involved have an up-to-date picture of their customer's activities.

    Another home page is run by HP's European personnel group. Jobs across the whole of European operations are listed, allowing employees easily to keep abreast of promotion opportunities at many sites and departments. One sales force uses a home page for product information, Q&As and news about product updates. There are also a few personal Web documents. More generally, HP uses the internal Web to share organisational charts, mission statements, executive speeches, employee newsletters and the personnel policy manual.

    Although the example of Hewlett-Packard shows how effortlessly Internet technologies can cope with enormous systems (the total monthly volume of data passed over the internal network is 5 trillion bytes), this certainly does not mean that they are only suitable for globe-spanning companies. An equally successful example of how these techniques can be applied comes from the other end of the spectrum.

    Zeneca Engineering is the corporate engineering design and build function of Zeneca, a company born out of ICI by demerger. The Zeneca Engineering Web (ZEW - which of course has its ZEWkeeper) links around 150 users, and is unusual in that it is completely serverless. Instead, a shared network drive is used, and Web pages are pulled in to Netscape simply by opening them as a local file.

    The benefits of this approach are that it is extremely easy to set up and maintain, can be implemented across any local area network (not just one running the TCP/IP protocols), and yet can also contain links out into the external WWW (where full Internet connectivity is provided). The downside is that advanced Web options like forms and clickable images are not available.

    The main application of ZEW is called Corporate Memory, a store of engineering information. It acts as a kind of expert system for non-specialised engineers, but unlike dedicated systems is both easy to set up and use (since it employs the familiar Netscape interface). Possible future plans include evolution into a collection of pointers to information on the 'real' World Wide Web, placed there by engineering equipment suppliers - a transition that is easy to accomplish because of the way HTML elements can be added and deleted incrementally. Other applications include distributing general information bulletins - for example, on corporate IT strategy.

    More case studies of internal Web systems, this time in the US at Eli Lilly and Sandia National Labs, can be found at http://home.netscape.com/comprod/at_work/customer_profiles/index.html, which describes some uses of Netscape products (unsurprisingly, given the URL). As you might expect from a company which is attempting to position itself as an Internet trendsetter, Netscape is well to the fore in evangelising about the use of this technology internally, and has produced an interesting paper on the subject which can be found at http://home.netscape.com/comprod/at-work/white_paper/index.html.

    IP Multicast

    Two noticeable trends on the Internet in recent years have been the upsurge in multimedia traffic, and of push services such as PointCast. Both of these place great loads on the Internet's infrastructure. This is largely because of the extremely inefficient way these heavy loads are sent. This, in turn, arises from the use of Internet techniques that were never designed to cope with these kind of situations.

    For example, there are currently two basic ways of sending information across the Internet. The first is unicast. In this, packets are sent from one server to one client. If more than one client wishes to receive the data, the packets must be retransmitted separately. This is clearly inefficient, especially if services like Pointcast are involved: the same information is being sent out over the Internet many millions of times, clogging up the network unnecessarily.

    The main alternative has been broadcast. Here a single transmission from a server is sent to every client out on the Internet: much more efficient for the server, but hideously inefficient for everyone else. The new multicast aims to combine elements of unicast and broadcast. Packets are sent out only once by the server, as with broadcasting, but these are then only forwarded to the network segments that have clients who wish to receive them.

    This is achieved by registering users on sub-networks: if all users on a particular network segment unsubscribe to a multicast service, that part of the Internet will no longer receive the multicast packets, thus saving bandwidth.

    Java

    One of the most impressive aspects of the World Wide Web is the fact that you can click on hotspots within a hypertext document to navigate through what is sometimes rather grandly called Webspace. This gives a sense of interactivity that is actually rather misleading.

    For the range of options open to you from a normal Web browser is extremely limited: you can move from document to document, view various files (if you have the right helper programs set up correctly), and enter information in on-screen forms in order to obtain simple customised responses. But in terms of real interactivity - where the page you are viewing responds in real-time in an infinitely-extendible way - this is pretty poor stuff.

    The main problem is that although very easy to use, the HTML language is limited and develops very slowly (even HTML 3 is not yet universally used or even supported). This means that it is not possible to add custom features to a Web page because the current HTML language simply cannot support anything too complex.

    The new HotJava Web browser from Sun gets round this by downloading the extra functionality required for a given page in the form of specially-written helper programs, called applets, as and when they are needed, from the server holding that page. In other words, the Web page automatically creates any extra features that the browser will require. To initiate these downloads from a Web page requires the addition of a single new HTML tag <APP>  which is ignored by other browsers unable to use applets; in this way the Web page is still accessible by the rest of the Internet population, even if the full range of its features are not.

    Applets are written in a language called Java, similar in many ways to C++, but with particular properties developed specially for the task of extending Web browsers dynamically. One important issue that its design addresses is that of security: the potential of an automatically downloaded executable to wreak havoc on a system is obviously high, but Sun insists that its multi-level approach solves the problem (see http://java.sun.com/1.0alpha3/doc/misc/SystemSecurityHelp.html for more on this).

    HotJava is available for Sun Solaris 2.3, 2.4 and 2.5 SPARC-based machines, and for Windows NT 3.5 or Windows 95 (see Java's home page at http://java.sun.com/ for downloading instructions). In due course versions will also appear for other platforms such as the Apple Macintosh. Also well worth visiting is the Java repository at http://www.gamelan.com/. JavaWorld magazine is found at http://www.java world.com/.

    IBM's "Ultimate Resource for Java Developer's" at http://www.ibm.com/java/ has more than 100,000 "distinct pieces of Java information". Although much is heavily biased towards IBM there is a lot of free and useful software. There are sections devoted to news ( http://www.ibm/java/news/ ), standalone applications ( http://www.ibm/java/apps/applications.html ) and applets ( http://www.ibm/java/apps/applets.html ). It worth trying the Mapuccio applet ( http://www.ibm/java/apps/showcase-apps.html ) that creates maps of Web sites, and of Java hierarchies. Java tools ( http://www.ibm/java/tools/ ) also links to Java development kits as well as plugging IBM's VisualAge range. Book lists and white papers can be found at http://www.ibm/java/education/.

    A list of Java applets to run and examine can be found at http://java.sum.com/applets/index.html. There are two kinds of Java applets, those designed to run on Netscape and those designed for Sun's own HotJava browser. for non-programmers the site http://www.noware.com.au/index2.htm helps you generate Java applets without coding directly.

    At the moment the Java applets that are available are fairly trivial demos of some of the language's capabilities. They include things like running tickertape displays of live share prices; on-screen animations; three-dimensional objects that can be moved in real-time by simply dragging them on-screen with the mouse pointer; and simulations that evolve as you input data. A good range of applets can be found at http://java.sun.com/1.0alpha3/doc/misc/SystemSecurityHelp.html and http://java.sun.com/contest/results.html (the results of a small competition to develop applets that show the potential of Java).

    The hope of adherents of the Javanese way is that this Web-mediated approach will re-introduce a level playing-field, giving other software manufacturers a second chance to colonise the desktop. Java's methods chime well with the idea of component-based software, built up in a modular fashion. Java also fits in with the rather more fanciful ideas of Internet terminals: supposed sub-$500 (£325) units with no local storage, which pull all their software from the network (Intranet or Internet).

    For companies, Java potentially offers yet more. In-house software projects can be split up into smaller and more manageable elements, delivered over the corporate Intranet to a heterogeneous mix of platforms, and developed incrementally as the need arises. Java itself will allow anyone with C++ skills to create applets, while the associated JavaScript (see http://home.netscape.com/comprod/products/navigator/version_2.0/script/script_info/index.html for an introduction) means that even those with limited programming skills will be able to deploy the full Java applets in advanced interactive Web applications.

    However much companies like Sun, IBM and Netscape might wish it, the success of Java is, of course, no more certain than that of Microsoft's Internet strategy. For example, in the face of the obvious appeal of Java and JavaScript, Microsoft is offering its own component approach using Active-X, together with the new Visual Basic (VB) Script. The latter has the advantage of being supported by more programmers than any other language (more than three million VB developers, claims Microsoft). Against this, Active-X (formally OCXs) have been mostly used on the Windows platforms, whereas Java was designed to be platform-independent from the start.

    Crucial to the success of Java is Netscape and its browser. The 32-bit Windows version Netscape comes with built-in Java support. Recognising this, even Microsoft has licensed Java for its Internet Explorer browser. Equally, through plug-ins from NCompass (see http:// www.excite.sfu.ca/) and Object Power (at http://www.opower. com/) Netscape can support the OCXs that form such an important element of Microsoft's counter-attack.

    Books on Java include Teach Yourself Java in 21 days (£37.50, ISBN 1-57521-030-4). This is aimed at those who want a step-by-step guide to learning the language. It is written by Laura Lemay, whose book on HTML has been much praised, and Charles Perkins who writes the last third of the book covering more technical issues.

    Also worth noting is Active Java (£21.95 ISBN 0-201-40370-6). Although rather thin, has a dull layout and lacking a CD-ROM it is written by Adam Freeman and Darrel Ince and has some exceptionally clear explanations which manage to show how Java works without going into complex details.

    The electronic magazine (E-zine) Javaworld put out by IDG (http://www.javaworld.com/) and Javology (http://www.magnastar.com/javology/) are a rich source and valuable points of reference. Javology is concise, written with a real love for the subject and has a simple but effective design. Everything is available from its opening page; the five or so main news stories, updated weekly and the regular columns. Back issues are also online.

    Java-L is an independent mailing list devoted to Java. To join, send the message sub java-I YourFirstName YourSecondName to listserver@vm.ege.edu.tr.

    When will Java get down to business?

    Java has been such a constant theme of the Internet and intranet world during the last year that it is easy to get swept along by the enthusiasm of its supporters. But as any hard-headed manager will point out, for all the excitement it has generated, Java's concrete achievements - both in terms of real-life use and commercial products - are rather thinner on the ground. One of the important steps made by Java has been the delivery of server-side technology. Originally code-named Jeeves, Sun's Java server indicates how the use of servlets will allow Web pages to be generated on-the-fly using Java technology.

    Although apparently an incremental change, the release of the Java Development Kit 1.1 is actually significant because this updates the basics of the Java language, and addresses many of the more fundamental omissions in the original release. For example, the latest version of Java now offers Java Database Connectivity (JDBC) for hooking up to SQL databases using Java - indispensable if Java is to be taken seriously within the enterprise.

    More interesting, perhaps, is the greatly increased support for various aspects of security. It is becoming apparent that elements such as encryption, digital signatures and certificates are all indispensable for Internet and even intranet use, and Java now supports these. It also allows code-signing - the ability to attach a unique personal identifier to a Java applet, say. This, of course, is the preferred technique of Microsoft with its ActiveX technology.

    The advantage is that if you decide to trust the signed applet you can allow it far greater freedom on your system: Java applets until now have been restricted to operate within a safe 'sandbox' - the idea being that this stops malicious code from wreaking havoc on files. If you are sure through a digital signature that the code comes from someone trustworthy, you may decide to allow the applet to access files etc - greatly increasing its power. The downside is that such code-signing ultimately says nothing about whether the coded applet is dangerous or not - just that you can be sure it comes from the person who claims to be its author.

    Another interesting addition is Java Beans. This allows Java applets to become fully-fledged objects and, as such, to communicate and interact with other objects such as ActiveX and OpenDoc components, which may be distributed across a network. JavaBeans essentially fills in some missing details to bring Java up to par in this area. In addition to the JavaBeans API, Sun has also b