How to Build up a Web Site

Bernhard Lorenz, International Society for Environmental Protection (ISEP), Vienna


Home | Index Page | Previous | Next

  1. What data in what format is supposed to be made available to the public?
    1. General Issues
    2. Indexing Data
    3. Complex Databases
  2. Who are the main target groups - how to structure a web site?
  3. What software to use - security isssues


Please note that this paper is just a short and very informal introduction to several aspects of web development and to no respect claims full and sufficient description or guidelines on either of the mentioned aspects and issues. When starting to develop a World Wide Web Site one will have to think about the following major issues:

  • What data in what format is supposed to be made available to the public?
  • Who are the main target groups - how to structure a web site?
  • What software to use - security issues to address within an institution.


1. What data in what format is supposed to be made available to the public?

1.1. General Issues

If one plans to put data on the Web one will have to consider if the format presently in use can be used for the WWW as well. This is normally the case for any sort of textual information (ASCII files, WinWord files etc.). There are many converters available that can help to put *.doc files into WWW-usable format (*.html).

If the data to be put up is of a more complex form, like a relational database, one will probably either have to export the data to ASCII files (if possible) or - which is the better, (in terms of financial and time-consuming aspects) also more expensive solution - program an interface between the database itself and the World Wide Web. Such interfaces can be as simple as small Perl scripts with 100 lines only, up to complex programs consisting of several thousand files. Once these main decision have been taken, one can proceed to think about the way this data can be made available to the public.

In general, there are always two possible solutions:

  1. Browsing
  2. Searching

If one is offering general information, text and alike files, it is very common to simply develop a menu for users to select different items and thus guide them to the files and information they want.

1.2. Indexing Data

However, if the amount of files and data grows very large (for example, CEDAR offers about more than 140 MB of data, mainly organized in text files) one can start building indices on a Web site, i.e. run a certain program (like WAIS or Harvest) which normally includes a search interface as well to allow for better data retrieval. Users looking for particular information can then enter keywords combined with boolean and case operators etc. to find the documents that contain the information they are looking for. Naturally, it is also possible to only index certain sections of a Web site, for example database-wise.

1.3. Complex Databases

On the other hand, one can also develop a more complex scheme for data retrieval, especially when it comes down to data organized in a relational database, like Oracle, Postgres95 etc. It is then better to program an interface between a WWW server and the database. The users will then enter keywords, operators etc. into a form, which passes these arguments on to the interface, which in turn selects all matching entries from the database and formats it for proper viewing for WWW browsers.


2. Who are the main target groups - how to structure a web site?

The main influential factor on how the actual pages will look like should be the target group, i.e. depending on who is going to visit a site most likely should have the feeling "to be at home" and that a site is useful and also well developed in terms of layout and structure.

Furthermore, one will also have to think of the fact that - especially if it comes to overseas links - it takes some time to load a WWW page. Therefore it is better to avoid heavy graphics and icons where possible.

Additionally, it may be helpful to allow users to jump back and forth between the main sections of a web site from other submenus, too, to allow for faster navigation and speed up the process of looking for and eventually finding specific data.

Thus, a good Web Site is often structured into several main sections, from where users can go on to more detailed information. It is not advisable to put every possible link onto the homepage itself - users will get confused and may be overlooking links that are actually there but hidden due to the amount of other links available on the same page.

Furthermore, it is probably best to have only one or two subsections (depending on the amount of data) on a web site - if users have to go through ten or more different menus to find specific information only to read that this section is under construction, a web site has definitely failed its purpose.


3. What software to use - security isssues

Most web server software is available for free on the web. The main servers are

There are also several commercial servers available, which may be worth consideration, depending on the specific needs of an organization.

From the security point of view, one should always follow discussions on recent bugs in these servers to minimize the risk of compromising a web server or even the machine it is running on. Special attention therefore needs to be payed to CGI (Common Gateway Interface) programs as well as certain strings which are passed through a web server.

Something which should always be avoided is running a web server with superuser privileges. Note that in order to open a connection on the http port 80, the server has to be started as root-user, but it (and its child processes) will switch to the user and group id one specifies in the configuration files.

Furthermore file permissions should always be correct - it is common practice to let a certain www user and group own all files, with directory permissions set to 755 (rwxr-xr-x) and file permissions set to 644 (rw-r-r-). A web server usually then runs as user "www" and group "nogroup". Also note that in case there are interfaces to other programs (databases etc.) on the same or other machines, this user probably also needs access to these directories and files as well. Finally, one has to make sure that Execute Permissions for CGI scripts etc. are properly set - if users on the same machine are allowed to run CGI scripts, these scripts should be checked very carefully and frequently to prevent the machine's abuse by intruders.

By restricting the access of a web server to a minimum of directories and files, possible intrusions and damage from hackers exploiting bugs in server software can be limited in a way only affecting the web site itself instead of the complete machine.

However, there may be a lot more security issues arising within an organization, for example to limit access to a web server to special sections only etc. - either by setting the appropriate parameters in the config files or even by implementing a firewall solution. These topics will vary depending on the type of institution, but should always be addressed in combination with setting up public services on insitutional hosts.


Home | Index Page | Previous | Next