Worl Wide Web

Web, that is, WWW is an application of dissemination and access to information, main responsible for the success and great dissemination that has achieved the Internet in the last five years. In fact, many users still
confuse the internet with the web, the network that is the main infrastructure and one of its applications. This confusion is not surprising, however, because most internet users do not know any other application on the network, better said, although they use more applications, they do not realize that “they see” them, since it is the interface or means of use of any web application.

In this article we will try to analyze the main characteristics of the web application, starting with its origin, to continue describing the different components of the application and end up explaining the resources (protocols and languages) that should be used by those who want to develop their functions.

A little history

In 1989, the first proposal for an application to obtain and share information was presented at CERN (Centre Europée pour la Recherche Nucleare). It can be said, therefore, that the web was born to fulfill very specific obligations: Facilitate access to information among researchers in European research centres. Subsequently, as often happens, it has been proven that the solution used for this particular use is also the appropriate solution to many other problems.

At 18 months after the presentation of this intention the first prototype had already been made and in February 1993 the obsolete browser Mosaic was presented, in this case yes, made in the US, specifically in the NCSA (National Center for Supercomputing Applications). This first product established the solid foundations of subsequent success due to two main features: on the one hand, following the Internet custom, the interface was distributed free of charge and on the other, it was very easy for the user.

Anyone, whether or not a computer expert, could use it by removing that cloud of mystery over the network. Due to the success of Mosaic, its creator, Marc Andresseen, immediately left the NCSA and created his own company, Netscape, which aims to develop software for the web. In 1995 Netscape appears on the stock exchange and breaks the quotation records, although it recognizes that it did not expect short-term performance. Many believed that Netscape would be a new Microsoft, the other great company that would create an empire. However, today Microsoft and Netscape are struggling to impose their product. At the moment the user is the winner, then the futures.

In 1994 the MIT (Massachusetts Institute of Technologies) and the European CERN created the organization called the World Wide Web Consortium to create standards related to Web. Your website is available at http://www.w3.org.

Clients and servers

Like any other distributed application, the web is divided into two parts: the server, that is, the one that runs itself on a machine, and the client, the one that runs on another machine and when receiving orders from others. The latter can receive different names depending on the applications. The browser is used in English, we have the browser in Spanish and the browser in Basque.

Whoever wants to expand the information on the network, therefore in the world, needs a server, that is, a computer, programs for the web server (there are also paid and free programs) and an internet connection with some provider. All of this, however, means putting much more money than can normally be invested. Therefore, it is much more normal to rent a portion of your server to whom you have or manage such infrastructure. In this way, everyone makes their own information at home, that is, on their own computer, and sends it to their provider for publication. It is about putting this training in HTML format that we will then analyze in a somera way.

On the other hand, whoever wants to inform himself needs a browser on his computer and, of course, an internet connection. So far we have not said it, but it must be borne in mind that on the server and browser computer, in both cases, TCP/IP protocols are necessary.

The requirements of the browser are, on the one hand, to contact the servers that are located along the network to capture the information contained in it and, on the other, to present to the user the documents received in the network, to help him in the search and to correct them. Each web document is called a “page.” Not all pages can be viewed directly, as they often have more than just text: images, sound, video or all three at once. There are browsers that are capable of reflecting any type of information, but you are usually asked to another application, an external viewer or an auxiliary application to do that work.

If you do not have this monitor on your computer you will not be able to access this information. This is another multimedia aspect, which makes the web even more fascinating. And the ability of the browser to work with other applications has brought us many possibilities: anything can be used with the websites. Therefore, it has already been said that the user only sees web even if he is using many applications. The last giant step has been the consideration of interpretative language as an auxiliary application, especially the Java language. Then we will talk about it.

Address: URL

To receive a page you have to define three concepts: the server that has the page, the file in which it is located and how to request it to the server.

All this is encoded in the URLs ( Universal Resource Locator), specifically in the three parts that make up the same URL. See Figure 1 for an example.

If the third part is not indicated, a special sheet is always presented as a presentation. The second part has its own syntax, corresponding to the DNS or Internet name. As for how to request it, the HTTP browser (HiperTex Transfer Protocol) is the contact protocol with the web server, but — and here is another key on the web — a web browser can also contact the servers of other applications. This makes it possible to use other resources on the internet (FTP, gopher, news...) before the appearance of the web. ) to be able to use together with the websites and, with it, that all the information that existed previously in the network is available to more people. We see again that the user uses different applications, although, as we said above, only sees web. In the image we can see examples of different URLs.

The truth is that normally the browser does not use all the protocols that appear in that image, because it would be too complicated. The difficulty should always be placed on the other side of the server. In this way, between the browser and the server is put another medium that serves multiple protocol and that makes of servant of zarza. This is called proxy. It is usually placed on the server's own computer or on another machine directly connected to it. Each proxy is usually interspersed with many servers such as web, ftp, gopher and news. In some cases all these applications are found on the server's own computer, but in others they are not.

Rules for speaking: HTTP

Most of the necessary documents are in HTML format, so the web protocol itself is used to access, called HTTP ( HyperText Transfer Protocol). HTTP is constantly evolving. At present different versions can be used and are being very created, mainly under the control of the WWW Consortium. The following characteristics are the most basic and although some of the nuances may vary in future versions, the concepts will remain intact.

It is a very simple protocol, like most application protocols. It is divided into two parts: the part of the requests that the browser can make to the server and the part of the answers to the contrary. The new versions contain two types of requests: simple and complete, both written in the ASCII code. Simple is the part of the URL that indicates the word GET and the file. For example:

GET/public/contributors/itoiz.html

The answer is the raw requested sheet, the browser should know how to prepare the received information to present it to the user. We can test this type of requests directly if in our machine we make browser with the telnet application. Connect to port 80 ( server name telnet 80) and enter the line mentioned above, but change the location of the file because what we have given you is invented by us.

However, complete requests can be of several lines, but always the last line is pure. The first line indicates the order (one of the options is GET), the page and the protocol version. The other lines follow the RFC 822 rules of the email and can be modified according to the order. It was designed for HTTP object oriented applications, so commands are called “method”.

The response to complete requests consists of a line of states and perhaps the attached information. The status line can be the OK code (number 200) or an error code. The information is usually the website itself.

Presentation standards: HTML

Web pages are in HTML format ( HyperText Markup Language). Saying language to HTML can be confusing, since it does not run anything in itself and the language is delivered to the code that runs the algorithms. It is more precise, therefore, to say format, since it only indicates how the information should appear, such as TeX and the rest of formats. For example, putting B tells us that the text should appear in black and putting B/ we are told to cancel the order. Therefore, the browser must know the meaning of these marks that appear in the text.

In this way, the information is stored (without worrying where it will appear): the browser of each machine or system will know how to present the text in black, will know what format conversion you must perform to fulfill the HTML commands, etc. This is very important as the page is on the screen 1024 x 768 and in 24-bit color, for example, on the screen 640x480 and in 8-bit colors.

Like HTTP, HTML is constantly changing. When Mosaic was the only browser, HTML 1.0 was the de facto standard, but when new browsers appeared, due to the need for an official standard, HTML 2.0 emerged. From there have appeared other standards, such as the officers recognized by WWW Consortium and those created by each producer. They are usually based on officers and only some novelty is added. If this innovation succeeds in the market, it will be included in the following official standard. This favors the producer, since your browser uses this feature before the rest. Contestants must invest time and money to achieve this.

At this time it would be too long to describe in detail HTML, but reader, what is its “raw” aspect and how a page would be presented.

... and more: actions. CGI and Java

HTML 1.0 was one-way: the user could receive information from the server but could not send him anything. With the addition of commercial organizations to the network, the need for a double direction has increased. For example, there are companies that, in addition to receiving orders, want to receive the customer's credit card number. Therefore, they began to offer the possibility of doing so in HTML 2.0. Even more: The CGI (Common Gateway Interface) standard allows you to run programs on the server if the user requests it through their browser. For example, suppose the server has a very interesting database and the user wants to consult it.

To do this, go to the server page where you will find the question form, once completed, your browser will send to the server the data of the questions, such as the access key to this database and the search keywords. If the server accepts the operation, the browser will instruct you to run a referral program on your page. These programs are called widget or script and like any other document they are indicated with a URL within the HTML document.

However, this work between the browser and the server is not enough in some applications, especially in the animations. To do this, another tool was created: Java language. The idea is: The program that executes the actions associated with an HTML page loads them in the browser along with the page and runs them itself. It seems very easy, but it is not so simple. The language that runs on any computer does not exist and it would also be very dangerous because it is the simplest way to introduce viruses. However, overcoming these barriers has achieved the goal.

The language is Java, type C++ and does not give security problems. Programs written in Java are small and are called applets, which means approximately “small app”. Applets are compiled and stored in a byte code, ensuring that all applets look the same, regardless of the author or procedure in which they were created. This byte code is accessed from HTML and runs on the browser computer. Its execution requires interpreter. Therefore, the components of the Java system are three (for more information, see the pure Java coffee of Luis Elizondo or the future of computing? Article 107, May 1996):

  1. Java compiler for byte code that the Java programmer will use on your system.
  2. Browser that understands applets.
  3. Interpreter of byte code on the browser computer.

We have described here the web and its components, not very in depth, it is true, but if we did, we would have to write a volume and also to publish it, many details would be obsolete, so is the computer. However, what we have explained here will remain true for some years. Surely not much more.

Babesleak
Eusko Jaurlaritzako Industria, Merkataritza eta Turismo Saila