Hey there! Have you ever wonder what happens when you type holbertonschool.com or any other web address into your browser? Well, your browser displays the web page! But how it does that?
Let’s make the trip together and try to see what is happening there. First, we press the h into the address bar of the browser.
As you see, the browser already shows me the first option to auto-complete the URL, along with a list of other URL suggestions. These suggestions appear depending on your browser’s algorithm, which may be based on your search history, cookies, or popular searches. But we want a specific URL the browser didn’t know yet, so we finish typing holbertonschool.com and hit the Enter key.
Is it a URL or a search term?
When no protocol or valid domain name is given the browser proceeds to feed the text given in the address box to the browser’s default web search engine. In many cases, the URL has a special piece of text appended to it to tell the search engine that it came from a particular browser’s URL bar.
HTTP and HTTPS/SSL
Besides determining “holbertonschool.com” is indeed and URL, the browser also needs to resolve which protocol and port to connect. Is it HTTP or HTTPS? But before we continue, we need to know what is HTTP and HTTPS.
Briefly, HTTP or Hypertext Transfer Protocol is a protocol that describes the communication of data between the client and the webserver and uses the 80 port by default. HTTP works with a simple model where the client makes a request to the server and waits for a response.
The HTTPS uses 443 port and is more like a combination of the HTTP protocol and SSL, this SSL stands for Secure Sockets Layer, and it adds security by encrypting the information.
The image below explains how SSL encrypts the information by using a public key to encrypt the content and a private key to decrypt it.
Now that we know about HTTP and HTTPS we can see what is happening after hit the enter key. The browser or client checks its HSTS (HTTP Strict Transport Security) list, which has the websites that have requested to only connect via HTTPS. If the website we type is there, the client will use the HTTPS protocol and 443 port, otherwise the HTTP protocol and 80 port.
After all the above, are we ready to get holbertonschool.com? Absolutely no. We need the IP address of our website.
What is DNS?
DNS stands for Domain Name System and is the technology that translates text-domain names to numerical based IP. The translation achieved by the work of several servers provides the IP address of the requested website to the browser. These servers are called DNS servers.
There four types of DNS servers that come structured as a hierarchy:
- DNS recursive resolver or DNS resolver.
- Root Name server
- Top-level Domain or TLD name server.
- Authoritative name server.
Following our example, to get the IP of a website the browser checks if it has a record of the IP address for the domain name in its DNS cache. In case there is not a record of holbertonschool.com the browser makes a DNS request to a DNS server to obtain it.
The requesting will travel up the DNS hierarchy until a DNS server can resolve the IP of the domain in the request. This process is one of the most complex to detail, and we will abord shortly. For more information about this topic click here.
How DNS works
- The DNS resolver receives the request query from the browser and checks its local cache. If there is a cached IP address for the hostname, the cached IP address is returned. Otherwise, the DNS starts to make additional requests to the root name server.
- The root name server does not know the IP address of holbertonschool.com. The root name server knows the authoritative DNS server for the top-level domain of .com, so it returns the address of the .com DNS name server to the resolver.
- The resolver then generates a query and sends it to the TLD .com DNS name server. This knows the DNS name server for the holbertonschool.com domain. The TLD ..com DNS name server returns the address of the Authoritative DNS name server for holbertonschool.com.
- Now, the DNS resolver makes another request and sends it to this Authoritative DNS name server. This DNS server has authoritative information about the holbertonschool domain. It finds the IP address of holbertonschool.com and returns it to the resolver.
- The DNS resolver receives the response and caches the address for future access. Then the server forwards the IP address to the browser.
The image below sums up the step by step of how the DNS wors to get the IP address of a web page.
Transmission Control Protocol / Internet Protocol
Now that we have the IP address of our website the browser can use it to initiate a connection to the specified server, more specifically a TCP connection. What is TCP? and How is going to make that connection?
First, begin understanding that the internet is a global system of interconnected computer networks that communicate between them and other devices with the Transmission Control Protocol / Internet Protocol (TCP/IP). TCP/IP is a protocol suite that has four layers and each one has more protocols. The four layers are:
- Application layer
- Transport layer
- Network layer
- Data link layer
We won’t get into detail with the layers, but with the outline of the TCP connection for the purpose of this blog.
TCP defines how applications can create channels of communication across a network. It manages how a message is assembled into smaller packets before transmitted over the internet and reassembled in the right order at the destination address.
IP defines how to address and route each packet to make sure it reaches the right destination.
TCP connection flow
- The Client sends an SYN(synchronize) packet to the server.
- The server receives SYN and if agrees chooses to initiate new connections and adds the ACK flag to indicate it is acknowledging receipt of the SYN packet.
- The Client receives the SYN/ACK packet from the server and acknowledges the connection as well.
Once the TCP connection is established the browser will send an HTTP request asking for holbertonschool.com to the server. More exactly the web server, but just before reaching it, the request might go through a load balancer first.
To think in a load balancer, we have to ask what load balancing is? As we know web traffic can be massive. Imagine, how many people do request to facebook.com? A lot, such a great amount of requests might be hard to be handled by just one web server, there is when enters a load balancing technique. This technique makes a distribution of workload across multiple computing resources, like servers, computers, virtual machines, etc. The distribution is possible through algorithms, that are applied knowing the needs and infrastructure of the servers for the domain.
Additionally, the load balancer can be a hardware device, for example a router, or a software that uses a load-balancing application running on a server. It’s goal is to get the best possible system performance by optimizing the resources.
What happens with our request in the load balancer is basically that the load balancer determine which server is going to respond to the request. Then, the request is directed to the selected server where it reach the Web server
A web server is software whose main function is to respond to the client’s HTTP requests. The most common request is the one to ask the server to retrieve and return some resource, like a web page (a Hypertext Markup Language [HTML] document).
So now our domain request arrives at the web server, and it starts looking into the directories to find the content needed. It’s important to add that the web server only fullfill static content, in other words, HTLM pages, files, videos, and others. If founds such content, then the requested is returned as a response. This response is interpreted by the browser and show the information webpage on the screen.
Finally we have reach the end, haven’t we? Not just yet. We indeed have the content of holbertonschool.com on our screen, but the interactions and behavior of the webpage are not available there. Here is where the application server makes it’s enter.
Ok, so we make the request and the dynamic content of the webpage is returned by the webserver, working together with the application server. However, you might think how the application server knows about updated data or information validation? Well, it almost slip out but the application server helps to process any requests by connecting to a database and returning the information to the web servers.
A database is a structure that stores organized information. The databases contain multiple tables, which may each include several different fields. This tool brings the facility to access, modify, update, manage, and organize the information of a business.
It is used when you request a webpage and, the application server needs to access some information relevant to give a response
Finally, after going through the process behind the typing of a website and get it, now our request to holbertonschool.com is complete. We have learn in a briefly way what happens and have a better understanding how things are working under the hood. Thank you so much for your attention, until the next time!
How Domain Name Servers Work
When you type a URL into your web browser's address bar, the correct page appears as if by magic (provided you typed it…
This repository is an attempt to answer the age old interview question "What happens when you type google.com into your…