When a session is started, the client is given a “session identifier” often a cookie, that is included with subsequent requests to the server. The server uses the session identifier to locate the corresponding session before processing the request.
Rather than storing all the variables needed to maintain state and include them with each request, the browser stores a single session identifier that finds and initializes the variables stored on the server. The session identifier is like the ticket given at a cloakroom. The ticket is much easier to carry around and ensures that the holder gets her own hat and coat.
So, the only thing that must be transported between the client and the server is the session ID. All other data relevant to the session resides on the server. So, no sensitive data is sent over the internet for an unnecessary amount of time.
In summary, there are three characteristics session management over the Web must exhibit:
- Information or state must be “stored”. For example, a selected item in a shopping cart, a customer name, or a credit card number must be maintained across multiple HTTP requests.
- Each HTTP request must carry an “identifier” that allows the server to process the request in the context of the stored state. For example, when an order is submitted, it must be processed with the correct items and customer details.
- Sessions need to have a “timeout”. Otherwise, if a user leaves the website, there is no way the server can tell when the session should end.
When to Use Sessions
Sessions allow some kinds of applications to be developed that otherwise would be difficult to implement on the Web. However, because HTTP is a stateless protocol, building a stateful application can present problems and restrictions. Avoiding the need to maintain state information is often a desirable goal. Here are some reasons “why sessions are used”:
Performance
In a stateless environment, an application may need to repeat an expensive operation. An example might be a financial calculation that requires many SQL statements and calls to mathematics libraries before displaying the results on several web pages. An application that uses a session variable to remember the result exposes the user, and the server, to the cost of the calculation only once.
Sequence of interaction
Often an application needs to present a series of screens in a controlled order. One style of application known as a “wizard” guides a user through what would otherwise be a complex task with a series of screens. Wizards are sometimes used for complex configurations, such as some software installations, and often alter the flow of screens based on user input. Some applications require that a user enters via a known page. Applications, such as online banking, often force a user to enter via a login page rather than allow access directly to a function such as funds transfer.
Intermediate results
Many database applications validate data before creating or updating a record in the database, preventing erroneous data from being saved. Sessions can keep the intermediate data so that incomplete data can be edited “rather than rekeyed” when errors are detected. For example, intermediate results can be used is when a database application collects and validates data for a single record over a number of fill-in forms. A shopping cart is an example where complete data may not be created until a user requests a purchase.
Reasons to Avoid Sessions
The reasons to avoid sessions focus mainly on the stateless nature of HTTP. The features of HTTP that support browsing access to a disparate collection of resources don’t support stateful applications. Stateful applications work over the Web often at the expense of HTTP features.
Need for centralized session store
In an application that uses sessions, each HTTP request needs to be processed in the context of the session variables to which that request belongs. The state information recorded as the result of one request needs to be available for subsequent requests.
Most applications that implement sessions store session variables in the middle tier. Once a session is created, all subsequent requests must be processed on the web server that holds the session variables. This requirement prevents such applications from using HTTP to distribute requests across multiple servers and therefore can’t easily scale horizontally to handle large numbers of requests.
Performance
When a server that offers session management processes a request, there is the unavoidable overhead of identifying and accessing session variables. The session overhead results in longer processing times for requests, which affects the performance and capacity of a site. While sessions can improve application performance, for example, a session can keep the result of an expensive operation, the gains may be limited and outweighed by the extra processing required.
Servers that manage session variables in memory require more memory. As the amount of memory used by the web server grows, a system may need to move portions of memory to disk, an operation known as swapping. Swapping memory in and out of disk storage is slow and can severely degrade the performance of a server. Servers that use files-such as the default PHP session management-incur the cost of reading and writing a file on disk each time a session is accessed.
Timeouts
Sessions can also cause synchronization problems. Because HTTP is stateless, there is no way of knowing when a user has really finished with an application. Other network applications can catch the fact that a connection has been dropped and clean up the state that was held on behalf of that user, even if the user did not use a logout procedure (such as typing exit
or clicking on a logout button).
Cookies and Sessions: