In practice, clustering is usually used with application servers like IBM WebSphere, BEA WebLogic and Oracle AS (10g). Also being used in that environment are load balancing features found in Application Delivery Controllers (ADC) like BIG-IP. (For simplicity, we will talk about "clustering" versus "ADC" approaches.)
Scalability
There are hardware load balancers, of course, but there we talk about "pools" or "farms," the server groupings where application requests get distributed. It is in the software world that the term "cluster" is applied to that same group.
Clustering will typically convert one instance of an application server to a master controller, then process/distribute requests to multiple instances using such industry standard algorithms as round robin, weighted round robin or least connections. Clustering is similar to load balancing in that it has horizontal scalability, a nearly transparent way to add additional instances of application servers for increased capacity or response time performance. To ensure that an instance is actually available, clustering approaches typically use an ICMP ping check or, sometimes, HTTP or TCP connection checks.
Health and Transparency
For load balancing, ADCs support the same industry algorithms, but have additional, complex number-crunching processes, and check such parameters as per-server CPU and memory utilization, fastest response times, etc. ADCs also support more robust health monitoring than the simple app server clustering solutions. This means they can verify content and do passive monitoring, dispensing with even the low impact of health checks on app server instances.
For applications that require the user to interact with the same server during a session, clustering uses server affinity to get the user there. This is most common during the execution of a process like order entry, where the session is used between pages (requests) to store data needed to close a transaction, like a shopping cart.
For the same situation, ADCs use persistence. Clustering solutions are usually somewhat limited as to the variables they can use, while ADCs can not only use traditional application variables but also get other information from the application or network-based data.
More than a few clustering solutions need node-agents deployed on each instance of an application server that is "clustered" by a controller. It may not be a burden as far as deploying and managing it, since it is often in place, but it is still means more processes running on the servers and consuming memory and CPU resources. Of course, it also adds another possible failure point to the data path. Since ADCs need no server-side components, they remain completely transparent.