back print site
sourceforge
|
Concept
| Table Of Content | |
1 General
1.1 Why is Looky called Looky?
Looky is an abbreviation for nothing. Actually, the name has no
specific meaning, no flame, no coffee ...
1.2 Motivation
The idea of the loadbalancer system was born somewhere in
2002, after a heavy debug session of a binary protocol tunneled via
HTTP to get an idea how to pass this through a professional HTTP
loadbalancer.
While doing this, we discovered, that we actually spend a lot of
time and effort to disable the 'balancing' feature of our
loadbalancer. This gave us a strange feeling that something was
wrong.
Our need to balance an application in between our servers was
somehow 'sticky': only the first HTTP request (the one, that starts
the user session) has to be balanced to a server. Once this decision was
made, all following HTTP requests of the same session have to be routed
to the same server (because of some local files written by the
application).
With this in our mind, we also discovered, that we actually face
about 50,000 balancing decisions per day (we only have to handle the
first HTTP request) and not 50,000 balancing decisions per minute (as
calculated for the HTTP loadbalancer).
So we've started to hack the first Looky (a single CGI script) on a
small Linux box. This was just to test, whether our application will be
able to run properly.
As expected, our application works properly and even more expected,
the small Linux box was able to cope our balancing needs.
1.3 The Basic Idea
is founded on the following basic ideas:
- The solution balances users on systems, not HTTP requests on
systems.
- The 'balancing decision' is only done for the initial HTTP
request (the one that starts up a user session).
- Once this is done, the client and the server machine are directly
connected.
This ideas are quite easily realized by a simple HTTP redirect (or
an embedded frame to hide the real server)
1.4 When Is Looky Good For You?
- You have a HTML based application that defines a user session.
- Your client machines are able to address all your server machines.
- Your application does not always pass a kind of session id in
all HTTP requests.
- Your application is a compound of several vendors with logical
interconnected user sessions.
- You have trust in perl applications.
1.5 When Is Looky Not Good For You?
- You just have static HTML pages.
- You don't like to use real server names in the wild.
- You like to have a 'real' loadbalancer.
- You have no trust in perl applications.
2 Looky Concepts
2.1 Consumers and Resources
A typical situation how to use is shown in the picture on
the left hand side (click on the picture to enlarge).
Looky manages on one side a set of so called resources
(namely a set of server machines). Each resource has a so called
session quantity: this represents the number of sessions the server
machine is able to launch.
On the other side, looky manages requests send from a set of so
called consumers. A consumer is a representation of a
user, that started an application session.
2.2 How Does Redirection Work?
The system is in between resources and consumers. The
initial request (step 1) of a consumer is passed to the Looky system.
If there is a resource available for the consumer, the Looky system
calculates a redirect URL to a server (hosting this resource) and
passes this URL back to the consumer (step 2a).
If there is no resource available for the consumer, the Looky system will
send a redirect to a static HTML page, indicating the problem to the
user (step 2b).
Once the consumer got the redirect request, the consumer will contact
(step 3) either the server hosting the resource (as a result of step
2a) or the consumer will contact a specific server (that might also
be the HTTP server hosting the Looky System) to get a page
indicating a lack of resources (step 2b)
2.3 Pools
The system organizes its resources in so called
pools. Each pool has a unique name and holds a set of
resources with assigned session quantities.
Consumers address a pool of the looky system to require a
resource. The name of the pool is given within the initial HTTP
request (step 1) to the looky server.
NB: Looky namely the looky balancer client) is
part of the Apache server. So you have a lot of ACLs to control
access to the pools.
Example: (see picture on the right hand side)
Consider we have 2 pools admin and sales. Pool 'admin'
provides two resources: Server 192.1.33.10 with a quantity of 50
sessions and Server 192.1.22.11 with a quantity of 200
sessions. Within pool 'admin', looky will be able to balance 200 +
50 = 250 sessions.
Pool 'sales' has three resources: server 192.1.22.8 with 100,
server 192.1.22.9 with 200 and server 192.1.22.10 with 150
available sessions. Here looky will be able to balance 450 consumer
sessions.
As you can see in this example, a resource may be located in several
pools. This is special and will be explained in section
[2.3.3].
2.3.1 Seed Methods
If a consumer requires a session from a pool with more than one
resource, has to make a decision, which resource should
be assigned to the consumer.
In case the consumer is unknown to Looky (see section
[2.3.2)] about 'consumer
identity') looky uses a pool property called 'seed method' to
determine a resource for this consumer. Currently
(1.0.7) there are four seed methods available:
serial |
With this seed method Looky fills each resource one by one.
Only if a resource is completely assigned to consumers, looky
continues with the next resource.
|
parallel |
The seed method parallel is a simple round robin
seed.
|
minimal |
Using seed method 'minimal' looky chooses the resource with the
smallest number of consumers.
|
weighted |
Using seed method 'weighted' looky chooses the resource with the
lowest percentage of usage.
|
2.3.1.1 Order Number, Have, Used and Usage Properties
To realize this seed methods, maintains five properties for
each resource of a pool (not for each resource(!)):
- an order number: all resources are ordered in a
list. The position in this list is equal to the order
number. The order number is used by the seed methods serial and
parallel:
- the seed method serial starts to fill the resource with order
number 1. If this resource is completely consumed, the seed
method continues on the resource with order number 2 etc.
- the seed method parallel uses the order number to define a
round robin cycle.
- a have value: this value is by default equal to the quantity
of allowed sessions on the resource. The 'have' value determines the number of
sessions provided by the resource.
The difference between the 'quantity' and the 'have' value is,
that the 'quantity' value is fixed, the 'have' value may be
changed by the load balancing process (namely by
looky controller)
- a used value: The 'used' value is the number of
consumers, that are currently assigned to this resource. The
'used' value is consulted by seed method 'minimal': the
resource with the lowest used value (and lowest order number in case of
equality) is assigned next.
To make this more precise: the 'used' value is actually a
mixture of Looky's view on assigned consumers and the number of
current sessions. Please read more about this in section
[3.1].
- a usage value: The 'usage' value is computed using the 'used'
and the 'have' value with the following formula:
(used * 100) / have
The 'usage' value depicts the relative usage of the resource in percent.
The 'usage' value is used by the seed method 'weighted' to choose
the next resource: the resource with the lowest usage is assigned
to the next consumer.
- current resource The 'current resource' is just a simple
pointer to highlight a specific resource (more precisely, its
order number). The 'current resource' value is used by seed methods
'serial' and 'parallel':
- for serial seeds the 'current resource' is the resource,
that is currently filled by Looky. Once this resource
is completely consumed, Looky will move 'current
resource' to the (as seen by order number) next resource.
- for parallel seeds the 'current resource' is that
resource, that will be assigned to the next consumer.
Once this has happened, the 'current resource' is set (as
seen by order number) to the next resource.
2.3.1.2 Seed Method Example
Example: (see picture on the left)
Consider to have a pool with three resources. Resource 192.1.2.9
with order number 1, has a 'have' value of 200 and a 'used' value of
100. Resource 192.1.2.10 (order number 2) has also a 'have' of 200,
but its 'used' value is currently by '150'. Finally we have a third
resource, 192.1.2.11, with a 'have' of 150 and a 'used' of 50.
Using this values, the 'usage' value is as follows: resource
192.1.2.9 = 50%, resource 192.1.2.10 = 75% and resource 192.1.2.11 33,3%. The 'current resource' is resource 192.1.2.9
With a 'serial' seed method, Looky will assign next 100 consumers to
resource 192.1.2.9, then 50 consumers to resource 192.1.2.10 and at
last another 100 consumers to resource 192.1.2.11.
With a 'parallel' seed method, Looky will assign next 150 consumers
round robin to resources 192.1.2.9, 192.1.2.10 and 192.1.2.11.
After this, resource 192.1.2.10 is completely consumed, so Looky
will distribute the next 100 consumers round robin to resource
192.1.2.9 and 192.1.2.11.
With a 'minimal' seed method, Looky will first assign 50 consumers
to resource 192.1.2.11. After this the next 100 consumers are
distributed round robin between 192.1.2.9 and 192.1.2.11. Finally
the last 100 consumers are distributed round robin between
192.1.2.9 and 192.1.2.10.
With a 'weighted' seed method, Looky will first assign 25 consumers
to resource 192.1.2.11 (until 192.1.2.9 has a 'usage' of 50,0%) The
next 88 consumers will be assigned weighted (200:150) to resources
192.1.2.9 and 192.1.2.11 (until both have a 'usage' of
75%). Finally the last 137 consumers are distributed weighted
(200:200:150) between resources 192.1.2.9, 192.1.2.10 and
192.1.2.11.
2.3.2 Consumer Identity and Latency Time
assigns resources exclusive to a consumer. To do this,
the looky system expects by default, that a consumer sends a
unique name, when asking for a resource in a pool.
Looky first checks, if a resource was already assigned to this
consumer. If that has happened, the same resource is again assigned
to the consumer (without wasting another resource).
This decision is not done for ever. For each assigned or reassigned
resource the looky system maintains a time point, when the resource
was handed over to the consumer. The resource is reserved for this
customer only for the period of time that is given by a pool property
named 'latency time'. If this period of time has elapsed, the
resource is freed by the looky system and may be distributed again.
However, this methods does not fit all needs:
- The heuristic approach of a session length may cause some
problems: e.g. you cannot find any good latency time because
the length of the user sessions do vary very much and there are some
limited resources, which cannot be used as long as the given latency
time defines).
In this case, you have to use a looky component called 'looky
controller' that will pass back the number of active sessions
to the 'looky balancer'.
Please read section [Looky Components] to get more
information about this.
- Not in all cases a consumer has a unique name or maybe you do not
what to make consumers sticky to a specific machine.
In this case, the looky system has a pseudo consumer name
(namely a single dash '-') that is always unique. If consumers
pass this pseudo name, looky will treat all requests as
'unique' consumers.
2.3.3 Shared And Coupled Resources
As already mentioned earlier, a resource may be assigned to
different pools. If that happens, the system offers two
approaches to cope the distribution problem of this resource.
- shared resource: If a resource is shared between pools,
each pool has an own set of properties 'quantity', 'used' and
'have' for the resource.
As a result of this, you are able to control exactly how many
of the resource's power is balanced between the pools.
Consider the following example (see picture on the left): A
resource is 'shared' between pool 'admin' and pool 'sales'. In
pool 'admin', the resource has a quantity of 50, in pool
'sales' the resource has a quantity of 150.
In this example looky is able to assign a max. of 150 sessions
to consumers that require a resource in pool 'sales'. There
currently 90 sessions, so there are 60 sessions left in this
pool.
If this 60 sessions are consumed in future, looky is not able
to assign more consumers to this resource, even if the quantity
of the same resource located in pool 'admin' is not completely
consumed.
NB: the example is actually not complete; the looky system has
a slightly different behaviour if you use the 'looky
controller' component: If no sessions are available on a
specific pool but the machine has a low load, the controller may
enlarge the number of sessions (actually the 'have' value of
the shared resource in a pool) to run more sessions in that pool
on the same resource.
- coupled resource If a resource is coupled, looky
maintains only one 'quantity' and one 'have' property for both
pools. Only the 'used' property is realized per pool.
As a result of this, the 'coupled' resource may be consumed in
any pool, where this resource is located.
Consider the following example (see picture on the left): A
resource is 'coupled' between pool 'admin' and pool 'sales'.
There is a coupled quantity of 200 sessions for both pools.
This quantity may be consumed either from consumers asking for pool
'admin' or asking for pool 'sales'. In our example we have 90 sessions running
in pool 'sales' and 35 sessions in pool 'admin'.
So there are 200 - 90 - 35 = 75 sessions left. This 75 sessions
may be either assigned to consumers asking for pool 'sales' or consumers
asking for pool 'admin' or any mixture of this.
NB: the example is actually not complete: the looky system
has a slightly different behaviour if you use the 'looky
controller' component: If no sessions are available on a
specific pool but at least one machine in the pools resources
has a low load, the controller may enlarge the number of
sessions (actually the 'have' value of the coupled resource)
to run more sessions on the same resource.
2.3.4 Sticky and Queued Distribution Mode
If all resources are consumed in a pool, the system must
make a decision, if another unknown consumer asks for a resource in
that pool.
This decision is controlled by the pools property 'distribution
mode'. There are two general behaviours:
- sticky mode If the distribution mode is set to 'sticky',
the consumer will get no resource from the looky system (step
2b in section [2.2])
If that happens, the consumer gets a redirection to a static
HTML page indicating the problem: The consumer has to wait
until a resource is freed (by latency time or by looky
controller)
- queued mode If the distribution mode is set to 'queued',
looky system will automatically free the 'oldest' assigned
resource and will reassign the consumer.
In this mode, all consumers will always get a resource from a
pool.
NB: the example is actually not complete: if all
resources of a pool with a 'queued' distribution mode are
frozen or locked, a new consumer will not get any resource
from this pool. The consumer is also redirected to the static
'no-resource' HTML page.
2.3.5 Frozen and Locked Resources
Finally there are two boolean properties left, that might be set to
a resource resp. to a resource in a pool.
If an administrator likes to remove a resource temporarily from a pool,
he might 'freeze' this resource. If a resource is frozen in
a pool, the resource is not recognized by the seed methods for polls.
If a resource (or more precisely, the node hosting this resource) is
down, the resource may be locked by looky controller. The
lock is applied on a resource and influences the seed methods of all
those pools hosting this resource. The lock is automatically
removed by looky controller (see section 3.1.3)
if the node is available again.
If all resources of a pool are either frozen or locked, no consumer
will get any resource (even if looky has already assigned the
resource to a known consumer)
3 Looky System
3.1 Looky Components
The system consists of the following components.
Not all of these components are currently
implemented. Please have a brief look into the
[releases]s.
LB |
3.1.1 looky server
The looky server is the heart of the looky system. It's a
single threaded perl daemon running a simple TCP protocol on
port 4040.
The looky server is contacted by the following looky components:
- looky balancer client: to acquire a resource
- looky controller: to increase/decrease/lock/free
resources based on usage/load probes
- looky balancer manager: to manage the balancer
|
LBC |
3.1.2 looky balancer client
The looky balancer client (or even more short, the 'looky
client') is a tool that is embedded in a HTTP server.
Currently two kinds of looky balancer clients exists:
- a 'CGI' client, to be embedded in e.g. a Apache server and
- a 'SQUID' client to be embedded in a Squid server.
|
LC |
3.1.3 looky controller
The looky controller is the 'real' load balancer. Although
you may distribute user session with just a combination of
looky server and looky balancer client, you have no chance to
recognize 'load'. The looky controller is a single threaded
perl daemon running a simple TCP protocol on port 4041.
If you add the looky controller to your system, the looky
server is mastered by the looky controller. In mastered mode,
the looky server expects to get 'real' data from the looky
controller about existing user sessions and current load on
the resource nodes.
The looky controller realizes a complex strategy to balance
'load' with 'sessions'. Have a look into
controlling to get more about this.
The looky controller is contacted by the following looky
components:
- looky probe client: to deliver probes about load and
number of user sessions (push mode)
- looky observer: same as looky probe client, but using poll
mode
|
LPC |
3.1.4 looky probe client
The looky probe client is just a simple shell tool that may
be launched by 'wise' tools to pass current load and current
number of processes back to looky controller.
The probe client protects you from connecting directly to the
looky controller. Actually your application would be faster,
if you do so. :-)
|
LO |
3.1.5 looky observer
The looky observer is a single threaded perl daemon running a
simple TCP protocol on port 4043.
The server is intended to poll nodes hosting resources, if
they are available or not and to collect usage and load data
from this nodes. If you like to use the looky observer
process, you have to install a looky observer client on each
node that is controlled by the looky observer.
Information collected by locky observer are passed to looky
controller in the same way a looky probe client would pass the data.
|
LOC |
3.1.6 looky observer client
The looky observer client is simple CGI client to be embedded
into an Apache Web server on each node, that is consulted by
looky observer.
The looky observer client collects information about current
system usage (directly from Apache) and sends this back to the
looky observer.
|
LBM |
3.1.7 looky balancer manager
The looky balancer manager is a web based administration tool
to control the looky server in real time.
Please find here:
[Example LBM] an inactive
demonstration of this tool on this web side:
|
LCM |
3.1.8 looky controller manager
The looky controller manager is a web based administration
tool to manage the looky controller in real time.
Please find here:
[Example LCM] an inactive
demonstration of this tool on this web side:
|
LOM |
3.1.9 looky observer manager
The looky observer manager is a web based administration tool
to manage the looky observer in real time.
|
LCE |
3.1.10 looky configuration editor
The looky configuration editor is a web based administration
tool to edit the central looky configuration file.
|
3.2 Set-ups
The following three sections provide some information about how a
looky system could be build with available components. All manager
components are not displayed; if you find a specific server in one
of the systems, you probably also need the manager tools.
3.2.1 Basic Set-up
A basic set-up of a looky system consists of a looky balancer client
(LBC) and a central looky server (LS).
This system is ok for the following situations:
- the number of consumers or the number of resources are high.
- you are able to identify a consumer by a unique name and have
enough resources for all consumers.
- the length of your session is very short.
In all this situations you actually have no need for a real
loadbalancer: you have a need for a dynamic user
management.
This is perfectly done by the basic set-up.
3.2.2 Controlled Set-up
If you have a need for a real loadbalancing or if you just want to be
more exactly, when a resource is actually freed, you have a need for
a looky controller.
If you add a looky controller to your looky system, you provide the
ability of a kind of feedback to the looky system. The feedback is
done by some looky probe clients, which are installed on each
node, hosting resource.
The looky probe clients check regularly the current load or just the
number of sessions (this is actually implemented by you). This
imformation is passed to the looky controller. The looky controller
collects all data sent by probe clients and does load balancing on
the looky server:
- the current number of sessions is passed to the looky server to
reduce the 'used' values of resources (so looky gets aware if a
session has been terminated)
- the load is used to increase or decrease the 'have' value of
resources: this will protect nodes from stress load or idle
times.
3.2.3 Controlled & Observed Set-up
The last 'observed set-up' adds an other server process to the looky
set-up. This observer process changes the 'push' mode as realized
by the controlled set-up into a 'poll' mode.
With this change, the looky system will be able to detect a loss of
service.
|
|
|