Skip to content
This repository has been archived by the owner on Jul 27, 2022. It is now read-only.

request latency #82

Open
EranGabber opened this issue Jun 15, 2015 · 22 comments · Fixed by #102
Open

request latency #82

EranGabber opened this issue Jun 15, 2015 · 22 comments · Fixed by #102
Assignees

Comments

@EranGabber
Copy link
Contributor

The elapsed time of the collector fetching data for the first time from a cluster is more than 10 seconds, which seems like an eternity.

Note that the collector should use parallelism for extract the data, so this may be a bug somewhere.

@EranGabber
Copy link
Contributor Author

Elapsed time on a test cluster provided by Jack:

$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8975k 100 8975k 0 0 1394k 0 0:00:06 0:00:06 --:--:-- 2119k
132893 153118 9191413

real 0m6.461s
user 0m0.154s
sys 0m0.038s

@EranGabber
Copy link
Contributor Author

Interestingly, two containers in the said cluster contain an extra long list of ExecIDs (about 100,000 entries in each). It should be eliminated by the cluster-insight minion. This may explain some of the increase in the elapsed time.

@supriyagarg
Copy link
Contributor

Also, both these containers are the skydns containers. The same that had the extra long ExecIDs in the cassandra cluster projects.

@EranGabber
Copy link
Contributor Author

Before deleting the values of ExecIDs by the cluster-insight minion:
Note that the 2nd request is immediately following the first request.
The target cluster is running the Guesbook app.
$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8462k 100 8462k 0 0 1007k 0 0:00:08 0:00:08 --:--:-- 2019k
115698 123192 8665329

real 0m8.405s
user 0m0.255s
sys 0m0.072s
$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8462k 100 8462k 0 0 1583k 0 0:00:05 0:00:05 --:--:-- 2043k
115699 123193 8665408

real 0m5.354s
user 0m0.234s
sys 0m0.100s

@EranGabber
Copy link
Contributor Author

After omitting the value of ExecIDs by the cluster-insight minions:
Note that the 2nd request is immediately following the first request.
The target cluster is running the Guesbook app.

$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 403k 100 403k 0 0 108k 0 0:00:03 0:00:03 --:--:-- 108k
11352 18998 413344

real 0m3.726s
user 0m0.022s
sys 0m0.004s
$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 403k 100 403k 0 0 1159k 0 --:--:-- --:--:-- --:--:-- 1163k
11352 18998 413344

real 0m0.358s
user 0m0.012s
sys 0m0.013s

*** important observations ***
The amount of data produced by accessing the /cluster endpoint was reduced from 8462KB to 403KB.
The elapsed time for a cold cache was reduced from 8.4 seconds to 3.7 seconds.

@EranGabber
Copy link
Contributor Author

It takes about 1 second for Docker to return the details of this container. This time varies from 0.6 seconds to 1.1 seconds. Note that the "docker inspect" command returns over 8 MB of data. Caching at the cluster-insight minion should eliminate the long latency of reading the raw data from Docker.

$ time sudo docker inspect 5e9a552724da | wc
109457 109561 8312039

real 0m0.594s
user 0m0.268s
sys 0m0.028s
$ time sudo docker inspect 5e9a552724da | wc
109458 109562 8312115

real 0m1.104s
user 0m0.248s
sys 0m0.020s
$ time sudo docker inspect 5e9a552724da | wc
109458 109562 8312115

real 0m0.949s
user 0m0.244s
sys 0m0.016s
$ time sudo docker inspect 5e9a552724da | wc
109458 109562 8312115

real 0m0.648s
user 0m0.272s
sys 0m0.008s

@EranGabber
Copy link
Contributor Author

Current implementation:
(a) the minion keep a warm cache of all container information, which is initialized at start time.
Any cache miss will also populate the cache. The data is kept in the cache for one hour.
(b) the value of the ExecIDs attribute in all containers is removed by the minions, because it is often a very large list of hex values (more than 8MB of text).
(c) I still need to refresh the cache every hour to keep is warm.

The performance now on a cluster running the Guestbook app.
Cold collector cache:
$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 160 100 160 0 0 399 0 --:--:-- --:--:-- --:--:-- 400
4 16 160

real 0m0.532s
user 0m0.003s
sys 0m0.006s

Hot collector cache:
$ time curl http://xxx.xxx.xxx.xxx:5555/cluster | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 160 100 160 0 0 1282 0 --:--:-- --:--:-- --:--:-- 1280
4 16 160

real 0m0.131s
user 0m0.000s
sys 0m0.008s

EranGabber added a commit to EranGabber/cluster-insight that referenced this issue Jun 16, 2015
@EranGabber
Copy link
Contributor Author

still need to do:

  1. try to increase the number of worker threads in the master.
  2. write a unit test for docker_proxy.py

@EranGabber
Copy link
Contributor Author

I will close this issue after trying to increase the number of worker threads and write a unit test.

@EranGabber EranGabber reopened this Jun 17, 2015
@EranGabber
Copy link
Contributor Author

Measuring the collector running on a Cassandra test cluster with 4 minion nodes using the following shell command:
$ for i in $(seq 1 5); do echo $i; sleep 11; time curl http://localhost:5555/cluster | wc; done

Each result is the arithmetic average of 5 requests for the /cluster endpoint. The requests are spaced 11 seconds apart to force a cache miss. The access is from the node that runs the master collector.

1 worker thread: 4.284 seconds
2 worker threads: 3.642 seconds
3 worker threads: 3.640 seconds
4 worker threads: 3.600 seconds (this is the current default, which is the same as the number of minion nodes)
5 worker threads: 4.222 seconds
6 worker threads: 3.798 seconds

Summary: creating the same number of threads as the number of minion nodes produced the lowest latency. This is also the current default.

EranGabber added a commit to EranGabber/cluster-insight that referenced this issue Jul 4, 2015
@EranGabber
Copy link
Contributor Author

When I measured the collector running on a Cassandra test cluster with 4 minion nodes again today, I got the following results:
$ for i in $(seq 1 5); do sleep 11; time curl http://localhost:5555/cluster | wc; done
4 worker threads: 5.318 seconds average latency

$ for i in $(seq 1 5); do sleep 11; time curl http://localhost:5555/debug | wc; done
4 worker threads: 5.623 seconds average latency

This does not make sense, because the average latency of the /debug endpoint should be less than the average latency of /cluster, because the output of /debug is 1/10 the size of the output of /cluster
(47KB vs. 712KB).

@EranGabber
Copy link
Contributor Author

Repeating the measurements:

$ for i in $(seq 1 5); do sleep 11; time curl http://localhost:5555/cluster | wc; done
4 worker threads: 5.640 seconds average latency

$ for i in $(seq 1 5); do sleep 11; time curl http://localhost:5555/debug | wc; done
4 worker threads: 4.728 seconds

This make more sense. However, the variability of the measurements is worrisome.

@vasbala
Copy link
Contributor

vasbala commented Jul 6, 2015

The logic for /debug does everything that /cluster does, plus some
additional work to output it in DOT format. So, even though the output size
is much smaller for /debug, the work done is more than /cluster. The
numbers make sense to me.

On Mon, Jul 6, 2015 at 1:34 PM, Eran Gabber [email protected]
wrote:

When I measured the collector running on a Cassandra test cluster with 4
minion nodes again today, I got the following results:
$ for i in $(seq 1 5); do sleep 11; time curl
http://localhost:5555/cluster | wc; done
4 worker threads: 5.318 seconds average latency

$ for i in $(seq 1 5); do sleep 11; time curl http://localhost:5555/debug
| wc; done
4 worker threads: 5.623 seconds average latency

This does not make sense, because the average latency of the /debug
endpoint should be less than the average latency of /cluster, because the
output of /debug is 1/10 the size of the output of /cluster
(47KB vs. 712KB).


Reply to this email directly or view it on GitHub
#82 (comment)
.

@EranGabber
Copy link
Contributor Author

Performance is still too slow. The master collector should respond in about 1 second for /cluster or /debug requests.

@EranGabber EranGabber reopened this Jul 6, 2015
EranGabber added a commit to EranGabber/cluster-insight that referenced this issue Jul 7, 2015
…calls.

This endpoint should help to find performance problems which increase the
overall response time. See issue google#82.
@EranGabber
Copy link
Contributor Author

The output of '/elapsed' endpoint show the following requests by decreasing elapsed time.
These requests where sent to the Kubernetes/Docker back ends while computing the response to the "/cluster" endpoint.

$ time curl http://localhost:5555/cluster | wc
17536 29159 709231
real 0m7.097s
user 0m0.016s
sys 0m0.000s

The top low-level requests by decreasing elapsed times:
"elapsed": {
"average": 0.08926033351732338,
"count": 230,
"max": 0.4532020092010498,
"min": 0.014033794403076172
},

 {
    "elapsed_seconds": 0.4532020092010498, 
    "start_time": "2015-07-07T19:44:53.894966", 
    "what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.e4cc795_fluentdelasticsearch-kubernetes-minion-5emu_default_5f9950d081564dad2522c56233c6ad72_be5bb21b/json"
  }, 
 {
    "elapsed_seconds": 0.31882810592651367, 
    "start_time": "2015-07-07T19:44:50.769393", 
    "what": "https://10.0.0.1:443/api/v1/services"
  }, 
  {
    "elapsed_seconds": 0.29047608375549316, 
    "start_time": "2015-07-07T19:44:50.767264", 
    "what": "https://10.0.0.1:443/api/v1/pods"
  }, 
  {
    "elapsed_seconds": 0.31073594093322754, 
    "start_time": "2015-07-07T19:44:50.765486", 
    "what": "https://10.0.0.1:443/api/v1/pods"
  }, 
  {
    "elapsed_seconds": 0.31626391410827637, 
    "start_time": "2015-07-07T19:44:50.769959", 
    "what": "https://10.0.0.1:443/api/v1/pods"
  }, 
  {
    "elapsed_seconds": 0.33477306365966797, 
    "start_time": "2015-07-07T19:44:50.785799", 
    "what": "https://10.0.0.1:443/api/v1/pods"
  }, 

Note that there are 4 concurrent calls to "api/v1/pods". The reason is that all four requests start at about the same time with a cache miss.

@EranGabber
Copy link
Contributor Author

More elapsed time measurements:

$ time curl http://localhost:5555/cluster | wc
real 0m7.823s
user 0m0.008s
sys 0m0.004s

$ curl http://localhost:5555/elapsed
{
"average": 0.08131413026289507,
"count": 231,
"max": 0.5312027931213379,
"min": 0.010451793670654297
},
{
"elapsed_seconds": 0.5312027931213379,
"start_time": "2015-07-07T22:14:41.519287",
"threadIdentifier": 140262333708032,
"what": "http://kubernetes-minion-5emu:4243/images/b6b9a86dc06aa1361357ca1b105feba961f6a4145adca6c54e142c0b
e0fe87b0/json"
},
{
"elapsed_seconds": 0.42257118225097656,
"start_time": "2015-07-07T22:13:51.797875",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/json"
},
{
"elapsed_seconds": 0.30991411209106445,
"start_time": "2015-07-07T22:14:42.644698",
"threadIdentifier": 140262308529920,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.8fdb0e41_kube-dns-v3-pesl1_default_4af962d5-
1eae-11e5-a2f4-42010af0216b_cfad6c4e/top?ps_args=aux"
},
{
"elapsed_seconds": 0.267808198928833,
"start_time": "2015-07-07T22:14:42.807726",
"threadIdentifier": 140262325315328,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.e4cc795_fluentd-elasticsearch-kubernetes-min
ion-5emu_default_5f9950d081564dad2522c56233c6ad72_be5bb21b/top?ps_args=aux"
},
{
"elapsed_seconds": 0.34384989738464355,
"start_time": "2015-07-07T22:14:42.919764",
"threadIdentifier": 140262333708032,
"what": "http://kubernetes-minion-z4dv:4243/images/b2d396dfaf1f81a38b920e6ea89eea6542bcc799fefd259a85e304e8
d4efcedd/json"
},

Thread 140262325315328 had 44 calls.
Thread 140262316922624 had 48 calls.
Thread 140262333708032 had 42 calls.
Thread 140262308529920 had 42 calls.
Thread 140262353975040 has 56 calls.
The first 50 calls or so where performed by thread 140262353975040.
The first call of the 50 started at 2015-07-07T22:13:51.557119.
The last call of the 50 started at 2015-07-07T22:13:54.237528.
This means that for the first 3 seconds of processing the /cluster request, the collector employed only one thread.

@EranGabber
Copy link
Contributor Author

Observations:

  1. There are about 230 calls to Kubernetes/Docker which are not cached. With 4 concurrent worker thread, it is about 60 calls per thread.
  2. The average call takes about 0.08 - 0.09 seconds, which means that each worker thread takes at least 5.4 seconds to complete, even without handling occasional longer calls.
  3. The longest calls are to get process information, followed by accessing container information, and accessing image information.
  4. we should increase concurrency maybe by changing the order by which the calls are created.

The first 10 calls are:
{
"elapsed_seconds": 0.19701814651489258,
"start_time": "2015-07-07T22:13:51.557119",
"threadIdentifier": 140262353975040,
"what": "https://10.0.0.1:443/api/v1/nodes"
},
{
"elapsed_seconds": 0.42257118225097656,
"start_time": "2015-07-07T22:13:51.797875",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/json"
},
{
"elapsed_seconds": 0.08326482772827148,
"start_time": "2015-07-07T22:13:52.220508",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.6e0eef86_cluster-insight-master-controller-v
1-f3h31_default_4ed86e5d-24f5-11e5-a2f4-42010af0216b_de7f60bb/json"
},
{
"elapsed_seconds": 0.07416605949401855,
"start_time": "2015-07-07T22:13:52.304149",
"threadIdentifier": 140262353975040,
"what": "https://10.0.0.1:443/api/v1/pods"
},
{
"elapsed_seconds": 0.04779195785522461,
"start_time": "2015-07-07T22:13:52.388333",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_cluster-insight.83588692_cluster-insight-master-
controller-v1-f3h31_default_4ed86e5d-24f5-11e5-a2f4-42010af0216b_27180b6b/json"
},
{
"elapsed_seconds": 0.1186530590057373,
"start_time": "2015-07-07T22:13:52.448607",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_cluster-insight.1697a734_cluster-insight-minion-
controller-v1-0bv7w_default_4d99e1d8-24f5-11e5-a2f4-42010af0216b_11d5c356/json"
},
{
"elapsed_seconds": 0.062438011169433594,
"start_time": "2015-07-07T22:13:52.581712",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.baeedb8_cluster-insight-minion-controller-v1
-0bv7w_default_4d99e1d8-24f5-11e5-a2f4-42010af0216b_360dba4d/json"
},
{
"elapsed_seconds": 0.043280839920043945,
"start_time": "2015-07-07T22:13:52.650576",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_heapster.ef1494a0_monitoring-heapster-v3-5ohfh_d
efault_244e735f-1f69-11e5-a2f4-42010af0216b_194de2c1/json"
},
{
"elapsed_seconds": 0.013170957565307617,
"start_time": "2015-07-07T22:13:52.699920",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_skydns.c572ec4b_kube-dns-v3-pesl1_default_4af962
d5-1eae-11e5-a2f4-42010af0216b_fafbbd6f/json"
},
{
"elapsed_seconds": 0.025799989700317383,
"start_time": "2015-07-07T22:13:52.716121",
"threadIdentifier": 140262353975040,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_fluentd-elasticsearch.fe026601_fluentd-elasticse
arch-kubernetes-minion-5emu_default_5f9950d081564dad2522c56233c6ad72_2add25c9/json"
},

The first 5 calls made by all other threads are:
{
"elapsed_seconds": 0.11059689521789551,
"start_time": "2015-07-07T22:13:54.297367",
"threadIdentifier": 140262325315328,
"what": "http://kubernetes-minion-9ura:4243/containers/k8s_POD.e4cc795_fluentd-elasticsearch-kubernetes-min
ion-9ura_default_5f9950d081564dad2522c56233c6ad72_cf65aaf4/top?ps_args=aux"
},
{
"elapsed_seconds": 0.14503097534179688,
"start_time": "2015-07-07T22:13:54.284837",
"threadIdentifier": 140262333708032,
"what": "http://kubernetes-minion-5emu:4243/containers/k8s_POD.b70514b6_elasticsearch-logging-v1-0l717_defa
ult_4ac8b711-1eae-11e5-a2f4-42010af0216b_41622438/top?ps_args=aux"
},
{
"elapsed_seconds": 0.14322996139526367,
"start_time": "2015-07-07T22:13:54.316736",
"threadIdentifier": 140262316922624,
"what": "http://kubernetes-minion-y02e:4243/containers/k8s_cassandra.b04ecdce_cassandra-o4jhi_default_0dd57
a4f-1f3c-11e5-a2f4-42010af0216b_535b6797/top?ps_args=aux"
},
{
"elapsed_seconds": 0.048887014389038086,
"start_time": "2015-07-07T22:13:54.431805",
"threadIdentifier": 140262333708032,
"what": "http://kubernetes-minion-5emu:4243/images/2c40b0526b6358710fd09e7b8c022429268cc61703b4777e528ac9d4
69a07ca1/json"
},
{
"elapsed_seconds": 0.05738997459411621,
"start_time": "2015-07-07T22:13:54.430641",
"threadIdentifier": 140262308529920,
"what": "http://kubernetes-minion-9ura:4243/containers/k8s_grafana.686aac74_monitoring-influx-grafana-v1-ig
6id_default_21f74fee-2406-11e5-a2f4-42010af0216b_914d0eb6/top?ps_args=aux"
},

@EranGabber
Copy link
Contributor Author

More confusing evidence:
The same elapsed time entry appears twice.
{
"elapsed_seconds": 0.1894969940185547,
"start_seconds": 1437420599.953763,
"start_time": "2015-07-20T19:29:59.953763",
"threadIdentifier": 140241479157504,
"what": "get_pods() cache hit"
},
...
{
"elapsed_seconds": 0.1894969940185547,
"start_seconds": 1437420599.953763,
"start_time": "2015-07-20T19:29:59.953763",
"threadIdentifier": 140241479157504,
"what": "get_pods() cache hit"
},

Looking at the log file after one day, I could not find any double entries. This may have been an artifact of the "less" or "more" commands.

@EranGabber
Copy link
Contributor Author

Another problem:
sometimes a cache hit takes 0.4 seconds or more. It should be instantaneous.

{
"elapsed_seconds": 1.5974044799804688e-05,
"start_seconds": 1437493046.502504,
"start_time": "2015-07-21T15:37:26.502504",
"threadIdentifier": 140175146006272,
"what": "StringCache lookup"
},
{
"elapsed_seconds": 0.2224419116973877,
"start_seconds": 1437493046.360348,
"start_time": "2015-07-21T15:37:26.360348",
"threadIdentifier": 140175146006272,
"what": "get_pods() cache hit"
},
{
"elapsed_seconds": 0.2226250171661377,
"start_seconds": 1437493046.360346,
"start_time": "2015-07-21T15:37:26.360346",
"threadIdentifier": 140175146006272,
"what": "get_selected_pods()"
},

The only expensive operation that JSONCache.lookup() is doing in addition to StringCache.lookup() is a json.loads() of the cached string. However, json.loads() is called all over the place. Maybe the problem is caused by memory management during concurrent processing.

@EranGabber
Copy link
Contributor Author

Running the cluster-insight collector with 1 worker thread and 100% cache hits for containers, images, and processes yields a response time between 0.4 seconds and 4 seconds.
Running it with 4 worker threads and 100% cache hits for containers, images, and processes yields a much larger response time variety.

Possible reasons:

  1. The VM that runs the cluster-insight collector is busy, and I do not get any scheduling guarantee.
  2. There is something wrong in the caching code. It is probably related to the json.loads() call. It could be the result of a critical section in memory management.

@EranGabber
Copy link
Contributor Author

Fresh measurement of Cluster-Insight collector with aggressive caching (all Kubernetes requests except get process information are cached).
I run the following command to collect 10 consecutive elapsed times:
% for i in $(seq 1 10); do echo $i; time curl http://localhost:5555/cluster | wc; curl http://localhost:5555/elapsed > /dev/null; sleep 10; done

"n1-standard-1" machines with 4 worker threads:
min/average/max elapsed time (seconds): 2.451 / 4.497 / 6.827

Output of "curl http://localhost:5555/cluster | wc" was consistently
17390 28831 706407

@EranGabber
Copy link
Contributor Author

Same measurements of Cluster-Insight collector with aggressive caching (all Kubernetes requests except get process information are cached) on a different cluster (thanks Vas!).
I run the following command to collect 10 consecutive elapsed times:
% for i in $(seq 1 10); do echo $i; time curl http://localhost:5555/cluster | wc; curl http://localhost:5555/elapsed > /dev/null; sleep 10; done

"n1-standard-1" machines with 4 worker threads:
min/average/max elapsed time (seconds): 0.307 / 0.355 / 0.528

Output of "curl http://localhost:5555/cluster | wc" was consistently
15460 26740 586765

These measurements indicate that the slowness and high variability of the Cluster-Insight collector are caused by the VM and cluster and not by the code. Aggressive caching is good in any case.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants