Thursday, 25 April 2019

Virtualization in DC vs AWS

Without getting into the larger debate of private cloud vs public cloud for the organization as a whole, this document tries to give the numbers that let us make a fair comparison in terms of performance - and performance alone between AWS and Private Cloud through Openstack.
The initial benchmarking and preparing these numbers stemmed out of the proposed DC migration to west coast. The migration did not happen, but it gave us the opportunity to study the performance impact of having our own cloud.
One of the gripes that I've had about our colo infra is that we never seem to have gotten grade-A processors. Almost all of the ones were hand-me-down servers or mid-level Intel Xeons, sometime with absurdly low clock speed. The storage side wasn't very brilliant either.
This was solved when we obtained a test bed from Dell with the latest and greatest hardware in their lab that we could setup and run benchmarks on.
The machine we got was this:
Hosted hardware in the environment consists of:
PowerEdge R740XD (Quantity: 1)
 (2) Intel Gold 6136 3.0GHz 12C processors
 384 GB RAM (12x 32GB configuration)
 Dell PERC H740P controller
 (6) 800GB SAS SSDs (RAID10 configuration)
 Mellanox 2 port 25GbE ConnectX-4 adapter
 Dell S5048 25 GbE 48-port switch
 Dell S3048 1GbE management switch (out-of-band management)
Note that this was not the top of the line Platinum processors with dedicated chip for syscalls that AWS had. But this represents a decent compromise between cost and performance and more importantly, this was something we could hope to replicate in our DC without having to trade an arm and a leg (of the company).
We decided to pick the latest and greatest in the AWS arsenal - a c5.4xlarge with a 100G SSD for pitting it against our VM which had the same configuration for CPU, Memory and Disk.
The VM was setup with Pass through CPU mode - meaning all the flags on the host CPU were exposed to the VM.
What to compare
Once we've gotten the hardware, we needed to decide on what exactly we wanted to compare with AWS. We knew already that we had no hope of competing with them in terms of storage. We also knew we can match AWS in terms of network performance because we had gotten SRIOV working - which guarantees bare-metal performance to VM NICs. We ultimately decided on these:
  • Raw CPU performance - Compute
  • Memory performance - Compute
  • Mixed Compute performance - Compilation tests
  • Raw Storage benchmarks
  • SQlite benchmarks - Real world approximation for storage intensive tasks
Care was taken to match OS, software and library versions as much as possible. We also accounted for the fact that we were only VM running on the host by dynamically disabling the extra cores on the host - meaning only thing that would come into play is the Virtualization overhead (or so we hope).
The Raw compute benchmarks were also later replicated on a host that was running 3 VMs of 16 cores each, with the tests running on all of them simultaneously.
Below we have the real numbers that we obtained out of these tests, and we also discuss what these numbers may imply. We are presenting a subset of all the benchmarks that we did.
The tests were run using Phoronix test suite, and direct commands in case of OpenSSL.

Raw CPU performance

Single core ssl signing
VM in Dell HWAWS c5.4xlarge
sign verify sign/s verify/ssign verify sign/s verify/s
rsa 2048 bits 1763.6 61026.7rsa 2048 bits 1613.4 54149.8
Considering 2048 bit key 8.9% faster
Implications: 
Considering the CPU models that we see inside the AWS instances this is a bit suprising that our Gold CPU can be faster. It means that either the hypervisor of AWS doesn't allocate as much CPU shares to each vCPU as the underlying CPU can provide or 2) AWS platinum processors are decidedly high-core count but lower performance than generally available top-end Xeon chips.
Multi core ssl signing
VM in Dell HWAWS c5.4xlarge
sign verify sign/s verify/ssign verify sign/s verify/s
rsa 2048 bits 0.000036s 0.000001s 27984.2 959951.4rsa 2048 bits 0.000077s 0.000002s 12994.5 437587.3
Considering 2048 bit key 109% faster
Implications:
Multi-core turbo scaling of commodity Xeon Golds is much better than in AWS where it is probably uneven / turned off.
One interesting tid-bit to mention here is that the Turbo Scaling that Intel advertises does not happen equally for all processor cores. It sort of linearly drops as the core count increases.
Check this graph out.

Memory

Redis benchmarking
This can be sort of a composite memory test with CPU being involved a fair bit.
VM in Dell HWAWS c5.4xlarge
Average: 1692634.75 Requests Per SecondAverage: 1638467.50 Requests Per Second
22% faster
Implications:
We believe that this is due to the spill-over effect of AWS CPU have lower raw compute performance.

Composite Performance

Linux Kernel Compile test
VM in Dell HWAWS c5.4xlarge
Average: 60.84 SecondsAverage: 99.57 Seconds
63% faster
Implications:
This is again a CPU intensive test, and the previous results percolate through.

Storage:

Before putting the numbers down here, let me state that storage is complicated business. What is raw storage performance is not what you get in the real world, and things like OS cache, filesystem choice, fsync choice, RAID cache, optimization by Qcow make harder to arrive at an apples to apples comparison.
So I am going to just paste the raw results of the entire test suite we ran for the Storage benchmarks below.
  1. AWS disk
  2. VM disk through QCOW Note: The links are down at the moment. I will fix them as soon as I get the chance.

I am more than happy to discuss the specifics of the storage tests over a chat / meet.

Database

Sqlite
We could not run the full postgresql test as part of the Phoronix, but we settled for sqlite due to the time crunch. Yes, we know its not a real DB.
VM in Dell HWAWS c5.4xlarge
Average: 5.65 Seconds - RAIDed SAS - SSDAverage: 19.81 Seconds
But these results are misleading
But here is an interesting thing,
Dell baremetalAWS c5.4xlarge
Average: 64.3825 secondsAverage: 19.81 Seconds
325% slower
Further Discussions:
By default fsync is performed after every transaction in sqlite test. When measuring raw fsync performance inside our VM, we found that it can be upto a 100 times slower for each fysnc call.
Check out the image below:

Which means the ridiculously low numbers that we were seeing with respect to Dell VM are probably due to a QCOW optimization.
Making sense of this involves a deep dive to the default KVM virtio storage cache mode, effects of OS cache, and effects of the RAID cache. Perhaps a separate write-up on this is in order.
But one thing we know for sure is that AWS has figured out how to drastically improve storage performance and that we cannot hope to match it unless we pay a storage vendor for a proprietary solution.
Final words:
Raw performance is not the only thing, not even the first thing that comes up as a consideration when deciding between private and public cloud. A cloud is much than an infrastructure as a service platform. However, when choosing between the two, I believe when performance comes up, people usually lean on the public cloud side - this write-up hopes to clear things up (or muddy them further) in that aspect.

Wednesday, 20 March 2019

Interesting numbers #1

Interesting numbers will be a section on my blog where I point out statistics that I find are interesting, mildly amusing and provides a new, different perspective.



25%
TamilNadu is a key market for Education loans in India, accounting for 25% of all education loans disbursed across the country, amounting to around Rs.20,650 Crores.

1.5Million
No. of hectares of forest diverted for non-forest purposes through Forest Conservation Act
1.8Million
No. of tribals to be evacuated under a recent supreme court order, under the pretext of preventing protecting forests.

Thursday, 31 January 2019

On mentorship

Being a mentor is a not an easy thing. There are a lot of things that a mentor has to do, beyond imparting knowledge. To mentor is to mould and sculpt - to recognize there is a permanence to the teaching for better or worse. One of those things is choosing when to intervene as a mentor, and to fix the mess created by your student. My favorite description of the importance of this intervention is captured in a novel by Arthur Hailey, titled Airport. One of the pages has a story about an air traffic controller training a new recruit -

"George Wallace nodded and edged closer to the radarscope. He was in his mid-twenties, had been a trainee for almost two years; before that, he had served an enlistment in the U. S. Air Force. Wallace had already shown himself to have an alert, quick mind, plus the ability not to become rattled under tension. In one more week he would be a qualified controller, though for practical purposes he was fully trained now. Deliberately, Keith allowed the spacing between an American Airlines BAC-400 and a National 727 to become less, than it should be; he was ready to trasmit quick instructions if the closure became critical. George Wallace spotted the condition at once, and warned Keith, who corrected it. That kind of firsthand exercise was the only sure way the ability of a new controller could be gauged. Similarly, when a trainee was at the scope himself, and got into difficulties, he had to be given the chance to show resourcefulness and sort the situation out unaided. At such moments, the instructing controller was obliged to sit back, with clenched hands, and sweat. Someone had once described it as, "hanging on a brick wall by your fingernails." When to intervene or take over was a critical decision, not to be made too early or too late. If the instructor did take over, the trainee's confidence might be permanently undermined, and a potentially a good controller lost."

I have always had such mentors who knew when to step in. And to that, I owe them a lifetime of gratitude.

Wednesday, 26 December 2018

Acknowledging privileges

Privilege - A special advantage that people enjoy over others. It is this elusive idea , hard to understand for a lot of people. Why is it so? Why do so many people carry on with their lives completely oblivious to the privileges they enjoy by virtue of their very existence? How can so many people miss such an obvious thing as one that pervades all your life? Is it the extraordinary levels of insulation that people can enjoy in this society? Or is it merely an inability to think beyond their little ponds? Can people be pathologically incapable of recognising privileges that they enjoy and the others so cruelly denied?

It is not hard to find people like this. People who rant about having to pay taxes, or people who rant about reservation system or people who in general think that the society is unfair to them because they do not always get their way.

Don't get me wrong. People are well within their rights to question high taxes or nepotism or inordinate majoritarianism in cornering resources in name of reservation. But a little acknowledgement might not be out of place. The transactional attitude to life, where everything is evaluated in individual terms of profit or loss, is simply being wilfully deaf to cries for empathy and basic human decency.

People need to realise that they are all cogs in a huge machinery called society, and they no matter how extraordinarily talented they are and no matter how isolated they think their actions are, others still contribute to their success / gain. It may be hard to see, but it is because there are certain sections of the society who in so many subtle ways, sacrifice things that are valuable to me, I get to enjoy those privileges. I get to enjoy flexible timings at work, and I can jolly well proclaim that I deserve it because I line the pockets of my 'company' and that it is a fair compensation for the talent I bring in. But is it really? Isn't the rest of the society toiling away enabling me to enjoy this? I recognise that this is a slippery slope. The society is not toiling away for me in particular - they toil because that's what masses do - in absence of war or famine, people just work their asses off. But some people in the society enjoy a disproportionate fruit of the society's labour. While a significant portion of the population needs to run on time or risk getting penalised, a small portion gets to strut around oblivious to constraints of time. I get to enjoy food anytime I want because someone chooses to be away from the warmth of family and shelter to deliver my food. And it is not my case that they are not being compensated for it - they very well might be - but it is no one's case that these people are not missing out on some of the most basic things humans want to secure in their life. It is important to acknowledge that privilege I enjoy.

Yes, people get paid - but as the Indian privileged class is fond of claiming - money isn't everything. Acknowledging that certain people - people who work in the so called white collar industries are able to enjoy their privileges only because the society operates in a way which incentivises unequal rewards for equitable efforts and to recognise that we are not some unfeeling variables in an equation of a zero sum game is important.

But why is this acknowledgement so important? What is so wrong in being oblivious to your privilege as long it is not causing anybody any harm?


It is important because of the flippant attitude people have. People seem to think it is okay to behave in certain manner or mete out some treatment because they can somehow morally justify it to themselves, usually in terms of money. 'I pay more in taxes, than he earns / than the state gives me back'. Everyone must have heard a variation of this statement in their lifetime. What does this mean? This actually points to a deep ingrained denial of the value that the society adds to one's life. It is a particularly sociopathic justification of cruelty in terms of flimsy logic. It is refusal to acknowledge the privilege that one enjoys - by way of being able to reside in a peaceful country, enjoying a lawful state apparatus (for the most part at least), and relying on the society for everything from the very basic needs - from milk to manual scavengers on hire. Somehow everything is looked through a lens of efficiency and the humanity is scorched out, especially when it suits the person uttering these words.

There is a price that everyone pays for being in a society that adheres to / compelled to adhere to certain norms that resemble order and civility - some pay it more than others. Some have to make greater sacrifices than the others to enjoy the same things - not because of their talent or ability - but seemingly because they did not possess a special advantage a crucial juncture in their life. A privilege.

Sunday, 29 January 2017

Review: Justice: What's the Right Thing to Do?

Justice: What's the Right Thing to Do? Justice: What's the Right Thing to Do? by Michael J. Sandel
My rating: 5 of 5 stars

Ever felt that you are being meted out injustice due to reservations based on Caste?
Ever felt that the shopkeepers who sold Milk for 300 Rupees a packet and Water for 200 Rs a Can should be punished?
Ever felt strongly for or against same sex marriage? Euthanasia? Cannibalism?

If your answer to any of the questions is an yes, then you need to read this book. This book will take that question, rip it apart, then patch them up again, making you understand the various strands that held the question together in the first place.

One of those rare, must read books.

View all my reviews

Saturday, 21 January 2017

Review: Fault Lines: How Hidden Fractures Still Threaten the World Economy

Fault Lines: How Hidden Fractures Still Threaten the World Economy Fault Lines: How Hidden Fractures Still Threaten the World Economy by Raghuram G. Rajan
My rating: 5 of 5 stars

TLDR; If you want to understand the '08 crisis without getting bogged down in jargon, this is the book to go. Also, Dr. Rajan makes perceptive observations about India and it's politics, governance, and social security.

This book aims to narrate the various factors that contributed to the global economic crisis of '08. he writing is lucid, clear and flows smoothly. Dr. Rajan explains Sub prime mortgage crisis very well, and is clear cut in his reasoning as to why the Government intervening (or not) was bad. He doesn't paint anything to be a panacea, and doesn't bat too much against the bankers, and not for too much regulation either. Her repeatedly makes it clear that there was a lot of things going wrong, and makes very interesting commentary on the social implications of the crisis, and the solutions that tried to correct the effects of the crisis.

Rajan also takes a dig at the idea that everyone is born equal. He also makes convincing arguments for better social security net, improving the school system, and for affirmative action.

In this Indian edition, there is also a commentary on what India should do, post crisis. He makes a lot of perceptive observations, and this chapter alone makes the book worth reading.



View all my reviews

Thursday, 22 December 2016

Review: An Era of Darkness: The British Empire in India

An Era of Darkness: The British Empire in India An Era of Darkness: The British Empire in India by Shashi Tharoor
My rating: 5 of 5 stars

This is a well researched book that covers all the aspects of the argument against British Raj. There are not only nationalistic arguments, but points against social, cultural, moral, technological, political and utilitarian theories that seek to support the British rule.

Shashi tharoor makes no bones about calling out the people who say that British provided us with democracy, and those who say that they were better rulers because of their liberalism. There are some great data and writing that sum up how British imperialism maimed India badly.

The pages about famine and War efforts make for some really grim reading and out to shut anyone up who plays the utilitarian card. This might very well be Shashi tharoor's first classic.

View all my reviews

Goodreads

my read shelf:
Muthu Raj's book recommendations, liked quotes, book clubs, book trivia, book lists (read shelf)