Friday, October 18, 2013

Performance matters

If spending a few days on improving overall performance can save you 30% of your server costs, you should probably do it.

That means 



  • getting rid of your prototyping language for something more fit for production (drop ruby/php, get python or drop java/python, get cpp, or drop cpp get c).
  • reworking your datamodel to optimize transfers, seek times, etc. through caching and pre-computed data sets
  • getting rid of your prototyping datastore for something more fit for production (drop mysql, get postgresql, drop NoSQL as main data store if your model contains more than one different item, drop NoSQL as main data store if the only thing you need is an ultra fast K/V store)
  • getting rid of your prototyping infrastructure for something more fit for production (drop your standard ec2 and get something optimized for your needs, do some profiling, fix your bottlenecks, use physical boxes with loads of RAM and SSDs)


For any big company, it should be simple to make that choice, you may invest a few hundred thousand in research and save 60%+ on a huge budget (like facebook could and has begun doing).

For a startup, it's even more relevant, since that extends your runway significantly, and pushes back the scaling discussion an order of magnitude or two (in number of users) later.

Usual replies:



  • "a language fit for production will reduce productivity" : 

that's never been proven in any way. it's pure OO propaganda.

  • "ruby is ready for production" :

rails had an sql injection weakness (the kind of security hole a 1-month beginner would never leave open), and it's much slower than Java or Python

  • "Python is ready for production" :

it's still changing rapidly, has no stable version that could have been time-tested for security and performance, and we're still talking 4x slower than C at least

  • "MySQL is just fine" :

most of the listed features are buggy, slow and incomplete, and the declared feature set covers maybe 5% of the SQL standard, most of it being incompatible with said standard.

  • "NoSQL is scalable" :

noSQL is just scalable by anyone including those who have no clue at all about databases, and is made scalable by being inconsistent.

  • "the cloud>you" :

go ahead and try to get one million IO per second in that cloud and tell me how much it cost you, I'm willing to sell you a physical box that does that for less than half of your yearly costs

  • "your approach isn't scalable" :

don't you think pushing the limit from 1000 concurrent users to 100.000 concurrent users on the same box is already a good step towards scalability? don't you think it'll make the scale out that much easier (i.e. a thousand boxes instead of 100.000 ? )

  • "your approach is more expensive" :

less servers, less expensive servers, more performance per server, ... it HAS to be more expensive indeed.

  • "<professionals I know> say you're wrong" :

In a field where 99% of the people don't understand what they're doing ? picture me shocked.

  • "facebook is right, you are wrong" :

I'm not telling you blue isn't the best color, I'm telling you a 30mpg car will go three times as far as a 10mpg car on the same fuel.

It's logic and mathematics, it doesn't depend on points of view, feelings or past experiences.
The only thing one could discuss is the actual cost of the performance improvement, but even rewriting all of facebook in pure optimized ASM would cost less than 10% of facebooks yearly server costs.

No comments:

Post a Comment