Sunday, May 6, 2012

SQL Azure vs Amazon RDS

Comparison of SQL Azure vs Amazon RDS
http://www.develop.com/sqlazurevsamazonrds

CONCLUSION
There are two major differences between the Microsoft SQL Azure and Amazon RDS platforms: pricing and capabilities. If price is no object and the user wants full features and high performance, then RDS is the obvious choice. If the user is more cost-conscious, SQL Azure has enough features and is good enough for many use-cases. The two exceptions are if a user has a database larger than 50GB, or needs a mature backup system. These are not possible with SQL Azure. That said, in most instances, Microsoft developers will favor SQL Azure because of their comfort level with the T-SQL syntax and the Microsoft tooling. Ruby and Java developers who have written MySQL applications will be inclined to choose RDS. With that in mind, perhaps these products aren’t competing with one another after all.

Is it good performance wise to use Apache httpd with Apache Tomcat for static content

Short Answer: No

Read the full article here

The short answer is that this is a myth. The longer answer is that back in the days of Tomcat 3 there was some truth to this depending on circumstances. However, for the versions of Tomcat in use today (5.5.x and 6.0.x) then there is no need to use httpd for purely performance reasons. Tomcat now supports the native/APR connector which uses the same native library (the Apache Portable Runtime—APR) as httpd for the low-level I/O and therefore can achieve similar performance to httpd. When serving static content there is ever so slightly more overhead when using Tomcat compared to httpd but the differences are so small they are unlikely to be noticeable in production systems.

While raw performance for static content may not be a good reason to use httpd, there are a number of good reasons why you might want to use httpd with Tomcat. The most frequent reason is to provide load-balancing to two or more Tomcat instances. httpd isn't the only option to so this - hardware load balancers or other reverse proxies can be used - but it is a popular choice amongst system administrators as many of them are already familiar with httpd. I'll write more on using httpd as a load-balancer in a future article.

httpd is also used with Tomcat when there is a requirement to support technologies other than Java. While Tomcat can support PHP or Perl, the support for these is better in httpd. Therefore, for sites that need a mix of technologies httpd can be used as the front-end web server, directing requests to mod_php, mod_fastcgi, mod_proxy_http (for Tomcat) or any other module as appropriate.

httpd's support for integrated Windows authentication is also a reason for using httpd in front of Tomcat. There are Tomcat based solutions for integrated windows authentication and, as these gain acceptance through wider use, this particular reason for using httpd may become less important. However, at the moment, it remains one of the more frequently cited reasons for using httpd with Tomcat.

In summary, there are good reasons for using httpd with Tomcat but raw performance for static content isn't one of them. If you are using httpd solely to improve static content performance then I recommend taking a look at the Coyote APR/native connector for Apache Tomcat.

Tuesday, May 1, 2012

Digg's Arch

http://about.digg.com/blog/how-digg-is-built

The above link explains at a high level how Digg is built.
Memcached and redis are in memory stores used for caching for quick reads.
Writes - depending on their criticality of execution go to various places.
For example critical writes would go straight to the Primary Database storage.
Less critical writes, which are pointed to asynchronous tasks can be queued
for execution using some Message Queuing system such as RabbitMQ.
More details at the link above.

Getting Memcached statistics using Ganglia and Cacti


Memcached statistics can be obtained using monitoring tools like Ganglia and Cacti.
These can report the number of hits/misses, gets/sets and evictions.

Choosing the right NoSQL database for your application

http://highscalability.com/blog/2010/12/6/what-the-heck-are-you-actually-using-nosql-for.html

Driving Factors for NoSQL Databases


Above ppt and links summarizes some important points for NoSQL databases
1.)The traditional Relational databases do not scale out well with respect to cost with the amount of data and the number of the users. This has led to the invention of NoSQL databases
2.)The other disadvantage of Relational databases is their strong rigid schema structure
whereas NoSQL databases have a more flexible non-rigid schema structure
3.)NoSQL databases do not require a set schema before inserting the data
4,)The schema flexibility is maximum in document oriented and key value stores rather than column oriented stores because column oriented stores require the column family specification at the start.
4.)NoSQL databases support AutoSharding without application participation
5.)NoSQL databases scale out linearly with respect to cost vs data whereas Relational databases scale out exponentially
6.)NoSQL databases support distributed queries
7.)Less overheads of transaction management, locks, constraints, as compared to Relational Databases. Relational databases can be easily scaled up but not scaled out.


Thursday, April 26, 2012

NoSQL guide for beginners

We see the main categories for NoSQL dbs are
Document
Key-value
Tabular
Graph

NoSQL guide for beginners