123
 123

  2010-09-02 Thu

14:40 How long Innodb Shutdown may take (3730 Bytes) » MySQL Performance Blog

How long it may take MySQL with Innodb tables to shut down ? It can be quite a while.
In default configuration innodb_fast_shutdown=ON the main job Innodb has to do to complete shutdown is flushing dirty buffers. The number of dirty buffers in the buffer pool varies depending on innodb_max_dirty_pages_pct as well as workload and innodb_log_buffer_size and can be anywhere from 10 to 90% in the real life workloads. Innodb_buffer_pool_pages_dirty status will show you the actual data. Now the flush speed also depends on number of factors. First it is your storage configuration – you may be looking at less than 200 writes/sec for single entry level hard drive to tens of thousands of writes/sec for high end SSD card. Flushing can be done using multiple threads (in XtraDB and Innodb Plugin at least) so it scales well with multiple hard drives. The second important variable is your workload, especially how dirty pages would line up on the hard drive. If there are a lot of sequential pages which are dirty Innodb will be able to use larger size IOs – up to 1MB flushing dirty pages which can be a lot faster than flushing data page by page.

So if we have system with single hard drive doing 200 IO/ssc, 48G buffer pool which is 90% dirty and completely random page writes we’ll look at 13500 seconds or about 5min per 1GB of Buffer pool size.
This is worse case scenario though it is quite common in practice to see shutdown time of about 1min per GB of buffer pool per hard drive.

Baron has written a nice post how to decrease innodb shutdown time which you may want to read on this topic.


Entry posted by peter | 3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

10:15 A Global Greeting (990 Bytes) » Inside AdSense
Members of the AdSense team from all over the world say hello from Mountain View, CA!



07:26 Distributed Hashing Algorithms by Example: Consistent Hashing (826 Bytes) » High Scalability

Consistent Hashing is a specific implementation of hashing that is well suited for many of today’s web-scale load balancing problems. Specifically, it can be seen in use in various caching solutions like Memcached and is applicable to NoSQL solutions as well. Consistent Hashing is used particularly because it provides a solution for the typical “hashcode mod n” method of distributing keys across a series of servers. It does this by allowing servers to be added or removed without significantly upsetting the distribution of keys, nor does it require that all keys be rehashed to accommodate the change in the number of servers.

You can read the full store here.

03:15 Better than nothing (is harder than you think) (1430 Bytes) » Seth's Blog

Most of the time, particulary in b2b and luxury sales, the competition is nothing.

"I will buy this treat or I will buy nothing, because I don't really need anything."

"I will buy your consulting services, or I'll continue doing what I'm doing now on that front, which is nothing."

None of the above.

"I will vote for you or I'll do what I usually do, which is not vote."

"I'll hire you or I'll hire no one."

While you think your competition is that woman across town, it's probably apathy, sitting still, ignoring the problem... nothing.

Stop worrying so much about comparing yourself to every other possible competitor you can imagine and start comparing yourself to nothing. Are you really worth the hassle, the risk, the time, the money? Or can't the prospect just wait until tomorrow?

  2010-09-01 Wed

13:45 Scale-out vs Scale-up (585 Bytes) » High Scalability

In this post I'll cover the difference between multi-core concurrency that is often referred to as Scale-Up and distributed computing that is often referred to as Scale-Out mode. 

more..

Source: Scale-out vs Scale-up (http://www.dzone.com/links/r/scaleout_vs_scaleup.html) by Nati Shalom

12:15 Launching the ShipIt Workbook (2553 Bytes) » Seth's Blog

Six months ago, I put together a workbook that would help Linchpin readers ship.

After testing it out on hundreds of people, it's now ready for retail sale. [UPDATE on 9/2--yesterday, the workbook was so popular it went to the top 10 of all books on Amazon. And they sold all the warehouse could take. So it's sold out... I have shipped more to them, but they probably won't go on sale until the 8th. I'll update this post then. Thanks guys.]

You can find details here, or jump right to the buy page. The goal? To make you uncomfortable at the beginning of a project (and successful at the end).

Here's the core idea: it's weird to write in a book. When you do, you're making a commitment. You're combining the open-mindedness that reading brings with the physical action of writing. If you do that at every step in a project--and if your co-workers do too--the seemingly slippery decisions that get made appear a lot more solid.

The ShipIt workbook is designed to be worked on in groups (hence the five pack) and it delivers. If you can confront the mechanics or the fear that's slowing down (or even killing) your project, it's easy to fix it now, before it's too late.

There's no digital version, because without writing things down, it can't work. But there is an mp3 interview that will help you get your arms around how each page works. I'm pricing this first batch at $3.20 each in a pack of five just for the launch. [PS Amazon is having trouble shipping to Canadians right now. It may take a while to figure this out, and all I can do is apologize...]

I hope you'll give it a try.

10:40 Paper: The Case for Determinism in Database Systems (838 Bytes) » High Scalability

Can you have your ACID cake and eat your distributed database too? Yes explains Daniel Abadi, Assistant Professor of Computer Science at Yale University, in an epic post, The problems with ACID, and how to fix them without going NoSQL, coauthored with Alexander Thomson, on their paper The Case for Determinism in Database Systems. We've already seen VoltDB offer the best of both worlds, this sounds like a completely different approach.

The solution, they propose, is: 

03:45 Responsibility and authority (851 Bytes) » Seth's Blog

Many people struggle at work because they want more authority.

It turns out you can get a lot done if you just take more responsibility instead. It's often offered, rarely taken.

(And you can get even more done if you give away credit, relentlessly).

  2010-08-31 Tue

17:52 Introducing tcprstat, a TCP response time tool (6305 Bytes) » MySQL Performance Blog

Ignacio Nin and I (mostly Ignacio) have worked together to create tcprstat[1], a new tool that times TCP requests and prints out statistics on them. The output looks somewhat like vmstat or iostat, but we’ve chosen the statistics carefully so you can compute meaningful things about your TCP traffic.

What is this good for? In a nutshell, it is a lightweight way to measure response times on a server such as a database, memcached, Apache, and so on. You can use this information for historical metrics, capacity planning, troubleshooting, and monitoring to name just a few.

The tcprstat tool itself is a means of gathering raw statistics, which are suitable for storing and manipulating with other programs and scripts. By default, tcprstat works just like vmstat: it runs once, prints out a line, and exits. You’ll probably want to tell it to run forever, and continue to print out more lines. Each line contains a timestamp and information about the response time of the requests within that time period. Here “response time” means, for a given TCP connection, the time elapsed from the last inbound packet until the first outbound packet. For many simple protocols such as HTTP and MySQL, this is the moral equivalent of a query’s response time.

The statistics we chose to output by default are the count, median, average, min, max, and standard deviation of the response times, in microseconds. These are repeated for the 95th and 99th percentiles as well. Other metrics are also available. Here’s a sample:

[root@server] # tcprstat -p 3306 -n 0 -t 1
timestamp	count	max	min	avg	med	stddev	95_max	95_avg	95_std	99_max	99_avg	99_std
1276827985	1341	24556	23	149	59	767	310	91	69	1030	107	112
1276827986	1329	12098	28	134	63	461	299	91	65	667	104	93
1276827987	1180	13277	22	202	93	873	439	103	79	1523	131	169
1276827988	1441	15878	27	180	139	672	427	116	79	1045	136	128
1276827989	1432	157198	26	272	138	4165	405	115	80	1092	134	123
1276827990	1835	25198	26	183	124	734	448	115	85	1141	137	141
1276827991	1242	6949	29	129	114	301	233	98	61	686	109	84
1276827992	1480	284181	25	442	127	7432	701	128	114	4157	173	293
1276827993	1448	9339	22	161	88	425	392	104	80	1280	126	140

tcprstat uses libpcap to capture traffic. It’s a threaded application that does the minimum possible work and uses efficient data structures. Your feedback on the kernel/userland exchange overhead caused by the packet sniffing would be very appreciated — libpcap allows the user to tune this exchange, so if you have suggestions on how to improve it, that’s great.

We build statically linked binaries with the preferred version of libpcap, which means there are no dependencies. You can just run the tool. In the future, packages in the Percona repositories will provide another means for rapid installation via yum and apt.

tcprstat is beta software. Several C/C++ experts reviewed its code and gave it a thumbs-up, so many eyes have been on the code. We’ve performed tests on servers with high loads and observed minimal resource consumption. I personally have been running it for many weeks on some production servers without stopping it and have seen no problems, so I am pretty sure it has no memory leaks or other problems. Nevertheless, it’s a first prototype release, and we want much more testing. We might also change the functionality; as we build tools around it, we discover new things that might be useful. When we’re happy with it and you’re happy with it, we’ll take the Beta label away and make it GA.

The tcprstat user’s manual and links to downloads are on the Percona wiki. Commercial support and services are provided by Percona. Bug reports, feature requests, etc should go to the Launchpad project linked from the user’s manual. General discussion is welcome on the Google Group also linked from the user’s manual.

[1] Historical note: we initially called this tool rtime, but did not publicize it. However, some of you might have heard of “rtime” before. This is the same tool.


Entry posted by Baron Schwartz | 9 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

11:15 Just launched: Linchpin on the Vook on the iPad (2062 Bytes) » Seth's Blog

The details are right here. Created by Vook, based on the hardcover.

Includes new video and interviews with some interesting folks...

The long tail challenge of the iPad store is getting more and more obvious to people. The ratio of "shelf space" to inventory is about the worst of any retail experience in the world. There are more than 24,000 apps listed in the iPad store, and yet the front window (equivalent to the window of a bookstore) shows the user six choices. The spotlight coverflow up top shows another sixteen, fairly randomly. Meaning there's a little worse than a one in a thousand chance that your app will appear in front of someone interacting with the store at the first level.

I have no doubt that as Apple sees revenue increase from this source, they'll do a much better job of crosslinks and browsing. But, once again, the lesson of the long tail is this: you can't count on the gatekeeper to do your promotion for you. Getting picked feels like a needle in a haystack, and the value of permission, of connecting directly to people who care instead of ceding control to a middle man, is at the heart of building an asset. Someone is going to be the gatekeeper, and it should be you.

03:15 The corporate conscience » Seth's Blog

  2010-08-30 Mon

  2010-08-29 Sun

03:15 Don't forget about color » Seth's Blog