A few months ago (about october) ago we were contacted with the simple question: Can you run an oracle database in the cloud, the Azure cloud. Well … it depends. The little detail was, that the database is about 34TB and there are a few other multi TB databases AND there are a lot of copies of them. And … the final decision for go live is … end of 2016. Well, we accepted the challenge.
The deadline was strict, so that’s also the reason I had less time to blog and these Azure cloud series won’t be completely chronological, … but (and this is a spoiler alert) I’m interested in sharing what we ended up with.
This post will focus on how the database tests using slob were done. Credits for @ for the SLOB-tool and @ for his slob testing harness. Combining these 2 provides a very quick way of running consistent tests. We needed such a quick testing framework as we were changing about everything to see if it impacted disk throughput / iops or not.
Why we choose those machines is for another post, but we opted for the DS15_V2 vm ( details here ). The explanation from the machine I borrowed from the Microsoft website: “Dv2-series, a follow-on to the original D-series, features a more powerful CPU. The Dv2-series CPU is about 35% faster than the D-series CPU. It is based on the latest generation 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, and with the Intel Turbo Boost Technology 2.0, can go up to 3.1 GHz. The Dv2-series has the same memory and disk configurations as the D-series.”
Looks good, right? And we can attach up to 40TB to the machine, which makes it a candidate to be used for the future database servers.
It gets better, these family of servers can use also the Microsoft premium storage, which are basically SSD’s and disk caching is possible if needed.
As the databases are a bit bigger, only way we could do was use the P30 disks ( more details about them here ) So a disk limit of 5000 iops and 200MB/s. Should be ok as a first test.
The first test was done using iozone. The results of that will be in a different blogpost as I still need to do the second tests to crosscheck them. But let’s continue, but not before I would like to ask, if there are remarks, questions or suggestions to improve, I’ll be happy to test them.
The vm is created, 1 storage account was used, and in the storage account, it was completely filled up with 35 premium storage ssds.
Those disks were presented to the virtual machine, added into one big volume group and an xfs striped filesystem was created on a logical volume, which will host the SLOB database.
The db was created db using cr_db.sql from create database kit after enabling it for the 4k redologs. After finishing all steps to make it a Physical IO test we were good to launch the testing harness. It ran for a wile and eventually our top load profile looked like this during all the tests:
I think that’s ok? So after that it’s time to run the slob2-analyze.sh to generate a csv file. That csv was loaded in excel and this was the result.
First I splitted the write and read iops, but then I decided to use the total iops as the graph follows the trend. My understanding (please correct me if wrong) is that around 30000 iops of a 8k database block is around 234MB/s? These tests were done without disk caching.
Then we decided to do the whole test again, but this time, instead of using 1 storage account with a bunch of disks, we used a bunch of storage accounts with only one disk in it. The rest of the setup was done exactly the same (created a new vm with same size, same volumegroup, same striping, …) and the database was created using the same scripts again. Here are the results:
I think it is remarkable that even in the cloud, the way how you provide the disks to the machine really does matters. For example if you take the 32 workers. With one storage account, remarkably less work was done.
More to come of course. Feedback is welcome about what might be the next blogpost. Let’s make it interactive 🙂
As always, questions, remarks? find me on twitter @vanpupi