Karan Singh

Code Never Lies, Comments Sometime Do !!

Ceph User Group Finland : First Meetup

| Comments

My colleague Toni Willberg from Red Hat Finland office, came up with a brilliant idea of creating a Ceph local meetup group “Ceph user group Finalnd” , so we both pursued that idea and created Ceph user group Finland (CugF)

This is group that brings together software/devops/system engineers, cloud/solution architects with an interest in Ceph, software defined storage and data center storage infrastructure to collaborate and share their knowledge,experience and opinions about the future of storage.

On 15th February 2017, we organized our first meetup with 2 presentations + great discussion. Here are they :

  • I have presented “Ceph Introduction and Beyond” and here are the slides.

  • My friend Pietari Hyvärinen presented “Detailed view of operations in production scale Ceph systems with Grafana” (i will seek his permission before publishing his slides)

Special Thanks to @ToniWillberg for CugF + 8 different User Groups that he is running in Finland Metropolitian Area on a wide range of open source technologies.

What’s Next : In approx 2 months time we will schedule our second meetup. Along with snacks and beer ( call out for sponsors ) you will enjoy listening to some interesting topics. If you are based in Finland and want to be the part of this community, feel free to join and collaborate. We will be happy to see you :)

Ceph Object Storage : Part-II (Indexless Buckets)

| Comments

Indexless Bucket

This is episode-2 of Ceph Object Storage Blog Series , you can refer to episode-1 here where i have explained the internals of Ceph Object Storage and covering over Ceph Indexless Buckets feature. In this episode we will go through the implementation of Ceph indexless buckets

By default Ceph RGW creates standard indexed bucket ( i.e. non-indexless buckets ). These buckets can list the objects stored in them. Lets verify standard bucket before configuring indexless buckets.

Ceph Object Storage : Part-I (the Internals)

| Comments

Ceph Object Storage

There is some performance difference between pure RADOS writes ( ex. via RadosBench ) vs RGW writes. There are several factors contributing to this such as :

  • Object storage access protocols ( S3 / Swift ) have higher overheads than native RADOS writes
  • Client write requests are translated through RGW adds additional latency causing additional bottlenecks
  • The most important factor is that “RGW maintains bucket indices that needs to be updated every time when a write operation is done. And further more RADOS writes does not have this over head of maintaining indexes / metadata”

In this blog post i will talk about a new feature landed in Ceph Jewel v10.1.0 which is officially known as Indexless Buckets and unofficially as Blind Buckets. Before diving into indexless buckets let’s understand what RGW does under the covers with a write request.

Working With NUMA/CPU Pinning

| Comments


The term CPU pinning / process affinity / NUMA generally boils down to the same idea that In a multi socket system, application best performance can be achieved by allowing application threads to get execute on the CPU core which is as close as to its memory bank. In most of the cases Linux process scheduler is intelligent enough to do this , however if you do this manually by yourself , it’s most likely that you will enjoy luxury of increased application performance. Here are some of my notes describing steps required for process affinity setup

Verify how application (radosgw in my case) threads being executed currently. The 5th column psr which represents processor core.

for i in $(pgrep radosgw); do ps -mo pid,tid,fname,user,psr -p $i;done

Don't Underestimate the Power of Ceph Placement Groups

| Comments

Ceph PG

Today i would like to share findings from one of my curious testing, which came from this basic question.

How Placement Group count affects Ceph performance?

If you are reading this blog then i assume you know what Ceph is and how Ceph Placement Groups (PG) works.

It all started when me and my colleague Sir Kyle Bader were discussing around how to get more performance out of our Ceph cluster with the following environment details

How Application IO's Are Treated by IO Scheduler

| Comments


Recently i have been doing FIO benchmarking and i found that IOPS reported by FIO != IOPS reported by iostat . Which made me think Why The Heck ?

So here is my FIO job with bs=4M and seq write

$ fio --filename=/dev/sdb --name=write-4M --rw=write --ioengine=libaio --bs=4M --numjobs=1 --direct=1 --randrepeat=0  --iodepth=1 --runtime=100 --ramp_time=5 --size=100G --group_reporting

write-4M: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/220.3MB/0KB /s] [0/55/0 iops] [eta 00m:00s]
write-4M: (groupid=0, jobs=1): err= 0: pid=424038: Tue Jun  7 23:48:32 2016
  write: io=22332MB, bw=228677KB/s, iops=55, runt=100001msec

As you can see fio reported 55 iops

Deploy COSBench Using Ansible

| Comments


In my previous blog about COSBench , i gave an introduction to this tool and have explained how you can install it and get it working.

Recently i automated COSBench installation and configuration using ansible and have created an ansible role. With this blog i will demonstrate how you can get COSBench up and running with minimal steps … because Ansible Rocks 8-)

FIO Tip: Use Genfio to Quickly Generate FIO Job Files

| Comments


FIO is one of the most popular benchmarking tool out there and its my favourite too. It’s a feature rich and provides pretty useful supporting utilities. One of the utilities it provides is genfio . As the name suggest it a tool that generates FIO job file based on arguments you provides to it.

Recently i have been try to benchmarking my server containing 35 disks such that each operation should run in parallel on all disk’s and i should get aggregated results for IOPS and Bandwidth. So i used genfio to generate FIO job file and then run fio command line using the job file

COSBench: Cloud Object Storage Benchmarking Tool

| Comments

cosbench Benchmarking storage systems is a pretty interesting job and these days i am playing hard with it . Having said that object storage benchmarking is currently under my radar.

Block and file storage is there since ancient times and so as their benchmarking tools. Object storage is way too different and the traditional benchmarking tools don’t talk object. There are not very many tools out there for benchmarking object storage, the one which is getting popularity is COSBench aka Cloud Storage Benchmarking tool.

COSbench is developed and open sourced by Intel. It’s based on classic client / server model where client node aka COSBench Driver node ( COSBench client ) accepts the job from server aka COSBench controller node ( COSBench server) executes the job against storage system gathers the result and send them back to COSBench controller node. Here are some good to know things about COSBench

Introducing Ceph Cookbook

| Comments

ceph cookbook

Why do we care

We are a part of a digital world that is producing an enormous amount of data each second. The data growth is unimaginable and it’s predicted that humankind will possess 40 Zettabytes of data by 2020. Well that’s not too much, but how about 2050? Should we guesstimate a Yottabyte? The obvious question arises: do we have any way to store this gigantic data, or are we prepared for the future? …. Software Defined Something ….. Well it’s a great saying that “Software is eating the world”. This appears to be true. However, from another angle, software is the feasible way to go for various computing needs, such as computing weather, networking, storage, datacenters, and burgers, ummm…well, not burgers currently. As you already know, the idea behind a software-defined solution is to build all the intelligence in software itself and use commodity hardware to solve your greatest problem. And the greatest minds in the industry thinks that, the software-defined approach should be the answer to the future’s computing problems.

Ceph is a true open source, software-defined storage solution, purposely built to handle unprecedented data growth with linear performance improvement. It provides a unified storage experience for file, object, and block storage interfaces from the same system. The beauty of Ceph is its distributed, scalable nature, and performance; reliability and robustness come along with these attributes. And furthermore, it is pocket friendly, that is, economical, providing you more value for each dollar you spent.