The 8th Habit of Highly Effective Big Data Programmers !
Last week I read a book called “The Seven Habits of Highly Effective Big Data Programmers” by Rekha Joshi which is interesting. Happy to share with the community which I have encouraged from the book.
Let’s understand first what Big Data is. Just by listening the word big data it seems rather What exactly is Big Data? At first glance, the term seems rather unclear, referring to something that is large and full of information. That description does indeed fit the bill, yet it provides no information on what Big Data really is.
Big Data is often described as extremely large data sets that have grown beyond the ability to manage and analyze them with traditional data processing tools. Searching the Web for clues reveals an almost universal definition, shared by the majority of those promoting the ideology of Big Data, that can be condensed into something like this: Big Data defines a situation in which data sets have grown to such enormous sizes that conventional information technologies can no longer effectively handle either the size of the data set or the scale and growth of the data set. In other words, the data set has grown so large that it is difficult to manage and even harder to garner value out of it. The primary difficulties are the acquisition, storage, searching, sharing, analytics, and visualization of data.
Now before going to the Eighth habit, let’s talk about first seven.
- Discover : Whenever we are stuck, we should explore and discover. In big data and analytics world, it can sometimes be like a search for bits of data in a TB size of data. And finding something quite different sticking out that cannot be ignored. Being an discovered which is looking without prejudice or bias as we go unraveling the future. Being an discoverer we are acknowledging the computing past and the future it will take. And being an discovered we are shaping the present to a correct future. Being able to discoverer into the unknown is the habit of a highly effective Big Data Programmer.
- Creative: Sometimes repeating more of the same work helps, while other times what is required is to break the framework. But we are in pretty good trajectory with big data processing. Even then unless we try something wildly different, we will never get wildly different results. Be creative. It is a habit of highly effective Big Data Programmer to run with absurd to validate if it is really all that absurd.
- Design and Redesign: Do it multiple ways, debate, and brain storm. All big data components are designed to be slightly differently from other to be a vital alternative. And all tool have it’s own pros and cons. So design and redesign. It should be big data programmer’s DNA. We have many tools as follows to choose one for specific needs like Spark, Storm, Samza, Flink, Kafka, Kinesis, ZeroMQ, Vertica, Redshift, Cassandra , MongoDB, HBase, Riak, DynamoDB, Impala, Drill, Presto, Apache Parquet, Sensu, Splunk, and more. And the first designs are not always the final ones, and as we have to go through many iterations. So Design and Redesign habit makes you highly effective big data programmer and also helps us to navigate the inevitable debates we will have.
- Observe: Really we have to observe what happens and we have to check the metrics, performance, security, and monitoring. So we have to monitor every chosen technology over a period of time and seeing what the numbers indicate is a game changer. Hence unless we measure a thing, it does not exist. A highly effective big data programmer monitors.
- Evaluation: Big Data is the data and technology together. Technology will change, so will our customer need also changes. And the great part of technology is dynamic. In other view the worst part is technology change and our dynamic customer needs. Hence the evaluation is the constant what is ever the technology changes we needs to build in. However, in the big data world use cases also imply changes at a fast pace in real time. Having a habit of evaluation is built in and it allows promotion of technology that best solves the use cases.
- The Maths: Sometimes with big data we do stuff like word count, or check how many people came to the web site home, how many folks clicked, and many more. So the math is basic and back bone for our big data. And to conclude at the intersection of data, science, technology and mathematics is the highly effective big data programmer.
- Network: In distributed and parallel computation the network is king. And the distribution of data is critical . So Data understanding is must and no one asks us to have that skill but it certainly makes our mind a strong equipment. Hence a highly effective big data programmer keeps a watch on the network and understands what is happening under the covers.
- Contribute is 8Th One: We have to contribute back and unless we contribute back the cycle is incomplete. Most of the big data components and tools are open sources and development happens in public. And contributing back makes us part of something much larger than themselves. The collaboration between people with like mind of purpose brings better big data application which is good for all. Raising our hand , contributing back is the 8th habit of a highly effective big data programmer.
So to conclude following are the effective big data programmer habits – Discover, Creative, Design & Redesign, Observe, Evaluation, The Maths, Network, and Contribution.
Reference – The Seven Habits Of Highly Effective Big Data Programmers. Rekha Joshi.
Interesting? Please subscribe to our blogs at www.dataottam.com to keep yourself trendy on Big Data, Analytics, and IoT.
And as always please feel free to suggest or comment coffee@dataottam.com.