Introduction to BigData Problems and their solutions.

As there is a mass growth of usage of Social media ,there is a sudden increase in data with time and the top MNC’s companies is all because of data .So,now the big question arises to the company where to store thousands of terabytes of data , a problem with volume arises and the second problem is if the company store the data the problem arises with speed/Velocity ,as no one in this recent world would wait for more than a minute.So the companies for there business to go fluently they use Hadoop as a software to control the whole cluster which is like master-slave module which works on the concept called Distributing Storage.

Distributed Storage Cluster

What is BigData ???

Big Data is also data but with a huge size. Big Data is a term used to describe a collection of data that is huge in volume and yet growing exponentially with time. In short such data is so large and complex that none of the traditional data management tools are able to store it or process it efficiently.

Example of BigData :

  • The New York Stock Exchange generates about one terabyte of new trade data per day.
New York Stock Exchange

Social media example for BigData :

  • The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.
  • The statistic shows that Instagram has 14 Million users, Terabytes of photos ,100s of Instances, 100s of Technologies.
  • A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.

There are tons of example where we see bigdata as a problem to all the big companies.

How the Big companies deal with BigData Problem ?

How Facebook deal ?

Facebook, for example, stores photographs. That statement doesn’t begin to boggle the mind until you start to realise that Facebook has more users than China has people. Each of those users has stored a whole lot of photographs. Facebook is storing roughly 250 Billion images.

I would take a library of books to explain how the big data practitioners use to process.But , once you start talking about data in terms that go beyond basic buckets, once you start talking about epic quantities, insane flow, and wide assortment, you’re talking about big data.

There are now ways to shift through all that insanity and glean insights that can be applied to solving problems,and identifying opportunities. That process is called analytics, and it’s why, when you hear big data discussed, you often hear the term analytics applied in the same sentence.

Like every other great power, big data comes with great promise and great responsibility.

How Google deal ?

First, google is indeed an expert in dealing with Big Data. Assuming that you already possess basic know-how about BigData. They process 3.5 billion requests per day, and each request queries a database of 20 billion web pages.

Let us quickly freshen up how search queries works. Google search results come from indexed pages and knowledge graph database.Google search engine analyzes the phrase entered into search bar. It analyzes the 2 aspects of the phrase.

1) Literal search- Search engine looks for a match for some of or the entire phrase. The root of your search phrase is then found, examined and expanded upon to find better results.

2) Semantic search: These searches attempt to understand the context of a phrase by analyzing the terms and language in the knowledge graph database to directly answer a question with specific information.

Working of Google with given data

“How google process enormous chunk of data?”

Over the year, google has developed programming models such as MapReduce , BigQuery etc. Now, Google compute engine (Distributing compute concept)+ Hadoop(Distribuiting storage concept) makes it possible to process and analyze the tera/peta bytes of data in order to provide the most relevant outcome for your search query.

There is more to know about the BigData .This is just a basic research to give you all a view what is BigData and how a top MNC’s companies deal with this type of problems.

Problems occurs in BigData

I really appreciate that you are reading my post.