Storage Mechanism

Data Governance & Security Mechanism in Distributed Data Storage System

We are aware that the traditional data storage mechanism is incapable to hold the massive volume of  data generated with lightning speed for further utilization even if we perform vertical scaling,  and we have anticipated only one fuel, nothing but DATA to accelerate the movement across all the sectors starting from business to natural resources including medical towards rapid growth. But the question is how to persist this massive volume of data for processing? The answer is, storing the data...

Read more...

Basic concept of Data Lake

The left side info graphics represents the basic concept of Data Lake where we can use the approach of ELT (Extraction, loading and then transformation) against traditional ETL (Extraction, Transformation and then loading) process. ETL process implies to traditional data warehousing system where structured data format follows (row and column). By leveraging HDFS (Hadoop Distributed File System), we can develop data lake to store any format data in order to process and analysis. Directly data can be loaded in the Lake...

Read more...

Essentially of Data Wrangling

To roll out a new software product commercially irrespective of any domain in the market,  360-degree quality check with test data is mandatory.  We can correlate this with a visualized concept of a new vehicle.  After completion of vehicle manufacturing, fuel has to be injected to the engine to make it operational. Once the vehicle starts moving, all the quality checks, testing get started like brake performance, mileage, comfort etc with thousands of other factors which are decided/concluded during...

Read more...

Establishment of Data Lake specific to multi-channel e-commerce application to understand customer’s buying pattern

Post order fulfillment data is becoming a very important asset of e-commerce vendors to understand complete buying pattern of customers. Especially for the e-commerce vendors who sells multiple products starting from electronics to apparels. Extraction and transformation are time-consuming operations when partially structured data starts moving from the various sources and finally land into the relational data warehouse.  Data extracted from the social media are semi-structured (JSON or XML).  As an example, Facebook provides information in JSON format through Graph API and same...

Read more...