Gautam Goswami - Real-time Data Streaming, Real-time Data Processing, Real-time Data Analytics | Data Engineering Solution in Bangalore | Apache Kafka Streaming Solutions in Bangalore | Kafka Confluent Cloud Solutions in Bangalore | Kafka Streaming Implementation Support in Bangalore | Apache Kafka Support in Bangalore | Multinode Kafka Cluster Setup in Bangalore | Kafka Application Consulting in Bangalore | Kafka cloud implementation in Bangalore | Kafka infrastructure consulting in Bangalore | Kafka security implementation in Bangalore | Kafka upgrade support in Bangalore | Zookeeper setup support in Bangalore | Zookeeper Solutions in Bangalore | Multinode Zookeeper Setup in Bangalore | Big Data Consulting Service Providers in Bangalore | Data Analytics Consutling Services in Bangalore | Big Data Solution Providers in Bangalore | Big Data Analytics Companies in Bangalore | Data Analytic Services in Bangalore | Big Data Services in Bangalore | Big Data Analytics Solutions in Bangalore | Big Data Analytics Service Providers in Bangalore | Big Data Case Studies | Big Data Companies in Bangalore | Multi Node Hadoop Cluster | Data Lake creation and support | Data Ingestion Services in Bangalore | Koolanch | Artificial Intelliegence Solutions in Bangalore | Predictive Analysis Solution in Bangalore | Machine Learning Solution in Bangalore | Deep Learning Solutions Bangalore | ChatBots for Websites | Text to Speech API | DialogFlow ChatBots | ChatBots using DialogFlow | AI based image processing | AI solution providers in Bangalore | AI based Predictive Analytics | Conversational Bots Development in Bangalore | AI chatbots and voicebots | E-Commerce Solution Providers in Bangalore | Demandware Consulting Service in Bangalore | Demandware Companies in Bangalore | SFCC Consulting Service in Bangalore | SFCC Consulting Companies in Bangalore | SFCC Service Providers in Bangalore | Demandware Contract Staffing in Bangalore | Salesforce Commerce Cloud Consulting Services in Bangalore | SFCC Contract Staffing in Bangalore | Salesforce Commerce Cloud Contract Staffing in Bangalore | Oracle Consulting Services in Bangalore | Oracle Service Providers in Bangalore | Oracle Contract Staffing in Bangalore | OCC Contract Staffing in Bangalore | Oracle Commerce Cloud Consulting in Bangalore | Oracle Commerce Cloud Companies in Bangalore | SAP Hybris Consulting Services in Bangalore | SAP Hybris Service Providers in Bangalore | SAP Hybris Contract Staffing in Bangalore | SAP Hybris Commerce Cloud Consulting in Bangalore | SAP Hybris Companies in Bangalore | SAP Hybris Solutions in Bangalore | Hybris Commerce Solution in India | Hybris Solution Provider Companies | Magento Consulting Services in Bangalore | Magento Service Providers in Bangalore | Magento Contract Staffing in Bangalore | Magento Commerce Cloud Consulting in Bangalore | Magento Companies in Bangalore | Mobile App Development Company in Bangalore | Android App Development Services in Bangalore | Location Tracking Based Mobile App Development | Mobile App Development In Bangalore | Mobility Solution Provider in Bangalore | SQL Server Support Services in Bangalore | SQL Server Support Companies in Bangalore | Data Mining Solution in Bangalore | Custom App Development in Bangalore | Contract Staffing Solution in Bangalore

29AugAugust 29, 2017

Data Ingestion phase for migrating enterprise data into Hadoop Data Lake

The Big Data solutions helps to achieve valuable information to iron out the accurate strategic business decision. Exponential growth of digitalization, social media, telecommunication etc. are fueling enormous data generation everywhere. Prior to process of huge volume of data, we should have efficient data storage mechanism in a distributed manner to hold any form of data starting from structured to unstructured. Hadoop distributed file systems (HDFS) can be leveraged efficiently as data lake by installing on multi node cluster....

By Gautam GoswamiData Engineering, Data IngestionApache software foundation, Apache Sqoop, ata storage mechanism, ATG database, ATG database schema, cloud service providers, collecting Twitter streaming data, Couchbase, Data ingestion, Data Ingestion phase for migrating enterprise data into Hadoop Data Lake, Data Lake, data storage mechanism, DB2, Digitization, distributed storage, efficient data storage mechanism, ELT, enterprise data, export data from Kafka topic to HDFS, fault-tolerant, Flume, Hadoop, HADOOP Cluster, Hadoop Data Lake, Hadoop distributed file systems, Hadoop multi node cluster, HDFS, Hive, huge data reservoirs, huge volume of data, Ingestion, JDBC connector, JDBC protocol, Kafka, Kafka HDFS connector, Kafka to HDFS, Mainframe, mainframe dataset to HDFS, MapReduce, MapReduce distributed computing, migrating enterprise data, moving large amount of streaming data into HDFS, multi node cluster, multiple delimited text files, MySQL, Netezza, NoSql DB, NoSql Stores, Oracle, Oracle 11g Enterprise Edition, Oracle ATG Platform, parallel import process, parallel processing, pluggable mechanism, PostgreSQL, read the messages from Kafka topic, SQLServer, Sqoop, Sqoop installation, Strom, Using Kafka HDFS connectorComments Off

08JunJune 8, 2017

Why Lambda Architecture in Big Data Processing

Due to the exponential growth of digitization, the entire globe is creating minimum 2.5 Quintilian 2500000000000 Million) bytes of data every day and that we can denote as Big Data. Data generation is happening from everywhere starting from social media sites, various sensors, satellite, purchase transaction, Mobile, GPS signals and much more. With the advancement of technology, there is no sign of slowing down of data generation, instead it will grow in massive volume. All the major organizations, retailers,...

By Gautam GoswamiArchitecture, Data EngineeringApache Kafka, Apache Spark, Batch Data-processing Pipeline, batch layer, Big Data, Big Data Processing, big data technologies, Data ingestion, data pipeline, Data Pipelines, data processing pipeline, data warehousing systems, design framework, Digitization, fault tolerance, Flume Lambda sign λ, GPS signals, Hadoop Distributed File System, HDFS, Lambda, Lambda Architecture, Lambda Architecture in Big Data Processing, Lambda Architecture is a pluggable architecture, leveraging big data technologies, live streaming data, Mobile, Nathan Marz, persistence of data, pluggable architecture, purchase transaction, Quintilian bytes of data, satellite, sensors, Serving Layer, Speed layer, Streaming Data Pipeline, streaming data processing pipeline, streaming layer, Streaming or Speed layerComments Off

29MayMay 29, 2017

Apache Kafka, The next Generation Distributed Messaging System

In Big Data project, the main challenge is to collect an enormous volume of data. We need distributed high throughput messaging systems to overcome it. Apache Kafka is designed to address the challenge. It was originally developed at LinkedIn Corporation and later on became a part of Apache project. A Messaging System is typically responsible for transferring data from one application to another. A message is nothing but the bunch of data/information. To ingest huge volume of data into Hadoop...

By Gautam GoswamiData IngestionApache Kafka, Apache project, Big Data project, collect an enormous volume of data, distributed high throughput messaging systems, distributed messaging systems, ETL, Extraction, Hadoop Distributed File System, HDFS, high throughput, Kafka supports multi-subscribers, LinkedIn Corporation, Messaging System, multi-subscribers, next Generation Distributed Messaging System, transferring data from one application to another, Transformation and LoadingComments Off

04MayMay 4, 2017

Fog Computing

Fog computing also refer to Edge computing . Cisco Systems introduced the term "Fog Computing" and it's not the replacement of cloud computing. Ideally cloud computing points to storing and accessing data and programs over the Internet instead of local computer's hard drive or storage. The cloud is simply a metaphor for the Internet. In Fog computing, data, processing and applications are concentrated in devices at the network edge. Here devices communicate peer-to-peer so that data storage and share...

By Gautam GoswamiArchitecture, Data Engineeringaugmented reality, authentication, Cisco Systems, Cloud Computing Data Centers, compute, conserve the network bandwidth, data, data applications, data crunching, data leak, data processing, data storage, device independence, Edge computing, Fog Computing, Fog Computing is not the replacement of cloud computing, Google Glass, improve systems response time, internet latency, Internet of Things, IoT, leverage Fog computing, lower operational cost, network edge, networking services, privacy, quicker development, reduce network latency, security enhancement, storage, storing and accessing data and programs over the Internet, traditional Cloud Computing Data Centers, virtualization platform, wearable health monitoring devices, wireless network securityComments Off

05AprApril 5, 2017

Basic concept of Data Lake

The left side info graphics represents the basic concept of Data Lake where we can use the approach of ELT (Extraction, loading and then transformation) against traditional ETL (Extraction, Transformation and then loading) process. ETL process implies to traditional data warehousing system where structured data format follows (row and column). By leveraging HDFS (Hadoop Distributed File System), we can develop data lake to store any format data in order to process and analysis. Directly data can be loaded in the Lake...

By Gautam GoswamiData Engineering, Storage Mechanismbasic concept of Data Lake, concept of Data Lake, Data Lake, Data Transformation, data warehouse, data warehousing, data warehousing system, ELT, ETL, Extraction loading and transformation, Extraction Transformation and loading, fault-tolerant, Hadoop Data Lake, Hadoop Distributed File System, HDFS, how to create hadoop data lake, Semi Structured data, structured data, Traditional data warehouse, unstructured dataComments Off

22MarMarch 22, 2017

Real time data analytics helps mobile service providers to achieve aggressive advantages

Usage of smart phones has become an integral part of our daily routine. Keeping aside phone calls and SMS, we are always engaged with lots of other activities Starting from entertainment to domestic shopping, social engagement etc., by installing various types of mobile applications. Of course, mobile internet is mandatory to carry out above. Mobile service providers are facing new and difficult challenges. Due to exponential growth of customer's expectations, they need to serve accordingly with advanced mobile technology and handle...

By Gautam GoswamiData Analysis, Data Engineeringbusiness demand, Collecting the live data, Continuous data flow, Continuous data flow via memory storage, faster reaction time, faster reaction time to the customers, improve service, live data, maximize asset utilization, persist huge volume of data, Real-time data analytics, retain the most profitable customers, spikes in network traffic, store huge volume of data, Usage of smart phonesComments Off

16MarMarch 16, 2017

How Google news is able to group similar news together

Google news uses clustering machine learning techniques to group similar kind of news or articles together. Interestingly, they don't have thousand news editors on trunk instead use the clustering techniques to forms groups of similar data based on the common characteristics. Mahout is a machine learning software from Apache community that applications leverage to analyse large sets of data. Before invention of Mahout, it was too complex to a analyse large sets of data. Mahout extensively utilize Apache Hadoop to...

By Gautam GoswamiData Engineering, Machine LearningApache Mahout, Classification, Clustering, clustering machine learning, clustering machine learning technique, community information, google news, How Google news is able to group similar news together, machine learning, machine learning techniques, Mahout, Recommendation, user informationComments Off

23FebFebruary 23, 2017

Essentially of Data Wrangling

To roll out a new software product commercially irrespective of any domain in the market, 360-degree quality check with test data is mandatory. We can correlate this with a visualized concept of a new vehicle. After completion of vehicle manufacturing, fuel has to be injected to the engine to make it operational. Once the vehicle starts moving, all the quality checks, testing get started like brake performance, mileage, comfort etc with thousands of other factors which are decided/concluded during...

By Gautam GoswamiArchitecture, Data Engineering, Storage Mechanism360-degree quality check, certify the software product, convert or map data from one raw format to another, Data Wrangling, Essentially of Data Wrangling, performance behavior, performance behavior of the product, process of data conversion from one format to another, The process of data conversion from one format to another is known as data wrangling.Comments Off

14FebFebruary 14, 2017

Semi-Structured Data

Semi-structured data lies between structured and unstructured data. Data that get stored in the traditional database system or excel sheet can be denoted as structured data and organized in COLUMNS and ROWS. Unstructured data can be considered as any data or piece of information which can't be stored in Databases/RDBMS etc. Email, Facebook comments, news paper etc. are the examples of unstructured data. Semi-structured data do not follow strict data model structure and neither raw data nor typed data in...

By Gautam GoswamiApache Hadoop, Data EngineeringAnalysing unstructured Data, bi-directional data interchange, Cassandra, client-server web application, Email, excel sheet, Facebook comments, Facebook graph, Facebook graph API, GET method, GET method in REST service, graph API, Hadoop Distributed File System, HBase, HDFS, JavaScript object notation, JSON format, JSON or XML, MongoDb, news paper, NoSQL, NoSQL Database, REST service, REST service request/response, sample unstructured data, Semi-structured data. structured data, traditional database system, transmit over wireComments Off

12FebFebruary 12, 2017

Why Omni-channel approach is becoming focal point for retailers

In short, we can define retailing is sales goods or services using different types channels. E-Commerce is falling under internet/electronic commerce which is one of the channel in multi-channel approaches. Business to customer (B2C) and business to business (B2B) are part of electronic retailing if carried out over internet. Due to advancement of technology, precise and appropriate customer engagement strategy is very important for retailer's business growth if e-commerce is their prime channel. Even though they have adopted multi...

By Gautam GoswamiEcommerceB2B, B2C. business to business, Business to customer, customer engagement strategy, e-commerce web site, electronic retailing, integrated shopping experience, multi-channel approach, multi-channel experience, Omni-channel, Omni-channel approach, seamless shopping experience, social media campaigns, teleshopping, web2.0 e-commerce websiteComments Off

Author - Gautam Goswami

Why Lambda Architecture in Big Data Processing

Apache Kafka, The next Generation Distributed Messaging System

Basic concept of Data Lake

Real time data analytics helps mobile service providers to achieve aggressive advantages

How Google news is able to group similar news together

Essentially of Data Wrangling

Semi-Structured Data

ready to realize your digital transformation dreams?