Semi-Structured DataGautam Goswami
Semi-structured data lies between structured and unstructured data. Data that get stored in the traditional database system or excel sheet can be denoted as structured data and organized in COLUMNS and ROWS. Unstructured data can be considered as any data or piece of information which can’t be stored in Databases/RDBMS etc. Email, Facebook comments, news paper etc. are the examples of unstructured data.
JSON is light weight and efficient compare to XML and easily human readable but we can’t store/persist or query from traditional database system. NoSQL databases like HBase, MongoDb, Cassandra, Hadoop distributed file system (HDFS) etc can be leveraged to store, query, analyze etc . In a typical client-server web application, JSON format widely used for bi-directional data interchange.
Here is the sample unstructured data ” The two company named ABCD and EFGH are located in Bangalore and Chennai respectively. ABCD is a pharmaceutical company and have 150 employees. They are into medical drugs supplier and associated with HDFC bank for all business transaction. Company EFGH is into manufacturing of PVC pipes and have 300 employs and doing financial transaction with State Bank Of India “. Above information or data can be transformed into semi-structured data using JSON format. Also possible to persist in NoSQL Database and transmit over wire as REST service request/response.
“Description”: “pharmaceutical company”,
“Type” : “Medical drugs supplier”,
“BusineesBank”: “HDFC Bank”,
“Location” : “Bangalore”
“Description”: “Manufacturing company”,
“Type” : “PVC Pipes”,
“BusineesBank”: “State Bank Of India”,
“Location” : “Chennai”
Facebook graph API provides semi-structured data in JSON format when we query from a specific node using GET method in REST service.