Most modern applications generate large amounts of data in order to understand the needs and likes of their customers. However finding meaningful information from within this data is like finding a needle in a haystack. In this session we will look at some solutions that are being used currently for Big Data Search and then take a closer look at one of the frontrunners, Elasticsearch. Github, FourSquare, StumbleUpon, SoundCloud all use ElasticSearch to analyze and search through terabytes of data and millions of search requests.
As per REST, the URLs should make use of HTTP verbs to expose their REST based services via HTTP. (i.e GET/PUT/POST/DELETE). But in a real life complex application, we are faced with exposing many services such as approve, reject where it becomes inevitable to add verbs to the URL. What should we do? Should we just have the URLs like ../foods/1/approve ? What would go wrong if we use verbs in REST URL? Whether there is some rationale behind it or it just REST dogma. Are there any “REST guidelines”?
Since early days, the Hadoop community has made several attempts to stretch Hadoop beyond its role as a distributed programming framework. The key strength that Hadoop brings to the table is its ability to scale linearly. Can we combine this advantage of Hadoop with the efficiency of databases? What does it take to run SQL over Hadoop?
Running SQL-on-Hadoop implies accessing data from “within” Hadoop using SQL as the interface. Accomplishing this demands a significant re-architecture of the storage and compute infrastructures. SQL-on-Hadoop also shifts Hadoop’s role from being a technology, viewed so far as complementary to databases into something that could compete with them. Its perhaps the single most significant feature that will help Hadoop find its way into more enterprises. This session highlights some conceptual ideas of the different ways that SQL processors can be implemented atop Hadoop. It looks looks at examples of OSS and Research-ware products.
Harpreet Singh shares his experience in building an enterprise Big Data Platform For a 100TB Dataset with a medical sector use case. He talks of at how they went about managing the unstructured data (genomics, imaging) on Hbase/Hadoop and structured data (biochemistry, skin tests etc) on NoSQL Mongo database, and the challenges faced along the way.
Vibhore Sharma, CTO of Naukri.com speaks about emerging technologies in the Cloud, Mobile, Big Data, Internet Of Things (IoT) and Analytics. He also shares the challenges of building and running a site like Naukri.com. This talk was delivered at the IndicThreads Conference 2013 at Delhi NCR, India. Vibhore is an internet technology specialist with expertise in building websites of challenging scale and complexity.
The 2nd Annual IndicThreads Delhi NCR Conference will be held on 30-31 August 2013. The conference has a great lineup of speakers & sessions and offers an unmatched opportunity for learning in Big Data, Java, Cloud, Web & Emerging technologies. Do take a look at the sessions and the schedule and registrations for conference 2013.