From SQL to NoSQL world!

Part 1 of 2:

In this article, we will explore a developer's experience moving from writing SQL queries to NoSQL queries for analytics. We will divide this article into two parts. First part will cover SQL section.

PS: Readers should have in-depth knowledge on SQL and beginner knowledge on aggregation framework of MongoDB.

Let's take an example problem for understanding overall use case for this article.

Problem:

  • We have two tables actor_survey for collecting information about how public rates actor based on their overall acting performance and actor_rating for different ratings used for defining their performance. We will use below MySQL queries to create tables.
-- actor_survey table 
CREATE TABLE actor_survey 
  ( 
     id         INTEGER, 
     actor_name VARCHAR(100), 
     rating_id  INTEGER 
  ); 

-- rating table 
CREATE TABLE actor_rating 
  ( 
     id                 INTEGER, 
     rating_name        VARCHAR(100), 
     rating_description VARCHAR(100) 
  );

Note: Insert dummy data for testing.

For rating, we will use 3 levels:

  1. Excellent
  2. Better
  3. Good

Once we load data into the tables, we can start our journey of writing queries.

Let's start with getting aggregated data for excellent rating.

-- Aggregate query for excellent rating 
SELECT actor_name             AS actorName, 
       Count(acr.rating_name) AS total_excellent 
FROM   actor_survey acs 
       LEFT JOIN actor_rating acr 
              ON acs.rating_id = acr.id 
WHERE  rating_name = 'excellent' 
GROUP  BY actorName;

For getting aggregated data for better rating.

-- Aggregate query for better rating 
SELECT actor_name             AS actorName, 
       Count(acr.rating_name) AS total_better 
FROM   actor_survey acs 
       LEFT JOIN actor_rating acr 
              ON acs.rating_id = acr.id 
WHERE  rating_name = 'better' 
GROUP  BY actorName;

For getting aggregated data for good rating.

-- Aggregate query for good rating  
SELECT actor_name             AS actorName, 
       Count(acr.rating_name) AS total_good 
FROM   actor_survey acs 
       LEFT JOIN actor_rating acr 
              ON acs.rating_id = acr.id 
WHERE  rating_name = 'good' 
GROUP  BY actorName;

Now we can combine above sub queries and get overall aggregated data for our use case.

-- Total aggregated data for excellent, better and good rating 
SELECT excellent.actorname, 
       excellent.total_excellent, 
       better.total_better, 
       good.total_good 
FROM   (SELECT actor_name             AS actorName, 
               Count(acr.rating_name) AS total_excellent 
        FROM   actor_survey acs 
               LEFT JOIN actor_rating acr 
                      ON acs.rating_id = acr.id 
        WHERE  rating_name = 'excellent' 
        GROUP  BY actorName) AS excellent 
       RIGHT JOIN (SELECT actor_name             AS actorName, 
                          Count(acr.rating_name) AS total_better 
                   FROM   actor_survey acs 
                          LEFT JOIN actor_rating acr 
                                 ON acs.rating_id = acr.id 
                   WHERE  rating_name = 'better' 
                   GROUP  BY actorName) AS better 
               ON better.actorName = excellent.actorName 
       RIGHT JOIN (SELECT actor_name             AS actorName, 
                          Count(acr.rating_name) AS total_good 
                   FROM   actor_survey acs 
                          LEFT JOIN actor_rating acr 
                                 ON acs.rating_id = acr.id 
                   WHERE  rating_name = 'good' 
                   GROUP  BY actorName) AS good 
               ON good.actorName = better.actorName

Below is my expected output:

Screenshot (33).png

I hope this article would be useful for proceeding to next part. See you soon! If you have any comments, feedback then I would be happy to receive.