Part 2: From SQL to NoSQL world!

Part 2 of 2:

In this article, we will finally explore about NoSQL world coming from SQL background and developer experience as a whole.

Note: I am also beginner on NoSQL and it may not be best possible solution. So, any feedback and comments are highly welcome from the community that would help beginner like me.

We will continue with same problem set from first part of this article .

Here, I will simply paste the whole query and explain steps below.

db.getCollection("actor_survey").aggregate([
  // First pipeline starts here
  {
    // facet for multiple aggregation with different stages inside
    $facet: {
      // Excellent rating data aggregation starts here
      total_excellent: [
        { $match: { "ratings.rating_name": "excellent" } },
        {
          $lookup: {
            from: "actor_rating",
            localField: "rating_id",
            foreignField: "_id",
            as: "ratings",
          },
        },
        {
          $unwind: { path: "$ratings", preserveNullAndEmptyArrays: true },
        },

        { $group: { _id: "$actor_name", total_excellent: { $sum: 1 } } },
        {
          $project: {
            _id: 1,
            total_excellent: 1,
          },
        },
      ], // Excellent rating data aggregation ends here

      // Better rating data aggregation starts here
      total_better: [
        { $match: { "ratings.rating_name": "better" } },
        {
          $lookup: {
            from: "actor_rating",
            localField: "rating_id",
            foreignField: "_id",
            as: "ratings",
          },
        },
        {
          $unwind: { path: "$ratings", preserveNullAndEmptyArrays: true },
        },

        { $group: { _id: "$actor_name", total_better: { $sum: 1 } } },
        {
          $project: {
            _id: 1,
            total_better: 1,
          },
        },
      ], // Better rating data aggregation ends here

      // Good rating data aggregation starts here
      total_good: [
        { $match: { "ratings.rating_name": "good" } },
        {
          $lookup: {
            from: "actor_rating",
            localField: "rating_id",
            foreignField: "_id",
            as: "ratings",
          },
        },
        {
          $unwind: { path: "$ratings", preserveNullAndEmptyArrays: true },
        },

        { $group: { _id: "$actor_name", total_good: { $sum: 1 } } },
        {
          $project: {
            _id: 1,
            total_good: 1,
          },
        },
      ], // Good rating data aggregation ends here
    }, // facet end
  }, // First pipeline ends here

  // Union all result data and project
  {
    $project: {
      data: { $setUnion: ["$total_excellent", "$total_better", "$total_good"] },
    },
  },

  {
    $unwind: { path: "$data", preserveNullAndEmptyArrays: false },
  },

  // format data
  {
    $project: {
      actorName: "$data._id",
      total_excellent: { $toInt: "$data.total_excellent" },
      total_better: { $toInt: "$data.total_better" },
      total_good: { $toInt: "$data.total_good" },
    },
  },

  // Group similar data and sum
  {
    $group: {
      _id: "$actorName",
      total_excellent: { $sum: "$total_excellent" },
      total_better: { $sum: "$total_better" },
      total_good: { $sum: "$total_good" },
    },
  },

  //Final output
  {
    $project: {
      _id: 0,
      actorName: "$_id",
      total_excellent: 1,
      total_better: 1,
      total_good: 1,
    },
  },
]);

We have used aggregation framework of mongodb which provides different pipeline stages to work with collections data.

For our specific use case, we have decided to aggregate data using facet that provides multiple aggregation pipelines within a single stage for processing data.

Example:

{ $facet:
   {
      total_excellent: [ <stage1>, <stage2>, ... ],
      total_better: [ <stage1>, <stage2>, ... ],
      ...

   }
}

Here, each pipeline like total_excellent and total_better are independent of each other.

Let's explore even deeper inside stages of pipeline like total_excellent.

total_excellent: [
        { $match: { "ratings.rating_name": "excellent" } },
        {
          $lookup: {
            from: "actor_rating",
            localField: "rating_id",
            foreignField: "_id",
            as: "ratings",
          },
        },
        {
          $unwind: { path: "$ratings", preserveNullAndEmptyArrays: true },
        },

        { $group: { _id: "$actor_name", total_excellent: { $sum: 1 } } },
        {
          $project: {
            _id: 1,
            total_excellent: 1,
          },
        },
      ], // Excellent rating data aggregation ends here

Here, $match like the name suggest helps for filtering data which we can relate with MySQL equivalent of where .

$lookup is similar to joins in MySQL which helps to connect two collections and perform lookup.

$unwind helps to deconstruct an array from input documents into output documents for each element. preserveNullAndEmptyArrays if set to true will preserve missing array.

$group is equivalent to group by of MySQL which helps grouping of similar data and here we can perform aggregation functions like sum.

Finally, $project is used for passing required data to next stage in the pipeline.

In summary, following are the high level steps required to perform overall aggregation.

  • First we start with the facet for calculating different aggregation data individually like total excellent, better and good.
  • Once we get the total data for different metrics, we can then combine that data in one place. Also, we can unwind combined data for further processing.
  • Now we can format data as required and group similar data and sum such data for final output.
  • Finally we can project the only required data for our use case.

I hope this article is clear enough to understand how one can transition from SQL to NoSQL and relate between two. We can also extend this to add different features like pagination, search, filtering, sorting and so on for powerful server side tables.

If you have any comments, feedback then I would be happy to receive. Also, I would like to again thank and appreciate everyone who have provided valuable feedback to make it more valuable. I will try to update my blog accordingly.