https://www.w3schools.com/mongodb/mongodb_aggregations_out.php
https://www.geeksforgeeks.org/mongodb-regex/
https://www.mongodb.com/docs/manual/
MongoDB is a document-oriented NoSQL database system that provides high scalability, flexibility, and performance.
Unlike standard relational databases, MongoDB stores data in a JSON document structure form.
mongoDB is key value pair, documents , graph
to improve performance we can add new server in no sql databases
but in regular ones we need to improve hardware..
Relational Database Management System(RDBMS) is not the correct choice when it comes to handling big data by the virtue of their design since they are not horizontally scalable.
If the database runs on a single server, then it will reach a scaling limit.
NoSQL databases are more scalable and provide superior performance.
What’s called Table in RDBMS is called a Collection in MongoDB. Similarly, a Row is called a Document and a Column is called a Field.
MongoDB is such a NoSQL database that scales by adding more and more servers and increases productivity with its flexible document model.
load balancing to manage high traffic
regular expresion based queries
We use a 1 to include a field and 0 to exclude a field.
The sort order can be chosen by using 1 or -1. 1 is ascending and -1 is descending.
Using $match early in the pipeline can improve performance since it limits the number of documents the next stages must process.
$lookup
This aggregation stage performs a left outer join to a collection in the same database.
There are four required fields:
from: The collection to use for lookup in the same database
localField: The field in the primary collection that can be used as a unique identifier in the from collection.
foreignField: The field in the from collection that can be used as a unique identifier in the primary collection.
as: The name of the new field that will contain the matching documents from the from collection.
cursor
var mycursor = db.student.find({studentId:{$gt:1}});
> while(mycursor.hasNext()){
... print(tojson(mycursor.next()));
... }
toArray
var mycursor = db.student.find().pretty()
var docs = mycursor.toArray()
var resultdoc = docs[0]
resultdoc
i: To match both lower case and upper case pattern in the string.
use single quote
first command -- mongo in terminal
use db_name
--creates db
You can create a collection using the createCollection() database method.
db.posts.insertOne({"title": "Post 1"}
This will create the "posts" collection if it does not already exist.
db.posts.updateOne( { title: "Post Title 1" }, { $set: { likes: 2 } } )
----
MongoDB
++++++++
According to International Data Corporation around 2.4 Quintillion bytes of data is generated everyday and more than 90% of data we have today is from last two years only.
Data is generated from mobile devices , social networks and applications.
Big Data
++++++++++
Data volume - zettabytes of data (1024 to power of 8) generated daily
Data velocity - speed of generation , storage and processing is very fast
Data variety - Structured (RDBMS)
Semi-structured (XML,HTML)
Unstructured data (docs , social)
NO SQL DBs
++++++++++++++++
Non - relational - document oriented / column oriented / graph based etc
Distributed and Horizontally scalable - more clusters with low cost hardware
Open Source - wide community support
RDBMS - ACID
NOSQL - CAP
C - Consistency - all nodes of cluster return same data
A - Availability - reads/writes are never declined
P - Partition Tolerance - works across network partitions
At most , only two of CAP can be satisfied.
CP - MongoDB
AP - Cassandra
CA - MySQL
MongoDB
++++++++
Created by 10gen , written in C++.
Key Features
++++++++++++++++
High Performance
Rich Query Language
High Availability
Horizontally Scalable
Support of multiple storage engines
Installation
++++++++++++++
7.0 +
UI : VSCode
Azure Setup
Terminal
- Internal Certification
Common Commands
+++++++++++++++++
1. db - Shows the current DB. test db is the default DB.
2. show dbs - Shows the list of DBs under the server
3. use <
4. db.createCollection("employees") - Create a table in the current DB
5. show collections/show tables -- show the tables in current db
Drop
+++++
1. db.employees.drop() - drop the table
2. db.dropDatabase() - drop the DB
CRUD Operations
++++++++++++++++
1.insert - can insert one or more records - not recommended
2.insertOne - one record only
3.insertMany - multiple records
db.employees.insertOne/insertMany/insert - it will create the table if not created using createCollection and then also insert the record
use EmpDB
db.employees.insertOne({
empId:1,
empName : "test",
dept : "ETA",
salary : 30000
})
{
"acknowledged" : true,
"insertedId" : ObjectId("66eadbc9abec11642808ce38")
}
insertedId - _id - primary key
db.employees.insertOne({
"empId":2,
"empName" : "Lakshmi",
dept : "ETA",
salary : 30000
})
db.employees.find() - See the data / read the data
db.employees.insertOne({
_id: 1001,
"empId":2,
"empName" : "Lakshmi",
dept : "ETA",
salary : 30000
})
{ "acknowledged" : true, "insertedId" : 1001 }
db.employees.insertMany([{
_id: 1002,
"empId":3,
"empName" : "John Doe",
dept : "ETA",
salary : 30000
},{
_id: 1003,
"empId":4,
"empName" : "Peter",
dept : "ETA",
salary : 30000
}])
{ "acknowledged" : true, "insertedIds" : [ 1002, 1003 ] }
Insert data from JSON file
db.employees.find().pretty()
Read Operations
++++++++++++++++
--where cond
db.employees.find({filter cond},{select cond})
db.employees.find({department:"ETA"})
--select cond
db.employees.find({department:"ETA"},{empName:1,department:1})
1 - show the col
0 - hide the col
--no _id
db.employees.find({department:"ETA"},{empName:1,department:1,_id:0})
You can have all 0's or all 1's except for _id field.You cannot have a combination of 1s and 0s.
db.employees.find({department:"ETA"},{empName:1,department:0,_id:0})
db.employees.find({department:"ETA"},{empName:0,department:0,_id:1})
findOne()
++++++++++
db.employees.findOne({department:"ETA"})
Read Operators
++++++++++++++++
$gt,$lt,$gte,$lte,$in,$nin,$eq,$ne etc
---all employees who have salary > 40000
db.employees.find({salary:{$gt:40000}},{empName:1,salary:1,_id:0})
--all emp whose dept is not ETA
db.employees.find({department:{$ne:"ETA"}},{empName:1,department:1,_id:0})
--all emp whose dept is ETA or IVS
db.employees.find({department:{$in:["ETA","IVS"]}},{empName:1,department:1,_id:0})
--all emp not in dept ETA or IVS but having a dept
db.employees.find({department:{$nin:["ETA","IVS"],$exists:true}},{empName:1,department:1,_id:0})
----------------------------------------
----------------------------------------
----------------------------------------
$size - Array Operator - No. of elem within the array
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
db.employees.find({experience:{$size:2}},{empName:1,experience:1,_id:0})
$regex - Regular Expression like in JS
+++++++
--Fetch all emp whose name starts with J
db.employees.find({empName:{$regex:"^J+"}},{empName:1,_id:0})
--Case insensitive search $options : i
+++++++++++++++++++++++++++++++++++++++
db.employees.find({empName:{$regex:"^J+",$options:"i"}},{empName:1,_id:0})
--Fetch all empName start with J or E
$in , $ regex ---cannot be used together
So, write regex using //
db.employees.find({empName:{$in:[/^J+/i,/^E+/i]}},{empName:1,_id:0})
--$and or $or
+++++++++++++++
--Fetch emp who work for ETA or IVS and salary > 30000
db.employees.find({$and:[{
department:{$in:["ETA","IVS"]}
},{
salary:{$gt:30000}
}]},{empName:1,department:1,salary:1,_id:0})
--Fetch emp who work for ETA or IVS or salary > 30000
db.employees.find({$or:[{
department:{$in:["ETA","IVS"]}
},{
salary:{$gt:30000}
}]},{empName:1,department:1,salary:1,_id:0})
$not $gt <> $lte
++++++++++++++++++
--Fetch those emp whose machineid is not greator than 1002
db.employees.find({machineId:{$not:{$gt:1002}}},{empName:1,machineId:1,_id:0})
db.employees.find({machineId:{$exists:true,$not:{$gt:1002}}},{empName:1,machineId:1,_id:0})
db.employees.find({machineId:{$lte:1002}},{empName:1,machineId:1,_id:0})
$elemMatch(Story)
++++++++++
---role : TL
db.employees.find({"experience.role":"Technology Lead"},{empName:1,experience:1,_id:0})
--role:TL exp > 1 year
$and , $gt
db.employees.find({$and:[{},{}]},{})
db.employees.find({$and:[{
"experience.role":"Technology Lead"
},{
"experience.roleExperience":{$gt:1.0}
}]},{empName:1,experience:1,_id:0})//not expected output
$elemMatch
//right output
db.employees.find({
experience:{$elemMatch:{role:"Technology Lead",roleExperience:{$gt:1.0}}}
},{empName:1,experience:1,_id:0})
db.employees.find({
experience:{$elemMatch:{role:"Technology Lead",roleExperience:{$gt:1.0}}}
},{empName:1,"experience.role":1,_id:0})
$all is like $and but for array elemMatch (Check this out in Mongo Notes)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--Update Queries
++++++++++++++++++++++++++++++++
Read Query - colname :{operator : value}
Update Query - operator :{colname : value}
1.update - not recommended . By default it acts like updateOne and with
{multi:true} it acts like updateMany.
2.updateOne
3.updateMany
--Update sal of Chris to 40000
db.employees.updateOne({where cond},{update cond})
db.employees.updateOne({empId:14},{$set:{salary:40000}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
db.employees.updateMany({department:"ETA"},{$set:{salary:42000}})
db.employees.update({department:"ETA"},{$set:{salary:44000}})
db.employees.update({department:"ETA"},{$set:{salary:46000}},{multi:true})
Update Operators
+++++++++++++++++
$inc , $mul , $max , $min etc
--Increase the sal of ETA emp by 2000 rs
db.employees.updateMany({department:"ETA"},{$inc:{salary:2000}})
--Decrease the sal of ETA emp by 2000 rs
db.employees.updateMany({department:"ETA"},{$inc:{salary:-2000}})
--Increase by 10 percent
$mul
db.employees.updateMany({department:"ETA"},{$mul:{salary:1.1}})
//Little confusing
--Each emp must have a min salary of 30000 or more
++++++++++++++++++++++++++++++++++++++++++++++++++++++
$max
db.employees.updateMany({},{$max :{salary:30000}})
--Each emp must have a max salary of 50000 or less
++++++++++++++++++++++++++++++++++++++++++++++++++++++
$min
db.employees.updateMany({},{$min :{salary:50000}})
$unset
+++++++
--Delete department and machine Id for emp id 14
db.employees.updateOne({empId:14},{$unset:{department:"",machineId:""}})
db.employees.updateOne({},{$unset:{department:"",machineId:""}})
--Array Update Operations
+++++++++++++++++++++++++++++++++++++++++++
--Adding a new value to existing array $push
db.employees.updateOne({empId:14},{$push:{experience:{
role:"Senior Systems Engineer",roleExperience:0.1}}})
--Add again , duplicate created
db.employees.updateOne({empId:14},{$push:{experience:{
role:"Senior Systems Engineer",roleExperience:0.1}}})
--To avoid duplicate use $addToSet
db.employees.updateOne({empId:14},{$addToSet:{experience:{
role:"Senior Systems Engineer",roleExperience:0.1}}})
--Add two array values
++++++++++++++++++++++++++++++++
$push - It will create a sub array
db.employees.updateOne({empId:14},{
$push:{experience:[{role:"Technology Analyst",roleExperience:0.1},
{role:"Technology Lead",roleExperience:0.1}]
}})
To avoid that $each
Use with the $push operator to append multiple values to an array
db.employees.updateOne({empId:14},{
$push:{experience:{$each:[{role:"Technology Analyst",roleExperience:0.1},
{role:"Technology Lead",roleExperience:0.1}]
}}})
--Add at a specific position
$position
db.employees.updateOne({empId:14},{
$push:{experience:{$each:[{role :"Systems Engineer Trainee",roleExperience:0.6}],$position:0}}}
)
$- update only the role or roleExp
db.employees.updateOne({$and:[{empId:14},
{"experience.role":"Systems Engineer Trainee"}]},{
$set:{"experience.$.roleExperience":0.9}})
---------
Update Queries [contd]
++++++++++++++++++++++++++
$pop - Remove first or last ele from the array
-1 - First element
1 - Last Element
db.employees.updateOne({empId:14},{$pop:{experience:1}})
db.employees.updateOne({empId:14},{$pop:{experience:-1}})
$pull - Remove ele from array by matching condition
db.employees.updateOne({empId:14},{$pull:{experience:{role:"Senior Systems Engineer",roleExperience:0.1}}})
$pullAll - Remove multiple matching elements(multiple conditions)
---Refer to notes
Delete Commands
++++++++++++++++
1. deleteOne
2. deleteMany
3. remove - By default acts like deleteMany but with {justOne:true} it acts like deleteOne.
db.employees.deleteOne({empId:14})
db.employees.deleteMany({department:"ETA"})
db.employees.remove({department:"DX"},{justOne:true})
Aggregations
+++++++++++++
count , sum , avg , min , max etc
Aggregation Pipeline - the output of one stage is the input to the next stage
(Chaining)
1.countDocuments
+++++++++++++++++++++++++
--Count the no. of rows in the table
db.employees.countDocuments({})
db.employees.countDocuments({department :"ETA"})
db.employees.countDocuments({department :"DX"})
db.employees.countDocuments({},{skip:1,limit:2})
db.employees.countDocuments({},{skip:13,limit:2})
2. distinct
++++++++++++++++
db.employees.distinct("department")
db.employees.distinct("experience.role")
3.$match and $project
++++++++++++++++++++++++++++++++
$match - where cond of find()
$project - select cond of find()
--Fetch all emp who belong to any department
db.employees.aggregate([{stage1},{stage2}......])//basic syntax
db.employees.aggregate([{$match :{department:{$exists:true}}},
{$project:{empName:1,department:1,_id:0}},{$sort:{empName:1}}])
Q : sort for lowercase also
$group
++++++++
db.employees.aggregate([{$group:{_id:"$department"}}])
db.employees.aggregate([{$match:{department:{$exists:true}}},{$group:{_id:"$department"}}])
db.employees.aggregate([{$group:{_id:"$experience.role"}}])
db.employees.aggregate([{$group:{_id:{dept:"$department",sal:"$salary"}}}])
$count
++++++
db.employees.aggregate([{$count:"noofemp"}])
db.employees.aggregate([{$group:{_id:"$department"}},{$count:"noofdepts"}])
$sort,$limit,$out,$skip
++++++++++++++++++++++++++++++
db.employees.aggregate([{$sort:{salary:-1}},{$project:{empName:1,salary:1,_id:0}}])
db.employees.aggregate([{$sort:{salary:-1}},{$limit:3},{$project:{empName:1,salary:1,_id:0}}])
db.employees.aggregate([{$sort:{salary:-1}},{$limit:3},{$project:{empName:1,salary:1,_id:0}},{$out:"Top3Emp"}])
show collections/show tables
db.Top3Emp.find()
Q : $out can take anything apart from table
Only 3rd highest paid emp
db.employees.aggregate([{$sort:{salary:-1}},{$limit:3},{$skip:2},{$project:{empName:1,salary:1,_id:0}}])
Q : is there any other way to get top three employees and what if there are more than one employee with same salary?
$unwind
+++++++++++++++++++++
db.employees.aggregate([{$unwind:"$experience"}])
$sum, $avg , $min , $max
++++++++++++++++++++++++++++++
Pre-req : group
--Group by dept count of emp in each dept and avg salary
db.employees.aggregate([{
$group:{_id:"$department",
"noofemp":{$sum:1},
"avgsal":{$avg:"$salary"}}
}])
--Same way min and max (Refer notes)
$concat
db.employees.aggregate([{$project:{empDept:{$concat:["$empName","-","$department"]},_id:0}}])
$ifNull
db.employees.aggregate([{$project:{empDept:{$concat:[{$ifNull:["$empName","NA"]},"-",{$ifNull:["$department","NA"]}]},_id:0}}])
Date Queries
++++++++++++++++++++++
$year $month
$dateToString
date :
format :
timezone :
onNull :
%Y - Year
%m - Month
%d - date
%H - hours
%M - minutes
%S - Seconds
%L - Milli seconds
db.employees.aggregate([{$project:{birthYear:{$year:"$dateOfBirth"}}}])
db.employees.aggregate([{
$project:{birthdate:{$dateToString:{
date:"$dateOfBirth",
format:"%m-%d"}}}
}])
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++===
Indexing
+++++++++
db.employees.find({salary:{$gte:40000,$lte:50000}}) //Sample Query
db.employees.find({salary:{$gte:40000,$lte:50000}}).explain("executionStats")
db.employees.createIndex({salary:1})
-1:desc
1 :asc
db.employees.find({salary:{$gte:40000,$lte:50000}}).explain("executionStats")
db.employees.getIndexes()
> db.employees.dropIndex({salary:-1})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.employees.getIndexes()
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } ]
Types of Indexes
++++++++++++++++++++++++++++++
1. Single Field Index
2.Compound Index
db.employees.createIndex({salary:1,department:1})
3. Text Index
db.employees.createIndex({department:"text"})
db.employees.createIndex({"experience.role":1})
BulkWrite
++++++++++
--Refer to Notes
No comments:
Post a Comment