DBMS vs NOSQL
NOSQLDBcategories
key/valuepairs,shardedarrays,anddocument-oriented
approaches)
googleAppdatastores.ItssyntaxinPythonissimpleandclear,anditwasspecifically
createdtobeeasytouseandreminiscentofrelationaldatabases,whileonlyprovidingthe
servicestypicalofkey/valuestoresbaseonamassivelyscalablecloudcomputingservice.
NOSQLDBbenefits
thebenefitsofvariousnon-relationalapproacheswillbeexplained
indepth,intermsofsimplicity(fewerservicesleadtolesscomplexity),scalability
(weakerintegrityassumptionsleadtomoredimensionsofconcurrency),andraw
performance(fewerfeaturesmeansfewerlayerstopassthrough).
SQLvsNOSQL
Example,
HowgoogleAppdataenginedo?
definition
classPosition(db.Model):
job_title=db.StringProperty(multiline=False)
open_date=db.DateTimeProperty(auto_now_add=False)
close_date=db.DateTimeProperty(auto_now_add=False)
salary=db.StringProperty(multiline=False)
description=db.StringProperty(multiline=True)
...
classApplicant(db.Model):
position=db.ReferenceProperty(Position)
name=db.StringProperty(multiline=False)
birth_date=db.DateTimeProperty(auto_now_add=False)
address=db.StringProperty(multiline=False)
source=db.StringProperty(multiline=False,
choices=set(["employeereferral","recruiter","advertisement"]))
applied_date=db.DateTimeProperty(auto_now_add=True)
...
Notethat,position_id,application_idisgeneratedbysystem.
Insertion
pos=Position()
pos.job_title="Accountant"
pos.put()
app=Applicant()
app.position=pos.key()
app.name="HomerSimpson"
app.put()
Ainnerjoinoperation
ForRDBMS,aselectoperationisenough.
SELECTP.job_title,A.name,A.birth_date,...
FROM
PositionP
INNERJOINApplicantA
ONA.position_id=P.position_id
WHERE
P.salary>100000
ANDA.state='NewJersey'
ORDERBY
A.name
FLOW:findingrelevantjobpostingsondiskandcachingtheminmemory,writingnewapplicantrecordstodisk,mergingtheinformationaboutpositionsandapplicantsinmemory,filteringtheresultsbyBooleanexpressions,sortingtheresults,etc.
Howgoogleappenginedo?
positions=Position.all()
positions.filter("open_date<",date.now).filter("close_date>",date.now)
forpositioninpositions:
#displaythepositioninthelist...
applicants=Applicant.all()
applicants.filter("state=","NewJersey")
forapplicantinapplicants:
position=applicant.Position()
#showdatacontainingattributesofbothpositionandapplicantobjects
Itshouldbeclearbythispointthatthereissome(potentiallylarge)classof
operationsthatwecanachievedeclaratively,withnoeffort,inaSQLdatabase,which
requiresignificantprogramminginanon-relationaldatabase.
However,theotherpropertiesofthedataaccessmayshiftthebalanceofthis
equation;whenthetaskisnottoproduceaquickreport,butinsteadtomanage
thisinformationformillionsofusers,inordertoproduceintermediatestructures
thatcananswersearchqueriesinfractionsofamillisecond,theprospectof
writingyourownaccesscodeinthismanner(via,forexample,amap/reduce
operation)becomesmuchmoreattractive.
Advantagestousingnon-relationaldatabases
•Semi-StructuredData
•AlternativeModelParadigms
•Multi-valuedproperties
•GeneralizedAnalytics
•VersionHistory
•PredictableScalability
•SchemaEvolution
AdvantagestousingRDBMSdatabases
•Easeofexpression-writingqueriesisfastandeasy,assumingthoserequirements
arewithinthepurviewofwhatSQLcandonatively.
•ConcurrencyandTransactions-ACIDproperties
•EventualConsistency
•NormalizedUpdatesandrelationalintegrity
•Standardization
•AccessControl