Relational Databases
- It's usually very useful to use a relational database (关系数据库) for data storage.
- Relational databases are very fast at retrieving specific info.
- … which is critical for web apps.
- There are also some good non-relational data storage options.
- There are several good open source database servers.
- Any language appropriate for web programming will be able to access these databases.
- … but maybe you shouldn't access them directly.
- Don't forget the things you learned/will learn in a database course.
- Especially indexes/keys!
Relational DB Servers
- There are many choices for relational databases.
- The free ones: MySQL, PostgreSQL, SQLite.
- The expensive ones: Oracle, DB2.
- The weird ones: Cassandra, Mongo, …
- For web work, free and relational is probably an obvious choice.
- Except maybe for serious enterprise work (e.g. database of ZJU students/grades/tuition payments).
- … or as an optimization.
- MySQL: the most common open-source choice.
- Well proven: many trillions of rows are out there in MySQL. Nobody is complaining that it destoyed their data.
- Fast.
- PostgreSQL: also very common, more “complete” functionality.
- Database geeks like Postgres better. Better transactions, more data types, better ACID assurance.
- e.g. Postgres had transactions years before MySQL implemented them; can do transactions on schema changes that MySQL can't.
- e.g. Postgres has a latitude/longitude data type, and can query “which rows have a point in this geographic region?” in the database.
- SQLite: not really a database server, just a library that does database things.
- Your “database” is a single file on disk. The SQLite library gives you a relational database in that.
- Amazing for development: nothing to deploy, no permissions to set. Just use SQLite and give it a filename.
- Probably not fast enough for production. Not good at handling concurrent queries.
Object-Relational Mapping
- Actually writing a lot of SQL is a pain.
- Most DB activity is similar: load and store data.
- … that are linked by primary key values,
- … and could be represented as objects.
- An object's properties often map closely to DB columns. e.g.
- A
Person
object will have properties fname, lname, email. - A
people
database table will have the same columns. - You end up writing a lot of code to turn one into the other.
- A
- An Object-Relational Mapper (对象关系映射) automatically connects objects to a relational database.
- Allows you to write code like: (using some fictional ORM)
p = new Person() p.fname = "Greg" p.save()
- or:
people = Person.findall(fname="Greg")
- All of your DB manipulation with no SQL needed.
- … and any differences between database servers can be taken care of for you.
- Allows you to write code like: (using some fictional ORM)
- ORMs can also handle relationships
- … with some configuration about how they are represented.
- e.g. if a person has many books, the
book
table might have columnsperson_id
,title
,author
, … - Then,
person.books
might return a list of books objects owned by this person.
- Bottom line:
- You probably don't have to write SQL.
- But you should probably know what's going on with your database: the abstraction can be leaky.
- Use artificial primary keys for everything, so the relations are easy to represent.
NoSQL
- A general term for modern non-relational databases.
- Doesn't describe a technology, or even specific type of database.
- Motivation: for very high volume sites with huge data stores, the relational/SQL/ACID paradigm is slow.
- Joins can be expensive with lots of data.
- Hard to distribute/parallelize across many servers: have to coordinate data and/or do distributed joins.
- ACID + transactions can also be expensive and unnecessary.
- ACID: Atomicity, Consistency, Isolation, Durability
- Locking/concurrency problems can slow things down, especially when on multiple servers.
- Maybe we don't care much about occasional missed transactions.
- If we're a bank: definitely care.
- If we're storing up/down votes on a link sharing site: 1% failure is totally fine.
- NoSQL DBs have simpler data models.
- key/value pairs: like a persistent Java map
- document store: a “record” is a collection of somewhat free-form data (with no schema).
- ACID rules are often relaxed as well.
- Eventual consistency: multiple servers may have different data temporarily, but will sync eventually.
- Or perhaps simply restrict what operations can be done in a “transaction”.
- Many different technologies are described as “NoSQL”.
- Each makes different tradeoffs of various DB properties.
- The goal is generally speed at the expense of more complicated data management.
- e.g. can't use traditional ORM; maybe need multiple database techs in your system.
- But…
- All published evidence seems anecdotal: no hard data sets with fair comparisons.
- Most seem to be using MySQL as the only representative of relational databases.
- Some poor performance claims for MySQL are hard to believe.
- Many of the NoSQL solutions aren't very mature: I like maturity in my data storage solutions.
- Using NoSQL:
- Probably think of NoSQL as an optimization, instead of part of initial design.
- Using a NoSQL store as an internal cache can be very easy and effective.
- e.g. cache result of some big calculation/query in memcached: retrieve instead of recalculating if possible.
- Summary message:
- There is more to the data storage world than relational/ACID databases.
- Some data is better suited to other tools.
- NoSQL sounds a little faddish.
- Also ask yourself how stable/proven any new data-storage tool is. Will you trust it with your data?s
MVC
- Model/View/Controller architectures.
- Model (模型)
- The information layer.
- Holds the information you store and contains logic for manipulating it.
- Usually classes/objects for the data you're representing.
- Often combined with an ORM to store in the database.
- Should generally be medium independent: model shouldn't care whether it's a web or GUI app.
- Model should combine object-oriented design and relational database design for your data.
- View (视图)
- The display layer.
- Arranges info for the user to see.
- In web apps, usually HTML templates: Cheetah, Razor, Velocity, or something integrated into your framework.
- Should contain no significant logic: that should be in M or C.
- Controller (控制器)
- Responds to user actions (and other events).
- In web apps: responds to HTTP requests.
- i.e. when user requests a page, the controller for that page takes over.
- Uses the model to get the right data; displays the data with a view.
- e.g. instantiates model objects (with the ORM) and renders a template by passing that data.
- Many web frameworks use an MVC pattern with an ORM.
- Rails, Django, CakePHP, Struts, .NET MVC2, …
- Some use the terms M/V/C differently, but the ideas are the same.