Forget about joins and SQL and try NoSQL databases – specifically MongoDB, the leading example
MongoDB is an open source document- oriented database system written in C++ by Dwight Merriman and Eliot Horowitz. It runs on UNIX machines as well as Windows and supports replication and sharding (aka horizontal partitioning) – the process of separating a single database across a cluster of machines. Many programming languages – including C, C++, Erlang, Haskell, Perl, PHP, Python, Ruby and Scala – support MongoDB. It is suitable for many things, including archiving, event logging, storing documents, agile development, real-time statistics and analysis, gaming, and mobile and location services. This article will show you how to store Apache log files in a MongoDB database with the help of a small Python script. We’ll also demonstrate how to implement replication in MongoDB. The replica set consists of nodes 192.168.2.4 (port 27019), 192.168.1.10 (port 27019) and 192.168.2.3 (port 27018)
Connecting to MongoDB for the first time Your Linux distribution probably includes a MongoDB package, so go ahead and install it. Alternatively, you can download a precompiled binary or get the source code from www.mongodb.org and compile it yourself. After installation, type mongo –version to find out the MongoDB version you are using and mongo to run the MongoDB shell and check if the MongoDB server process is running.
Step 02
MongoDB terminology NoSQL databases are designed for the web and do not support joins, complex transactions and other features of the SQL language. You can update a MongoDB database schema without downtime, but you should design your MongoDB database without joins in mind. Their terminology is a little different from the terminology of relational databases and you should familiarise yourself with it.
Step 03
The _id field Every time you insert a BSON document in MongoDB, MongoDB automatically generates a new field called _id. The _id field acts as the primary key and is always 12 bytes long. To find the creation time of the object with _id ‘51cb590584919759671e4687’, execute the following command from the MongoDB shell:
Note: You should remember that queries are case-sensitive
Step 04
Inserting an Apache log file into MongoDB Now that you know some things about MongoDB, it is time to do something interesting and useful. A log file from Apache will be inserted inside a MongoDB database using a Python script. The Python script is executed as follows:
…where www6.ex000704.log.gz is the name of the compressed (for saving disk space) log file.
Step 05
The storeDB.py Python script The storeDB.py script uses the PyMongo Python module to connect to MongoDB. The MongoDB server is running on localhost and listens to port 27017. For every inserted BSON document, its _id field is printed on screen. Finally, the script prints the total number of documents inserted in the MongoDB database. The host and its port number are hard-coded inside the script, so change them to match yours.
Step 06
Connecting to MongoDB using PyMongo You first need to connect to MongoDB using:
You then select the database name you want (LUD) using the following line of code:
db = connMongo.LUD
And finally you select the name of the collection (apacheLogs) to store the data:
logs = db.apacheLogs
After finishing your interaction with MongDB you should close the connection as follows:
connMongo.close()
Step 07
Displaying BSON documents from the apacheLogs collection Type the following in order to connect to the MongoDB shell:
$ mongo
Select the desired database as follows:
> use LUD
See the available collections for the LUD database as follows:
> show collections apacheLogs system.indexes
Lastly, execute the following command to see all the contents of the apacheLogs collection:
> db.apacheLogs.find()
If the output is long, type ‘it’ to go to the next screen.
Step 08
A replication example Imagine that you have your precious data on your MongoDB server and there is a power outage. Can you access your data? Is your data safe? To avoid such difficult questions, you can use replication to keep your data both safe and available. Replication also allows you to do maintenance tasks without downtime and have MongoDB servers in different geographical areas.
Step 09
Running the three MongoDB servers from the command line For this example, you need three MongoDB server processes running. We ran the three MongoDB servers, on their respective machines, as follows:
Note: You are going to see lots of output on your screen.
Step 10
More information about the three MongoDB servers You should specify the name of the replica set (LUDev) when you start the MongoDB server and have the data directory, specified by the –dbpath parameter, already created. You do not necessarily need three discrete Linux machines. You can use the same machine (IP address) as long as you are using different port numbers and directories.
Step 11
The rs.initiate() command Once you have your MongoDB server processes up and running, you should run the rs.initiate() command to actually create and enable the replica set. If everything is okay, you will see similar output on your screen. If the MongoDB server processes are successfully running, most errors come from misspelled IPs or port numbers. The rs.initiate() command is simple but has a huge impact!
Step 12
Information about replication Any node can be primary, but only one node can be primary at a given time. All write operations are executed at the primary node. Read operations go to primary and optionally to a secondary node. MongoDB performs automatic failover. MongoDB performs automatic recovery. Replication is not a substitute for backup, so you should not forget to take backups.
Step 13
More information about replication The former primary will rejoin the set as a secondary if it recovers. Every node contacts the other nodes every few seconds to make sure that everything is okay. It is advised to read from the primary node as it is the only one that contains the latest information for sure. All the machines of a replica set must be equally powerful in order to handle the full load of the MongoDB database.
Step 14
The rs.status() command output The rs.status() command shows you the current status of your replica set. It is the first command to execute to find out what is going on. Apart from primary and secondary nodes, a third type of node exists. It is called arbiter. An arbiter node does not have a copy of the data and cannot become primary. Arbiter nodes are only used for voting in elections for a primary node.
Step 15
Selecting a new primary node If you shut down the primary MongoDB server (by pressing Ctrl+C), the logs of the remaining two MongoDB servers will show the failure of the 192.168.1.10:27018 MongoDB server: Mon Jul 1 11:21:29.371 [rsHealthPoll] couldn’t connect to 192.168.1.10:27018: couldn’t connect to server 192.168.1.10:27018 Mon Jul 1 11:21:29.371 [rsHealthPoll] couldn’t connect to 192.168.1.10:27018: couldn’t connect to server 192.168.1.10:27018 It takes about 30 seconds for the new primary server to come up and the new status can be seen by running the rs.status() command. Important note: Once a primary node is down, you need more than 50 per cent of the remaining nodes in order to select a new primary server.
Step 16
Trying to write data to a non- master node If you try to write to a non-master node, MongoDB will not allow you and will generate an error message.
Step 17
Useful MongoDB commands Delete the full apacheLogs collection: db.apacheLogs.drop() Show available databases: show dbs Find documents within the apacheLogs collection that have a StatusCode of 404: db.apacheLogs.find({“StatusCode” : “404″}) Connect to the 192.168.1.10 server using port number 27017: mongo 192.168.1.10:27017
Step 18
Hints and tips It is highly recommended that you first run find() to verify your criteria before actually deleting the data with remove(). Should you need to change the database schema and add another field, MongoDB will not complain and will do it for you without any problems or downtime. The way to handle very large datasets is through sharding. Mongo has its own distributed file system called GridFS.
TERIMA KASIH ATAS KUNJUNGAN SAUDARA
Judul: Create and save data with a MongoDB database
Ditulis oleh Unknown
Rating Blog 5 dari 5
Semoga artikel ini bermanfaat bagi saudara. Jika ingin mengutip, baik itu sebagian atau keseluruhan dari isi artikel ini harap menyertakan link dofollow ke https://androidblackberries.blogspot.com/2013/09/create-and-save-data-with-mongodb.html. Terima kasih sudah singgah membaca artikel ini.
1 komentar:
QUANTUM BINARY SIGNALS
Get professional trading signals delivered to your cell phone daily.
Follow our trades today and gain up to 270% a day.
Posting Komentar