BSON
We've been working with json files for a while now, and and we know that they are objects that can store, in an organized way, different data types, such as strings, numbers, arrays and more. But json stores this information as textual data which is considered slow and not compressible.
MongoDB uses a file type called BSON. bson files are very similar to json files, but they are stored as a representation of "binary files", which means they are faster than json files, and are also compressible which leads to smaller file sizes..
Here is a chart showing the main differences:
JSON | BSON | |
Encoding | UTF-8 String | Binary |
Data Support | String, Boolean, Number, Array | String, Boolean, Number (Integer, Float, Long, Decimal 128..), Array, Date, Raw Binary |
Readability | Human and Machine | Machine Only |
BSON, or Binary JSON, is the data format that MongoDB uses to organize and store data. This data format includes all JSON data structure types and adds support for types including dates, different size integers, ObjectIds, and binary data. You can use BSON documents in your Python application by including the bson package. For a complete list of supported types, see the BSON Types server manual page.
BSON documents are stored in MongoDB collections in binary format, while PyMongo represents BSON documents as Python dictionaries. PyMongo automatically converts Python dictionaries into BSON documents when inserting them into a collection. Likewise, when you retrieve a document from a collection, PyMongo converts the BSON document back into a Python dictionary.