PostgreSQL 10 – New Features and Functionality

PostgreSQL remains a popular option for organizations that need a traditional SQL database, but don’t want to spend the money required for Oracle. We’ve covered this open source database in the past here on the blog. For those companies who want extra support, a commercial Postgre option like EnterpriseDB needs to be considered.

With PostgreSQL 10 scheduled for release later this year, many users are undoubtedly curious about the new features and functionality. Let take a closer look at what’s in the feature set so you can consider either an upgrade or using this new version on your next development project.

Improved Query Performance

One of the most important enhancements in PostgreSQL 10 is its faster query executor. The database is already known for performing essentially as fast as Oracle, so any additional speed boost is sure to make those benchmark comparisons even closer.

Robert Haas, Vice President for Enterprise DB and a major contributor to the PostgreSQL codebase, commented on the technical changes behind the executor’s performance boost. “Hash aggregation has been rewritten to use a more efficient hash table and store narrower tuples in it, and work has also been done to speed up queries that compute multiple aggregates and joins where one side can be proven unique,” said Haas.

Improved parallelism is another enhancement in Postgre aimed at boosting query performance. Haas noted that parallel queries now run two to four times faster in version 10. Index scanning is another function now faster because of parallel processing.

The new XMLTABLE support improves query processing against data stored internally as XML. This is the one PostgreSQL 10 enhancement aimed at the NoSQL market.

Replication is now Better – and Easier

PostgreSQL 10 now supports replication at the table level; previous versions required the full database to be replicated. This additional flexibility comes with the bonus of being easier to use as well. Called Logical Replication, it is a feature greatly anticipated in the PostgreSQL community.

Extended Statistics help with Query Planning

Developers who write complex queries against a PostgreSQL 10 instance enjoy the benefit of expanded statistics that help the query planning process. Haas explains this in more detail: “If the query planner makes a bad row count estimate resulting in a terrible plan, how do you fix it?  With extended statistics, you can tell the system to gather additional statistics according to parameters that you specify, which may help it get the plan right.”

Other PostgreSQL 10 Enhancements

Other significant Postgre 10 improvements include Declarative Partitioning which makes inserting new records faster, among other benefits. Support for SCRAM authentication enhances the security of a database instance. Durable Hash Indexes are another new feature aimed at boosting database performance.

One future enhancement potentially coming out in a point release is just-in-time compilation. This is expected to add yet another performance boost to any PostgreSQL implementation.

PostgreSQL 10 definitely adds enough new functionality for current users as well as organizations interested in an alternative to Oracle. While its NoSQL support remains limited, it is definitely a traditional SQL database worthy of your interest. EnterpriseDB also offers commercial-level support for companies still wary of an open source solution.

Keep returning to the Betica Blog for additional dispatches from the software development world. Thanks for reading!

Apache Cassandra – the Highly Scalable NoSQL Database

The NoSQL database movement happened because traditional relational databases simply don’t work as well in the highly distributed environments typical of today’s Web infrastructure. We recently covered the NoSQL graph database, Neo4j, here at the blog. It serves the needs of those looking to find relationships between records within huge Big Data stores.

This time out, we train our eye towards Apache Cassandra. Leveraging a key-value storage model, Cassandra offers high scalability and latency across widely distributed data centers. Read further to see if this NoSQL database makes sense for your organization’s data management needs.

The Genesis of Cassandra

Cassandra began as an internal project at Facebook. It actually powered the auto-complete functionality in the social network’s search box. Facebook released the project into the open source community in 2008. It became an Apache Software Foundation top level project in 2010 after two years in the incubator.

The latest release of Cassandra – 3.10 – became available in February of this year. As a fully open source database, it is downloadable for free. A free application with cross-platform support for most popular operating systems makes it worth checking out on a pilot project at your organization. Driver support exists for many current programming languages, like Java (using JDBC), Python, Node.js, Go, and C++.

Enterprises looking for a commercial NoSQL solution built upon Cassandra need to check out DataStax’s offerings. That company is known as the leading commercial provider of support for the database.

Cassandra’s Features and Functionality

Highly scalable distributed performance is Cassandra’s major calling card. DataStax provides a white paper comparing third-party benchmarks of a few of the most popular NoSQL databases (MongoDB, Couchbase, Hive), which revealed Cassandra as the top performer by a wide margin. Fault tolerance and replication are also seamlessly handled across a multitude of data centers – an important feature considering the modern global business landscape.

Impressive scalability also distinguishes Cassandra from similar NoSQL database products. Many enterprise users of the database boast massive production deployments, highlighted by Apple’s 10 petabytes of data spread over 75,000 nodes. Netflix also stores 420 terabytes of data across 2,500 nodes. Needless to say, Cassandra has rapidly become the database of choice for these enormous chunks of Big Data.

The database’s architecture provides no single points of failure; ensuring access to the data isn’t hampered by large amounts of network traffic. Since every node is identical, an entire data center can go offline without any loss of data. This kind of durability makes Cassandra very attractive to businesses with mission-critical applications – built-in support for multiple data centers is another plus.

Adding new servers to a deployment is also a breeze, according to DataStax’s lead Cassandra evangelist, Patrick McFadin. “You simply boot up a new machine and tell Cassandra where the other nodes are and it takes care of the rest,” said McFadin.

Superior horizontal scalability combined with ease of administration make Cassandra a worthy option for businesses looking to embrace NoSQL for their modern database needs. Its driver support for most popular languages lets developers come up to speed quickly. This is one open source database worth checking out.

Stay tuned to the Betica Blog for additional dispatches from the wide world of software development. As always – thanks for reading!