GreenPlum DBA | Facts About Greenplum DBA
Last updated on Wed 18 Mar 2020
Greenplum DBA builds on foundations of open source DB postgreSQL. It is partitioned across multiple segment servers and each segment owns and manages distinct portion of overall data.
Greenplum DBA builds on the foundations of open source database postgreSQL. It primarily functions as a dataware house and makes use of a massively parallel architecture. Greenplum DB is partitioned across multiple segment servers and each segment owns and manages a distinct portion of the overall data.
Some Of The Interesting Facts About Green Plum DBA:
- Greenplum DB is an array of individual databases based upon PostgreSQL 8.2 working together to present a single database image.
- Greenplum DB stores and processes a large amount of data by distributing the data and processing workload across several servers.
- Greenplum DBA is the database to which clients can connect and submit the SQL statements.
- User-defined tables and their indexes are distributed across the available segments in the Greenplum Database system.
The Greenplum DBA master is the entry step to the Greenplum Database system. It accepts client connections and SQL queries and distributes the work to the segment instances.
Greenplum Database end-users interact with Greenplum DB (through the master) as they would with a typical PostgreSQL database. They connect to the database using client programs such as psql or APIs (JDBC or ODBC). The Greenplum DBA master is where the global system catalog resides. The global system catalog is the set of system tables that contain metadata about the Greenplum DB system itself. The master does not contain any user data; data resides only on the segments.
The Master Authenticates the
- client connections
- processes incoming SQL commands
- distributes workloads among segments
- coordinates the results returned by each segment and
- presents the final results to the client program
Greenplum DBA segments are independent PostgreSQL databases that each store a portion of the data and perform the Query Processing.
When a user connects to the database via the Greenplum master and serves a query, processes are created in each segment database for handling the work of that Query.
Segments that runs on servers are called segment hosts. A segment host executes from two - eight Greenplum segments, depending upon the CPU cores, RAM, storage, network interfaces, and workloads. The key point for obtaining the best performance from Greenplum DB is to distribute data and workloads evenly across a large number of capable segments. So that all segments can begin to work on a task simultaneously and complete their work at the same time.
The interconnect is the networking layer of the Greenplum DBA. It refers to the inter-process communication between the segments and the network infrastructure on which this communication relies. The Greenplum interconnect uses a standard 10-Gigabit Ethernet switching fabric. The interconnect uses User Datagram Protocol to send messages over the network.
Pivotal Query Optimizer:
The Pivotal Query Optimizer extends the planning and optimization capabilities of the Greenplum DB legacy optimizer. The PQO is extensible and achieves better optimization in multi-core architecture environments. When it is enabled, Greenplum DB uses the PQO to generate an execution plan for a query when possible. It also enhances Greenplum Database query performance tuning in the following areas:
- Queries against partitioned tables
- Queries that contain a common table expression (CTE)
- Queries that contain subqueries
Greenplum DBA Tools:
Aqua Data Studio is one of the tools that offers a suite of Greenplum DBA for the monitoring, maintaining and tuning the Greenplum server. It mainly includes the
- Instance Manager
- Storage Manager
- Security Manager and
- Session Manager
These managers allow the monitoring and maintenance of the Tablespaces, DB Objects, Sessions, Users, and Permissions via GUI.