• PostgreSQL
PostgreSQL®, Docker, and Shared Memory

A fast and convenient (and common) way to deploy PostgreSQL® is in a Docker container. However, this can have some unexpected complications since PostgreSQL relies heavily on shared memory, and Docker handles shared memory differently from a traditional system. This means some extra care is needed to ensure your PostgreSQL system is running as well as it can. 

What is Shared Memory? 

A computer will normally have several different types of memory, all with different access latency and read and write speeds. The fastest of these is the cache that lives on the same chip as your CPU, and the slowest local storage is an external storage device such as a hard drive. Somewhere in the middle of this spectrum is random access memory, commonly known as RAM, and sometimes confusingly simply called memory. 

RAM can be used for many different functions, but today we are specifically interested in shared memory. 

Each logical process is assigned some RAM space to work with by the operating system. Once the space has been assigned, this memory cannot* be accessed by any other process on the system. This reserved memory means that processes can’t just peak at each other’s memory to collaborate on work. While there is more than one way around this limitation, a common and robust one is to use shared memory assigned by the operating system. 

A process can request a shared memory space from the OS, that is then created independent of the original process and assigned an identifier that can be used to access it. Other processes can then use the same identifier to access the same shared memory space. 

 (*This memory can be accessed through “unsafe” means, but this makes developers unhappy.)

Why Does it Impact PostgreSQL? 

Now we arrive at a more pertinent question: what does this have to do with PostgreSQL?  

Because of the way PostgreSQL is designed, it uses multiple processes to achieve its functionality. Some of these processes are background services that a user will typically not interact with, while others are background processes and are responsible for directly executing the queries created by users—hashing, filtering, joining, etc. 

PostgreSQL uses shared memory for things such as the shared buffers, a buffer for write-ahead logs (known as the “WAL buffer”), and for commit logs used for concurrency control within the database. This is not an exhaustive list, and each different function can use vastly different amounts of this shared memory.  

It is important to note that there are multiple ways to assign shared memory used by PostgreSQL, namely memory-mapped files (known as mmap), and /dev/shm. Some of this functionality, including “shared_buffers”, which utilize mmap will not be covered here as we’re only interested in the /dev/shm (shm_open) method which is used for dynamic memory allocation for processes. 

Docker and SHM-Size 

All this is well and good in a normal PostgreSQL setup, however, if you want to run PostgreSQL in a Docker container as many do, there is an additional hitch. Docker containers manage their shared memory in a different way as their whole appeal is their broad compatibility across even the cheapest smart toasters. This means that instead of allowing the shared memory space to grow to fill any and all available memory, it has an artificial limit placed on it. By default, this limit is 64MB. Now, if you’re from 1969 or you haven’t worked with software outside of an integrated circuit, you might think that’s more than enough (enough for a trip to the moon, at least!), but this is substantially less than what is expected in modern systems. 

With light personal use, this limit is unlikely to present a problem as PostgreSQL can comfortably work with less when it is managing only small amounts of data. However, if you start doing heavier work with the database, you might encounter an error like this: 

This means that PostgreSQL has tried to grow its shared memory and found that there was nowhere to grow. To fix this, we need to tell Docker to allow shared memory to use more memory.  

This is done by passing the “–shm-size=” parameter when executing the “docker run” command. For example: 

It is important to remember, as mentioned above, that shared buffers don’t use the same /dev/shm memory as other shared memory. While these 2 can be correlated through the general workload on the system, increasing the /dev/shm size won’t directly impact shared buffers and vice versa. 

It is difficult to provide a simple recommendation or rule of thumb regarding how this parameter should be set as there are several influencing factors such as your maximum parallel workers, table sizes, dynamic_shared_memory_type, min_dynamic_shared_memory, and number of expected concurrent users. Additionally, for sensitive workloads, aim to overestimate your requirements as an unexpected “out of memory” error can be the difference between smooth operation and needing an unscheduled maintenance outage. 

Conclusion 

Whether you’re planning to set up a new PostgreSQL instance in Docker, or tuning a running one, there are a lot of different parameters and configurations to consider. Here, we have discussed what shared memory is, and how to increase it. 

Or, instead of worrying about picking this or any other magic number to make your database work, head over to Instaclustr at https://console2.instaclustr.com/ and get a Managed PostgreSQL instance up and running in less than 5 minutes. 

Author

Josh Mackay Product Development