Technical Technical — Postgres Monday 24th May 2021

PostgreSQL Data Types: Mappings to SQL, JDBC, and Java Data Types

By Paul Brebner

1. Data Types

A long time ago at university, I learned my first high-level programming language, Pascal, from the book by Niklaus Wirth:

Algorithms + Data Structures = Programs

But I didn’t really learn much about databases studying computer science, as they were taught by another department (business systems). So maybe now’s the time to start with an imaginary book called something like:

Queries + Data Types = Databases?

Data types have a long and important history in computing, driven initially by word lengths and machine data types but becoming more powerful and abstract as computer science matured. One popular 1980’s magazine was even named after a data type (“BYTE”, early microprocessors such as the 8008, Z80, and 6800 were characterized by BYTE/8-bit word sizes, in an era when the PDP-11 had 16-bit words, the VAX had 32-bit words, and the Cray-1 a massive 64-bit word size).

Byte magazine December 1975 with two data types on the cover (BYTE and Character!)
Byte magazine December 1975 with two data types on the cover (BYTE and Character!)
(Source: Wikimedia)

2. PostgreSQL Data Types

“Data Types” is a popular PostgreSQL search, so I decided to do some investigation of my own into why they are so important. First of all, why do data types matter in PostgreSQL? Doing some preliminary research I found out that data types in PostgreSQL are important for at least the following aspects (possibly more!): 

  1. As column data types when creating a table
  2. For functions and operators
  3. For constraints
  4. For creating types and domains, and
  5. When using PostgreSQL from a programming language (e.g. PostgreSQL to/from Python, and “C”).

PostgreSQL has a lot of built-in data types that are described in Chapter 8 of the documentation. And you can add new data types, so I guess there are really an infinite number of data types possible.

There’s a table that enumerates at least 43 built-in data types, and reveals that along with the official name some types have aliases (used internally for historical reasons). For example “real” has the alias “float4” (a single precision 4-byte floating-point number).

Here’s the full table which shows the variety of data types available:

NameAliasesDescription
bigintint8signed eight-byte integer
bigserialserial8autoincrementing eight-byte integer
bit [ (n) ] fixed-length bit string
bit varying [ (n) ]varbit [ (n) ]variable-length bit string
booleanboollogical Boolean (true/false)
box rectangular box on a plane
bytea binary data (“byte array”)
character [ (n) ]char [ (n) ]fixed-length character string
character varying [ (n) ]varchar [ (n) ]variable-length character string
cidr IPv4 or IPv6 network address
circle circle on a plane
date calendar date (year, month, day)
double precisionfloat8double precision floating-point number (8 bytes)
inet IPv4 or IPv6 host address
integerint, int4signed four-byte integer
interval [ fields ] [ (p) ] time span
json textual JSON data
jsonb binary JSON data, decomposed
line infinite line on a plane
lseg line segment on a plane
macaddr MAC (Media Access Control) address
macaddr8 MAC (Media Access Control) address (EUI-64 format)
money currency amount
numeric [ (p, s) ]decimal [ (p, s) ]exact numeric of selectable precision
path geometric path on a plane
pg_lsn PostgreSQL Log Sequence Number
pg_snapshot user-level transaction ID snapshot
point geometric point on a plane
polygon closed geometric path on a plane
realfloat4single precision floating-point number (4 bytes)
smallintint2signed two-byte integer
smallserialserial2autoincrementing two-byte integer
serialserial4autoincrementing four-byte integer
text variable-length character string
time [ (p) ] [ without time zone ] time of day (no time zone)
time [ (p) ] with time zonetimetztime of day, including time zone
timestamp [ (p) ] [ without time zone ] date and time (no time zone)
timestamp [ (p) ] with time zonetimestamptzdate and time, including time zone
tsquery text search query
tsvector text search document
txid_snapshot user-level transaction ID snapshot (deprecated; see pg_snapshot)
uuid universally unique identifier
xml XML data

Table 1: Postgres Data Types (Name, Alias, Description)

But how do you know what you can do with each data type? Chapter 9 documents which functions and operators are applicable to each data type. The documentation also says that each data type has an external representation, which raises the question of what these “external representations” are either in standard SQL data types or for a specific programming language.

3. Using PostgreSQL From Java—the PgJDBC Driver

How do you use PostgreSQL from Java? With JDBC! (Java Database Connectivity). There’s a PostgreSQL JDBC Driver (PgJDBC for short) which allows Java programs to connect using standard, database independent, Java code. It’s an open source Pure Java (Type 4, which talks native PostgreSQL protocol) driver and is well documented.

It’s easy to download PostgreSQL, install it, and start the database server running. You also need to download the JDBC driver

Connecting to the database is easy from jdbc:

To create tables you need to specify PostgreSQL data types for all the columns. For example, here’s a simple test table with integer id (which is the primary key) and value columns:

You can then do simple things in Java/PgJDBC using statements:

4. Why Do Data Types Matter in JDBC?

For simple Statements, the INSERT statement is just a string, so you don’t need a real data type yet. However, to extract the returned value you do need to know the PostgreSQL type, and also the corresponding Java data type if you are going to do anything more useful than just print it out (i.e. store it, process it):

However, to use Prepared Statements you need to use the correct data types (lines 2, 3, 6, and 9) for both INSERT and SELECT, otherwise you’ll get a run time SQL exception:

Note that rs.getString(“value”) also seems to work ok for the basic data types (i.e. automatic casting), which is a feature we’ll find useful below.

5. Mappings From Postgresql Data Types to SQL/JDBC Data Types

 Mappings from shape to colour
Mappings from shape to colour
(Source: Wikimedia)

But how do you know which Java data types the PostgreSQL data types actually map to in advance?

Some searching resulted in the discovery of the getTypeInfo() method in the java.sql.DatabaseMetaData Interface, which is an interface implemented by driver vendors to let users know the capabilities of the database in combination with the JDBC driver. The getTypeInfo() method retrieves a description of all the data types supported by the database, ordered by DATA_TYPE and then by how closely they map to the corresponding JDBC SQL type. There’s potentially lots of useful information available, but I was mainly interested in TYPE_NAME and DATA_TYPE, which is the SQL data type from java.sql.Types. Initially it was reporting DATA_TYPE as integers (corresponding to the constants in the ENUM java.sql.Types), which wasn’t very useful, but then I managed to get it to report the constant name as follows:

Running this reveals the complete list of PostgreSQL data types and their mapping to SQL/JDBC Data types, for a total of 183 data types. Removing all the (many) OTHER and ARRAY (e.g. _record etc) data types leaves us with this more manageable table:

PostgreSQL Data TypeSQL Data Type
boolBIT
bitBIT
int8BIGINT
bigserialBIGINT
oidBIGINT
byteaBINARY
charCHAR
bpcharCHAR
numericNUMERIC
int4INTEGER
serialINTEGER
int2SMALLINT
smallserialSMALLINT
float4REAL
float8DOUBLE
moneyDOUBLE
nameVARCHAR
textVARCHAR
varcharVARCHAR
dateDATE
timeTIME
timetzTIME
timestampTIMESTAMP
timestamptzTIMESTAMP
cardinal_numberDISTINCT
character_dataDISTINCT
sql_identifierDISTINCT
time_stampDISTINCT
yes_or_noDISTINCT
xmlSQLXML
refcursorREF_CURSOR

Table 2: Mappings From PostgreSQL to SQL/JDBC Data Types

Note that the PostgreSQL data types are using the aliases from Table 1. But this is only part of the answer. How do we know what Java data types correspond to the SQL data types?

6. Mappings From SQL/JDBC Data Types to Java Data Types

After doing some more searching, the best I could do to find mappings from SQL data types to Java data types was to consult the actual JDBC specifications, Appendix B tables. Here’s my solution using Appendix B.4, Java object types mapped to JDBC types. 

PostgreSQL Data TypeSQL/JDBC Data TypeJava Type
boolBITboolean
bitBITboolean
int8BIGINTlong
bigserialBIGINTlong
oidBIGINTlong
byteaBINARYbyte[]
charCHARString
bpcharCHARString
numericNUMERICjava.math.BigDecimal
int4INTEGERint
serialINTEGERint
int2SMALLINTshort
smallserialSMALLINTshort
float4REALfloat
float8DOUBLEdouble
moneyDOUBLEdouble
nameVARCHARString
textVARCHARString
varcharVARCHARString
dateDATEjava.sql.Date
timeTIMEjava.sql.Time
timetzTIMEjava.sql.Time
timestampTIMESTAMPjava.sql.Timestamp
timestamptzTIMESTAMPjava.sql.Timestamp
cardinal_numberDISTINCTMapping of underlying type
character_dataDISTINCTMapping of underlying type
sql_identifierDISTINCTMapping of underlying type
time_stampDISTINCTMapping of underlying type
yes_or_noDISTINCTMapping of underlying type
xmlSQLXMLjava.sql.SQLXML
refcursorREF_CURSORUndefined
_abcARRAYjava.sql.array

Table 3: Complete Mappings from PostgreSQL to SQL/JDBC to Java Data Types

Notes: 

1 REF_CURSOR doesn’t appear in the jdbc appendices, but is mentioned in section “13.3.3.4 REF Cursor Support” of the specification, and may map to Types.REF_CURSOR. 

2 _abc stands for one of many ARRAY data types available in PostgreSQL (_record to _yes_or_no).

7. Type Conversion

So now we know the correct Java types to use, everything should just magically work correctly right? Well, mostly, as long as you remember what the PostgreSQL data types used in the table columns are, and use the correct PostgreSQL to Java mappings. As an experiment I tried a few data types including int2/int4/int8 (short, int, long), char/text (String), numeric (BigDecimal). The main thing to watch out for is conversion/casting between different sized types—going from longer to short types results in run-times errors (if you are lucky) or truncation (if you are unlucky). 

How about a really simple type such as bit, bool, and boolean? That should be idiot proof? True? (or False or Unknown…). Let’s try and see what happens (in my next PostgreSQL blog)!

8. Conclusions

In this blog I discovered what data types are available in PostgreSQL (a lot), and hopefully determined the definitive mapping from PostgreSQL to SQL/JDBC to Java data types. However, even armed with this information you have to be careful about type conversion/casting, and watch out for run-time errors, truncation, or loss of information.

And we’ve really only started to scratch the surface of PostgreSQL data types, as PostgreSQL allows for custom data types, user defined types, arrays, and OTHER data types (e.g. JSON). I was also surprised to discover that PostgreSQL is really an ORDBMS (an object-relational DB), and has support for some object-oriented features such as table inheritance and function overloading. How these all work from JDBC would be more than enough to fill my imaginary database book, or several actual (but currently chimerical) future blogs.