PDFToolkit API (DB Connector)
Requirement
As part of the PDFManager Module creation, a method was required that would allow the manipultaion of fillable (form) PDF's.
The requirement was to:
Complete a form PDF inserting data into the fields (rather than just stamping un-editable text on the form), and
Add a barcode and text to an existing PDF form, and output the resultant PDF as a form (the v1 PDFManager using a PHP solution always output a flat non-form PDF, even if a form PDF is used as the input).
This could not be achieved (in 2022) using an opensource PHP module, but there is a well established and proven Linux CLI app which can be utilized, and provided a couple of additional features to the requirement.
The main Drupal site (served by an Acquia webserver), while running on Linux is not managed by City of Boston and the pdftk libraries are not loaded on that server. Given the short time constraints, the pdftk was deployed within the same container as the DBConnector, leveraging the existing endpoint services (node/javascript/express) and some shellscripting.
AWS Microservice
The dbconnector service was extended to provide the following endpoints:
Administration Functions
Ping to test service is available.
GET
/v1/pdf/heartbeat
Runs an internal test to verify that pdftk is installed properly.
GET
/v1/pdf/test
Internally calls the pdftk and captures the version of the cli.
PDF File Operations
Adds data to fields of a PDF form, and outputs a reference to the completed PDF form.
POST
/v1/pdf/fill
A PDF and data file must be provided. The PDF must be a fillable form PDF and the data file must be a file in an FDF format.
The /v1/pdf/generate_fdf endpoint can be used to generate a blank FDF data file.
Request Body
formfile*
String
Url to a form PDF
datafile*
String
Url to a form data file in FDF format
Stamps a PDF on to another PDF, and outputs a reference to the merged PDF.
POST
/v1/pdf/overlay
Request Body
basefile*
String
A PDF document - can be a URL or a file-reference returned from another endpoint.
overlayfile*
String
URL to a PDF document
overwrite
String
Defaults to "true"
Updates the PDF document properties and outputs a reference to the updated PDF.
POST
/v1/pdf/metadata
Request Body
pdf_file*
String
A PDF document - can be a URL or a file-reference returned from another endpoint.
meta_data*
String
A file in a the following format:
InfoBegin
InfoKey: <one of title, author, subject, creator, producer>
InfoValue: <the value to set>
InfoKey: ..
InfoValue: ...
...
Removes compression on a PDF, and returns the decompressed file as an attachment.
GET
/v1/pdf/decompress
This is a useful utility to use the PDFManager cannot manipulate a PDF because its compression is later than PDF1.5.
The endpoint first checks to see if it already has a file with the filename specified in the pdf_file
query parameter. If it does, then it just returns that file.
NOTE: restarting the dbconnector task(s) on AWS will empty this cache.
If the del
parameter is "true" then the file is deleted after decompression and downloading. To reduce load on the endpoint, set to "false" if the pdf_file
does not change often and if you expect to call the function frequently.
Query Parameters
pdf_file*
String
Url to a PDF document
del
String
Should the file be deleted after it is downloaded. Defaults to "true".
Returns the decompressed document as an attachment.
The expected headers are:
PDF Retrieval
Returns the requested PDF document from its reference.
GET
/v1/pdf/fetch
Query Parameters
file*
String
A file-reference from one of the endpoints
del
String
Delete the file after downloading. defaults to false
show
String
Download method: D (default) downloads attachment, I download and display in browser (if supported)
Returns the document as an attachment.
When show=D, expected headers are:
The when show=I, expected headers are:
Last updated