The 'mutool run' command executes a JavaScript program, which has access to most of the features of the MuPDF library. The command supports ECMAScript 5 syntax in strict mode. All of the MuPDF constructors and functions live in the global object, and the command line arguments are accessible from the global 'scriptArgs' object. The name of the script is in the global 'scriptPath' variable.
mutool run script.js [ arguments ... ]
If invoked without any arguments, it will drop you into an interactive REPL (read-eval-print-loop).
Example scripts
Create and edit PDF documents:
- pdf-create-lowlevel.js: Create PDF document from scratch using only low level functions.
- pdf-create.js: Create PDF document from scratch, using helper functions.
- pdf-merge.js: Merge pages from multiple PDF documents into one PDF file.
Graphics and the device interface:
- draw-document.js: Draw all pages in a document to PNG files.
- draw-device.js: Use device API to draw graphics and save as a PNG file.
- trace-device.js: Implement a device in JavaScript.
Advanced examples:
- create-thumbnail.js: Create a PDF from rendered page thumbnails.
JavaScript Shell
Several global functions that are common for command line shells are available:
- gc(report)
- Run the garbage collector to free up memory. Optionally report statistics on the garbage collection.
- load(fileName)
- Load and execute script in 'fileName'.
- print(...)
- Print arguments to stdout, separated by spaces and followed by a newline.
- quit()
- Exit the shell.
- read(fileName)
- Read the contents of a file and return them as a UTF-8 decoded string.
- readline()
- Read one line of input from stdin and return it as a string.
- require(module)
- Load a JavaScript module.
- write(...)
- Print arguments to stdout, separated by spaces.
- repr(object)
- Format the object into a string with javascript syntax.
Buffer
The Buffer objects are used for working with binary data. They can be used much like arrays, but are much more efficient since they only store bytes.
- new Buffer()
- Create a new empty buffer.
- new Buffer(original)
- Create a new buffer with a copy of the data from the original buffer.
- readFile(fileName)
- Create a new buffer with the contents of a file.
- Buffer#length
- The number of bytes in the buffer.
- Buffer#[n]
- Read/write the byte at index 'n'. Will throw exceptions on out of bounds accesses.
- Buffer#writeByte(b)
- Append a single byte to the end of the buffer.
- Buffer#writeRune(c)
- Encode a unicode character as UTF-8 and append to the end of the buffer.
- Buffer#writeLine(...)
- Append arguments to the end of the buffer, separated by spaces, ending with a newline.
- Buffer#write(...)
- Append arguments to the end of the buffer, separated by spaces.
- Buffer#writeBuffer(data)
- Append the contents of the 'data' buffer to the end of the buffer.
- Buffer#slice(start, end)
- Create a new Buffer containing a (subset of) the data in this buffer. Start and end are offsets from the beginning of this buffer, and if negative from the end of this buffer.
- Buffer#save(fileName)
- Write the contents of the buffer to a file.
Matrices and Rectangles
All dimensions are in points unless otherwise specified.
Matrices are simply 6-element arrays representing a 3-by-3 transformation matrix as
/ a b 0 \ | c d 0 | \ e f 1 /
This matrix is represented in JavaScript as [a,b,c,d,e,f]
.
- Identity
- The identity matrix, short hand for
[1,0,0,1,0,0]
. - Scale(sx, sy)
- Return a scaling matrix, short hand for
[sx,0,0,sy,0,0]
. - Translate(tx, ty)
- Return a translation matrix, short hand for
[1,0,0,1,tx,ty]
. - Rotate(theta)
- Return a rotation matrix, short hand for
[cos(theta),sin(theta),-sin(theta),cos(theta),0,0]
. - Concat(a, b)
- Concatenate matrices a and b. Bear in mind that matrix multiplication is not commutative.
Rectangles are 4-element arrays, specifying the minimum and maximum corners (typically
upper left and lower right, in a coordinate space with the origin at the top left with
descending y): [ulx,uly,lrx,lry]
.
If the minimum x coordinate is bigger than the maximum x coordinate, MuPDF treats the rectangle as infinite in size.
Document and Page
MuPDF can open many document types (PDF, XPS, CBZ, EPUB, FB2 and a handful of image formats).
- new Document(fileName)
- Open the named document.
- Document#needsPassword()
- Returns true if a password is required to open this password protected PDF.
- Document#authenticatePassword(password)
- Returns true if the password matches.
- Document#hasPermission(permission)
- Returns true if the document has permission for "print", "annotate", "edit" or "copy".
- Document#getMetaData(key)
- Return various meta data information. The common keys are: "format", "encryption", "info:Author", and "info:Title".
- Document#setMetaData(key, value)
- Set document meta data information field to a new value.
- Document#isReflowable()
- Returns true if the document is reflowable, such as EPUB, FB2 or XHTML.
- Document#layout(pageWidth, pageHeight, fontSize)
- Layout a reflowable document (EPUB, FB2, or XHTML) to fit the specified page and font size.
- Document#countPages()
- Count the number of pages in the document. This may change if you call the layout function with different parameters.
- Document#loadPage(number)
- Returns a Page object for the given page number. Page number zero (0) is the first page in the document.
- Document#loadOutline()
- Returns an array with the outline (table of contents). In the array is an object for each heading with the property 'title', and a property 'page' containing the page number. If the object has a 'down' property, it contains an array with all the sub-headings for that entry.
- Document#resolveLink(uri)
- Resolve a document internal link uri to a link destination.
- Document#formatLinkURI(linkDestination)
- Format a document internal link destination object to an URI string suitable for Page#createLink().
- Document#outlineIterator()
- Return an OutlineIterator for the document outline.
- Document#isPDF()
- Returns true if the document is a PDF document.
- setUserCSS(userStylesheet, usePublisherStyles)
- Set user styles and whether to use publisher styles when laying out reflowable documents.
A link is a dictionary with keys for:
- Link.bounds:
- A rectangle describing the link's location on the page.
- Link.uri:
- A uri describing either a document internal destination or a URL for a web page.
A link destination points to a location within a document and how a document viewer should show that destination. It consists of a dictionary with keys for:
- LinkDestination.chapter, .page:
- The chapter and page within the document.
- LinkDestination.type:
- Either "Fit", "FitB", "FitH", "FitBH", "FitV", "FitBV", "FitR" or "XYZ", controlling which of the keys below exist.
- LinkDestination.x:
- The left coordinate, valid for FitV, FitBV, FitR and XYZ.
- LinkDestination.y:
- The top coordinate, valid for FitH, FitBH, FitR and XYZ.
- LinkDestination.width:
- The width of the zoomed in region, valid for XYZ.
- LinkDestination.height:
- The height of the zoomed in region, valid for XYZ.
- LinkDestination.zoom:
- The zoom factor, valid for "XYZ".
- Page#bound()
- Returns a rectangle containing the page dimensions.
- Page#run(device, transform, skipAnnotations)
- Calls device functions for all the contents on the page, using the specified transform matrix. The device can be one of the built-in devices or a JavaScript object with methods for the device calls. The transform maps from user space points to device space pixels. If skipAnnotations is true, ignore annotations.
- Page#toPixmap(transform, colorspace, alpha, skipAnnotations)
- Render the page into a Pixmap, using the transform and colorspace. If alpha is true, the page will be drawn on a transparent background, otherwise white.
- Page#toDisplayList(skipAnnotations)
- Record the contents on the page into a DisplayList.
- Page#toStructuredText(options)
- Extract the text on the page into a StructuredText object. The options argument is a comma separated list of flags: preserve-ligatures, preserve-whitespace, preserve-spans, and preserve-images.
- Page#search(needle)
- Search for 'needle' text on the page, and return an array with rectangles of all matches found.
- Page#getLinks()
- Return an array of all the links on the page. Each link is an object with a 'bounds' property, and either a 'page' or 'uri' property, depending on whether it's an internal or external link.
- Page#createLink(rect, destinationUri)
- Create a new link within the rectangle on the page, linking to the destination URI string.
- Page#deleteLink(link)
- Delete the link from the page.
- Page#getLabel()
- Returns the page number as a string using the numbering scheme of the document.
- Page#isPDF()
- Returns true if the page is from a PDF document.
StructuredText
StructuredText objects hold text from a page that has been analyzed and grouped into blocks, lines and spans.
- StructuredText#walk(textWalker)
- Call callback methods on the textWalker while replaying the structured text contents.
- StructuredText#search(needle)
- Search the text for all instances of 'needle', and return an array with rectangles of all matches found.
- StructuredText#highlight(p, q)
- Return an array with rectangles needed to highlight a selection defined by the start and end points.
- StructuredText#copy(p, q)
- Return the text from the selection defined by the start and end points.
A StructuredTextWalker has callback methods that are called when StructuredText#walk() walks over blocks, lines, characters and images.
- StructuredTextWalker#onImageBlock(bbox, transform, image)
- Called when walking over an image, describing its location and transform.
- StructuredTextWalker#beginTextBlock(bbox)
- Called when walking over the beginning of a text block, describing its location.
- StructuredTextWalker#endTextBlock()
- Called when walking over the end of a text block.
- StructuredTextWalker#beginLine(bbox, writingMode)
- Called when walking over each beginning of text lines in a text block, describing its location and writing direction.
- StructuredTextWalker#endLine()
- Called when walking over the end of a text line.
- StructuredTextWalker#onChar(utf, position, font, size, quad, color)
- Called when walking over each character in a text line, describing its location, font, size, dimension and color.
ColorSpace
- DeviceGray
- The default grayscale colorspace.
- DeviceRGB
- The default RGB colorspace.
- DeviceBGR
- The default RGB colorspace, but with components in reverse order.
- DeviceCMYK
- The default CMYK colorspace.
- DeviceLab
- The default Lab colorspace.
- ColorSpace#getNumberOfComponents()
- A grayscale colorspace has one component, RGB has 3, CMYK has 4, and DeviceN may have any number of components.
- ColorSpace#toString()
- Return name of Colorspace.
- ColorSpace#isGray()
- Returns true if the object is a gray color space.
- ColorSpace#isRGB()
- Returns true if the object is an RGB color space.
- ColorSpace#isCMYK()
- Returns true if the object is a CMYK color space.
- ColorSpace#isIndexed()
- Returns true if the object is a Indexed color space.
- ColorSpace#isLab()
- Returns true if the object is a Lab color space.
- ColorSpace#isDeviceN()
- Returns true if the object is a Device N color space.
- ColorSpace#isLabICC()
- Returns true if the object is a Lab ICC color space.
- ColorSpace#isSubtractive()
- Returns true if the object is a subtractive color space.
- ColorSpace#isDevice()
- Returns true if the object is a Device color space.
- ColorSpace#isDeviceGray()
- Returns true if the object is a Deice gray color space.
- ColorSpace#isDeviceCMYK()
- Returns true if the object is a Deice CMYK color space.
Pixmap
A Pixmap object contains a color raster image (short for pixel map). The components in a pixel in the pixmap are all byte values, with the transparency as the last component. A pixmap also has a location (x, y) in addition to its size; so that they can easily be used to represent tiles of a page.
- new Pixmap(colorspace, bounds, alpha)
- Create a new pixmap. The pixel data is not initialized; and will contain garbage.
- Pixmap#clear(value)
- Clear the pixels to the specified value. Pass 255 for white, or undefined for transparent.
- Pixmap#bound()
- Return the pixmap bounds.
- Pixmap#getX()
- Pixmap#getY()
- Pixmap#getWidth()
- Pixmap#getHeight()
- Pixmap#getNumberOfComponents()
- Pixmap#getY()
- Number of colors; plus one if an alpha channel is present.
- Pixmap#getAlpha()
- True if alpha channel is present.
- Pixmap#getStride()
- Number of bytes per row.
- Pixmap#getColorSpace()
- Pixmap#getXResolution()
- Pixmap#getYResolution()
- Pixmap#getXResolution()
- Pixmap resolution in dots per inch.
- Pixmap#getSample(x, y, k)
- Get the value of component k at position x, y (relative to the image origin: 0, 0 is the top left pixel).
- Pixmap#setResolution(xRes, yRes)
- Set X/Y resolution in dots per inch.
- Pixmap#saveAsPNG(fileName)
- Save the pixmap as a PNG. Only works for Gray and RGB images.
- Pixmap#saveAsJPEG(fileName, quality)
- Save the pixmap as a JPEG. Only works for Gray, RGB, and CMYK images.
- Pixmap#saveAsPAM(fileName)
- Save the pixmap as a PAM.
- Pixmap#saveAsPNM(fileName)
- Save the pixmap as a PNM. Only works for Gray and RGB images without alpha.
- Pixmap#saveAsPBM(fileName)
- Pixmap#saveAsPKM(fileName)
- Save the pixmap as a PBM/PKM. Only works for Gray and CMYK images without alpha.
- Pixmap#invert()
- Invert all pixel. All components are processed, except alpha which is unchanged.
- Pixmap#invertLuminance()
- Transform all pixels so that luminance of each pixel is inverted, and the chrominance remains as unchanged as possible. All components are processed, except alpha which is unchanged.
- Pixmap#gamma(gamma)
- Apply gamma correction to pixmap. All components are processed, except alpha which is unchanged.
- Pixmap#tint(black, white)
- Tint all pixels in an RGB, BGR or Gray pixmap. Map black and white respectively to the given hex RGB values.
- Pixmap#warp(points, width, height)
- Return a warped subsection of the pixmap, where the result has the requested dimensions. Points give the corner points of a convex quadrilateral within the pixmap to be warped, represented as [x0, y0, x1, y1, x2, y2, x3, y3].
- Pixmap#convertToColorSpace(colorspace, proof, defaultColorSpaces, colorParams, keepAlpha)
- Convert pixmap into a new pixmap of a desired colorspace. A proofing colorspace, a set of default colorspaces and color parameters used during conversion may be specified. Finally a boolean indicates if alpha should be preserved (default is to not preserve alpha).
DrawDevice
The DrawDevice can be used to render to a Pixmap; either by running a Page with it or by calling its methods directly.
- new DrawDevice(transform, pixmap)
- Create a device for drawing into a pixmap. The pixmap bounds used should match the transformed page bounds, or you can adjust them to only draw a part of the page.
DisplayList and DisplayListDevice
A display list records all the device calls for playback later. If you want to run a page through several devices, or run it multiple times for any other reason, recording the page to a display list and replaying the display list may be a performance gain since then you can avoid reinterpreting the page each time. Be aware though, that a display list will keep all the graphics required in memory, so will increase the amount of memory required.
- new DisplayList(mediabox)
- Create an empty display list. The mediabox rect has the bounds of the page in points.
- DisplayList#run(device, transform)
- Play back the recorded device calls onto the device.
- DisplayList#bound()
- Returns a rectangle containing the dimensions of the display list contents.
- DisplayList#toPixmap(transform, colorspace, alpha)
- Render display list to a pixmap. If alpha is true, it will render to a transparent background, otherwise white.
- DisplayList#toStructuredText(options)
- Extract the text in the display list into a StructuredText object. The options argument is a comma separated list of flags: preserve-ligatures, preserve-whitespace, preserve-spans, and preserve-images.
- DisplayList#search(needle)
- Search the display list text for all instances of 'needle', and return an array with rectangles of all matches found.
- new DisplayListDevice(displayList)
- Create a device for recording onto a display list.
Device
All built-in devices have the methods listed below. Any function that accepts a device will also accept a JavaScript object with the same methods. Any missing methods are simply ignored, so you only need to create methods for the device calls you care about.
Many of the methods take graphics objects as arguments: Path, Text, Image and Shade.
The stroking state is a dictionary with keys for:
- StrokingState.startCap, .dashCap, .endCap:
- "Butt", "Round", "Square", or "Triangle".
- StrokingState.lineCap:
- Set startCap, dashCap, and endCap all at once.
- StrokingState.lineJoin:
- "Miter", "Round", "Bevel", or "MiterXPS".
- StrokingState.lineWidth:
- Thickness of the line.
- StrokingState.miterLimit:
- Maximum ratio of the miter length to line width, before beveling the join instead.
- StrokingState.dashPhase:
- Starting offset for dash pattern.
- StrokingState.dashes:
- Array of on/off dash lengths.
Colors are specified as arrays with the appropriate number of components for the color space.
The methods that clip graphics must be balanced with a corresponding popClip.
- Device#fillPath(path, evenOdd, transform, colorspace, color, alpha, colorParams)
- Device#strokePath(path, stroke, transform, colorspace, color, alpha, colorParams)
- Device#clipPath(path, evenOdd, transform)
- Device#clipStrokePath(path, stroke, transform)
- Device#strokePath(path, stroke, transform, colorspace, color, alpha, colorParams)
- Fill/stroke/clip a path.
- Device#fillText(text, transform, colorspace, color, alpha, colorParams)
- Device#strokeText(text, stroke, transform, colorspace, color, alpha, colorParams)
- Device#clipText(text, transform)
- Device#clipStrokeText(text, stroke, transform)
- Device#strokeText(text, stroke, transform, colorspace, color, alpha, colorParams)
- Fill/stroke/clip a text object.
- Device#ignoreText(text, transform)
- Invisible text that can be searched but should not be visible, such as for overlaying a scanned OCR image.
- Device#fillShade(shade, transform, alpha, colorParams)
- Fill a shade (a.k.a. gradient). TODO: this details of gradient fills are not exposed to JavaScript yet.
- Device#fillImage(shade, transform, alpha, colorParams)
- Draw an image. An image always fills a unit rectangle [0,0,1,1], so must be transformed to be placed and drawn at the appropriate size.
- Device#fillImageMask(image, transform, colorspace, color, alpha, colorParams)
- An image mask is an image without color. Fill with the color where the image is opaque.
- Device#clipImageMask(image, transform)
- Clip graphics using the image to mask the areas to be drawn.
- Device#beginMask(area, luminosity, backdropColorspace, backdropColor, backdropAlpha, colorParams)
- Device#endMask()
- Create a soft mask. Any drawing commands between beginMask and endMask are grouped and used as a clip mask. If luminosity is true, the mask is derived from the luminosity (grayscale value) of the graphics drawn; otherwise the color is ignored completely and the mask is derived from the alpha of the group.
- Device#popClip()
- Pop the clip mask installed by the last clipping operation.
- Device#beginGroup(area, isolated, knockout, blendmode, alpha)
- Device#endGroup()
- Push/pop a transparency blending group. Blendmode is one of the standard PDF blend modes: "Normal", "Multiply", "Screen", etc. See the PDF reference for details on isolated and knockout.
- Device#beginTile(areaRect, viewRect, xStep, yStep, transform, id)
- Device#endTile()
- Draw a tiling pattern. Any drawing commands between beginTile and endTile are grouped and then repeated across the whole page. Apply a clip mask to restrict the pattern to the desired shape.
- Device#beginLayer(tag)
- Device#endLayer()
- Begin/end a marked-content layer with the given tag.
- Device#renderFlags(set, clear)
- Set/clear device rendering flags. Both set and clear are arrays where each element is a flag name: "mask", "color", "uncacheable", "fillcolor-undefined", "strokecolor-undefined", "startcap-undefined", "dashcap-undefined", "endcap-undefined", "linejoin-undefined", "miterlimit-undefined", "linewidth-undefined", "bbox-defined", or "gridfit-as-tiled".
- Device#setDefaultColorSpaces(defaults)
- Change the set of default colorspaces for the device. See the DefaultColorSpaces object.
- Device#beginStructure(standard, raw, uid)
- Device#endStructure()
- Begin/end a standard structure element, the raw tag name and a unique identifier.
- Device#beginMetatext(type, text)
- Device#endMetatext()
- Begin/end meta text information, the type (either of "ActualText", "Alt", "Abbreviation", or "Title") and the text value itself.
- Device#close()
- Tell the device that we are done, and flush any pending output.
ColorParams is a dictionary with keys for:
- ColorParams.renderingIntent:
- Either of "Perceptual", "RelativeColorimetric", "Saturation" or "AbsoluteColorimetric".
- ColorParams.blackPointCompensation:
- True if black point compensation is activated.
- ColorParams.overPrinting:
- True if overprint is activated.
- ColorParams.overPrintMode:
- The overprint mode.
DefaultColorSpaces is a an object with keys for:
- DefaultColorSpaces#getDefaultGray()
- DefaultColorSpaces#getDefaultRGB()
- DefaultColorSpaces#getDefaultCMYK()
- DefaultColorSpaces#getOutputIntent()
- DefaultColorSpaces#getDefaultRGB()
- Get the default gray/RGB/CMYK colorspace, or the output intent.
- DefaultColorSpaces#setDefaultGray(colorspace)
- DefaultColorSpaces#setDefaultRGB(colorspace)
- DefaultColorSpaces#setDefaultCMYK(colorspace)
- DefaultColorSpaces#setOutputIntent(colorspace)
- DefaultColorSpaces#setDefaultRGB(colorspace)
- Set the default gray/RGB/CMYK colorspace, or the output intent, to the provided colorspace.
Path
A Path object represents vector graphics as drawn by a pen. A path can be either stroked or filled, or used as a clip mask.
- new Path()
- Create a new empty path.
- Path#bound(stroke, transform)
- Return a bounding rectangle for the path.
- Path#moveTo(x, y)
- Lift and move the pen to the coordinate.
- Path#lineTo(x, y)
- Draw a line to the coordinate.
- Path#curveTo(x1, y1, x2, y2, x3, y3)
- Draw a cubic bezier curve to (x3,y3) using (x1,y1) and (x2,y2) as control points.
- Path#curveToV(cx, cy, ex, ey)
- Draw a cubic bezier curve to (ex,ey) using the start point and (cx,cy) as control points.
- Path#curveToY(cx, cy, ex, ey)
- Draw a cubic bezier curve to (ex,ey) using the (cx,cy) and (ex, ey) as control points.
- Path#closePath()
- Close the path by drawing a line to the last moveTo.
- Path#rect(x1, y1, x2, y2)
- Shorthand for moveTo, lineTo, lineTo, lineTo, closePath to draw a rectangle.
- Path#walk(pathWalker)
- Call moveTo, lineTo, curveTo and closePath methods on the pathWalker object to replay the path.
- Path#transform(transform)
- Transform path by the given transform matrix.
A PathWalker has callback methods that are called when Path#walk() walks over move to, line to, curve to and close path operators in a Path.
- PathWalker#moveTo(x, y)
- Called when walking over a moveTo, moving the current point to x, y.
- PathWalker#lineTo(x, y)
- Called when walking over a lineTo, requesting a line to be drawn from the current point to x, y.
- PathWalker#curveTo(x1, y1, x2, y2, x3, y3)
- Called when walking over a curveTo, requesting a Bézier curve from current point to x3, y3 with x1, y1 and x2, y2 as control points.
- PathWalker#closePath()
- Called when walking over a closePath, closing the current path.
Text
A Text object contains text.
- new Text()
- Create a new empty text object.
- Text#showGlyph(font, transform, glyph, unicode, wmode)
- Add a glyph to the text object. Transform is the text matrix, specifying font size and glyph location. For example:
[size,0,0,-size,x,y]
. Glyph and unicode may be -1 for n-to-m cluster mappings. For example, the "fi" ligature would be added in two steps: first the glyph for the 'fi' ligature and the unicode value for 'f'; then glyph -1 and the unicode value for 'i'. WMode is 0 for horizontal writing, and 1 for vertical writing. - Text#showString(font, transform, string)
- Add a simple string to the text object. Will do font substitution if the font does not have all the unicode characters required.
- Text#walk(textWalker)
- Call the showGlyph method on the textWalker object for each glyph in the text object.
Font
Font objects can be created from TrueType, OpenType, Type1 or CFF fonts. In PDF there are also special Type3 fonts.
- new Font(fontName or fileName)
- Create a new font, either using a built-in font name or a filename.
The built-in standard PDF fonts are: Times-Roman, Times-Italic, Times-Bold, Times-BoldItalic, Helvetica, Helvetica-Oblique, Helvetica-Bold, Helvetica-BoldOblique, Courier, Courier-Oblique, Courier-Bold, Courier-BoldOblique, Symbol, and ZapfDingbats.
The built-in CJK fonts are referenced by language code: zh-Hant, zh-Hans, ja, ko. - Font#getName()
- Get the font name.
- Font#encodeCharacter(unicode)
- Get the glyph index for a unicode character. Glyph zero (.notdef) is returned if the font does not have a glyph for the character.
- Font#advanceGlyph(glyph, wmode)
- Return advance width for a glyph in either horizontal or vertical writing mode.
- Font#isBold()
- Font#isItalic()
- Font#isMono()
- Font#isSerif()
- Font#isItalic()
- Returns true if font is bold/italic/monospaced/serif.
Image
Image objects are similar to Pixmaps, but can contain compressed data.
- new Image(pixmap or fileName)
- Create a new image from a pixmap data, or load an image from a file.
- Image#getWidth()
- Image#getHeight()
- Image size in pixels.
- Image#getXResolution()
- Image#getYResolution()
- Image resolution in dots per inch.
- Image#getColorSpace()
- Image#getNumberOfComponents()
- Image#getBitsPerComponent()
- Image#getInterpolate()
- Image#getNumberOfComponents()
- Returns true if interpolation was used during decoding.
- Image#getColorKey()
- Returns an array with 2 * N integers for an N component image with color key masking, or null if masking is not used. Each pair of integers define an interval, and component values within that interval are not painted.
- Image#getDecode()
- Returns an array with 2 * N numbers for an N component image with color mapping, or null if if mapping is not used. Each pair of numbers define the lower and upper values to which the component values are mapped linearly.
- Image#getOrientation()
- Returns the orientation of the image:
- 1
- no rotation or flipping
- 2
- rotate counter clockwise by 90 degrees
- 4
- rotate counter clockwise by 180 degrees
- 5
- rotate counter clockwise by 270 degrees
- 6
- flip on X, then rotate counter clockwise by 90 degrees
- 7
- flip on X, then rotate counter clockwise by 180 degrees
- 8
- flip on X, then rotate counter clockwise by 270 degrees
- Image#setOrientation(orientation)
- Set the image orientation to the given orientation.
- Image#getImageMask()
- Returns true if this image is an image mask.
- Image#getMask()
- Get another Image used as a mask for this one.
- Image#toPixmap(scaledWidth, scaledHeight)
- Create a pixmap from the image. The scaledWidth and scaledHeight arguments are optional, but may be used to decode a down-scaled pixmap.
Document Writer
Document writer objects are used to create new documents in several formats.
- new DocumentWriter(filename, format, options)
- Create a new document writer to create a document with the specified format and output options. If format is null it is inferred from the filename extension. The options argument is a comma separated list of flags and key-value pairs. See below for more details.
- DocumentWriter#beginPage(mediabox)
- Begin rendering a new page. Returns a Device that can be used to render the page graphics.
- DocumentWriter#endPage(device)
- Finish the page rendering. The argument must be the same device object that was returned by the beginPage method.
- DocumentWriter#close()
- Finish the document and flush any pending output.
The output formats and options supported are the same as in the mutool convert command.
PDFDocument and PDFObject
With MuPDF it is also possible to create, edit and manipulate PDF documents using low level access to the objects and streams contained in a PDF file. A PDFDocument object is also a Document object. You can test a Document object to see if it is safe to use as a PDFDocument by calling document.isPDF().
- new PDFDocument()
- Create a new empty PDF document.
- new PDFDocument(fileName)
- Load a PDF document from file.
- PDFDocument#save(fileName, options)
- Write the PDF document to file. The write options are a string of comma separated options (see the document writer options).
- PDFDocument#canBeSavedIncrementally()
- Returns true if the document can be saved incrementally, e.g. repaired documents or applying redactions prevents incremental saves.
- PDFDocument#countVersions()
- Returns the number of versions of the document in a PDF file, typically 1 + the number of updates.
- PDFDocument#countUnsavedVersions()
- Returns the number of unsaved updates to the document.
- PDFDocument#validateChangeHistory()
- Check the history of the document, return the last version that checks out OK. Returns 0 if the entire history is ok, 1 if the next to last version is ok, but the last version has issues, etc.
- PDFDocument#hasUnsavedChanges()
- Returns true if the document has been saved since it was last opened or saved.
- PDFDocument#wasPureXFA()
- Returns true if the document was an XFA form without AcroForm fields.
- PDFDocument#wasRepaired()
- Returns true if the document was repaired when opened.
- PDFDocument#setPageLabels(index, style, prefix, start)
- Sets the page label numbering for the page and all pages following it, until the next page with an attached label. Style can be one of the following strings: "", "D", "R", "r", "A", or "a". Start is the ordinal with which to start numbering.
- PDFDocument#deletePageLabels(index)
- Removes any associated page label from the page.
- PDFDocument#getVersion()
- Returns the PDF document version as an intenger multiplied by 10, so e.g. a PDF-1.4 document would return 14.
PDF Object Access
A PDF document contains objects, similar to those in JavaScript: arrays, dictionaries, strings, booleans, and numbers. At the root of the PDF document is the trailer object; which contains pointers to the meta data dictionary and the catalog object which contains the pages and other information.
Pointers in PDF are also called indirect references, and are of the form "32 0 R" (where 32 is the object number, 0 is the generation, and R is magic syntax). All functions in MuPDF dereference indirect references automatically.
PDF has two types of strings: /Names and (Strings). All dictionary keys are names.
Some dictionaries in PDF also have attached binary data. These are called streams, and may be compressed.
- PDFDocument#getTrailer()
- The trailer dictionary. This contains indirect references to the Root and Info dictionaries.
- PDFDocument#countObjects()
- Return the number of objects in the PDF. Object number 0 is reserved, and may not be used for anything.
- PDFDocument#createObject()
- Allocate a new numbered object in the PDF, and return an indirect reference to it. The object itself is uninitialized.
- PDFDocument#deleteObject(obj)
- Delete the object referred to by the indirect reference.
PDFObjects are always bound to the document that created them. Do NOT mix and match objects from one document with another document!
- PDFDocument#addObject(obj)
- Add 'obj' to the PDF as a numbered object, and return an indirect reference to it.
- PDFDocument#addStream(buffer, object)
- Create a stream object with the contents of 'buffer', add it to the PDF, and return an indirect reference to it. If object is defined, it will be used as the stream object dictionary.
- PDFDocument#addRawStream(buffer, object)
- Create a stream object with the contents of 'buffer', add it to the PDF, and return an indirect reference to it. If object is defined, it will be used as the stream object dictionary. The buffer must contain already compressed data that matches the Filter and DecodeParms.
- PDFDocument#newNull()
- PDFDocument#newBoolean(boolean)
- PDFDocument#newInteger(number)
- PDFDocument#newReal(number)
- PDFDocument#newString(string)
- PDFDocument#newByteString(byteString)
- PDFDocument#newName(string)
- PDFDocument#newIndirect(objectNumber, generation)
- PDFDocument#newArray()
- PDFDocument#newDictionary()
- PDFDocument#newBoolean(boolean)
The following functions can be used to copy objects from one document to another:
- PDFDocument#graftObject(object)
- Deep copy an object into the destination document. This function will not remember previously copied objects. If you are copying several objects from the same source document using multiple calls, you should use a graft map instead.
- PDFDocument#newGraftMap()
- Create a graft map on the destination document, so that objects that have already been copied can be found again. Each graft map should only be used with one source document! Make sure to create a new graft map for each source document used.
- PDFGraftMap#graftObject(object)
- Use the graft map to copy objects, with the ability to remember previously copied objects.
- PDFGraftMap#graftPage(map, dstPageNumber, srcDoc, srcPageNumber)
- Graft a page at the given page number from the source document to the requested page number in the destination document connected to the map.
- PDFDocument#graftPage(dstDoc, dstPageNumber, srcDoc, srcPageNumber)
- Graft a page and its resources at the given page number from the source document to the requested page number in the destination document.
All functions that take PDF objects, do automatic translation between JavaScript objects and PDF objects using a few basic rules. Null, booleans, and numbers are translated directly. JavaScript strings are translated to PDF names, unless they are surrounded by parentheses: "Foo" becomes the PDF name /Foo and "(Foo)" becomes the PDF string (Foo).
Arrays and dictionaries are recursively translated to PDF arrays and dictionaries. Be aware of cycles though! The translation does NOT cope with cyclic references!
The translation goes both ways: PDF dictionaries and arrays can be accessed similarly to JavaScript objects and arrays by getting and setting their properties.
- PDFObject#get(key or index)
- PDFObject#put(key or index, value)
- PDFObject#delete(key or index)
- PDFObject#put(key or index, value)
- Access dictionaries and arrays. Dictionaries and arrays can also be accessed using normal property syntax: obj.Foo = 42; delete obj.Foo; x = obj[5].
- PDFObject#resolve()
- If the object is an indirect reference, return the object it points to; otherwise return the object itself.
- PDFObject#compare(other)
- Compare the object to another one. Returns 0 on match, non-zero on mismatch. Streams always mismatch.
- PDFObject#isArray()
- PDFObject#isDictionary()
- PDFObject#forEach(function(key,value){...})
- PDFObject#isDictionary()
- Iterate over all the entries in a dictionary or array and call fun for each key-value pair.
- PDFObject#length
- Length of the array.
- PDFObject#push(item)
- Append item to the end of the array.
- PDFObject#toString()
- Returns the object as a pretty-printed string.
- PDFObject#valueOf()
- Convert primitive PDF objects to a corresponding primitive null, boolean, number or string javascript objects. Indirect PDF objects get converted to the string "R" while PDF names are converted to plain strings. PDF arrays or dictionaries are returned unchanged.
The only way to access a stream is via an indirect object, since all streams are numbered objects.
- PDFObject#isIndirect()
- Is the object an indirect reference.
- PDFObject#asIndirect()
- Return the object number the indirect reference points to.
- PDFObject#isStream()
- True if the object is an indirect reference pointing to a stream.
- PDFObject#readStream()
- Read the contents of the stream object into a Buffer.
- PDFObject#readRawStream()
- Read the raw, uncompressed, contents of the stream object into a Buffer.
- PDFObject#writeObject(obj)
- Update the object the indirect reference points to.
- PDFObject#writeStream(buffer)
- Update the contents of the stream the indirect reference points to. This will update the Length, Filter and DecodeParms automatically.
- PDFObject#writeRawStream(buffer)
- Update the contents of the stream the indirect reference points to. The buffer must contain already compressed data that matches the Filter and DecodeParms. This will update the Length automatically, but leave the Filter and DecodeParms untouched.
Primitive PDF objects such as booleans, names, and numbers can usually be treated like JavaScript values. When that is not sufficient use these functions:
- PDFObject#isNull()
- Is the object the 'null' object?
- PDFObject#isBoolean()
- Is the object a boolean?
- PDFObject#asBoolean()
- Get the boolean primitive value.
- PDFObject#isNumber()
- Is the object a number?
- PDFObject#asNumber()
- Get the number primitive value.
- PDFObject#isName()
- Is the object a name?
- PDFObject#asName()
- Get the name as a string.
- PDFObject#isString()
- Is the object a string?
- PDFObject#asString()
- Convert a "text string" to a javascript unicode string.
- PDFObject#asByteString()
- Convert a string to an array of byte values.
PDF JavaScript actions
- PDFDocument#enableJS()
- Enable interpretation of document JavaScript actions.
- PDFDocument#disableJS()
- Disable interpretation of document JavaScript actions.
- PDFDocument#isJSSupported()
- Returns true if interpretation of document JavaScript actions is supported.
- PDFDocument#setJSEventListener(listener)
- Calls method onAlert whenever a document JS action trigger an alert.
PDF journalling
- PDFDocument#enableJournal()
- Activate journalling for the document.
- PDFDocument#getJournal()
- Returns a PDF journal object, described below.
- PDFDocument#beginOperation()
- Begin a journal operation
- PDFDocument#beginImplicitOperation()
- Begin an implicit journal operation. Implicit operations are operations that happen due to other operations, e.g. updating an annotation.
- PDFDocument#endOperation()
- End a previously started normal or implicit operation. After this it can be undone/redone using the methods below.
- PDFDocument#canUndo()
- Returns true if undo is possible in this state.
- PDFDocument#canRedo()
- Returns true if redo is possible in this state.
- PDFDocument#undo()
- Move backwards in the undo history. Changes to the document after this throws away all subsequent history.
- PDFDocument#redo()
- Move forwards in the undo history.
A PDF journal object contains a numbered array of operations and a reference into this list indicating the current position.
- PDFJournal.position:
- The current position in the journal.
- PDFJournal.steps:
- An array containing the name of each step in the journal.
PDF Page Access
All page objects are structured into a page tree, which defines the order the pages appear in.
- PDFDocument#countPages()
- Number of pages in the document.
- PDFDocument#findPage(number)
- Return the page object for a page number. The first page is number zero.
- PDFDocument#findPageNumber(pageObject)
- Given a pageObject, lookup the page number in the document.
- PDFDocument#deletePage(number)
- Delete the numbered page.
- PDFDocument#insertPage(at, page)
- Insert the page object in the page tree at the location. If 'at' is -1, at the end of the document.
Pages consist of a content stream, and a resource dictionary containing all of the fonts and images used.
- PDFDocument#addPage(mediabox, rotate, resources, contents)
- Create a new page object. Note: this function does NOT add it to the page tree.
- PDFDocument#addSimpleFont(font, encoding)
- Create a PDF object from the Font object as a simple font. Encoding is either "Latin" (CP-1252), "Greek" (ISO-8859-7), or "Cyrillic" (KOI-8U). The default is "Latin".
- PDFDocument#addCJKFont(font, language, wmode, style)
- Create a PDF object from the Font object as a UTF-16 encoded CID font for the given language ("zh-Hant", "zh-Hans", "ko", or "ja"), writing mode ("H" or "V"), and style ("serif" or "sans-serif").
- PDFDocument#addFont(font)
- Create a PDF object from the Font object as an Identity-H encoded CID font.
- PDFDocument#addImage(image)
- Create a PDF object from the Image object.
- PDFDocument#loadImage(obj)
- Load an Image from a PDF object (typically an indirect reference to an image resource).
- PDFPage#process(processor)
- Run through the page contents stream and call methods on the supplied PDF processor.
- PDFPage#toPixmap(transform, colorspace, alpha, renderExtra, usage)
- Render the page into a Pixmap using the given colorspace and alpha while applying the transform. Rendering of annotations/widgets can be disabled. A page can be rendered for e.g. "View" or "Print" usage.
- PDFPage#getTransform()
- Return the transform from Fitz page space (upper left page origin, y descending, 72 dpi) to PDF user space (arbitrary page origin, y ascending, UserUnit dpi). This may
PDF Processor
A PDF processor provides callbacks that get called for each PDF operator handled by PDFPage#process(). The callbacks whose names start with op_ correspond to each PDF operator. Refer to the PDF specification for what these do and what the callback arguments are.
- PDFProcessor#push_resources(resources)
- PDFProcessor#pop_resources()
General graphics state callbacks:
- PDFProcessor#op_w(lineWidth)
- PDFProcessor#op_j(lineJoin)
- PDFProcessor#op_J(lineCap)
- PDFProcessor#op_M(miterLimit)
- PDFProcessor#op_d(dashPattern, phase)
- PDFProcessor#op_ri(intent)
- PDFProcessor#op_i(flatness)
- PDFProcessor#op_gs(name, extGState)
- PDFProcessor#op_j(lineJoin)
Special graphics state:
- PDFProcessor#op_q()
- PDFProcessor#op_Q()
- PDFProcessor#op_cm(a, b, c, d, e, f)
- PDFProcessor#op_Q()
Path construction:
- PDFProcessor#op_m(x, y)
- PDFProcessor#op_l(x, y)
- PDFProcessor#op_c(x1, y1, x2, y2, x3, y3)
- PDFProcessor#op_v(x2, y2, x3, y3)
- PDFProcessor#op_y(x1, y1, x3, y3)
- PDFProcessor#op_h()
- PDFProcessor#op_re(x, y, w, h)
- PDFProcessor#op_l(x, y)
Path painting:
- PDFProcessor#op_S()
- PDFProcessor#op_s()
- PDFProcessor#op_F()
- PDFProcessor#op_f()
- PDFProcessor#op_fstar()
- PDFProcessor#op_B()
- PDFProcessor#op_Bstar()
- PDFProcessor#op_b()
- PDFProcessor#op_bstar()
- PDFProcessor#op_n()
- PDFProcessor#op_s()
Clipping paths:
- PDFProcessor#op_W()
- PDFProcessor#op_Wstar()
Text objects:
- PDFProcessor#op_BT()
- PDFProcessor#op_ET()
Text state:
- PDFProcessor#op_Tc(charSpace)
- PDFProcessor#op_Tw(wordSpace)
- PDFProcessor#op_Tz(scale)
- PDFProcessor#op_TL(leading)
- PDFProcessor#op_Tf(name, size)
- PDFProcessor#op_Tr(render)
- PDFProcessor#op_Ts(rise)
- PDFProcessor#op_Tw(wordSpace)
Text positioning:
- PDFProcessor#op_Td(tx, ty)
- PDFProcessor#op_TD(tx, ty)
- PDFProcessor#op_Tm(a, b, c, d, e, f)
- PDFProcessor#op_Tstar()
- PDFProcessor#op_TD(tx, ty)
Text showing:
- PDFProcessor#op_TJ(textArray) number/string
- PDFProcessor#op_Tj(stringOrByteArray)
- PDFProcessor#op_squote(stringOrByteArray)
- PDFProcessor#op_dquote(wordSpace, charSpace, stringOrByteArray)
- PDFProcessor#op_Tj(stringOrByteArray)
Type 3 fonts:
- PDFProcessor#op_d0(wx, wy)
- PDFProcessor#op_d1(wx, wy, llx, lly, urx, ury)
Color:
- PDFProcessor#op_CS(name, colorspace)
- PDFProcessor#op_cs(name, colorspace)
- PDFProcessor#op_SC_color(color)
- PDFProcessor#op_sc_color(color)
- PDFProcessor#op_SC_pattern(name, patternID, color)
- PDFProcessor#op_cs(name, colorspace)
- API not settled, arguments may change in the future.
- PDFProcessor#op_sc_pattern(name, patternID, color)
- API not settled, arguments may change in the future.
- PDFProcessor#op_SC_shade(name, shade)
- API not settled, arguments may change in the future.
- PDFProcessor#op_sc_shade(name, shade)
- API not settled, arguments may change in the future.
- PDFProcessor#op_G(gray)
- PDFProcessor#op_g(gray)
- PDFProcessor#op_RG(r, g, b)
- PDFProcessor#op_rg(r, g, b)
- PDFProcessor#op_K(c, m, y, k)
- PDFProcessor#op_k(c, m, y, k)
- PDFProcessor#op_g(gray)
Shadings, images and XObjects:
- PDFProcessor#op_BI(image, colorspace)
- PDFProcessor#op_sh(name, shade)
- API not settled, arguments may change in the future.
- PDFProcessor#op_Do_image(name, image)
- PDFProcessor#op_Do_form(xobject, resources)
Marked content:
- PDFProcessor#op_MP(tag)
- PDFProcessor#op_DP(tag, raw)
- PDFProcessor#op_BMC(tag)
- PDFProcessor#op_BDC(tag, raw)
- PDFProcessor#op_EMC()
- PDFProcessor#op_DP(tag, raw)
Compatibility:
- PDFProcessor#op_BX()
- PDFProcessor#op_EX()
Embedded files in PDFs
After embedding a file into a PDF, it can be connected to an annotation using PDFAnnotation#setFilespec()
- PDFDocument#addEmbeddedFile(filename, mimetype, contents, creationDate, modificationDate, addChecksum)
- Embedded a file into the document, along with its name, mimetype, creation and modification dates. If a checksum is the file contents can be verified later. An indirect reference to the filespec PDF object is returned.
- PDFDocument#getEmbeddedFileParams(filespecObject)
- Return an EmbeddedFileParams describing the file referenced by the filespec object.
- PDFDocument#getEmbeddedFileContents(filespecObject)
- Returns a Buffer with the contents of the embedded file referenced by the filespec object.
- PDFDocument#verifyEmbeddedFileChecksum(filespecObject)
- Verify the MD5 checksum of the embedded file contents.
An EmbeddedFileParams object contain metadata about an embedded file.
- EmbeddedFileParams.filename:
- The name of the embedded file.
- EmbeddedFileParams.mimetype:
- The MIME type of the embedded file, or undefined if none exists.
- EmbeddedFileParams.size:
- The size in bytes of the embedded file contents.
- EmbeddedFileParams.creationDate:
- The creation date of the embedded file.
- EmbeddedFileParams.modificationDate:
- The modification date of the embedded file.
OutlineIterator
An outline iterator is can be used to walk over all the items in an Outline and query their properties. To be able to insert items at the end of a list of sibling items, it can also walk one item past the end of the list.
- OutlineIterator#item()
- Return an OutlineIteratorItem or undefined if out of range.
- OutlineIterator#next()
- OutlineIterator#prev()
- OutlineIterator#up()
- OutlineIterator#down()
- OutlineIterator#prev()
- Move the iterator position next/prev/up/down. Returns 0 if the new position has a valid item, or 1 if the position contains no valid item, but one may be inserted at this position.
- OutlineIterator#insert(item)
- Insert item before the current item. The position does not change. Returns 0 if the position has a valid item, or 1 if the position has no valid item.
- OutlineIterator#delete()
- Delete the current item. This implicitly moves to the next item. Returns 0 if the new position has a valid item, or 1 if the position contains no valid item, but one may be inserted at this position.
- OutlineIterator#update(item)
- Updates the current item with the properties of the supplied item.
An OutlineIteratorItem is a dictionary with keys for:
- OutlineIteratorItem.title:
- The title of the item.
- OutlineIteratorItem.uri:
- A uri pointing to the destination. Likely to be a document internal link that can be resolved by Document#resolveLink(), otherwise a link to a web page.
- OutlineIteratorItem.open:
- Returns true if the item should be opened when shown in a tree view.
PDF Annotations
PDFAnnotations belong to a specific PDFPage and may be created/changed/removed. Because annotation appearances may change (for several reasons) it is possible to scan through the annotations on a page and query them whether a re-render is necessary. Finally redaction annotations can be applied to a PDFPage, destructively removing content from the page.
- PDFPage#getAnnotations()
- Return array of all annotations on the page.
- PDFPage#createAnnotation(type)
- Create a new blank annotation of a given annotation type: "Text", "Link", "FreeText", "Line", "Square", "Circle", "Polygon", "PolyLine", "Highlight", "Underline", "Squiggly", "StrikeOut", "Redact", "Stamp", "Caret", "Ink", "Popup", "FileAttachment", "Sound", "Movie", "RichMedia", "Widget", "Screen", "PrinterMark", "TrapNet", "Watermark", "3D" or "Projection".
- PDFPage#deleteAnnotation(annot)
- Delete the annotation from the page.
- PDFPage#update()
- Loop through all annotations of the page and update them. Returns true if re-rendering is needed because at least one annotation was changed (due to either events or javascript actions or annotation editing).
- PDFPage#applyRedactions(blackboxes, imagemethod)
- Apply redaction annotations to the page. Should black boxes be drawn at each redaction or not? Should affected images be ignored, entirely redacted or should just the covered part of the image be redacted?
These are general methods to interact with annotations:
- PDFAnnotation#bound()
- Returns a rectangle containing the location and dimension of the annotation.
- PDFAnnotation#run(device, transform)
- Calls the device functions to draw the annotation
- PDFAnnotation#toPixmap(transform, colorspace, alpha)
- Render the annotation into a Pixmap, using the transform and colorspace.
- PDFAnnotation#toDisplayList()
- Record the contents of the annotation into a DisplayList.
- PDFAnnotation#getObject()
- Get the underlying PDF object for an annotation.
- PDFAnnotation#process(processor)
- Run through the annotation appearance stream and call methods on the supplied PDF processor.
- PDFAnnotation#setAppearance(appearance, state, transform, displaylist)
- PDFAnnotation#setAppearance(appearance, state, transform, bbox, resources, contents)
- Set the annotation appearance stream for the given appearance ("N", "R" or "D") and state (may be e.g. "On", "Off"). The desired appearance is given as a transform along with either a display list or a bounding box, a PDF dictionary of resources and a content stream.
- PDFAnnotation#update()
- Update the appearance stream to account for changes in the annotation.
- PDFAnnotation#getHot(), #setHot(hot)
- Get/set the annotation as being hot, i.e. that the pointer is hovering over the annotation.
- PDFAnnotation#getHiddenForEditing(), #setHiddenForEditing(hidden)
- Get/set a special annotation hidden flag for editing. This flag prevents the annotation from being rendered.
These properties are available for all annotation types.
- PDFAnnotation#getType()
- Return the annotation type.
- PDFAnnotation#getFlags(), #setFlags(flags)
- Get/set the annotation flags.
- PDFAnnotation#getContents(), #setContents(text)
- Get/set the annotation contents.
- PDFAnnotation#getBorder(), #setBorder(width)
- Get/set the annotation border line width in points. Use PDFAnnotation#setBorderWidth() to avoid removing the border effect.
- PDFAnnotation#getColor(), #setColor(color)
- Get/set the annotation color, represented as an array of up to 4 component values.
- PDFAnnotation#getOpacity(), #setOpacity(opacity)
- Get/set the annotation opacity.
- PDFAnnotation#getCreationDate(), #setCreationDate(milliseconds)
- Get the annotation creation date as a Date object. Set it to the number of milliseconds since the epoch.
- PDFAnnotation#getModificationDate(), #setModificationDate(milliseconds)
- Get the annotation modification date as a Date object. Set it to the number of milliseconds since the epoch.
- PDFAnnotation#getQuadding(), #setQuadding(value)
- Get/set the annotation quadding (justification), 0 for left-justified, 1 for centered, 2 for right-justified.
- PDFAnnotation#getLanguage(), #setLanguage(language)
- Get/Set the annotation language (or get inherit the document language).
These properties are only present for some annotation types, so support for them must be checked before use.
- PDFAnnotation#getRect(), #setRect(rect)
- Get/set the annotation bounding box.
- PDFAnnotation#getDefaultAppearance(), #setDefaultAppearance(font, size, color)
- Get/Set the default text appearance used for free text annotations.
- PDFAnnotation#hasInteriorColor(), #getInteriorColor(), #setInteriorColor(color)
- Check for support/get/set the annotation interior color, represented as an array of up to 4 component values.
- PDFAnnotation#hasAuthor(), #getAuthor(), #setAuthor(author)
- Check for support/get/set the annotation author.
- PDFAnnotation#hasLineEndingStyles(), #getLineEndingStyles(), #setLineEndingStyles(start, end)
- Check for support/get/set the annotation line ending styles, either of "None", "Square", "Circle", "Diamond", "OpenArrow", "ClosedArrow", "Butt", "ROpenArrow", "RClosedArrow" or "Slash".
- PDFAnnotation#hasIcon(), #getIcon(), #setIcon(iconname)
- Check for support/get/set annotation icon. Standard icons names for:
- File attachment annotations:
- "Graph", "PaperClip", "PushPin" and "Tag".
- Sound annotations:
- "Mic" and "Speaker".
- Stamp annotations:
- "Approved", "AsIs", "Confidential", "Departmental", "Draft", "Experimental", "Expired", "Final", "ForComment", "ForPublicRelease", "NotApproved", "NorForPublicRelease", "Sold" and "TopSecret".
- Text annotations:
- "Comment", "Help", "Insert", "Key", "NewParagraph", "Note" and "Paragraph".
- PDFAnnotation#hasLine(), #getLine(), #setLine(endpoints)
- Check for support/get/set line end points, represented by an array of to points, each represented as an [x, y] array.
- PDFAnnotation#hasOpen(), #isOpen(), #setIsOpen(state)
- Check for support/get/set annotation open state, represented as a boolean.
- PDFAnnotation#hasFilespec(), #getFilespec(), #setFilespec(filespecObject)
- Check for support/get/set annotation file specification, represented by a PDF object.
The border drawn around some annotation can be controlled by:
- PDFAnnotation#hasBorder(), #getBorderStyle(), #setBorderStyle(style)
- Check for support/get/set the annotation border style, either of "Solid" or "Dashed".
- PDFAnnotation#getBorderWidth(), #setBorderWidth(width)
- Get/set the border width in points. Retain any existing border effects.
- PDFAnnotation#getBorderDashCount()
- Returns the number of items in the border dash pattern.
- PDFAnnotation#getBorderDashItem(i)
- Returns the length of dash pattern item i.
- PDFAnnotation#setBorderDashPattern(dashpattern)
- Set the annotation border dash pattern to the given array of dash item lengths.
- PDFAnnotation#clearBorderDash()
- Clear the entire border dash pattern for an annotation.
- PDFAnnotation#addBorderDashItem(length)
- Append an item (of the given length) to the end of the border dash pattern.
Annotations that have a border effect allows the effect to be controlled by:
- PDFAnnotation#hasBorderEffect(), #getBorderEffect(), #setBorderEffect(effect)
- Check for support/get/set the annotation border effect, either of "None" or "Cloudy".
- PDFAnnotation#getBorderEffectIntensity(), #setBorderEffectIntensity(intensity)
- Get/set the annotation border effect intensity. Recommended values are between 0 and 2 inclusive.
Ink annotations consist of a number of strokes, each consisting of a sequence of vertices between which a smooth line will be drawn. These can be controlled by:
- PDFAnnotation#hasInkList(), #getInkList(), #setInkList(inkList)
- Check for support/get/set the annotation ink list, represented as an array of strokes, each an array of points each an array of its X/Y coordinates.
- PDFAnnotation#clearInkList()
- Clear the list of ink strokes for the annotation.
- PDFAnnotation#addInkList(stroke)
- To the list of strokes, append a stroke, represented as an array of vertices each an array of its X/Y coordinates.
- PDFAnnotation#addInkListStroke()
- Add a new empty stroke to the ink annotation.
- PDFAnnotation#addInkListStrokeVertex(vertex)
- Append a vertex to end of the last stroke in the ink annotation. The vertex is an array of its X/Y coordinates.
Text markup and redaction annotations consist of a set of quadadrilaterals controlled by:
- PDFAnnotation#hasQuadPoints(), #getQuadPoints(), #setQuadPoints(quadPoints)
- Check for support/get/set the annotation quadpoints, describing the areas affected by text markup annotations and link annotations.
- PDFAnnotation#clearQuadPoints()
- Clear the list of quad points for the annotation.
- PDFAnnotation#addQuadPoint(quadpoint)
- Append a single quad point as an array of 8 elements, where each pair are the X/Y coordinates of a corner of the quad.
Polygon and polyline annotations consist of a sequence of vertices with a straight line between them. Those can be controlled by:
- PDFAnnotation#hasVertices(), #getVertices(), #setVertices(vertices)
- Check for support/get/set the annotation vertices, represented as an array of vertices each an array of its X/Y coordinates.
- PDFAnnotation#clearVertices()
- Clear the list of vertices for the annotation.
- PDFAnnotation#addVertex(vertex)
- Append a single vertex as an array of its X/Y coordinates.
Stamp annotations have the option to set a custom image as its appearance.
- PDFAnnotation#setStampImage(image)
- Set a custom image appearance for a stamp annotation.
PDF Widgets
- PDFPage#getWidgets()
- Return array of all widgets on the page.
- PDFWidget#getFieldType()
- Return string indicating type of widget: "button", "checkbox", "combobox", "listbox", "radiobutton", "signature" or "text".
- PDFWidget#getFieldFlags()
- Return the field flags. Refer to the PDF specification for their meanings.
- PDFWidget#getRect(), #setRect(rect)
- Get/set the widget bounding box.
- PDFWidget#getMaxLen()
- Get maximum allowed length of the string value.
- PDFWidget#getValue(), #setTextValue(value), #setChoiceValue(value)
- Get/set the widget string value.
- PDFWidget#toggle()
- Toggle the state of the widget, returns 1 if the state changed.
- PDFWidget#getOptions()
- Returns an array with one text string describing the state of each kid of radio button/checkbox field.
- PDFWidget#layoutTextWidget()
- Layout the value of a text widget. Returns a TextLayout object described below.
- PDFWidget#isReadOnly()
- If the value is readonly and the widget cannot be interacted with.
- PDFWidget#getLabel()
- Get the field name as a string.
- PDFWidget#getEditingState(), #setEditingState()
- Get/set whether the widget is in editing state. When in editing state any changes to the widget value will not cause any side-effects such as changing other widgets or running javascript. This is intended for e.g. when a text widget is interactively having characters typed into it. Once editing is finished the state should reverted back, before updating the widget value again.
- PDFWidget#update()
- Update the appearance stream to account for changes to the widget.
- PDFWidget#isSigned()
- Returns true if the signature is signed.
- PDFWidget#validateSignature()
- Returns number of updates ago when signature became invalid. Returns 0 is signature is still valid, 1 if it became invalid during the last save, etc.
- PDFWidget#checkCertificate()
- Returns "OK" if signature checked out OK, otherwise a text string containing an error message, e.g. "Self-signed certificate." or "Signature invalidated by change to document.", etc.
- PDFWidget#getSignatory()
- Returns a text string with the distinguished name from a signed signature, or a text string with an error message.
- PDFWidget#previewSignature(signer, signatureConfig, image, reason, location)
- Return a Pixmap preview of what the signature would look like if signed with the given configuration. Reason and location may be undefined, in which case they are not shown. The signature configuration is described below.
- PDFWidget#sign(signer, signatureConfig, image, reason, location)
- Sign the signature with the given configuration. Reason and location may be undefined, in which case they are not shown. The signature configuration is described below.
- PDFWidget#clearSignature()
- Clear a signed signature, making it unsigned again.
- new PDFPKCS7Signer(filename, password)
- Read a certificate and private key from a pfx file and create a signer to hold this information. Used with PDFWidget#sign().
A signature configuration of what to include in the signature appearance:
- showLabels:
- Whether to include both labels and values or just values on the right hand side.
- showDN:
- Whether to include the distinguished name on the right hand side.
- showTextName:
- Whether to include the name of the signatory on the right hand side.
- showDate:
- Whether to include the date of signing on the right hand side.
- showGraphicName:
- Whether to include the signatory name on the left hand side.
- showLogo:
- Whether to include the MuPDF logo in the background.
- PDFWidget#eventEnter()
- Trigger the event when the pointing device enter a widget's active area.
- PDFWidget#eventExit()
- Trigger the event when the pointing device exits a widget's active area.
- PDFWidget#eventDown()
- Trigger the event when the pointing device's button is depressed within a widget's active area.
- PDFWidget#eventUp()
- Trigger the event when the pointing device's button is released within a widget's active area.
- PDFWidget#eventFocus()
- Trigger the event when the a widget gains input focus.
- PDFWidget#eventBlur()
- Trigger the event when the a widget loses input focus.
A description of layouted text value from a text widget with keys:
- TextLayout.matrix:
- TextLayout.invMatrix:
- Normal and inverted transform matrices for the layouted text.
- TextLayout.lines:
- An array of text lines belonging to the layouted text:
- lines.x:
- lines.y:
- The coordinate for the text line.
- lines.fontSize:
- The text size used for the layouted text line.
- lines.index:
- The index of the beginning of the line in the text string.
- lines.rect:
- The bounding rectangle for the text line.
- lines.chars:
- An array of characters in the text line:
- chars.x:
- chars.advance:
- The position and advance of the character.
- chars.index:
- The index of the character in the text string.
- chars.rect:
- The bounding rectangle for the character.
- chars.x:
- lines.x:
- new Archive(path)
- Create a new archive based either on a tar- or zip-file or the contents of a directory.
- Archive#getFormat()
- Returns a string describing the archive format.
- Archive#countEntries()
- Returns the number of entries in the archive.
- Archive#listEntry(idx)
- Returns the name of entry number idx in the archive.
- Archive#hasEntry(name)
- Returns true if an entry of the given name exists in the archive.
- Archive#readEntry(name)
- Returns the contents of the entry of the given name.
- new MultiArchive()
- Create a new empty multi archive.
- MultiArchive#mountArchive(subArchive, path)
- Add an archive to the set of archives handled by a multi archive. If path is null, the subArchive contents appear at the top-level, otherwise they will appear prefixed by the string path.
- new TreeArchive()
- Create a new empty tree archive.
- TreeArchive#add(name, buffer)
- Add a named buffer to a tree archive.
Story
- new Story(contents, user_css, em, archive)
- Create a new story with the given contents, formatted according to the provided user css and em size, and an archive to lookup images, etc.
- Story#document()
- Return a DOM for an unplaced story. This allows adding content before placing the story.
- Story#place(rect)
- Place (or continue placing) a story into the supplied rectangle, returning a PlacementResult described below. Call Story#draw() to draw the placed content before calling Story#place() again to continue placing remaining content.
- Story#draw(device, matrix)
- Draw the placed story to the given device with the given transform.
- PlacementResult.filled:
- The rectangle of the actual area that was used.
- PlacementResult.more:
- True if more content remains to be placed, otherwise false if all content fit in the story.
DOM
- DOM#body()
- Return a DOM for the body element.
- DOM#documentElement()
- Return a DOM for the top level element.
- DOM#createElement(tag)
- Create an element with the given tag type, but do not link it into the DOM yet.
- DOM#createTextMode(text)
- Create a text node with the given text contents, but do not link it into the DOM yet.
- DOM#find(tag, attribute, value)
- Find the element matching the tag, attribute and value. Set either of those to null to match anything.
- DOM#findNext(tag, attribute, value)
- Find the next element matching the tag, attribute and value. Set either of those to null to match anything.
- DOM#appendChild(dom, childdom)
- Insert an element as the last child of a parent, unlinking the child from its current position if required.
- DOM#insertBefore(dom, elementDom)
- Insert an element before this element, unlinking the new element from its current position if required.
- DOM#insertAfter(dom, elementDom)
- Insert an element after this element, unlinking the new element from its current position if required.
- DOM#remove()
- Remove an element from the DOM. The element can be added back elsewhere if required.
- DOM#clone()
- Clone an element (and its children). The clone is not yet linked into the DOM.
- DOM#firstChild()
- Return the first child of the element as a DOM, or null if no child exist.
- DOM#parent()
- Return the parent of the element as a DOM, or null if no parent exists.
- DOM#next()
- Return the next element as a DOM, or null if no such element exists.
- DOM#previous()
- Return the previous element as a DOM, or null if no such element exists.
- DOM#addAttribute(attribute, value)
- Add attribute with the given value, returns the updated element as a DOM.
- DOM#removeAttribute(attribute)
- Remove the specified attribute from the element.
- DOM#attribute(attribute)
- Return the element's attribute value as a string, or null if no such attribute exists.
- DOM#getAttributes()
- Returns a dictionary object with properties and their values corresponding to the element's attributes and their values.
TODO
There are areas in MuPDF that still need bindings to access from JavaScript:
- Shadings