Table of Contents

Class PdfPageText

Namespace
nebulae.dotPDFium
Assembly
dotPDFium.dll
public class PdfPageText : PdfObject, IDisposable
Inheritance
PdfPageText
Implements
Inherited Members

Properties

CountChars

Gets the total number of characters in the associated PDF text.

public int CountChars { get; }

Property Value

int

WebLinkCount

Gets the number of web links associated with the current PDF text object.

public int WebLinkCount { get; }

Property Value

int

Methods

Releases resources associated with the web links and resets the handle.

public void CloseWebLinks()

This method should be called to clean up resources when web link processing is no longer needed. Failing to call this method may result in resource leaks.

CountRects(int, int)

Counts how many bounding rectangles exist for a specific range of characters in the associated PDF text.

public int CountRects(int startIndex, int count)

Parameters

startIndex int

The starting index of the character to begin counting bounding rectangles

count int

The count of characters to count bounding rectangles for

Returns

int

The number of bounding rectangles that exist for the specified range of characters

Exceptions

ObjectDisposedException

Dispose(bool)

Dispose method for the PdfText object. This method is called when the object is disposed.

protected override void Dispose(bool disposing)

Parameters

disposing bool

Whether to dispose of managed resources

Find(string, bool, bool, int)

Searches for the specified text in the associated PDF text. The search can be case-sensitive and/or whole-word only.

public PdfTextSearch Find(string searchTerm, bool matchCase = false, bool matchWholeWord = false, int startIndex = 0)

Parameters

searchTerm string

The term to search for

matchCase bool

A boolean indicating whether or not the search should be case-sensitive

matchWholeWord bool

Flag indicating whether or not the search whould match the whole-word only

startIndex int

The index of the character to begin searching at

Returns

PdfTextSearch

A PdfTextSearch result

Exceptions

ArgumentException

Throws if the search term is null or empty

dotPDFiumException

Throws on a PDFium library error

GetChar(int)

Gets the character at the specified index in the associated PDF text.

public uint GetChar(int index)

Parameters

index int

Returns

uint

Exceptions

ObjectDisposedException
ArgumentOutOfRangeException

GetCharAngle(int)

Retrieves the rotation angle of the character at the specified index within the text object.

public float GetCharAngle(int index)

Parameters

index int

The zero-based index of the character whose rotation angle is to be retrieved.
Must be within the range of 0 to the total number of characters minus one.

Returns

float

The rotation angle of the character at the specified index, in degrees.

Exceptions

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetCharBox(int, out double, out double, out double, out double)

Gets the bounding box of the character at the specified index in the associated PDF text.

public bool GetCharBox(int index, out double left, out double right, out double bottom, out double top)

Parameters

index int

The index of the character

left double

The out paramater for the left dimension

right double

The out parameter for the right dimension

bottom double

The out parameter for the bottom dimension

top double

The out parameter for the top dimension

Returns

bool

The bounding box of the chacter in the out parameters

Exceptions

ObjectDisposedException

Throws if the PdfText object has been disposed

ArgumentOutOfRangeException

Throws if the index is out of range

GetCharIndexAtPos(double, double, double, double)

Gets the index of the character located at the specified position within the text.

public int GetCharIndexAtPos(double x, double y, double xTolerance = 2, double yTolerance = 2)

Parameters

x double

The x-coordinate of the position to check, in device-independent points.

y double

The y-coordinate of the position to check, in device-independent points.

xTolerance double

The horizontal tolerance, in device-independent points, for matching the position to a character. Defaults to 2.0.

yTolerance double

The vertical tolerance, in device-independent points, for matching the position to a character. Defaults to 2.0.

Returns

int

The zero-based index of the character at the specified position, or -1 if no character is found within the specified tolerances.

Exceptions

ObjectDisposedException

Thrown if the underlying text object has been disposed.

GetCharIndexFromTextIndex(int)

Converts a text index to the corresponding character index within the document.

public int GetCharIndexFromTextIndex(int textIndex)

Parameters

textIndex int

The zero-based index of the text element to be converted. Must be a valid index within the document's text content.

Returns

int

The zero-based character index corresponding to the specified text index.

Remarks

This method maps a logical text index to its associated character index, which can be used for further text processing. Ensure that the provided textIndex is within the bounds of the document's text content to avoid exceptions.

GetCharOrigin(int, out double, out double)

public bool GetCharOrigin(int index, out double x, out double y)

Parameters

index int
x double
y double

Returns

bool

GetFillColor(int)

Retrieves the fill color of the character at the specified index.

public RgbaColor? GetFillColor(int index)

Parameters

index int

The zero-based index of the character whose fill color is to be retrieved.

Returns

RgbaColor?

An RgbaColor representing the fill color of the character, or null if the fill color is not available.

Exceptions

ObjectDisposedException

Thrown if the object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetFontInfo(int)

Retrieves font information for the character at the specified index.

public PdfFontInfo? GetFontInfo(int index)

Parameters

index int

The zero-based index of the character for which to retrieve font information. Must be within the range of available characters.

Returns

PdfFontInfo

A PdfFontInfo object containing the font name and style flags for the specified character, or null if the font information cannot be retrieved. Flags can be checked using the PdfFontFlags enum and .HasFlag().

Exceptions

ObjectDisposedException

Thrown if the PdfPageText object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetFontSize(int)

Retrieves the font size of the character at the specified index in the text.

public double GetFontSize(int index)

Parameters

index int

The zero-based index of the character whose font size is to be retrieved. Must be within the range of 0 to the total character count minus one.

Returns

double

The font size of the character at the specified index, expressed as a double.

Exceptions

ObjectDisposedException

Thrown if the underlying text object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total character count.

GetFontWeight(int)

Retrieves the font weight of the character at the specified index.

public int GetFontWeight(int index)

Parameters

index int

The zero-based index of the character whose font weight is to be retrieved. Must be within the range of available characters.

Returns

int

An integer representing the font weight of the specified character. The value corresponds to the font weight as defined in the PDF document.

Exceptions

ObjectDisposedException

Thrown if the underlying PDF text object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetLooseCharBox(int)

Retrieves the loose bounding box of a character at the specified index within the text.

public FsRectF? GetLooseCharBox(int index)

Parameters

index int

The zero-based index of the character whose bounding box is to be retrieved.

Returns

FsRectF?

A FsRectF structure representing the loose bounding box of the character if successful; otherwise, null.

Remarks

The loose bounding box may include additional space around the character, such as padding or spacing, depending on the font and rendering context.

Exceptions

ObjectDisposedException

Thrown if the text object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetMatrix(int)

Retrieves the transformation matrix for the character at the specified index.

public FsMatrix? GetMatrix(int index)

Parameters

index int

The zero-based index of the character for which to retrieve the transformation matrix.

Returns

FsMatrix?

An FsMatrix representing the transformation matrix of the character if the operation is successful; otherwise, null.

Remarks

The transformation matrix describes how the character is positioned and scaled within the document.

Exceptions

ObjectDisposedException

Thrown if the object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetStrokeColor(int)

Retrieves the stroke color of the character at the specified index.

public RgbaColor? GetStrokeColor(int index)

Parameters

index int

The zero-based index of the character whose stroke color is to be retrieved.

Returns

RgbaColor?

An RgbaColor representing the stroke color of the character, or null if the stroke color could not be determined.

Exceptions

ObjectDisposedException

Thrown if the object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

GetTextIndexFromCharIndex(int)

Converts a character index to the corresponding text index in the document.

public int GetTextIndexFromCharIndex(int charIndex)

Parameters

charIndex int

The zero-based index of the character within the text content.

Returns

int

The zero-based text index corresponding to the specified character index.

Remarks

This method maps a character index to its equivalent text index, which may differ depending on the document's internal representation of text. Ensure that charIndex is within the valid range of the text content to avoid exceptions.

GetTextObject(int)

Retrieves the text object at the specified index within the PDF text.

public PdfTextObject? GetTextObject(int index)

Parameters

index int

The zero-based index of the text object to retrieve. Must be within the range of available text objects.

Returns

PdfTextObject

A PdfTextObject representing the text object at the specified index, or null if no text object exists at the specified index.

Exceptions

ObjectDisposedException

Thrown if the current PdfPageText instance has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of text objects.

GetTextRange(int, int)

Returns the chacters from the specified index and count from the associated PDF text.

public string GetTextRange(int index, int count)

Parameters

index int

The starting character index

count int

The number of characters to return

Returns

string

A string from the start charcter and reading the specified number of characters

Exceptions

ObjectDisposedException

Throws if the PdfText object has been disposed

ArgumentOutOfRangeException

Throws if the starting index or the ending index are out of bounds

HasUnicodeMapError(int)

Determines whether the character at the specified index has a Unicode mapping error.

public bool HasUnicodeMapError(int index)

Parameters

index int

The zero-based index of the character to check. Must be within the range of valid character indices.

Returns

bool

true if the character at the specified index has a Unicode mapping error; otherwise, false.

Exceptions

ObjectDisposedException

Thrown if the object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total character count.

IsGenerated(int)

Determines whether the character at the specified index is a generated character.

public bool IsGenerated(int index)

Parameters

index int

The zero-based index of the character to check. Must be within the valid range of characters.

Returns

bool

true if the character at the specified index is a generated character; otherwise, false.

Exceptions

ObjectDisposedException

Thrown if the underlying object has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total character count.

IsHyphen(int)

Determines whether the character at the specified index is a hyphen.

public bool IsHyphen(int index)

Parameters

index int

The zero-based index of the character to check. Must be within the range of valid character indices.

Returns

bool

true if the character at the specified index is a hyphen; otherwise, false.

Exceptions

ObjectDisposedException

Thrown if the underlying text resource has been disposed.

ArgumentOutOfRangeException

Thrown if index is less than 0 or greater than or equal to the total number of characters.

Loads the web links associated with the current PDF document.

public void LoadWebLinks()

This method initializes the web link handle for the PDF document if it has not already been loaded. If the web link handle is already initialized, the method returns without performing any action. If the operation fails, an exception is thrown.

Exceptions

dotPDFiumException

Thrown if the web links cannot be loaded successfully.

TryCountRects(int, int, int)

Counts how many bounding rectangles exist for a specific range of characters in the associated PDF text.

public bool TryCountRects(int startIndex, int count, int rects)

Parameters

startIndex int

The starting index of the character to begin counting bounding rectangles

count int

The count of characters to count bounding rectangles for

rects int

The out parameter to hold The number of bounding rectangles that exist for the specified range of characters

Returns

bool

true on success, false on failure

TryGetCharBox(int, out double, out double, out double, out double)

public bool TryGetCharBox(int index, out double left, out double right, out double bottom, out double top)

Parameters

index int
left double
right double
bottom double
top double

Returns

bool

TryGetCharIndexAtPos(double, double, out int, double, double)

Attempts to retrieve the character index at the specified position within the document.

public bool TryGetCharIndexAtPos(double x, double y, out int index, double xTolerance = 2, double yTolerance = 2)

Parameters

x double

The x-coordinate of the position, in device-independent points.

y double

The y-coordinate of the position, in device-independent points.

index int

When this method returns, contains the zero-based index of the character at the specified position, if the operation succeeds. If the operation fails, this will be set to 0.

xTolerance double

The horizontal tolerance, in device-independent points, for determining the character at the position. The default value is 2.0.

yTolerance double

The vertical tolerance, in device-independent points, for determining the character at the position. The default value is 2.0.

Returns

bool

true if a character index was successfully retrieved at the specified position; otherwise, false.

Remarks

This method returns false if the underlying document handle is invalid or if no character is found at the specified position within the given tolerances.

TryGetCharOrigin(int, out double, out double)

public bool TryGetCharOrigin(int index, out double x, out double y)

Parameters

index int
x double
y double

Returns

bool

TryGetTextRange(int, int, out string)

Returns the chacters from the specified index and count from the associated PDF text.

public bool TryGetTextRange(int index, int count, out string text)

Parameters

index int

The starting character index

count int

The number of characters to return

text string

Returns

bool

A string from the start charcter and reading the specified number of characters and true on success, false on failure