Class PdfPageText
public class PdfPageText : PdfObject, IDisposable
- Inheritance
-
PdfPageText
- Implements
- Inherited Members
Properties
CountChars
Gets the total number of characters in the associated PDF text.
public int CountChars { get; }
Property Value
WebLinkCount
Gets the number of web links associated with the current PDF text object.
public int WebLinkCount { get; }
Property Value
Methods
CloseWebLinks()
Releases resources associated with the web links and resets the handle.
public void CloseWebLinks()
Remarks
This method should be called to clean up resources when web link processing is no longer needed. Failing to call this method may result in resource leaks.
CountRects(int, int)
Counts how many bounding rectangles exist for a specific range of characters in the associated PDF text.
public int CountRects(int startIndex, int count)
Parameters
startIndex
intThe starting index of the character to begin counting bounding rectangles
count
intThe count of characters to count bounding rectangles for
Returns
- int
The number of bounding rectangles that exist for the specified range of characters
Exceptions
Dispose(bool)
Dispose method for the PdfText object. This method is called when the object is disposed.
protected override void Dispose(bool disposing)
Parameters
disposing
boolWhether to dispose of managed resources
Find(string, bool, bool, int)
Searches for the specified text in the associated PDF text. The search can be case-sensitive and/or whole-word only.
public PdfTextSearch Find(string searchTerm, bool matchCase = false, bool matchWholeWord = false, int startIndex = 0)
Parameters
searchTerm
stringThe term to search for
matchCase
boolA boolean indicating whether or not the search should be case-sensitive
matchWholeWord
boolFlag indicating whether or not the search whould match the whole-word only
startIndex
intThe index of the character to begin searching at
Returns
- PdfTextSearch
A PdfTextSearch result
Exceptions
- ArgumentException
Throws if the search term is null or empty
- dotPDFiumException
Throws on a PDFium library error
GetChar(int)
Gets the character at the specified index in the associated PDF text.
public uint GetChar(int index)
Parameters
index
int
Returns
Exceptions
GetCharAngle(int)
Retrieves the rotation angle of the character at the specified index within the text object.
public float GetCharAngle(int index)
Parameters
index
intThe zero-based index of the character whose rotation angle is to be retrieved.
Must be within the range of 0 to the total number of characters minus one.
Returns
- float
The rotation angle of the character at the specified index, in degrees.
Exceptions
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetCharBox(int, out double, out double, out double, out double)
Gets the bounding box of the character at the specified index in the associated PDF text.
public bool GetCharBox(int index, out double left, out double right, out double bottom, out double top)
Parameters
index
intThe index of the character
left
doubleThe out paramater for the left dimension
right
doubleThe out parameter for the right dimension
bottom
doubleThe out parameter for the bottom dimension
top
doubleThe out parameter for the top dimension
Returns
- bool
The bounding box of the chacter in the out parameters
Exceptions
- ObjectDisposedException
Throws if the PdfText object has been disposed
- ArgumentOutOfRangeException
Throws if the index is out of range
GetCharIndexAtPos(double, double, double, double)
Gets the index of the character located at the specified position within the text.
public int GetCharIndexAtPos(double x, double y, double xTolerance = 2, double yTolerance = 2)
Parameters
x
doubleThe x-coordinate of the position to check, in device-independent points.
y
doubleThe y-coordinate of the position to check, in device-independent points.
xTolerance
doubleThe horizontal tolerance, in device-independent points, for matching the position to a character. Defaults to 2.0.
yTolerance
doubleThe vertical tolerance, in device-independent points, for matching the position to a character. Defaults to 2.0.
Returns
- int
The zero-based index of the character at the specified position, or -1 if no character is found within the specified tolerances.
Exceptions
- ObjectDisposedException
Thrown if the underlying text object has been disposed.
GetCharIndexFromTextIndex(int)
Converts a text index to the corresponding character index within the document.
public int GetCharIndexFromTextIndex(int textIndex)
Parameters
textIndex
intThe zero-based index of the text element to be converted. Must be a valid index within the document's text content.
Returns
- int
The zero-based character index corresponding to the specified text index.
Remarks
This method maps a logical text index to its associated character index, which can be used
for further text processing. Ensure that the provided textIndex
is within the bounds of the
document's text content to avoid exceptions.
GetCharOrigin(int, out double, out double)
public bool GetCharOrigin(int index, out double x, out double y)
Parameters
Returns
GetFillColor(int)
Retrieves the fill color of the character at the specified index.
public RgbaColor? GetFillColor(int index)
Parameters
index
intThe zero-based index of the character whose fill color is to be retrieved.
Returns
- RgbaColor?
An RgbaColor representing the fill color of the character, or null if the fill color is not available.
Exceptions
- ObjectDisposedException
Thrown if the object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetFontInfo(int)
Retrieves font information for the character at the specified index.
public PdfFontInfo? GetFontInfo(int index)
Parameters
index
intThe zero-based index of the character for which to retrieve font information. Must be within the range of available characters.
Returns
- PdfFontInfo
A PdfFontInfo object containing the font name and style flags for the specified character, or null if the font information cannot be retrieved. Flags can be checked using the PdfFontFlags enum and .HasFlag().
Exceptions
- ObjectDisposedException
Thrown if the PdfPageText object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetFontSize(int)
Retrieves the font size of the character at the specified index in the text.
public double GetFontSize(int index)
Parameters
index
intThe zero-based index of the character whose font size is to be retrieved. Must be within the range of 0 to the total character count minus one.
Returns
Exceptions
- ObjectDisposedException
Thrown if the underlying text object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total character count.
GetFontWeight(int)
Retrieves the font weight of the character at the specified index.
public int GetFontWeight(int index)
Parameters
index
intThe zero-based index of the character whose font weight is to be retrieved. Must be within the range of available characters.
Returns
- int
An integer representing the font weight of the specified character. The value corresponds to the font weight as defined in the PDF document.
Exceptions
- ObjectDisposedException
Thrown if the underlying PDF text object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetLooseCharBox(int)
Retrieves the loose bounding box of a character at the specified index within the text.
public FsRectF? GetLooseCharBox(int index)
Parameters
index
intThe zero-based index of the character whose bounding box is to be retrieved.
Returns
- FsRectF?
A FsRectF structure representing the loose bounding box of the character if successful; otherwise, null.
Remarks
The loose bounding box may include additional space around the character, such as padding or spacing, depending on the font and rendering context.
Exceptions
- ObjectDisposedException
Thrown if the text object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetMatrix(int)
Retrieves the transformation matrix for the character at the specified index.
public FsMatrix? GetMatrix(int index)
Parameters
index
intThe zero-based index of the character for which to retrieve the transformation matrix.
Returns
- FsMatrix?
An FsMatrix representing the transformation matrix of the character if the operation is successful; otherwise, null.
Remarks
The transformation matrix describes how the character is positioned and scaled within the document.
Exceptions
- ObjectDisposedException
Thrown if the object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetStrokeColor(int)
Retrieves the stroke color of the character at the specified index.
public RgbaColor? GetStrokeColor(int index)
Parameters
index
intThe zero-based index of the character whose stroke color is to be retrieved.
Returns
- RgbaColor?
An RgbaColor representing the stroke color of the character, or null if the stroke color could not be determined.
Exceptions
- ObjectDisposedException
Thrown if the object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
GetTextIndexFromCharIndex(int)
Converts a character index to the corresponding text index in the document.
public int GetTextIndexFromCharIndex(int charIndex)
Parameters
charIndex
intThe zero-based index of the character within the text content.
Returns
- int
The zero-based text index corresponding to the specified character index.
Remarks
This method maps a character index to its equivalent text index, which may differ depending
on the document's internal representation of text. Ensure that charIndex
is within the valid
range of the text content to avoid exceptions.
GetTextObject(int)
Retrieves the text object at the specified index within the PDF text.
public PdfTextObject? GetTextObject(int index)
Parameters
index
intThe zero-based index of the text object to retrieve. Must be within the range of available text objects.
Returns
- PdfTextObject
A PdfTextObject representing the text object at the specified index, or null if no text object exists at the specified index.
Exceptions
- ObjectDisposedException
Thrown if the current PdfPageText instance has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of text objects.
GetTextRange(int, int)
Returns the chacters from the specified index and count from the associated PDF text.
public string GetTextRange(int index, int count)
Parameters
Returns
- string
A string from the start charcter and reading the specified number of characters
Exceptions
- ObjectDisposedException
Throws if the PdfText object has been disposed
- ArgumentOutOfRangeException
Throws if the starting index or the ending index are out of bounds
HasUnicodeMapError(int)
Determines whether the character at the specified index has a Unicode mapping error.
public bool HasUnicodeMapError(int index)
Parameters
index
intThe zero-based index of the character to check. Must be within the range of valid character indices.
Returns
Exceptions
- ObjectDisposedException
Thrown if the object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total character count.
IsGenerated(int)
Determines whether the character at the specified index is a generated character.
public bool IsGenerated(int index)
Parameters
index
intThe zero-based index of the character to check. Must be within the valid range of characters.
Returns
Exceptions
- ObjectDisposedException
Thrown if the underlying object has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total character count.
IsHyphen(int)
Determines whether the character at the specified index is a hyphen.
public bool IsHyphen(int index)
Parameters
index
intThe zero-based index of the character to check. Must be within the range of valid character indices.
Returns
Exceptions
- ObjectDisposedException
Thrown if the underlying text resource has been disposed.
- ArgumentOutOfRangeException
Thrown if
index
is less than 0 or greater than or equal to the total number of characters.
LoadWebLinks()
Loads the web links associated with the current PDF document.
public void LoadWebLinks()
Remarks
This method initializes the web link handle for the PDF document if it has not already been loaded. If the web link handle is already initialized, the method returns without performing any action. If the operation fails, an exception is thrown.
Exceptions
- dotPDFiumException
Thrown if the web links cannot be loaded successfully.
TryCountRects(int, int, int)
Counts how many bounding rectangles exist for a specific range of characters in the associated PDF text.
public bool TryCountRects(int startIndex, int count, int rects)
Parameters
startIndex
intThe starting index of the character to begin counting bounding rectangles
count
intThe count of characters to count bounding rectangles for
rects
intThe out parameter to hold The number of bounding rectangles that exist for the specified range of characters
Returns
- bool
true on success, false on failure
TryGetCharBox(int, out double, out double, out double, out double)
public bool TryGetCharBox(int index, out double left, out double right, out double bottom, out double top)
Parameters
Returns
TryGetCharIndexAtPos(double, double, out int, double, double)
Attempts to retrieve the character index at the specified position within the document.
public bool TryGetCharIndexAtPos(double x, double y, out int index, double xTolerance = 2, double yTolerance = 2)
Parameters
x
doubleThe x-coordinate of the position, in device-independent points.
y
doubleThe y-coordinate of the position, in device-independent points.
index
intWhen this method returns, contains the zero-based index of the character at the specified position, if the operation succeeds. If the operation fails, this will be set to 0.
xTolerance
doubleThe horizontal tolerance, in device-independent points, for determining the character at the position. The default value is 2.0.
yTolerance
doubleThe vertical tolerance, in device-independent points, for determining the character at the position. The default value is 2.0.
Returns
- bool
true if a character index was successfully retrieved at the specified position; otherwise, false.
Remarks
This method returns false if the underlying document handle is invalid or if no character is found at the specified position within the given tolerances.
TryGetCharOrigin(int, out double, out double)
public bool TryGetCharOrigin(int index, out double x, out double y)
Parameters
Returns
TryGetTextRange(int, int, out string)
Returns the chacters from the specified index and count from the associated PDF text.
public bool TryGetTextRange(int index, int count, out string text)
Parameters
Returns
- bool
A string from the start charcter and reading the specified number of characters and true on success, false on failure