“iText.Commons.Actions.EventManager”的类型初始值设定项引发异常

问题描述 投票:0回答:1

我正在尝试构建一个简单的 Powershell cmdlet,它将使用 iText7 打开 PDF 文件并输出文本。我尝试使用 Powershell Gallery 中的 iText7Module,但作为 iText7 一部分的 LocationTextExtractionStrategy 在垂直位置上过于具体。我想在这里实现方法iText7以错误的顺序读出行来解决这个问题。我希望从 iText7 中获得的唯一功能是文本提取,因此我认为最好构建一个自定义 cmdlet 来完成这一任务(使用宽松的提取策略)。

我一直在逐步构建 cmdlet。基本上从“Hello, World”cmdlet 开始,我一直在添加所需的 iText7 命令。它一直有效,直到我尝试创建 PdfDocument 对象。当我运行 cmdlet 时,我得到“‘iText.Commons.Actions.EventManager’的类型初始值设定项引发了异常。”

我正在使用 Visual Studio Code。请提前原谅,我的 C#/NET 开发技能非常有限(该项目的所有其余繁重工作都将在 Powershell 中完成)。

using System;
using System.Management.Automation; 
using iText.Kernel.Pdf;

namespace TestCmdlet{
    [Cmdlet(VerbsCommon.Get, "LaxPDFText")]

    public class LaxPDFText : PSCmdlet {
        [Parameter(Mandatory = true)]
        public string filePath {get; set;}

        [Parameter(Mandatory = true)]
        public int laxRange {get; set;}

        protected override void BeginProcessing()
        {
          WriteObject("filePath: " + filePath);
          WriteObject("laxRange: " + laxRange);
          try {
            PdfReader pdfReader = new PdfReader(filePath);
            try {
              PdfDocument pdfDocument = new PdfDocument(pdfReader);
              pdfDocument.Close();
              }
            catch (Exception e) {WriteObject("Couldn't create pdfDocument"); WriteObject(e.Message); WriteObject(e.StackTrace);}
            pdfReader.Close();
          }
          catch (Exception e) {WriteObject("Couldn't create pdfReader"); WriteObject(e.Message); WriteObject(e.StackTrace);}
        }
    }
}

然后在 Powershell 中尝试...

PS C:\Users\Van Drunens\documents> import-module TestCmdlet
PS C:\Users\Van Drunens\documents> Get-LaxPDFText -filePath "c:\users\van drunens\documents\test.pdf" -laxRange 5
filePath: c:\users\van drunens\documents\test.pdf
laxRange: 5
Couldn't create pdfDocument
The type initializer for 'iText.Commons.Actions.EventManager' threw an exception.
   at iText.Kernel.Pdf.PdfDocument.Open(PdfVersion newPdfVersion)
   at TestCmdlet.LaxPDFText.BeginProcessing()

我对

iText.Kernel.Pdf.PdfDocument.Open(PdfVersion newPdfVersion)
引用一种
PdfVersion
类型的错误感到困惑...

itext7 powershell-cmdlet typeinitializeexception
1个回答
0
投票
Building a custom PowerShell cmdlet for text extraction with iText7 is a great approach, especially if you want to customize the extraction behavior. The error you encountered regarding the EventManager suggests an issue with initialization that could be related to iText7 dependencies or the way the library is being loaded in the PowerShell environment.
here are few things you could try to resolve

**Verify .NET Runtime Compatibility** 

    <TargetFramework>net8.0</TargetFramework>

**Check NuGet Package Dependencies**

    Install-Package itext7

Test Text Extraction

    using System;
using System.Management.Automation;
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Canvas.Parser;

namespace TestCmdlet {
    [Cmdlet(VerbsCommon.Get, "LaxPDFText")]
    public class LaxPDFText : PSCmdlet {
        [Parameter(Mandatory = true)]
        public string filePath { get; set; }

        protected override void BeginProcessing() {
            WriteObject($"filePath: {filePath}");

            try {
                using (PdfReader pdfReader = new PdfReader(filePath))
                using (PdfDocument pdfDocument = new PdfDocument(pdfReader)) {
                    WriteObject("PdfDocument created successfully.");
                    string text = PdfTextExtractor.GetTextFromPage(pdfDocument.GetPage(1));
                    WriteObject("Extracted Text:");
                    WriteObject(text);
                }
            } catch (Exception e) {
                WriteObject("Error extracting text from PDF.");
                WriteObject($"Exception: {e.GetType().Name}, Message: {e.Message}");
                WriteObject(e.StackTrace);
            }
        }
    }
}
Or
**Customize Text Extraction Strategy**
using iText.Kernel.Pdf.Canvas.Parser.Listener;
using iText.Kernel.Pdf.Canvas.Parser;
using System.Text;

public class LaxTextExtractionStrategy : ITextExtractionStrategy {
    private StringBuilder text = new StringBuilder();

    public void EventOccurred(IEventData data, EventType type) {
        // Customize text extraction logic as needed
        if (data is TextRenderInfo renderInfo) {
            text.Append(renderInfo.GetText());
            text.Append(" "); // Add space to prevent words from sticking together
        }
    }

    public string GetResultantText() {
        return text.ToString();
    }

    public void BeginTextBlock() {}
    public void EndTextBlock() {}
    public void RenderText(TextRenderInfo renderInfo) {}
}

You would use this class in your LaxPDFText cmdlet when extracting text:


string text = PdfTextExtractor.GetTextFromPage(pdfDocument.GetPage(1), new LaxTextExtractionStrategy());

解决 EventManager 问题

dotnet nuget locals all --clear
© www.soinside.com 2019 - 2024. All rights reserved.