I need to be able to take a pdf that is already created, and I need to search for sections of the pdf, insert and/or change existing text, and resave the pdf.
I saw the FixedContentEditor, but I only saw how to add, I didn't notice how to search for text or replace it?
Could you point me in the right direction?
Thank you!
1 Answer, 1 is accepted
Hello Rich,
Currently, the search functionality is not available in the PdfProcessing library. We have a feature request for it on our feedback portal. You can track its progress, subscribe to status changes, and add your comment to it here: PdfProcessing: Introduce Find API.
At this point, I can suggest manually iterating the text fragments of the pdf file. Here is an example of this:
var pdfProvider = new PdfFormatProvider();
var fixedDocument = pdfProvider.Import(File.ReadAllBytes(@"..\..\lorem.pdf"));
foreach (var page in fixedDocument.Pages)
{
foreach (var item in page.Content)
{
if (item is TextFragment)
{
var fragment = (TextFragment)item;
Console.WriteLine(fragment.Text);
}
}
}
The issue with this approach is that in the pdf format each word can be separated into several fragments with different positions. This means that you need to use the position in order to extract the text and this is a complex task that will require a lot of work.
Should you have any other questions do not hesitate to ask.
Regards,
Dimitar
Progress Telerik
Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.
Hi Dimitar,
I think I am a bit lost with all of this.
I used your code to cycle through the fragments, that works.
I am trying to write fragments back to it and in the documentation I have already seen about 4 different ways to write text to the document, which is the RIGHT way? Example, I have seen addtextfragment and drawtext, among others.
Also, when I look at the text fragment item, it shows the font as a truetype.
When I try and write text back, I need to select a font, and I cannot seem to be able to use truetype.
Also, if I want to insert text in the middle of a paragraph in the pdf, do I need to reposition all of the rest of the paragraph under it? There is no way to insert in the middle and the lower fragments auto move down?
Thank you,
Rick Doll.
Hi Rick,
The best approach here would be to alter the text of the existing fragment. This way the text will be shifted if you are in the middle of the paragraph (but this would not shift the other fragments). Unfortunately, there is no automatic way to shift all contents of the file. Please note, that the pdf format is very different from the typical Word document and cannot be edited in a similar way. If you are using a template document it would be best to leave enough space for the new content. Consider creating a flow document and converting it to PDF as well.
In addition, you can directly use the font of the existing fragments:
var fragment = (TextFragment)item;
var newFragment = new TextFragment();
newFragment.Font = fragment.Font;
Creating new fonts is described in the following article: PdfProcessing - Fonts.
I hope this helps. Should you have any other questions do not hesitate to ask.Hi Dimitar,
The issue is in some cases I need to edit the text, and in some cases i need to insert new text in the middle of a paragraph.
We have a third party till that creates an invoice, and in some cases, we are required to have the text a bit different for certain customers, so we were going to run the generated invoice through a tool i put together, and the tool would search through the text and replace text in some areas, and insert in the middle of others, lets say there is a section of text that is 6 lines, i might need to enter a new line after the second perhaps, and the hope was lines 3-6 would shift.
Other issue is the generated invoice isn't a bunch of text fragment words, every character of every word is a fragment.
I am missing something here.
This is what i did to try and get the font. I got the first font used by any fragment, as that font is what will be used for the other text.
Then I assign that font to a new fragment.
Anytime I assign that font to the text fragment, it doesn't write to the pdf. If I assign NO font, it writes to the pdf.
Hi Rick,
Ok, I understand your scenario. Perhaps it would be better to find the fragments that hold the first and the last words of the template text and delete everything between them and then insert a new fragment with the complete text.
The font name suggests that this is a stripped sunset that contains only characters used with the specific text. With such fonts, you might see only a part of the text or no text. Unfortunately, our API does not provide a way to get the actual font name. I have logged this on our feedback portal so we can improve it and pass the actual name if possible. You can track its progress, subscribe to status changes, and add your comment to it here. I have updated your Telerik points as well.
My guess is that the template will not use another font so until this is implemented you can examine the font in Adobe (see attached). This is the only workaround I can think of for this case. In addition, If you replace the entire text the font will look consistent.
Let me know if I can assist you further.Hello,
Has there been any updates to the font issue described above?
I'm seeing similar results and have dropped back to using Arial (as a test) which when setting the Font property of the TextFragment is OMCIFW+Arial-BoldMT. When my PDF is generated, I see no text output.
In addition, do you have a list of supported fonts somewhere on your site? I ask only because I'm curious but most often, fonts can't be just swapped out as they are usually selected at the corporate/marketing level - which, for most of us, locks us into the font in question.
Thanks,
Mark
Hi Mark,
The issue is resolved. If you are not getting the expected result I would recommend opening a new ticket and attaching your file there. This way we will be able to properly investigate this.
Thank you in advance for your patience and cooperation.