Solved: Grab Text (Not) Wrapping Issue

  • 2
  • Question
  • Updated 5 months ago
  • (Edited)
I have a long table in row format that I want to port to Excel.

But when I grab the text, the line breaks are honoured on a random basis.  Is there a way around this or have I got to replace them all manually (it's a LONG table)?



EDIT:
Solved.  I found an even better solution, an online PDF to Excel converter.  Worked a treat
Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes

Posted 5 months ago

  • 2
Photo of Ed Covney

Ed Covney

  • 346 Posts
  • 232 Reply Likes
OK, problem solved but would you have had the same results by copying the same PDF lines to notepad, then re-copying that to Excel?
Can you share the original snag?
(Edited)
Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes
Hi Ed

Yes, exactly the same issue in NotePad++, because it's caused by a lack of line breaks in the grabbed text.

The original snag file is here.
Photo of george

george

  • 102 Posts
  • 21 Reply Likes
Hi Paul,
I would be interested in more details about your solution please.  I have a long Index in a Help file application which I would like to obtain a hard copy of in order to review it with others.  It used to be so easy to capture scrolling windows in earlier version of SnagIt!
George
Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes
I just googled "PDF to Excel conversion online" :)
Photo of Joe Morgan

Joe Morgan

  • 7138 Posts
  • 3873 Reply Likes
SnagIt's version of Abbyy's "Grab Text" is grossly under-powered.

Compared to Abbyy's FineReader. The software engine that drives Grab Text.


I'm running Version 12 from 2013. Granted it wasen't cheap software.

I downloaded Paul's .snag and grab text couldn't read Paul's .snag with formatting. It failed to place dates should have followed names. I knew it was a low performance application.Having Abbyy FineReader, I don't need it. But this is nothing more than sub-par performance by the application, in my opinion.

 

Heres Abbyy FineReader 2013 handling the task with ease.



It seems to me Abbyy owes TechSmith a more robust version to me.

Regards,Joe 
 
Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes
The intrepid Joe strikes again!

Thanks, Joe.

Out of interest did it handle all the columns as well as the ones shown in your image?  Because the PDF to Excel online converter I used offset a block of columns by one column
(Edited)
Photo of Joe Morgan

Joe Morgan

  • 7111 Posts
  • 3864 Reply Likes
It wasn't perfect process, sorry it took me so long to respond. I had to be out and about today.
There was more than one formatting error. It seemed to make a decision or 2 of its own. It reduced the quantity of date rows for some songs. Because the space for dates posted was wider overall.  Here's the file so you can peak.

https://www.mediafire.com/file/ovxkb3fkm3979g3/Pauls_Example.xlsx/file

Its a wonder they get the algorithms to function as well as they do. Thats a pretty big list.

I'm using Abby 2013, I would think the newest version is improved in these departments.

Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes
Well, that's a much better result than I got with the online conversion process.  But that was free and Abbyy is eye-wateringly expensive
Photo of Joe Morgan

Joe Morgan

  • 7138 Posts
  • 3873 Reply Likes
Curiosity got the better of me.I Downloaded the trial version of Abbyy 14's newest version.
It definitely does a better job than version 12 {:>)

https://www.mediafire.com/file/oy23goe7xr25gz8/Abbyy_2019.xlsx/file

It follows the formatting much closer. The only places it seemed to get tripped up? Is where the text encroached rows of text it shouldn't have. In these 2 images. The very bottom of the text slips below the line."Dates Actually"

On the right in Abbyy is the ORC conversion. It places the text correctly.If you look at the original image at the left and bottom.You'll see the actual text placement.

To the far left you'll see the error in Excel. There is text dipping below the line everywhere I find an error. I'm wondering if the SnagIt capture produced the anomaly? Or the original image is flawed?

Anyway, all seems pretty impressive for scanning an image to me.

 


(Edited)
Photo of Joe Morgan

Joe Morgan

  • 7138 Posts
  • 3873 Reply Likes
I thought I'd take a second bite at this issue Paul.

The biggest problem is the length of the scroll. I'm not sure where the shortcoming lies? Could be worthy of reporting as a bug.

I looked at the 654 x 6041 image and thought? SnagIt doesn't handle large image well. Never has. So I cropped the image to 654 x 2095. It did a pretty good job of grabbing the text. More errors than Abby 14 but the formatting was pretty spot on.

I copied it into WORD. Which could have been sufficient.

Then created  a editable PDF.


Provided you were using Adobe Acrobat.You could export edited version to Excel.



For what its worth.
Photo of Paul

Paul

  • 1637 Posts
  • 1220 Reply Likes
Thanks for your diligence, Joe.  I should've thought about chunking the capture down.  Duh!

Anyway, the moment has passed.  I took the online PDF > Excel conversion and dragged and dropped the errant content.  Fortunately, this is a bit of a one-off.

Have a beer on me:




Photo of Joe Morgan

Joe Morgan

  • 7111 Posts
  • 3864 Reply Likes
Thanks, looks tasty.{:>)


Ya know, in a 4K world. It’s only a page and a 1/2 scroll. That's not much text it you think about.

I called tech support.Talked to them about your 900 X 6000 image and what was going wrong. I thought, this must be a bug?

They told me grab text has limitations. Is currently designed for 1500 or 2000 pixels captures. They wern't sure of the exact figure off the top of their head.