Print long topic to PDF, redux, again

feedback

(Dan Thomas) #1

This has been discussed before, yet as I understand it, it is not considered important. The issue is, there’s no convenient method to save a long topic to a PDF file. Personally, I can’t understand why this wouldn’t be considered important.

I’m posting this here because I’m a member of the Keyboard Maestro forum (which obviously uses Discourse), and it was suggested there that someone (me) try to see if I could get some progress here on this forum.

So here I am, bringing up a subject that seems to be destined for the trash bin, or at least the dusty back pile of “yeah, maybe, someday, if someone yells loudly enough”.

(I’m being tongue in cheek here, a little - I understand priorities. But still, SOMEONE YELLING LOUDLY here.) Any hope?


Download PDF of post
Format for hard copy or PDF printing
Discourse for Academic Use Features (Done and to be Done)
(Jeff Atwood) #2

Why does it need to be convenient? The request comes up so rarely. Printing is not not even a 1% of the time task these days.


(Mittineague) #3

That, and AFAIK most browsers have “print to PDF” from their print menu.


(Dan Thomas) #4

I did’t say I wanted to print it. I said I wanted a PDF of it. To put on my iPad. To read when I have time, potentially in an unconnected state.


(Dan Thomas) #5

The issue is long topics. Try to print a long topic to PDF, and you’ll get some topic text then a lot of blank pages. It seems the only feasible way to PDF long topics is to turn off JavaScript, which limits the web page to around 20 posts, print that to PDF, go to the next page, etc.


(Kane York) #6

This sounds like it would be better served by a stored partial copy of the database.


(Joshua Rosenfeld) #7

How complicated would it be to add an Export button to the bottom of each topic, which automatically formats the topic for print? Something that removes the UI, and simply displays the post text (and images) one after another? An example would be what Gmail does removing all of the UI.


(Dan Thomas) #8

Or a “Printer-friendly version”?


(Mittineague) #9

There already is a print.css

It is somewhat aged and in need of some updating. But other than the new timeline causing another page it looks good to me.

Have you tried it?

print-to-pdf.pdf (213.7 KB)


(cpradio) #10

But it won’t work for beyond 20 posts. To my knowledge there is still nothing that will do that for you.


(Dan Thomas) #11

I didn’t realize this would get so much attention so quickly. Have to go to dinner now, but please, continue on without me. :smile:


(Joshua Rosenfeld) #12

I am guessing that is due to infinite scrolling…posts beyond 20 (above or below) aren’t loaded.

Edit: Seeing as the UI is all JS, I feel that something will need to generate the full topic in “printer-friendly version”, and load another page. If Discourse could generate the PDF itself, all the better!


(Jeff Atwood) #13

Good ideas here, but I have no resources to allocate to this task, and won’t for the foreseeable future. It is possible @erlend_sh could fund something like this from the community, if someone here is particularly interested in building it.


(Tobias Eigen) #14

I occasionally wish for this myself as admin but am beginning to think that offering this to users would prevent the desired behavior - we want people to read and participate via discourse. It’s bad enough that we allow participation by email.

I’d love to see this implemented as a moderator wrench option to “export topic as PDF”. So in the odd instance when my CEO requests a PDF to read on the plane I can provide it for him but it’s not something that is readily available to all users.

(…maybe that option could be extended later to allow export as JSON that can then be imported to another discourse instance)


(Tarak'ha) #15

I remember a topic on SitePoint regarding this feature, as SitePoint is also powered by Discourse.

HTTrack may provide something in the meantime while waiting for a PR; with the proper scan rules set plus JavaScript turned off.


(Erlend Sogge Heggen) #16

I think the main use case of this feature, namely “going offline with Discourse”, is best solved by implementing ServiceWorkers and other new standard APIs for enabling offline functionality.

Another alternative path you could go down would be to commission someone to do a client-side browser plugin for this.


(Dan Thomas) #17

Just an FYI - If you add support for this, make sure that “code blocks” (or whatever you call them) don’t end up getting truncated. Since they’re usually displayed in a box with scroll bars, this is certainly a possibility.

Thanks.

PS: I actually love Discourse, as a user. I’m a typical user in that I bellyache for something I don’t have, but I’d like to be atypical and say “Thanks”. Good stuff.


(Simon Cossar) #18

This might be too much of a workaround, but if you have access to a WordPress site, you can use this plugin to save a Discourse topic as a WordPress post:

And then use a plugin like this to convert the post to a PDF:

This will work for reasonably long topics, but not infinitely long. The ‘Discourse Topic Archive’ plugin needs better error handling. The ‘pdf-print’ plugin can be quite slow for long topics.

Here’s a PDF of the ‘What is the most awesome plugin for Discourse, that does not yet exits?’ topic that was created by this method.

awesome-plugins.pdf (1.1 MB)


(Dan Thomas) #19

Thanks.

Actually, what ended up working pretty well was to turn off JavaScript, select and copy the text on a page, paste it into TextEdit (I’m on a Mac), go to the next page, and repeat until done.

Surprisingly, TextEdit handled the pasted text pretty darn well - can’t say the same for Word.

So it wasn’t too bad.


(Dean Taylor) #20

I don’t know how long this will last (due to future coding changes)…
… totally not recommended …
… and you follow these instructions at your own risk - ie. there is “no support” …
… and your browser may crash doing this …
… but for anybody desperate enough to need to do it.

There is a global variable that influences “cloaking” - which is what hides / removes content when it scrolls out of view to save memory / allow Discourse behave as well as it does on low-powered devices whilst scrolling.

That variable is window.inTestEnv.

However it needs to be set before the main Discourse code loads.

You can use Chrome Dev Tools to do this:

  1. Go to the topic page you want to print in the browser
  2. Scroll to the top
  3. Press F12 (this opens Chrome Dev Tools)
  4. Press F5 (refreshes the page)
  5. Select the “Sources” tab
  6. Find the source code to the page you are currently on in the left hand nav, double click it to open the actual source code.
  7. Scroll down until you find the first <script> HTML tag in the source
  8. Add a break point in by clicking the line number of the first line of script.
  • You should see a blue arrow, like the one on line 43 pictured above (it might not be line 43).
  1. Press F5 (refreshes the page), this time stopping the JavaScript execution on the break point.
  2. Press Esc, this toggles the display of the “Console”, if you don’t see it - press it again and it should appear.
  3. In the Console type window.inTestEnv = true and press Enter
  4. Press F8 - this resumes the JavaScript execution and the page loading.
  5. Press F12 (this closes Chrome Dev Tools)
  6. Slowly and repeatedly press Page Dn until you reach the bottom of the topic.
  7. Press Ctrl+P (this opens the print dialog).
  8. Select “Save as PDF” Destination and press “Save”.
  9. Select the location to save the file, press “Save”.

Note that you may have to repeat the sequence 15 though 17 quickly / repeatedly until the file actually saves - as the print dialog closes / crashes when there is an issue. I found doing 15 though 17 faster and changing the destination away from a printer to PDF got me there in the end.

I test printed the “1000 replies” topic on “try” and it “works for me” to prove it works for long topics (outputs a 206 page A4 document).

Please don’t try this on here on “meta” only on your “own” personal instances.


By the way my option is that time should be put into Service Worker / offline support - not printing.