Monday, January 23, 2012

3 Ways to Serve PDF Files using Htaccess Cookies, Headers, Rewrites

SkyHi @ Monday, January 23, 2012
FYI, using the Mod_Rewrite Variables Cheatsheet makes this example, and all advanced .htaccess code easier to understand. This demo lets you set a cookie with 1 of 3 values, then you just request the pdf file with a normal link click and get 1 of 3 different responses. This is accomplished with a nice bit of .htaccesscode.

As I explain the htaccess code that achieves this, keep in mind this is merely one simple application for this code. It's much more advanced than your basic htaccess trick, notice how this htaccess acts like a php script, very unusual.. I really wanted to share this trick after I created it for one of my clients because this is the tip of the iceberg. Another use would be to display an alternate style sheet depending on a users theme preference. The coolest thing is that it uses multiple advanced .htaccess ideas. This code uses mod_headers to set the Content-Disposition header for forcing a download and uses mod_rewrite to: Send different Content-Type headers, Check the value of a cookie, Set environment variables for use later by mod_headers header directive
Set PDF Viewing Mode - Make a selection, then click the view pdf button.
InlineDownloadSave AsView PDF using selected mode »

What's Going On ^

There are 3 different ways for a server to send a pdf file in response to a request for one. This causes 3 different ways to open/view the pdf file in the clients browser.
  1. The browser display's a "Save File As" dialog, allowing you to save the file or open.
  2. The browser opens the pdf file "Inline", opening the pdf file in the browser like a web page.
  3. The browser "Downloads" the pdf file automatically as an "Attachment" and then causes an external pdf reader program like adobe reader to open the file.
Some people prefer to have the option of saving the file to view later, some prefer opening it with an external program, and some just like the pdf file to load right in the browser... The point is that by using .htaccess, we can let them choose any of the 3 methods and save their preference for all further pdf files requested from our site by that user.

How It Works ^

When you click on one of the 3 demo buttons above, "Inline", "Save As", or "Download", a cookie named askapache_pdf is saved in your browser using the javascript below, with the value being set to which button you clicked. Then when you request the pdf file the .htaccess code below uses mod_rewrite to read the value of the askapache_pdf cookie, and depending on which was your preference it will send alternate HTTP Headers that control how your browser handles the file.

Unique HTTP Headers Returned ^

When it comes down to it, the following information is the 3 modes. Notice each one is different, because these headers are the only thing controlling how your browser handles the file.

Save As Mode (askapache_pdf=s) ^

Content-Disposition: attachment
Content-Type: application/pdf

Inline Mode (askapache_pdf=i) ^

Content-Type: application/pdf

Download Mode (askapache_pdf=a) ^

Content-Type: application/octet-stream

Htaccess Demo File ^

For the demo I created the folder /storage/pdf/ and this is the .htaccess file at /storage/pdf/.htaccess
The default Content-Type for .pdf files. This will make .pdf files default Content-Type header have the value 'application/pdf' - but the default can be overridden by using RewriteRule with the [T='different/type']
AddType application/pdf .pdf
Turn on the rewrite engine if its already on you dont need this
RewriteEngine On
Skip RewriteRules if not .pdf request, like autoindexing. The next [2] RewriteRule directives are specific for .pdf files so if the filename requested does not end in .pdf then the [S=2] instructs the next 2 RewriteRule directives to be completely skipped.
RewriteRule !.*\.pdf$ - [S=2]
The first RewriteCond checks to see if the askapache_pdf cookie is NOT set. The second RewriteCond checks to see if the askapche_pdf cookie has the value of s, which is the value corresponding to someone clicking the "Save As" button.
The [NC,OR] flag means that if the cookie askapache_pdf does not exist, OR (next cond) if the askapache_pdf cookie does exist and is set to 's' then process the RewriteRule. If neither cond is true the rewriterule is skipped.
If one of the RewriteCond is true, then the RewriteRule is processed. The RewriteRule applies to any/all requests (.*) but doesn't rewrite anything (-) This RewriteRule sets an Apache environment variable ASKAPACHE_PDFS to have the value of 1 if either rewritecond is true. The variable can be checked by any directives following the rewriterule in the whole htaccess file. The ASKAPACHE_PDFS ends in S because if this variable exists then it means the users preference is 'Save As'
Notice that if the user requested the pdf file without selecting a preference i.e. no cookie exists, then the ASKAPACHE_PDFS variable is still set. This just lets us pick the default preference for them, in this example the default is 'Save As'
RewriteCond %{HTTP_COOKIE} !^.*askapache_pdf.*$ [NC,OR]
RewriteCond %{HTTP_COOKIE} ^.*askapache_pdf=s.*$ [NC]
RewriteRule .* - [E=ASKAPACHE_PDFS:1]
The RewriteCond checks the askapache_pdf cookie for the value 'a' which 'a' represents 'Download'
If the cookies value is 'a' then the RewriteRule overrides the default Content-Type from 'application/pdf' set with AddType earlier, to 'application/octet-stream', which is a special content-type that tells the browser that the file cannot be loaded by the browser 'Inline', but must be saved which will be opened by an external viewer depending on browser configuration and plugins.
RewriteCond %{HTTP_COOKIE} ^.*askapache_pdf=a.*$
RewriteRule .* - [T=application/octet-stream]
This is superfly. If the cookie/users-preference was 'Save As' (s) then the RewriteRule above the last one set the environment variable ASKAPACHE_PDFS to have the value 1. The Header directive here is ONLY processed in that variable ASKAPACHE_PDFS exists. That is what the end 'env=ASKAPACHE_PDFS' does, it is the condition that must be met or the Header directive is skipped. If the ASKAPACHE_PDFS environment variable set by RewriteRule does exist then the header directive adds the header 'Content-Disposition: attachment' to the normal Response Headers. The 'Content-Disposition: attachment' header instructs your browser to present you with the 'Save As' dialog box allowing you to choose whether you want to save or open.
Header set Content-Disposition "attachment" env=ASKAPACHE_PDFS

Javascript used by Demo ^

The best place for javascript is quirksmode, here is a definitive article on setting, reading, parsing, etc.. COOKIES.
Note, I now prefer using jQuery over my AAJS javascript library. Also, the whole using cookies aspect is just to highlight some advanced htaccess, you can accomplish this much easier without javascript or cookies.
if(!gi('pdfr'))return;
var pdfr=gi('pdfr');
var cval=getCookie('askapache_pdf');
 
if(cval=='i'){pdfr.innerHTML='Currently set to "Inline".';}
else if(cval=='a'){pdfr.innerHTML='Currently set to "Download" mode.';}
else if(cval=='s'){pdfr.innerHTML='Currently set to "Save As" mode.';}
 
addMyEvent(gi('pdfi'),"mousedown",function(){
  setCookie("askapache_pdf", "i", "", "/", "www.askapache.com"); gi('pdfr').innerHTML = 'Changed mode to "Inline".'; return false; });
addMyEvent(gi('pdfa'),"mousedown",function(){
  setCookie("askapache_pdf", "a", "", "/", "www.askapache.com"); gi('pdfr').innerHTML = 'Changed mode to "Download".'; return false; });
addMyEvent(gi('pdfs'),"mousedown",function(){
  setCookie("askapache_pdf", "s", "", "/", "www.askapache.com"); gi('pdfr').innerHTML = 'Changed mode to "Save As".'; return false; });

Alternative Method - No Cookies + PHP ^

This is what I came up with first for my client, and then while programming the php I noticed.. Hey! I think I can do the same thing using .htaccess, which would save me on cpu/memory/potential security/etc.. but this works great too. Though you will need to hack the code to get it working probably..
Note that the .htaccess rewrite code I used here used FILENAME-i.pdf or FILENAME-s.pdf to pass the preference to the pdf-dl.php script, it also worked for FILENAME.pdf?i=i

pdf-dl.php ^

Alternate Method .htaccess ^

Deny direct request to pdf-dl.php file
RewriteCond %{THE_REQUEST} ^.*pdf-dl\.php.*$ [NC]
RewriteRule .* - [F]
Handle PDF files named anything-i.pdf as inline
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ([^/]*)-i\.pdf$  /cgi-bin/pdf-dl.php?i=i&file=%{DOCUMENT_ROOT}/storage/pdf/$1.pdf [L,NC,QSA,S=1]
Handle PDF files without -i.pdf as attachments
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ([^/]*)\.pdf$  /cgi-bin/pdf-dl.php?file=%{DOCUMENT_ROOT}/storage/pdf/$1.pdf [L,NC,QSA]



REFERENCES
http://www.askapache.com/htaccess/pdf-cookies-headers-rewrites.html