Reference: mod_rewrite, URL rewriting and “pretty links” explained
Reference: mod_rewrite, URL rewriting and “pretty links” explained
"Pretty links" is an often requested topic, but it is rarely fully explained. mod_rewrite is one way to make "pretty links", but it's complex and its syntax is very terse, hard to grok and the documentation assumes a certain level of proficiency in HTTP. Can someone explain in simple terms how "pretty links" work and how mod_rewrite can be used to create them?
Other common names, aliases, terms for clean urls: RESTful URLs, User-friendly URLs, SEO-friendly URLs, Slugging, MVC urls (probably a misnomer)
@Mike Sort of, but slugs are often a part of pretty URLs. A slug is pretty specifically when, for example, the headline of an article is turned into a URL-friendly form which then acts as the identifier of that article. So
reference-mod-rewrite-url-rewriting-explained
is the slug, /questions/20563772/reference-mod-rewrite-url-rewriting-explained
is the pretty URL.– deceze♦
Dec 13 '13 at 12:54
reference-mod-rewrite-url-rewriting-explained
/questions/20563772/reference-mod-rewrite-url-rewriting-explained
I think that the
.htaccess
and mod-rewrite
tags should be updated to include a link to this question, as it covers much of what is asked on a regular basis. Thoughts?– Mike Rockétt
Feb 13 '16 at 13:23
.htaccess
mod-rewrite
4 Answers
4
To understand what mod_rewrite does you first need to understand how a web server works. A web server responds to HTTP requests. An HTTP request at its most basic level looks like this:
GET /foo/bar.html HTTP/1.1
This is the simple request of a browser to a web server requesting the URL /foo/bar.html
from it. It is important to stress that it does not request a file, it requests just some arbitrary URL. The request may also look like this:
/foo/bar.html
GET /foo/bar?baz=42 HTTP/1.1
This is just as valid a request for a URL, and it has more obviously nothing to do with files.
The web server is an application listening on a port, accepting HTTP requests coming in on that port and returning a response. A web server is entirely free to respond to any request in any way it sees fit/in any way you have configured it to respond. This response is not a file, it's an HTTP response which may or may not have anything to do with physical files on any disk. A web server doesn't have to be Apache, there are many other web servers which are all just programs which run persistently and are attached to a port which respond to HTTP requests. You can write one yourself. This paragraph was intended to divorce you from any notion that URLs directly equal files, which is really important to understand. :)
The default configuration of most web servers is to look for a file that matches the URL on the hard disk. If the document root of the server is set to, say, /var/www
, it may look whether the file /var/www/foo/bar.html
exists and serve it if so. If the file ends in ".php" it will invoke the PHP interpreter and then return the result. All this association is completely configurable; a file doesn't have to end in ".php" for the web server to run it through the PHP interpreter, and the URL doesn't have to match any particular file on disk for something to happen.
/var/www
/var/www/foo/bar.html
mod_rewrite is a way to rewrite the internal request handling. When the web server receives a request for the URL /foo/bar
, you can rewrite that URL into something else before the web server will look for a file on disk to match it. Simple example:
/foo/bar
RewriteEngine On
RewriteRule /foo/bar /foo/baz
This rule says whenever a request matches "/foo/bar", rewrite it to "/foo/baz". The request will then be handled as if /foo/baz
had been requested instead. This can be used for various effects, for example:
/foo/baz
RewriteRule (.*) $1.html
This rule matches anything (.*
) and captures it ((..)
), then rewrites it to append ".html". In other words, if /foo/bar
was the requested URL, it will be handled as if /foo/bar.html
had been requested. See http://regular-expressions.info for more information about regular expression matching, capturing and replacements.
.*
(..)
/foo/bar
/foo/bar.html
Another often encountered rule is this:
RewriteRule (.*) index.php?url=$1
This, again, matches anything and rewrites it to the file index.php with the originally requested URL appended in the url
query parameter. I.e., for any and all requests coming in, the file index.php is executed and this file will have access to the original request in $_GET['url']
, so it can do anything it wants with it.
url
$_GET['url']
Primarily you put these rewrite rules into your web server configuration file. Apache also allows* you to put them into a file called .htaccess
within your document root (i.e. next to your .php files).
.htaccess
* If allowed by the primary Apache configuration file; it's optional, but often enabled.
What mod_rewrite does not do
mod_rewrite does not magically make all your URLs "pretty". This is a common misunderstanding. If you have this link in your web site:
<a href="/my/ugly/link.php?is=not&very=pretty">
there's nothing mod_rewrite can do to make that pretty. In order to make this a pretty link, you have to:
Change the link to a pretty link:
<a href="/my/pretty/link">
Use mod_rewrite on the server to handle the request to the URL /my/pretty/link
using any one of the methods described above.
/my/pretty/link
(One could use mod_substitute
in conjunction to transform outgoing HTML pages and their contained links. Though this is usally more effort than just updating your HTML resources.)
mod_substitute
There's a lot mod_rewrite can do and very complex matching rules you can create, including chaining several rewrites, proxying requests to a completely different service or machine, returning specific HTTP status codes as responses, redirecting requests etc. It's very powerful and can be used to great good if you understand the fundamental HTTP request-response mechanism. It does not automatically make your links pretty.
See the official documentation for all the possible flags and options.
Maybe mention the FallbackResource directive introduced in version 2.2.16 as the preffered way of rewriting to a dispatcher.
– Darsstar
Dec 13 '13 at 10:31
To expand on deceze's answer, I wanted to provide a few examples and explanation of some other mod_rewrite functionality.
All of the below examples assume that you have already included RewriteEngine On
in your .htaccess
file.
RewriteEngine On
.htaccess
Rewrite Example
Lets take this example:
RewriteRule ^blog/([0-9]+)/([A-Za-z0-9-+]+)/?$ /blog/index.php?id=$1&title=$2 [NC,L,QSA]
The rule is split into 4 sections:
RewriteRule
^blog/([0-9]+)/([A-Za-z0-9-+]+)/?$
blog/index.php?id=$1&title=$2
[NC,L,QSA]
The above rewrite would allow you to link to something like /blog/1/foo/
and it would actually load /blog/index.php?id=1&title=foo
.
/blog/1/foo/
/blog/index.php?id=1&title=foo
^
example.com/blog/...
example.com/foo/blog/...
(…)
([0-9]+)
$1
-
+
+
$2
?
/blog/1/foo/
/blog/1/foo
$
These are options that are added in square brackets at the end of your rewrite rule to specify certain conditions. Again, there are a lot of different flags which you can read up on in the documentation, but I'll go through some of the more common flags:
NC
The no case flag means that the rewrite rule is case insensitive, so for the example rule above this would mean that both /blog/1/foo/
and /BLOG/1/foo/
(or any variation of this) would be matched.
/blog/1/foo/
/BLOG/1/foo/
L
The last flag indicates that this is the last rule that should be processed. This means that if and only if this rule matches, no further rules will be evaluated in the current rewrite processing run. If the rule does not match, all other rules will be tried in order as usual. If you do not set the L
flag, all following rules will be applied to the rewritten URL afterwards.
L
END
Since Apache 2.4 you can also use the [END]
flag. A matching rule with it will completely terminate further alias/rewrite processing. (Whereas the [L]
flag can oftentimes trigger a second round, for example when rewriting into or out of subdirectories.)
[END]
[L]
QSA
The query string append flag allows us to pass in extra variables to the specified URL which will get added to the original get parameters. For our example this means that something like /blog/1/foo/?comments=15
would load /blog/index.php?id=1&title=foo&comments=15
/blog/1/foo/?comments=15
/blog/index.php?id=1&title=foo&comments=15
R
This flag isn't one I used in the example above, but is one I thought is worth mentioning. This allows you to specify a http redirect, with the option to include a status code (e.g. R=301
). For example if you wanted to do a 301 redirect on /myblog/ to /blog/ you would simply write a rule something like this:
R=301
RewriteRule ^/myblog/(*.)$ /blog/$1 [R=301,QSA,L]
Rewrite Conditions
Rewrite conditions make rewrites even more powerful, allowing you to specify rewrites for more specific situations. There are a lot of conditions which you can read about in the documentation, but I'll touch on a few common examples and explain them:
# if the host doesn't start with www. then add it and redirect
RewriteCond %HTTP_HOST !^www.
RewriteRule ^ http://www.%HTTP_HOST%REQUEST_URI [L,R=301]
This is a very common practice, which will prepend your domain with www.
(if it isn't there already) and execute a 301 redirect. For example, loading up http://example.com/blog/
it would redirect you to http://www.example.com/blog/
www.
http://example.com/blog/
http://www.example.com/blog/
# if it cant find the image, try find the image on another domain
RewriteCond %REQUEST_URI .(jpg|jpeg|gif|png)$ [NC]
RewriteCond %REQUEST_FILENAME !-f
RewriteCond %REQUEST_FILENAME !-d
RewriteRule (.*)$ http://www.example.com/$1 [L]
This is slightly less common, but is a good example of a rule that doesn't execute if the filename is a directory or file that exists on the server.
%REQUEST_URI .(jpg|jpeg|gif|png)$ [NC]
%REQUEST_FILENAME !-f
%REQUEST_FILENAME !-d
Stack Overflow has many other great resources to get started:
^/
.htaccess
And newcomer-friendly regex overviews even:
.*
[^/]+
d+
w+
[A-Za-z0-9_]
[w-]+
-
_
[w-.,]+
-
[…]
.
.
[…]
Each of these placeholders is usually wrapped in (…)
parentheses as capture group. And the whole pattern often in ^………$
start + end markers. Quoting "patterns" is optional.
(…)
^………$
RewriteRules
The following examples are PHP-centric and a bit more incremental, easier to adapt for similar cases.
They're just summaries, often link to more variations or detailed Q&As.
/contact
/about
Shortening a few page names to internal file schemes is most simple:
RewriteRule ^contact$ templ/contact.html
RewriteRule ^about$ about.php
/object/123
Introducing shortcuts like http://example.com/article/531
to existing PHP scripts is also easy. The numeric placeholder can just be remapped to a $_GET
parameter:
http://example.com/article/531
$_GET
RewriteRule ^article/(d+)$ article-show.php?id=$1
# └───────────────────────────┘
/article/with-some-title-slug
You can easily extend that rule to allow for /article/title-string
placeholders:
/article/title-string
RewriteRule ^article/([w-]+)$ article-show.php?title=$1
# └────────────────────────────────┘
Note that your script must be able (or be adapted) to map those titles back to database-ids. RewriteRules alone can't create or guess information out of thin air.
/readable/123-plus-title
Therefore you'll often see mixed /article/529-title-slug
paths used in practice:
/article/529-title-slug
RewriteRule ^article/(d+)-([w-]+)$ article.php?id=$1&title=$2
# └───────────────────────────────┘
Now you could just skip passing the title=$2
anyway, because your script will typically rely on the database-id anyway. The -title-slug
has become arbitrary URL decoration.
title=$2
-title-slug
/foo/…
/bar/…
/baz/…
If you have similar rules for multiple virtual page paths, then you can match and compact them with |
alternative lists. And again just reassign them to internal GET parameters:
|
# ┌─────────────────────────┐
RewriteRule ^(blog|post|user)/(w+)$ disp.php?type=$1&id=$2
# └───────────────────────────────────┘
You can split them out into individual RewriteRule
s should this get too complex.
RewriteRule
/date/SWITCH/backend
A more practical use of alternative lists are mapping request paths to distinct scripts. For example to provide uniform URLs for an older and a newer web application based on dates:
# ┌─────────────────────────────┐
# │ ┌───────────┼───────────────┐
RewriteRule ^blog/(2009|2010|2011)/([d-]+)/?$ old/blog.php?date=$2
RewriteRule ^blog/(d+)/([d-]+)/?$ modern/blog/index.php?start=$2
# └──────────────────────────────────────┘
This simply remaps 2009-2011 posts onto one script, and all other years implicitly to another handler.
Note the more specific rule coming first. Each script might use different GET params.
/
/user-123-name
You're most commonly seeing RewriteRules to simulate a virtual directory structure. But you're not forced to be uncreative. You can as well use -
hyphens for segmenting or structure.
-
RewriteRule ^user-(d+)$ show.php?what=user&id=$1
# └──────────────────────────────┘
# This could use `(w+)` alternatively for user names instead of ids.
For the also common /wiki:section:Page_Name
scheme:
/wiki:section:Page_Name
RewriteRule ^wiki:(w+):(w+)$ wiki.php?sect=$1&page=$2
# └─────┼────────────────────┘ │
# └────────────────────────────┘
Occasionally it's suitable to alternate between /
-delimiters and :
or .
in the same rule even. Or have two RewriteRules again to map variants onto different scripts.
/
:
.
/
/dir
/dir/
When opting for directory-style paths, you can make it reachable with and without a final /
RewriteRule ^blog/([w-]+)/?$ blog/show.php?id=$1
# ┗┛
Now this handles both http://example.com/blog/123
and /blog/123/
. And the /?$
approach is easy to append onto any other RewriteRule.
http://example.com/blog/123
/blog/123/
/?$
.*/.*/.*/.*
Most rules you'll encounter map a constrained set of /…/
resource path segments to individual GET parameters. Some scripts handle a variable number of options however.
The Apache regexp engine doesn't allow optionalizing an arbitrary number of them. But you can easily expand it into a rule block yourself:
/…/
Rewriterule ^(w+)/?$ in.php?a=$1
Rewriterule ^(w+)/(w+)/?$ in.php?a=$1&b=$2
Rewriterule ^(w+)/(w+)/(w+)/?$ in.php?a=$1&b=$2&c=$3
# └─────┴─────┴───────────────────┴────┴────┘
If you need up to five path segments, then copy this scheme along into five rules. You can of course use a more specific [^/]+
placeholder each.
Here the ordering isn't as important, as neither overlaps. So having the most frequently used paths first is okay.
[^/]+
Alternatively you can utilize PHPs array parameters via ?p=$1&p=$2&p=3
query string here - if your script merely prefers them pre-split.
(Though it's more common to just use a catch-all rule, and let the script itself expand the segments out of the REQUEST_URI.)
?p=$1&p=$2&p=3
See also: How do I transform my URL path segments into query string key-value pairs?
prefix/opt?/.*
A common variation is to have optional prefixes within a rule. This usually makes sense if you have static strings or more constrained placeholders around:
RewriteRule ^(w+)(?:/([^/]+))?/(w+)$ ?main=$1&opt=$2&suffix=$3
Now the more complex pattern (?:/([^/])+)?
there simply wraps a non-capturing (?:…)
group, and makes it optional )?
. The contained
placeholder ([^/]+)
would be substitution pattern $2
, but be empty if there's no middle /…/
path.
(?:/([^/])+)?
(?:…)
)?
([^/]+)
$2
/…/
/prefix/123-capture/…/*/…whatever…
As said before, you don't often want too generic rewrite patterns. It does however make sense to combine static and specific comparisons with a .*
sometimes.
.*
RewriteRule ^(specific)/prefix/(d+)(/.*)?$ speci.php?id=$2&otherparams=$2
This optionalized any /…/…/…
trailing path segments. Which then of course requires the handling script to split them up, and variabl-ify extracted parameters
itself (which is what Web-"MVC" frameworks do).
/…/…/…
/old/path.HTML
URLs don't really have file extensions. Which is what this entire reference is about (= URLs are virtual locators, not necessarily a direct filesystem image).
However if you had a 1:1 file mapping before, you can craft simpler rules:
RewriteRule ^styles/([w.-]+).css$ sass-cache.php?old_fn_base=$1
RewriteRule ^images/([w.-]+).gif$ png-converter.php?load_from=$2
Other common uses are remapping obsolete .html
paths to newer .php
handlers, or just aliasing directory names only for individual (actual/real) files.
.html
.php
/ugly.html
/pretty
So at some point you're rewriting your HTML pages to carry only pretty links, as outlined by deceze.
Meanwhile you'll still receive requests for the old paths, sometimes even from bookmarks. As workaround, you can ping-pong browsers to display/establish
the new URLs.
This common trick involves sending a 30x/Location redirect whenever an incoming URL follows the obsolete/ugly naming scheme.
Browsers will then rerequest the new/pretty URL, which afterwards is rewritten (just internally) to the original or new location.
# redirect browser for old/ugly incoming paths
RewriteRule ^old/teams.html$ /teams [R=301,QSA,END]
# internally remap already-pretty incoming request
RewriteRule ^teams$ teams.php [QSA,END]
Note how this example just uses [END]
instead of [L]
to safely alternate. For older Apache 2.2 versions you can use other workarounds, besides also remapping
query string parameters for example:
Redirect ugly to pretty URL, remap back to the ugly path, without infinite loops
[END]
[L]
/this+that+
It's not that pretty in browser address bars, but you can use spaces in URLs. For rewrite patterns use backslash-escaped ␣
spaces.
Else just "
-quote the whole pattern or substitution:
␣
"
RewriteRule "^this [w ]+/(.*)$" "index.php?id=$1" [L]
Clients serialize URLs with +
or %20
for spaces. Yet in RewriteRules they're interpreted with literal characters for all relative path segments.
+
%20
Frequent duplicates:
RewriteCond %REQUEST_URI !-f
RewriteCond %REQUEST_URI !-d
RewriteRule ^.*$ index.php [L]
Which is often used by PHP frameworks or WebCMS / portal scripts. The actual path splitting then is handled in PHP using $_SERVER["REQUEST_URI"]
. So conceptionally it's pretty much the opposite of URL handling "per mod_rewrite". (Just use FallBackResource
instead.)
$_SERVER["REQUEST_URI"]
FallBackResource
www.
Note that this doesn't copy a query string along, etc.
# ┌──────────┐
RewriteCond %HTTP_HOST ^www.(.+)$ [NC] │
RewriteRule ^(.*)$ http://%1/$1 [R=301,L] │
# ↓ └───┼────────────┘
# └───────────────┘
See also:
· URL rewriting for different protocols in .htaccess
· Generic htaccess redirect www to non-www
· .htaccess - how to force "www." in a generic way?
Note that RewriteCond/RewriteRule combos can be more complex, with matches (%1
and $1
) interacting in both directions even:
%1
$1
Apache manual - mod_rewrite intro, Copyright 2015 The Apache Software Foundation, AL-2.0
HTTPS://
RewriteCond %SERVER_PORT 80
RewriteRule ^(.*)$ https://example.com/$1 [R,L]
See also: https://wiki.apache.org/httpd/RewriteHTTPToHTTPS
RewriteCond %REQUEST_FILENAME.php -f
RewriteRule ^(.+)$ $1.php [L] # or [END]
See also: Removing the .php extension with mod_rewrite
See: http://httpd.apache.org/docs/2.4/rewrite/remapping.html#backward-compatibility
See mod_rewrite, php and the .htaccess file
See How can i get my htaccess to work (subdomains)?
Prevalent .htaccess
pitfalls
.htaccess
Now take this with a grain of salt. Not every advise can be generalized to all contexts.
This is just a simple summary of well-known and a few unobvious stumbling blocks:
mod_rewrite
.htaccess
To actually use RewriteRules in per-directory configuration files you must:
Check that your server has AllowOverride All
enabled. Otherwise your per-directory .htaccess
directives will go ignored, and RewriteRules won't work.
AllowOverride All
.htaccess
Obviously have mod_rewrite
enabled in your httpd.conf
modules section.
mod_rewrite
httpd.conf
Prepend each list of rules with RewriteEngine On
still. While mod_rewrite is implicitly active in <VirtualHost>
and <Directory>
sections,
the per-directory .htaccess
files need it individually summoned.
RewriteEngine On
<VirtualHost>
<Directory>
.htaccess
^/
You shouldn't start your .htaccess
RewriteRule patterns with ^/
normally:
.htaccess
^/
RewriteRule ^/article/d+$ …
↑
This is often seen in old tutorials. And it used to be correct for ancient Apache 1.x versions. Nowadays request paths are conveniently fully directory-relative in .htaccess
RewriteRules. Just leave the leading /
out.
.htaccess
/
· Note that the leading slash is still correct in <VirtualHost>
sections though. Which is why you often see it ^/?
optionalized for rule parity.
· Or when using a RewriteCond %REQUEST_URI
you'd still match for a leading /
.
· See also Webmaster.SE: When is the leading slash (/) needed in mod_rewrite patterns?
<VirtualHost>
^/?
RewriteCond %REQUEST_URI
/
<IfModule *>
You've probably seen this in many examples:
<IfModule mod_rewrite.c>
Rewrite…
</IfModule>
<VirtualHost>
.htaccess
However you don't want that usually in your own .htaccess
files.
.htaccess
500
404
What seems enticing as generalized safeguard, often turns out to be an obstacle in practice.
RewriteBase
Many copy+paste examples contain a RewriteBase /
directive. Which happens to be the implicit default anyway. So you don't actually need this. It's a workaround for fancy VirtualHost rewriting schemes, and misguessed DOCUMENT_ROOT paths for some shared hosters.
RewriteBase /
It makes sense to use with individual web applications in deeper subdirectories. It can shorten RewriteRule patterns in such cases. Generally it's best to prefer relative path specifiers in per-directory rule sets.
See also How does RewriteBase work in .htaccess
MultiViews
URL rewriting is primarily used for supporting virtual incoming paths. Commonly you just have one dispatcher script (index.php
) or a few individual handlers (articles.php
, blog.php
, wiki.php
, …). The latter might clash with similar virtual RewriteRule paths.
index.php
articles.php
blog.php
wiki.php
A request for /article/123
for example could map to article.php
with a /123
PATH_INFO implicitly. You'd either have to guard your rules then with the commonplace RewriteCond
!-f
+!-d
, and/or disable PATH_INFO support, or perhaps just disable Options -MultiViews
.
/article/123
article.php
/123
RewriteCond
!-f
!-d
Options -MultiViews
Which is not to say you always have to. Content-Negotiation is just an automatism to virtual resources.
See Everything you ever wanted to know about mod_rewrite
if you haven't already. Combining multiple RewriteRules often leads to interaction. This isn't something to prevent habitually per [L]
flag, but a scheme you'll embrace once versed.
You can re-re-rewrite virtual paths from one rule to another, until it reaches an actual target handler.
[L]
Still you'd often want to have the most specific rules (fixed string /forum/…
patterns, or more restrictive placeholders [^/.]+
) in the early rules.
Generic slurp-all rules (.*
) are better left to the later ones. (An exception is a RewriteCond -f/-d
guard as primary block.)
/forum/…
[^/.]+
.*
RewriteCond -f/-d
When you introduce virtual directory structures /blog/article/123
this impacts relative resource references in HTML (such as <img src=mouse.png>
).
Which can be solved by:
/blog/article/123
<img src=mouse.png>
href="/old.html"
src="/logo.png"
<base href="/index">
<head>
You could alternatively craft further RewriteRules to rebind .css
or .png
paths to their original locations.
But that's both unneeded, or incurs extra redirects and hampers caching.
.css
.png
See also: CSS, JS and images do not display with pretty url
A common misinterpetation is that a RewriteCond blocks multiple RewriteRules (because they're visually arranged together):
RewriteCond %SERVER_NAME localhost
RewriteRule ^secret admin/tools.php
RewriteRule ^hidden sqladmin.cgi
Which it doesn't per default. You can chain them using the [S=2]
flag. Else you'll have to repeat them. While sometimes you can craft an "inverted" primary rule to [END] the rewrite processing early.
[S=2]
You can't match RewriteRule index.php?x=y
, because mod_rewrite compares just against relative paths per default. You can match them separately however via:
RewriteRule index.php?x=y
RewriteCond %QUERY_STRING b(?:param)=([^&]+)(?:&|$)
RewriteRule ^add/(.+)$ add/%1/$1 # ←──﹪₁──┘
See also How can I match query string variables with mod_rewrite?
.htaccess
<VirtualHost>
If you're using RewriteRules in a per-directory config file, then worrying about regex performance is pointless. Apache retains
compiled PCRE patterns longer than a PHP process with a common routing framework. For high-traffic sites you should however consider
moving rulesets into the vhost server configuration, once they've been battle-tested.
In this case, prefer the optionalized ^/?
directory separator prefix. This allows to move RewriteRules freely between PerDir and server
config files.
^/?
Fret not.
Compare access.log
and error.log
access.log
error.log
Often you can figure out how a RewriteRule misbehaves just from looking at your error.log
and access.log
.
Correlate access times to see which request path originally came in, and which path/file Apache couldn't resolve to (error 404/500).
error.log
access.log
This doesn't tell you which RewriteRule is the culprit. But inaccessible final paths like /docroot/21-.itle?index.php
may give away where to inspect further.
Otherwise disable rules until you get some predictable paths.
/docroot/21-.itle?index.php
Enable the RewriteLog
See Apache RewriteLog docs. For debugging you can enable it in the vhost sections:
# Apache 2.2
RewriteLogLevel 5
RewriteLog /tmp/rewrite.log
# Apache 2.4
LogLevel alert rewrite:trace5
#ErrorLog /tmp/rewrite.log
That yields a detailed summary of how incoming request paths get modified by each rule:
[..] applying pattern '^test_.*$' to uri 'index.php'
[..] strip per-dir prefix: /srv/www/vhosts/hc-profi/index.php -> index.php
[..] applying pattern '^index.php$' to uri 'index.php'
Which helps to narrow down overly generic rules and regex mishaps.
See also:
· .htaccess not working (mod_rewrite)
· Tips for debugging .htaccess rewrite rules
Before asking your own question
As you might know, Stack Overflow is very suitable for asking questions on mod_rewrite. Make them on-topic
by including prior research and attempts (avoid redundant answers), demonstrate basic regex understanding, and:
$_SERVER
access.log
error.log
rewrite.log
This nets quicker and more exact answers, and makes them more useful to others.
.htaccess
If you copy examples from somewhere, take care to include a # comment and origin link
. While it's merely bad manners to omit attribution,
it often really hurts maintenance later. Document any code or tutorial source. In particular while unversed you should be
all the more interested in not treating them like magic blackboxes.
# comment and origin link
Disclaimer: Just a pet peeve. You often hear pretty URL rewriting schemes referred to as "SEO" links or something. While this is useful for googling examples, it's a dated misnomer.
None of the modern search engines are really disturbed by .html
and .php
in path segments, or ?id=123
query strings for that matter. Search engines of old, such as AltaVista, did avoid crawling websites with potentially ambigious access paths. Modern crawlers are often even craving for deep web resources.
.html
.php
?id=123
What "pretty" URLs should conceptionally be used for is making websites user-friendly.
/common/tree/nesting
However don't sacrifice unique requirements for conformism.
There are various online tools to generate RewriteRules for most GET-parameterish URLs:
Mostly just output [^/]+
generic placeholders, but likely suffices for trivial sites.
[^/]+
Still needs a bit rewriting, more links, and the many subheadlines are somewhat obnoxious. There's some overlap with the other answers here, so can be cut down perhaps. It's mainly about the visual examples though, and that list of common gotchas.
– mario
Jul 7 '15 at 22:07
Didn't saw such a beauty of an answer for a long time! My eyes are glowing while I'm reading it. Please don't stop posting such answers :)
– Rizier123
Jul 8 '15 at 18:56
Excellent post. Made me understand the basic concepts of mod_rewrite very quickly!
– breez
Dec 9 '16 at 11:20
Many basic virtual URL schemes can be achieved without using RewriteRules. Apache allows PHP scripts to be invoked without .php
extension, and with a virtual PATH_INFO
argument.
.php
PATH_INFO
Nowadays AcceptPathInfo On
is often enabled by default. Which basically allows .php
and other resource URLs to carry a virtual argument:
AcceptPathInfo On
.php
http://example.com/script.php/virtual/path
Now this /virtual/path
shows up in PHP as $_SERVER["PATH_INFO"]
where you can handle any extra arguments however you like.
/virtual/path
$_SERVER["PATH_INFO"]
This isn't as convenient as having Apache separate input path segments into $1
, $2
, $3
and passing them as distinct $_GET
variables to PHP. It's merely emulating "pretty URLs" with less configuration effort.
$1
$2
$3
$_GET
.php
The simplest option to also eschew .php
"file extensions" in URLs is enabling:
.php
Options +MultiViews
This has Apache select article.php
for HTTP requests on /article
due to the matching basename. And this works well together with the aforementioned PATH_INFO feature. So you can just use URLs like http://example.com/article/virtual/title
. Which makes sense if you have a traditional web application with multiple PHP invocation points/scripts.
article.php
/article
http://example.com/article/virtual/title
Note that MultiViews has a different/broader purpose though. It incurs a very minor performance penalty, because Apache always looks for other files with matching basenames. It's actually meant for Content-Negotiation, so browsers receive the best alternative among available resources (such as article.en.php
, article.fr.php
, article.jp.mp4
).
article.en.php
article.fr.php
article.jp.mp4
.php
A more directed approach to avoid carrying around .php
suffixes in URLs is configuring the PHP handler for other file schemes. The simplest option is overriding the default MIME/handler type via .htaccess
:
.php
.htaccess
DefaultType application/x-httpd-php
This way you could just rename your article.php
script to just article
(without extension), but still have it processed as PHP script.
article.php
article
Now this can have some security and performance implications, because all extensionless files would be piped through PHP now. Therefore you can alternatively set this behaviour for individual files only:
<Files article>
SetHandler application/x-httpd-php
# or SetType
</Files>
This is somewhat dependent on your server setup and the used PHP SAPI. Common alternatives include ForceType application/x-httpd-php
or AddHandler php5-script
.
ForceType application/x-httpd-php
AddHandler php5-script
Again take note that such settings propagate from one .htaccess
to subfolders. You always should disable script execution (SetHandler None
and Options -Exec
or php_flag engine off
etc.) for static resources, and upload/ directories etc.
.htaccess
SetHandler None
Options -Exec
php_flag engine off
Among its many options, Apache provides mod_alias
features - which sometimes work just as well as mod_rewrite
s RewriteRules. Note that most of those must be set up in a <VirtualHost>
section however, not in per-directory .htaccess
config files.
mod_alias
mod_rewrite
<VirtualHost>
.htaccess
ScriptAliasMatch
is primarily for CGI scripts, but also ought to works for PHP. It allows regexps just like any RewriteRule
. In fact it's perhaps the most robust option to configurate a catch-all front controller.
ScriptAliasMatch
RewriteRule
And a plain Alias
helps with a few simple rewriting schemes as well.
Alias
Even a plain ErrorDocument
directive could be used to let a PHP script handle virtual paths. Note that this is a kludgy workaround however, prohibits anything but GET requests, and floods the error.log by definition.
ErrorDocument
See http://httpd.apache.org/docs/2.2/urlmapping.html for further tips.
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
Slug or Slugging is another common alias/term for pretty urls.
– Mike B
Dec 13 '13 at 12:30