PHP Security Mistakes : Article from Developer Shed

PHP Security Mistakes

The purpose of this document is to inform PHP programmers of common security mistakes that can be overlooked in PHP scripts. While many of the following concepts may appear to be common sense, they are unfortunately not always common practice. After applying the following practices to your coding, you will be able to eliminate the vast majority of security holes that plague many scripts. Many of these security holes have been found in widely-used open source and commercial PHP scripts in the past.

The most important concept to learn from this article is that you should never trust the user to input exactly what is expected. The way most PHPimage scripts are compromised is by entering unexpected data to exploit security holes inadvertantly left in the script.

Always keep the following principles in mind when designing your scripts:

1. Never include, require, or otherwise open a file with a filename based on user input, without thoroughly checking it first.

Take the following example:





Since there is no validation being done on $page, a malicious user could hypothetically call your script like this (assuming register_globals is set to ON):


Therefore causing your script to include the servers /etc/passwd file. When a non PHP file is include()’d or require()’d, it’s displayed as HTML/Text, not parsed as PHP code.

On many PHP installations, the include() and require() functions can include remote files. If the malicious user were to call your script like this:


He would be able to have evilscript.php output any PHP code that he or she wanted your script to execute. Imagine if the user sent code to delete content from your database or even send sensitive information directly to the browser.

Solution: validate the input. One method of validation would be to create a list of acceptable pages. If the input did not match any of those pages, an error could be displayed.

$pages = array(‘index.html’, ‘page2.html’, ‘page3.html’);

if( in_array($page, $pages) )






die(“Nice Try.”);


2. Be careful with eval()

Placing user-inputted values into the eval() function can be extremely dangerous. You essentially give the malicious user the ability to execute any command he or she wishes! You may envision the input coming from a drop-down menu of options you specify, but you user may decide to send input like this:

script.phpimage?input=;passthru(“cat /etc/paswd”);

By putting his own code in that statement, the user could cause your program to output your server’s complete /etc/passwd file.

Use eval() sparingly, and by all means, validate the input. It should only be used when absolutely necessary — when there is dynamically generated PHP code. If you are using it to substitute template variables into a string or substitute user-inputted values, then you are using it for the wrong reason. Try sprintf() or a template system instead.

3. Be careful when using register_globals = ON

This has been a major issue since this feature was invented. It was originally designed to make programming in PHP easier (and that it did), but misuse of it often led to security holes. As of PHP 4.2.0, register_globals is set to OFF by default. It is recommended that you use the superglobals to deal with input ($_GET, $_POST, $_COOKIE, $_SESSION, etc).

For example, let us say that you had a variable that specified what page to include:


but you intended $page to be defined in a config file or somewhere else in the script, and not to come as user input. In one instance you forgot to pre-define $page. If register_globals is set to ON, the malicious user can take over and define $page for you, by calling your script like this:


I recommend you develop with register_globals set to OFF, and use the superglobals when accessing user input. In addition, you should always develop with full error reporting, which can be specified like this (at the top of your script):


This way, you will receive a notice for every variable you try to call that was not previously defined. Yes, PHP does not require you to define variables so there may be notices that you can ignore, but this will help you to catch undefined variables that you did expect to come from input or other sources. In the previous example, when $page was referenced in the include() statement, PHP would issue a notice that $page was not defined.

Whether or not you want to use register_globals is up to you, but make sure you are aware of the advantages and disadvantages of it and how to remedy the possible security holes.

4. Never run unescaped queries

PHP has a feature, enabled by default, that automatically escapes (adds a backslash in front of) certain characters that come in from a GET, POST, or COOKIE. The single quote (‘) is one example of a character that is escaped automatically. This is done so that if you include input variables in your SQL queries, it will not treat single quotes as part of the query. Say your user entered $name from a form and you performed this query:

UPDATE users SET Name=’$name’ WHERE ID=1;

Normally, if they had entered $name with single quotes in them, they would be escaped, so MySQL would see this:

UPDATE users SET Name=’Joe\’s’ WHERE ID=1

so that the single quote entered into “Joe’s” would not interfere with the query syntax.

In some situations, you may use stripslashes() on an input variable. If you put the variable into a query, make sure to use addslashes() or mysql_escape_string() to escape the single quotes before your run the query. Imagine if an unslashed query went in, and a malicious user had entered part of a query as their name!

UPDATE users SET Name=’Joe’,Admin=’1′ WHERE ID=1

On the input form, the user would have entered:


As their name, and since the single quotes were not escaped, he or she would be able to actually end the name definition, place in a comma, and set another variable called Admin!

The final query with input in blue would look like this:

UPDATE users SET Name=’Joe’,Admin=’1′ WHERE ID=1

In some configurations, magic_quotes_gpc (the feature that automatically adds slashes to all input) is actually set to OFF. You can use the function get_magic_quotes_gpc() to see if it’s on or not (it returns true or false). If it returns false, simply use addslashes() to add slashes to all of the input (it is easiest if you use $_POST, $_GET, and $_COOKIE or $HTTP_POST_VARS, $HTTP_GET_VARS, and $HTTP_COOKIE_VARS, instead of globals because you could step through those arrays using a foreach() loop and add slashes to each one).

5. For protected areas, use sessions or validate the login every time.

There are some cases where programmers will only use some sort of login.phpimage script to first validate their username and password (entered through a form), test if they’re an administrative or valid user, and actually set a variable through a cookie, or even hide it as a hidden variable. Then in the code, they check to see if they have access like this:



// let them in




// kick them out


The above a code makes the fatal assumption that the $admin variable can only come from a cookie or input form that the malicious user has no control over. However, that is simply not the case. With register_globals enabled, injecting designed input into the $admin variable is as easy as calling the script like so:


Furthermore, even if you use the superglobals $_COOKIE or $_POST, a malicious user can easily forge a cookie or create his own HTML form to post any information to your script.

There are two good solutions to this problem. One is on the same track as setting an $admin variable, but this time set $admin as a session variable. In this case, it is stored on the server and is much less likely to be forged. On subsequent calls to the same script, your user’s previous session information will be available on the server, and you will be able to verify if the user is an administrator like so:

if( $_SESSION[‘admin’] )

The second solution is to only store their username and password in a cookie, and with every call to the script, validate the username and password and verify if the user is an administrator. You could have two functions — one called validate_login($username,$password) that verified the user’s login information, and one called is_admin($username) that queried the database to see if that username is an administrator. The code would be placed at the top of any protected script:

if( !validate_login( $_COOKIE[‘username’], $_COOKIE[‘password’] ) )


echo “Sorry, invalid login”;



// the login is ok if we made it down here

if( !is_admin( $_COOKIE[‘username’] ) )


echo “Sorry, you do not have access to this section”;



Personally I recommend using sessions, as the latter solution is not scalable.

6. If you don’t want the file contents to be seen, give the file a .php extension.

It was common practice for awhile to name include files or library files with a .inc extension. Here’s the problem: if a malicious user simply enter the .inc file into his browser, it will be displayed as plain text, not parsed as PHP. Even if the browser did not like the file type, an option to download it would most likely be given. Imagine if this file had your database login and password, or even more sensitive information.

This goes for any other extension other than .php (and a few others), so even a .conf or a .cfg file would not be safe.

The solution is to put a .php extension on the end of it. Since your include files or config files usually just define variables and/or functions and not really output anything, if your user were to load this, for example, into their browser:

they would most likely be shown nothing at all, unless of your outputs something. Either way, the file would be parsed as PHP instead of just displaying your code.

There are also some reports of people adding Apache directives that will deny access to .inc files; however, I do not recommend this because of the lack of portability. If you rely on .inc files and that Apache directive to deny access to them and one day you move your scripts to another server and forget to place the Apache directive in, you are wide open.


Web development mistakes:Article From Roger Johansson

DOCTYPE confusion

Completely missing, incorrect, or in the wrong place. I have seen HTML 4.0 Transitional used in documents containing XHTML markup as well as in <frameset> documents, DOCTYPE declarations appearing after the opening <html> tag, and incomplete DOCTYPES.
Why? Two reasons. First, it’s required, as stated in the W3C HTML 4.01 spec as well as in the W3C XHTML 1.0 spec. Second, modern web browsers use the specified DOCTYPE to decide which rendering mode to use. This is also known as “DOCTYPE switching”. For more consistent results across browsers, especially when using CSS, you’ll want browsers to use their “Standards compliance mode”. More info on DOCTYPE switching can be found in Fix Your Site With the Right DOCTYPE! and Activating the Right Layout Mode Using the Doctype Declaration.

<span> mania

A common way of styling something with CSS is to wrap it in a <span> element with a class attribute and use that to hook up the styling. I’m sure we’ve all seen things like <span class="heading"> and <span class="bodytext">.
Why? It is, in most cases, completely unnecessary, has no semantic value, and just clutters the markup. Use heading elements for headings, put paragraphs in paragraph elements, mark up lists with HTML list elements. Use CSS to style those elements. If necessary, add class or id attributes.

(too much) Visual thinking

Treating the web as WYSIWYG – starting off by focusing on how things look instead of thinking about structure first, and presentation later.
Why? While most people using the web are sighted, all are not. And there is no way of making the web WYSIWYG. There will always be variations as long as people use different browsers, operating systems, monitor sizes, screen resolutions, window sizes, colour calibration, and font sizes. The web is not print or television. Make your design flexible.

Lack of semantics

Non-semantic markup. Basing the choice of which HTML element to use on the way most graphical browsers render it by default, instead of on which meaning the element has.
Why? This mistake is closely related to “<span> mania”, in that it does not make proper use of existing HTML elements to give content meaning. Without semantic HTML, it is much harder for non-visual user agents to make sense of the content. Semantic HTML also tends to be easy to style with CSS.

Character encoding mismatches

Specifying one character encoding in the HTTP header sent by the server, and using another in the document. This may confuse browsers and make them display the document improperly.
Why? Because you want to make sure all your visitors can read your content.

Bad alt attributes

Missing or useless. <img> elements with missing alt attributes can be found in billions on the web. Not quite as common are useless attribute values like “spacer GIF used to make the layout look good”, “big blue bullet with dropshadow”, and “JPEG image, 123 KB”. Remember, the alt attribute is required for <img> and <area> elements.
Why? It’s required, and without it, any information in the image will be invisible to screen readers, text-only browsers, search engine robots, or users with images turned off. Note that alternate text should be relevant. Do not specify alternate text for decorative images or images used for layout. In those cases, specify an empty string, alt="".

Invalid id and class attributes

Multiple uses of the same value for the id attribute. Invalid characters used in id and class attributes and CSS selectors.

For CSS (CSS 2.1 Syntax and basic data types):

In CSS 2.1, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters U+00A1 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit.

For HTML (Basic HTML data types):

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens (“-“), underscores (“_”), colons (“:”), and periods (“.”).

Why? Browsers that follow the specification will not display your document as intended. If a document has multiple occurrences of the same id value, any JavaScript using that value is likely to break or behave unpredictably.

Browser sniffing

Using scripts, server or client side, in an attempt to detect the visitor’s browser, and send or execute browser-specific code. Very commonly fails for reasons like new browsers, updated browsers, and user agent spoofing (Opera does this by default).
Why? It adds unnecessary complexity, and will break eventually.

Missing units in CSS

Length values (horizontal or vertical measurements) require units in CSS, except when the value is zero. It’s not like in HTML, where you can type width="10". In CSS, it has to be width:10px; (or whatever unit you’re using).
Why? It doesn’t work in browsers that follow the specification.

Browser-specific CSS.

Scrollbar styling, expressions, filters etc. Proprietary CSS that only works in Internet Explorer. Invalid, too.
Why? Only works in a specific browser. If you really must use IE-specific CSS, move it to a separate file and use conditional comments, or some other means, to make sure only IE sees the invalid rules.

JavaScript dependency

Making a site depend on JavaScript. More people than you’d like are either using a browser with no JavaScript support, or have disabled JavaScript in their browser. Current stats (Browser Statistics at W3Schools, indicate that this is 8-10 percent of web users. Search engine robots currently don’t interpret JavaScript very well either, even though there are reports that Google are working on JavaScript support for Googlebot. If your site requires JavaScript to navigate, don’t expect great search engine rankings.
Why? Inaccessible and bad for search engine rankings.

Flash dependency

Assuming everybody has Flash installed. Not everybody has. And most search engine robots do not (Google has reportedly started experimenting with indexing of Flash files, but they still recommend that you make sure all your text content and navigation is available in HTML files), so if your whole site, or your site navigation, depends on Flash being available, you’re not going to score high with search engines.
Why? Inaccessible and bad for search engine rankings. I’m not saying you shouldn’t use Flash at all, just that you should use it sensibly.

Text as image

Making images of text, and not providing a more accessible alternative. Not only does it take longer for visitors to download images instead of text, you also make it impossible for all visitors to copy the text, and for most visitors to enlarge it.
Why? Inaccessible, increases load time, bad for search engine rankings.

Bad forms

Inaccessible, hard-to-use forms. Learn to use the <label>, <fieldset>, and <legend> elements, and do not use a “Reset” button.
Why? Inaccessible, decreased usability. Read Creating Accessible Forms, Better Accessible Forms, and Reset and Cancel Buttons to learn more about creating accessible and usable forms.

Old skool HTML

Multiple nested tables, spacer GIFs, <font> elements, presentational markup. So common I don’t really have to mention it here.
Why? Increased complexity, bloated pages, slow, inaccessible, bad for search engine rankings.

Being IE-centric

Coding for IE/Win first, then adjusting for others, if there is time.
Why? Takes more time, encourages bad coding practices. IE/Win is notorious for accepting sloppy, invalid HTML, which breaks in many other browsers. IE also accepts well-formed, valid HTML, which works in all browsers, so by using valid HTML you make all browsers happy, and it doesn’t take more time or cost more. Also see The IE Factor.

Invalid HTML attributes

Using deprecated or browser specific attributes like marginwidth, leftmargin, language, height for <table> elements, border for <img> elements etc.
Why? Invalid and unnecessary. Use CSS instead. For <script> elements, use type, not language, to specify the scripting language (almost always JavaScript).

Unencoded ampersands

Many URIs contain long query strings with unencoded ampersands (&). This is invalid, and may cause problems. Ampersands must be written as &amp;.
Why? An explanation as well as an example of what can go wrong can be found in Ampersands and validation.


Using frames to split the browser viewport into several independent documents.
Why? First of all, let me say that frames may be useful, if used in the right way, in intranets and certain web applications. For a public website, however, frames have too many accessibility and usability problems. Bookmarking problems, printing difficulties, trouble with deep linking, and having to do search engine workarounds are a few of the drawbacks to using frames.

Inaccessible data tables

Tables containing tabular data, but marked up as if they were layout tables, not using any of the many elements and attributes that are available for making tables structured and accessible.
Why? Screen readers and other assistive technologies have no way to make sense of a data table unless it is marked up correctly. A whole bunch of links to articles describing how to mark up data tables can be found in A table, s’il vous plaît, at the Web Standards Project.

Divitis and classitis

Related to <span> mania. Adding unnecessary div elements and class attributes.
Why? See “<span> mania” and “lack of semantics”.

Too wide fixed width

If you use a fixed width design, don’t make it too wide. Note: I’m not getting into the whole debate on fixed vs fluid width here.
Why? If your specified width is wider than your visitors can fit on their monitor, you force them to scroll horizontally, which is really bad for usability.

Vague and/or presentational class and id names

Naming a class or id based on how it looks rather than on what it does.
Why? Doing this is asking for confusion when you redesign. A class named largeblue may end up making text small and red. An id named leftcol may be displayed to the right.

No background colour

Failure to declare a background colour for the body element.
Why? Many users do not have their browser set to display the same default background colour as you do.

Non well-formed XHTML

Using XHTML that is not well-formed.
Why? If XHTML is served as “application/xhtml+xml”, which it should be, strictly compliant browsers, like those based on Mozilla, will not render non well-formed XHTML. Note that this site currently does not serve all documents as “application/xhtml+xml”, for certain reasons explained in my post on Content negotiation.

Incomplete colours for text input fields

Specifying only background or text colour for form fields, especially single and multi-line text inputs (input type="text" and textarea).
Why? Some people set their browser or operating system to use inverted colours. The default for a text input would then be white text on a black background, instead of black on white.

If you set the text colour for text inputs to dark grey, and don’t specify a background colour, people with inverted colours would get dark grey text on a black background, which is next to impossible to read. The opposite will also cause problems – specifying a light grey background without specifying the text colour would lead to white text on a light grey background.

Always specify either both text and background colours, or none at all, for text input fields.

That’s a pretty long list of things to watch out for. Avoid them all and you’re doing very well. If you’re currently making some of these mistakes, well, if it’s any consolation, I’ve been guilty of making a lot of them at some point. Hopefully this list will help you make fewer mistakes in the future.