The Quick Guide on Writing Secure PHP Code

By John, 28 December, 2020

A combination Masterlock hanging on a doorknob.

A young member of my extended family is just starting to get into web development. He recently sent me a link to the code and working example he was hosting on a Rasberry Pi in his home. There were a few serious security issues regarding how his code was written. I tried to go find a quick guide to writing secure PHP code, but everything I came across was pretty elaborate, the best, quick primer I found so far is this article. This quick guide will cover the basics, or sanitizing inputs and outputs. It won't cover the more esoteric concepts, that a hobbyist developer just starting out would likely not need to be worried about.

First Rule of Writing Secure Code, Don't!

I think the best thing you can do to avoiding having security vulnerabilities in the code you write, is don't write your own code. These days, you have plenty of open source options to choose from for all sorts of applications. If you need a guest book or a forum, you can easily find something. Big content management systems like Drupal or WordPress can provide these types of things as well. My general view is that all custom code is pretty much garbage, no matter who wrote it. The main reason for this is that custom code is now something that needs to be maintained. For example, if some bit of code was found to have a security vulnerability, the person responsible for fixing it is going to be the person supporting that software project. If that is an open source module, that can generally be identified and fixed by a community of developers. An organization with a heavily customized project might not even use a support developer at all (even though they should), and if they aren't actively probing their own site for vulnerabilities (which would be the responsibility of a support developer to coordinate). Side note, I've started playing with Zed Attack Proxy, and it seems like a very formidable tool for discovering security vulnerabilities.

Modern web development is often more about gluing different systems together anyway, rather than writing your own code for them. As a developer I will set up a CMS, and then enable a module to provide a forum, or a guest book. or even just subscribe to a service where you can have a subdomain set up, and then use single sign on to enable users to log in. Any custom code that I write is usually working within these existing systems. This is why I don't consider myself as much of an expert in writing "secure" code from the standpoint of managing user inputs and outputs, that's because I never really need to, usually the framework that I am working within handles it for you. That's the way it should be, it's better to have everyone using the same well tested tools, instead of making the same mistakes over and over.

Sanitizing your Inputs and Outputs

When it comes to actually writing any code, the number one thing to keep in mind is sanitizing your inputs and outputs. You want to sanitize inputs to avoid any code injection that can cause a remote execution, (this is probably the most dangerous vulnerability one could have). You also want to avoid SQL injection, where someone can inject malicious SQL code into your processes. That kind of vulnerability has been famously depicted in this comic from XKCD. Making sure you sanitize your inputs also will tie heavily into how you are storing your data. And how you output your data isimportant, un-sanitized content can lead to Cross-Site Scripting (XSS) vulnerabilities. Now anything I say here is nowhere near the definitive guide on how you should handle user input. so refer to some other sources, but hopefully it gives you a start.

Sanitizing and Storing Input

Where you store user submitted data will have a big influence on how you handle it. Most web applications will use some sort of database. In SQL, this means using prepared statements. But if you are using a flat file system, you need to be more cautious. I don't recommend storing your data in a flat file system, but if you are going to, you should make sure that you sanitize inputs so that they don't include any code that can be injected (javascript or PHP). Ideally, you can store it in such a way, that the user data is indeed preserved in it's original contents, but is never handed in a way that would allow any sort execution of the stored content as code. But if you are trying using flat files because you are trying to avoid running a MySQL server (or similar), then why not just use something like NoSQL?

I've heard some people claim you should base64 encode user data, which could be an option, but would recommend against it for the reasons in the SO link.

Embedding User Output

If you are embedding user input on a page, make sure you use "htmlspecialchars();" to convert the contents into an html encoded version of the content. Be aware when using this, you will convert all html code into it's html entity version, which will display, as HTML on a page. e.g.

this is in a span

Vs.

Some security scanners will even call out the second example, even though it plainly is not an executable snippet. But the point I'm trying to make is that unless you want your users to be able to add raw unfiltered HTML text, you should be converting to HTML entities, which means that how you are storing the data should not include any HTML.

Including files

PHP includes are a very common way to organize the structure a web application. What they should not be is a storage mechanism for fetching user submitted data. Using this:

include 'some-file.html';

Is fine when you are including known parts of your application. But if the included files are not known to you, this can be incredibly dangerous. That is because files included in this way, are interpreted by PHP as PHP. So you can easily inject something that will execute a GET parameter as shell commands, and then a hacker has access to your entire system. It would be safer to do something like:

$output = file_get_contents('my/file.html');

And print the contents of $output.

That's it for now. I just wanted to get something quick up here, but I'll come back and make updates as I think of better suggestions, or realize I said something wrong.

More Advanced Concepts

Code always lives within a system, with many moving parts. A web page can include many libraries and tools, which themselves can become sources of vulnerabilities. Securing these areas require knowledge of things like Content Security Policy (CSP), and Cross Origigin Resource Sharing (CORS). But these are concepts that are more applicable to sites that are dealing in PII, or financial transactions.