Navigation

Last week, I read a post by Beau Simensen about resource location in PHP. It was interesting, because I had already pondered over this topic in the context of Symfony2. Thinking about how to make Beau’s proposal work for Symfony2 brought me to a result that hit me like a stone. Both he and Amy Stephen made me realize how much this could change not just Symfony2, but the PHP ecosystem as a whole. After PSR-0 and Composer, this is the next important step to ensure interoperability between PHP libraries. And it could almost completely replace the new and hotly debated autoloading PSR. Let me take a few minutes of your time to explain why.

Resource Location

In almost every PHP application, you need to locate some kind of resource. A resource, in this context, could be a template file, a configuration file, an image or any other kind of tangible asset in your project. Zend Framework 2, for example, ships translation files for its validation messages. The configuration for using these files looks like this:

$translator = new Zend\I18n\Translator\Translator();
$translator->addTranslationFile(
    'phpArray'
    __DIR__.'/../vendor/zendframework/zendframework/resources/languages/en.php',
    'default',
    'en_US'
);
Zend\Validator\AbstractValidator::setDefaultTranslator($translator);

You provide an absolute path to the file and the translator will know where to find it. This works well and is generic. Unfortunately, it is also verbose and couples the configuration to the specific directory structure of the zendframework/zendframework package.

Resource Identifiers

Many libraries and frameworks reduce verbosity by inventing some kind of identifier for resources.  Let’s take an example from the CakePHP documentation:

class UsersController extends AppController {
    public function view_active() {
        $this->layout = 'Contacts.contact';
    }
}

The string “Contacts.contact” is the identifier of a layout template that is going to be used as layout for the view_active action. This template must be located in one of the following paths:

  • /app/Plugin/Contacts/View/Layout/contact.ctp
  • /app/View/Plugin/Contacts/Layout/contact.ctp

So it can be shipped with the Contacts plugin, but also overridden in the application. Essentially, CakePHP creates a mapping from resource identifiers (such as “Contacts.contact”) to paths in the file system. What happens if you want to use a *.ctp file that is located somewhere else, for example in a package you just installed via Composer? That’s not possible.

Let’s look at another example from Symfony2. In Symfony2’s configuration system, you can import configuration files into other configuration files. Let’s look at a typical routing configuration:

_wdt:
    resource: "@WebProfilerBundle/Resources/config/routing/wdt.xml"
    prefix:   /_wdt

Again, a custom kind of resource identifier is used to refer to the wdt.xml file in WebProfilerBundle. What if you want to import a routing configuration file from some place that is not in a bundle? Not possible (except if you use the absolute path, but that’s not portable so I’ll dismiss this solution).

This list goes on, but the pattern repeats.

Autoloading

Let me take a short excursion to talk about autoloading. As you probably know, autoloading refers to the dynamic loading of class files when classes are used for the first time. PSR-0 did a great deal to unify autoloading among PHP projects, and a new PSR (“PSR-X”) is being discussed to improve PSR-0.

The core of PSR-0 is the mapping of class names to paths on the file system. For example:

require __DIR__.'/vendor/composer/ClassLoader.php';
 
$classLoader = new Composer\Autoload\ClassLoader();
$classLoader->add('Acme\\Demo', '/path/to/acme/demo');
$classLoader->register();
 
// loads /path/to/acme/demo/Controller/ContactController.php
$controller = new Acme\Demo\Controller\ContactController();

Do you see the pattern? Mapping identifiers of classes to PHP files and namespaces to directories is essentially the same as all the other mappings described before.

Let’s extend this example: What if I want to load some file from a location that is known to the autoloader? For example, I know that config.ini is located in the same directory as the class Acme\Demo\Application. The only way to do this right now is by using reflection:

$reflClass = new \ReflectionClass('Acme\\Demo\\Application');
 
$file = dirname($reflClass->getFileName()).'/config.ini';

But reflection is slow.

Uniform Resource Location

What if we could solve all of the above problems and use cases with one simple pattern? Our basic requirements are:

  1. Map resource identifiers to one or more paths on the file system.
  2. Use the same identifier pattern across different PHP libraries to make them interoperable.

The good thing: A specification for Uniform Resource Identifiers (URI) already exists (RFC 3986). Why not reuse it?

Let’s see how we could leverage URIs to locate PHP classes and files or directories relative to PHP classes:

$locator = new ResourceLocator();
$locator->addPath('classpath', '/Acme/Demo/', '/path/to/acme/demo');
 
echo $locator->findResource('classpath:///Acme/Demo/Parser.php');
// => /path/to/acme/demo/Parser.php
 
echo $locator->findResource('classpath:///Acme/Demo/resources');
// => /path/to/acme/demo/resources
 
echo $locator->findResource('classpath:///Acme/Demo/resources/config.ini');
// => /path/to/acme/demo/resources/config.ini

This is basically the same thing that autoloaders are doing today, but a little more generic. The presented URIs have two parts:

  • a scheme: “classpath”
  • a path: “/Acme/Demo/Parser.php”

The “authority” (“host:port”) part is empty, which is why the double slash (“//”) is immediately followed by the initial slash of the path. Other URI parts like the query (“?query”) are not required as well, but could be added for custom (non-interoperable) implementations.

An autoloader can be based on such a resource locator by turning backslashes into forward slashes:

spl_autoload_register(function ($class) use ($locator) {
    try {
        include $locator->findResource('classpath:///'.strtr($class, '\\', '/').'.php');
    } catch (\Exception $e) {
    }
});

Many of PHP’s standard functions can be used to work with URIs:

echo dirname('classpath:///Acme/Demo/Parser.php');
// => classpath:///Acme/Demo
 
echo basename('classpath:///Acme/Demo/Parser.php');
// => Parser.php
parse_url() unfortunately cannot be used, because it explicitly does not support URIs, just URLs.

As you see, URIs are a very strong concept that decouples resource location completely from the mechanisms of individual frameworks.

More URI Schemes

To see the use of URIs in practice, let’s replace Symfony’s resource identifiers:

_wdt:
    resource: "classpath:///Symfony/Bundle/WebProfilerBundle/Resources/config/routing/wdt.xml"
    prefix:   /_wdt

That’s already a lot more generic. But it is also a lot more verbose than the previous code. To fix this problem, we can add a context-specific scheme, for example “config”:

$locator->addPath(
    'config',
    '/symfony/web-profiler-bundle/',
    '/path/to/web-profiler-bundle/Resources/config'
);

Like before, the first argument of addPath() is the URI scheme, the second the URI path prefix and the last one the actual file path. As URI path prefix I used the Composer package name.

The new scheme allows to simplify the configuration quite a lot:

_wdt:
    resource: "config:///symfony/web-profiler-bundle/routing/wdt.xml"
    prefix:   /_wdt

If the configuration parser assumes that resource URIs are in the “config” scheme by default (unless an explicit scheme is given), we can even reduce it to:

_wdt:
    # The scheme config:// is assumed if no other is given
    resource: /symfony/web-profiler-bundle/routing/wdt.xml
    prefix:   /_wdt

Now we completely replaced the custom identifier by a generic, framework-independent one. And this genericity opens up many possibilities.

Composer Integration

Currently, Composer already generates an autoloader for you. Imagine that it would also generate a resource locator for you. Each package could specify in its composer.json where the directories for (a) namespace prefixes and (b) custom schemes are located:

{
    "name": "acme/demo",
    "autoload": "psr-x",
    "resources": {
        "classpath": {
            "Acme\\Demo\\": "src/"
        },
        "config": "resources/config/",
        "view": "resources/templates/",
        "lang": "resources/translations/"
    }
}

With the generated locator you can easily locate resources in any package and use them anywhere that resource URIs are supported, independent from the internal directory structure of that package! Sit back and think about the consequences for a second. For example:

demo:
    # The scheme config:// is assumed if no other is given
    resource: /acme/demo/routing.xml
    prefix:   /demo

Or for loading translations from the zendframework/zendframework package:

$translator->addTranslationFile(
    'phpArray'
    $locator->findResource('lang:///zendframework/zendframework/en.php'),
    'default',
    'en_US'
);

Or for using custom templates from a webmozart/theme package in CakePHP:

class UsersController extends AppController {
    public function view_active() {
        // The scheme view:// is assumed if no other is given
        // .ctp is automatically appended
        $this->layout = '/webmozart/theme/layout';
    }
}

And so on. This will have a huge impact on developing with PHP.

Stream Wrappers

Although I see this more of a toy, a consequence of using URIs for identifying resources is that you could wrap a stream wrapper around the resource locator. For example:

ResourceLocatorBasedStreamWrapper::setResourceLocator($locator);
 
stream_wrapper_register('classpath', 'ResourceLocatorBasedStreamWrapper');
stream_wrapper_register('view', 'ResourceLocatorBasedStreamWrapper');
 
echo file_get_contents('classpath:///Acme/Demo/Parser.php');
echo file_get_contents('view:///webmozart/theme/layout.ctp');

Roadmap

Yesterday, I created a PSR proposal for uniform resource location. I would be happy about feedback in the corresponding Google Groups topic, especially from the people currently discussing PSR-X. By specifying the “classpath” scheme in detail, this PSR would almost completely replace PSR-X.

The next steps I see are:

  1. Finish the framework survey (help appreciated!)
  2. Specify the “classpath” scheme
  3. Implement a sample locator
  4. Vote on and release the PSR
  5. Rebase PSR-X on this PSR
  6. Integrate both into Composer
  7. Integrate resource URIs into member projects

I’m excited. Are you yet? What possibilities do you foresee?

Posted Wednesday, June 19th, 2013 at 11:56
Written by: | Filed Under Category: Thinking Ahead
You can leave a response, or trackback from your own site.

11

Responses to “The Power of Uniform Resource Location in PHP”

Yes, very interesting, hope the FIG will accept it!

It’s not such a toy, and there is a little precedent.

Drupal has been using stream wrappers for a few years now for Drupal-managed files. We have public:// for “public” file uploads, private:// for private file uploads, etc. You can register your own streams for things like S3, YouTube, etc.

We’ve also been working on, but not yet merged, stream wrappers for Modules and Themes: https://drupal.org/node/1308152

From benchmarking we’ve found that it really sucks to do that for PHP, because it breaks APC. However, it would work fine for resource files. In our case we’re talking about something like:

$css_file = ‘module://modulename/something.css';

But I could easily see that being ‘stylesheet://modulename/something.css’

There’s definitely potential here.

Bernhard

Thank you for the interesting insight Larry!

Performance is the reason why I didn’t want to put too much focus on stream wrappers (yet). If the resource locator is used by a PSR autoloader, it must be *very* efficient. What the resource locator does now is essentially the same as the current PSR autoloader implementations: Turn some identifier (FQCN, URI) into an absolute file path, which can then be used as always.

What I foresee is probably a combination: Direct access to the resource locator where performance is critical, and registered stream wrappers for everywhere else (if performance is good enough).

I think here instead of this:
“`
echo dirname(‘classpath:///Acme/Demo/Parser.php’);
// => classpath:///Acme/Demo

echo dirname(‘classpath:///Acme/Demo/Parser.php’);
// => Parser.php
“`

you mean this:

echo dirname(‘classpath:///Acme/Demo’);
// => classpath:///Acme/Demo

echo dirname(‘classpath:///Acme/Demo/Parser.php’);
// => Parser.php

also you talked about stream wrapping the resource locator, but i did not get the point for doing this in the sense that is a `toy` :) could you please explain this a bit further?

Also I know it is perhaps too much to ask, but do you envision symfony2.4 or 2.5 as supporting this? or that comes even later?

I really like the idea, except for the default schemes. That makes things to magic and we introduce yet another convention.
Conventions are imo one of the things to avoid in your project. It’s makes the learning-curve pretty steep and avoid doing something else when you want that.

Bernhard

@cordoval: I already fixed the code sample. The second “dirname()” should have been a “basename()”. dirname() also strips the last path component if it is a directory.

Maybe “toy” was a bad word to use here :) My point was that I don’t know about the performance implications of stream wrappers yet, so I don’t want to promote them right now. If this PSR gets accepted, and if we find that stream wrappers are efficient, we can definitely use them for more than a toy. See also my reply to Larry above.

we also use such urls to combine resources within our cms. we have something like a resouce generation/compilation phase where all this URIs get resolved and will be replaced with plain php.

this makes it easy for the dev to use (express with URIs) but keeps it performant for our frontend

William

Very interesting and having implemented and worked with similar implementations found that when combined with stream wrappers this technique provides a very flexible approach for resource access as you can pass the URI to almost any function that expects a file path.

I would like to suggest one change. By allowing multiple and seemingly random schemes you are increasing the chance of a scheme conflicting with existing code as well as forcing people to have to think about what that uri might do. Is view:// in this application defined as above or has it been implemented differently? Can i pass it to fopen etc.? Therefore, would it not be better to have a single standard scheme and then use the authority part to specify the resource type. E.g.
psr://config/foo/bar.ini
psr://view/foo/bar.twig
If you need to differentiate between uris that can be passed to fopen and those that can’t then two schemes might be better.
psr://… for uris that are not stream wrappers
psrf://… for uris that are stream wrappers

Bernhard

Thank you William for this interesting idea! We will take your proposal into consideration once we flesh out the details of the PSR.

Kirill Khatsko

I have now same problem – how to locate resources.

But i think you miss main point. _Every_ file should know where you `locator` is. And this braking all benefits. Problem is not – what scheme\style\conversation pass to $locator. Problem – that you should load $locator in first place.

IMHO it should be some unix demon. But that you need alot php code to contact it. It will be good, if some one has better idea.

[…] about this problem with Beau Simensen and several others at PHP-FIG. I wrote a blog post about The Power of Uniform Resource Location in PHP. Many people joined the discussion. The understanding of the problem and its solution got riper as […]

Leave a Reply

 

Additional Resources