Skip to content

caching approach for nginx-proxy #54

@gingerlime

Description

@gingerlime

The nginx-proxy image is used for proxying thumbor and primarily for caching (+ easier handling of CORS perhaps). It's built on top of https://github.com/jwilder/nginx-proxy with the caching parts based on APSL's original implementation.

It serves well, but recently I'm having second thoughts if this is the best approach. Reasons:

Those aren't the core issues however in my opinion, but rather a symptom. Why?

The caching is relying on thumbor to store cached images in a results storage folder, and then fetch them if they exist using try_files. This works well, but has some limitations. For example:

  • content-type headers are unknown just from looking at the cached file. We essentially "guess" them based on the filename (e.g. if it has ".png" in the filename then it's an image/png), or the filter... But if we serve a file that has no extension and no filter (not very common, but can happen), then we just can't serve the right content-type header
  • WEBP again is an edge-case and depends on browser headers + result storage
  • there's quite a bit of "code" (nginx configuration) to deal with this. For example calculating the hash of the request via JS to find the cached file, plus all the content-type handling etc
  • In order to work, nginx-proxy and thumbor must share the "data" folder where result storage is kept

It feels a bit like re-inventing the wheel here. After all, nginx has built-in proxy caching functionality, so why not rely on it?

I played around with it a bit, and I think it can be a simpler, more elegant and just as performant (or perhaps even more) if we simply use the built-in proxy_cache directives.

Advantages:

  • No need to save results storage in thumbor, caching is independent
  • caching disk space / retention etc are also independent
  • no need to share a volume or "find" the results cache files, nginx takes care of it by itself
  • content-type header cached "for free". Nginx "remembers" it as far as I can tell (TODO: check what happens with WEBP, but I imagine it would work)
  • less modification to the nginx.tmpl file compared to the base

Disadvantages:

  • Need for custom variables for proxy settings (e.g. cache storage size, expiration, etc), or let the users "bring their own" proxy.conf file? (but that won't be very useful)
  • breaks backwards-compatibility if we simply change how nginx-proxy works... So maybe introduce a new proxy? then there are two, so which one to pick?? the transition might be tricky... need to think about it

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions