12.10. Troubleshooting server problems

12.10.1. Problems under Docker

See troubleshooting under Docker.

12.10.2. Installation problems

12.10.2.1. matplotlib complains about being unable to build freetype, png

Note

This bug relates to old versions of CamCOPS and should no longer be seen.

Observed with matplotlib==2.2.0 under Windows 10.

See

Unsuccessful method:

  • Do this and retry CamCOPS installation:

    pip install freetype-py
    pip install pypng
    

Successful method:

  • Upgrade matplotlib from 2.2.0 to 3.0.2 (as of CamCOPS v2.3.1).

12.10.3. Problems starting CamCOPS

12.10.3.1. “Posix spec does not match last transition”

This error has been reported (on 2020-02-14) when running CamCOPS 2.3.6 (version number likely not very important) under Ubuntu 18.04 LTS hosted via the Windows Subsystem for Linux (WSL) version 1 running under Windows 10, in Singapore (but using a GMT timezone).

The problem appears to be that the WSL does not properly implement the POSIX standard for time “transitions” (we think this refers to e.g. daylight savings time adjustments in a given timezone; see tzfile). The problem arises via the Pendulum Python module.

This may reflect known WSL bugs:

Our suggestion would be to run Ubuntu in a proper virtual machine (e.g. VirtualBox) or on physical hardware.

12.10.4. Memory problems

12.10.4.1. Celery worker process size grows until the process dies

See Celery-related memory leak.

12.10.5. Web server errors from Apache

12.10.5.1. Apache itself is working, but not talking to CamCOPS

Suppose:

  • your Apache server is running and meant to be serving the site’s root page;

  • CamCOPS is meant to be accessible at http://mysite/camcops.

We’ll use curl rather than wget for testing, since curl can talk to UNIX domain sockets. Check the following:

  • Watch your Apache access and error logs (e.g. with tail -f /var/log/apache2/access.log and tail -f /var/log/apache2/error.log).

  • Check you can fetch the web server’s root page from Apache:

    curl --verbose https://mysite/
    

    You should see your site’s root page HTML. If not, the problem is with the Apache configuration itself.

  • As root, check you can talk directly to CamCOPS.

    If you are using a UNIX domain socket, e.g. /run/camcops/camcops.socket:

    # Must use http://, not https:// (Apache, not CamCOPS, handles the SSL)
    sudo curl --verbose --unix-socket /run/camcops/camcops.socket http://mysite/camcops
    

    If you are using TCP/IP via a local port:

    # Must use http://, not https:// (Apache, not CamCOPS, handles the SSL)
    sudo curl --verbose http://mysite:MY_INTERNAL_PORT/
    

    You should see HTML from CamCOPS. If not, the problem is with the CamCOPS configuration (supervisor config file, CamCOPS config file, permissions, etc.).

  • Ask Apache for CamCOPS:

    curl --verbose https://mysite/camcops
    

    If you get an Apache HTTP 404 “Not Found” error, then…

    • Check static images are being served – you should have configured these to be served by Apache itself.

      curl --verbose https://mysite/camcops/static/logo_local.png
      

      If that doesn’t work, your Apache config or permissions are wrong.

    • Check that you haven’t put a trailing slash at the end of your ProxyPass* commands, e.g. using /camcops/ when you meant /camcops. If that is the problem, then you will find:

      curl --verbose https://mysite/camcops  # fails
      curl --verbose https://mysite/camcops/  # works
      

      An alternative solution in this case is to ensure you have run sudo a2enmod rewrite to enable mod_rewrite and add this to your Apache config:

      RewriteEngine on
      RewriteRule ^/camcops$ camcops/ [L,R=301]
      

      For details, see RewriteRule, but in brief, the flag “L” says that this is the last rewrite required, and “R=301” means “redirect with HTTP 301 Moved Permanently”.

12.10.5.2. Web server returns “permission denied”-type messages; Apache error log is full of file permission errors

Ownerships, permissions, or SELinux security settings on files in the DocumentRoot tree are probably wrong. See Server configuration; Linux flavours.

12.10.5.3. Operation (e.g. table upload) fails; Apache error log says “client denied by server configuration”

The Apache configuration file might be missing a section saying

<Directory /usr/share/camcops/server>
    # ...
    <Files "database.py"> # CGI script for tablets to upload data
        Allow from all
    </Files>
    # ...
</Directory>

Note

This bug relates to old versions of CamCOPS and should no longer be seen.

12.10.5.4. SSL not working; Apache log contains “Invalid method in request \x16\x03\x01

Misconfigured server; it is speaking unencrypted HTTP on port 443. Do you have the VirtualHost section configured properly? Do you have LoadModule ssl_module modules/mod_ssl.so?

12.10.5.5. Slow upload fails; Apache error log contains “Internal server error” or “Proxy error”

This may be a timeout problem. Gunicorn’s default timeout, for example, is 30 seconds, which may be too little for lots of data and/or a slow network.

Look for [CRITICAL] WORKER TIMEOUT (pid:385) or similar in the CamCOPS logs.

Check the following:

  • The Gunicorn timeout. Try 300 (seconds) rather than 30.

    • For modern CamCOPS servers, change the GUNICORN_TIMEOUT_S setting in the config file.

    • In old CamCOPS servers, this was the --timeout parameter on the command line; see camcops_server serve_gunicorn --help.

  • The Apache timeout. Try 300 (seconds).

    • This can be specified in the ProxyPass command, or if not, defaults to the value of ProxyTimeout, or if that’s not specified, to TimeOut. See the Apache docs.

  • When you make changes, be absolutely sure everything has restarted, including

    • Apache

    • supervisord itself (e.g. service supervisor restart), to pick up any changes in the supervisor config file

    • CamCOPS – e.g. do you have a problem such that supervisorctl stop camcops says it’s stopping CamCOPS but leaves orphan processes (check via ps aux | grep camcops)?

12.10.5.6. Data download fails; Apache reports proxy error

For example:

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /camcops/basic_dump.

Reason: Error reading from remote server

This is probably Apache saying “CamCOPS was too slow to respond, so I timed out”. Consider adding e.g. timeout=300 to the ProxyPass command in the Apache config. The timeout value is in seconds. See e.g. https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypass.

12.10.5.7. Other Apache errors

See front-end web server configuration, which has specimen Apache config sections.

12.10.6. Web server URL errors

12.10.6.1. Can’t find //login

If your CamCOPS server routes you to the login page and then says it can’t find a URL like //login, it’s likely that your Apache file is failing to swallow a trailing slash. If it looks like this:

ProxyPass /camcops ...
ProxyPassReverse /camcops ...

then change it to

ProxyPass /camcops/ ...
ProxyPassReverse /camcops/ ...

12.10.7. Web server errors in general

12.10.7.1. Web server returns content but says <class ‘ConfigParser.NoSectionError’>: No section: ‘server’

Unless the CamCOPS config files are broken, this probably means that the /etc/camcops/* configuration files have the wrong ownerships/permissions/SELinux security settings. See the chown and chcon commands 9. It’s also possible that the configuration files have been damaged, or that the Apache configuration file is pointing to a non-existing configuration file.

12.10.7.2. I can log in, but then seem to be logged out again immediately

Is your browser correctly storing session cookies? Especially, are you trying to run CamCOPS over a non-encrypted (HTTP) link? The session cookies are set to secure and httponly for security reasons, and will not work without HTTPS.

Is an intermediary between the web client and server (such as Azure Application Gateway) correctly setting the remote client IP address (REMOTE_ADDR header)?

Check the entries in the database:

mysql> select * FROM _security_webviewer_sessions WHERE user_id IS NOT NULL;

+-------+--------------------------+---------------+---------+---------------------+----------------+----------------+
| id    | token                    | ip_address    | user_id | last_activity_utc   | number_to_view | task_filter_id |
+-------+--------------------------+---------------+---------+---------------------+----------------+----------------+
| 42622 | x7UAFgDHgVMRqpSTBA78bA== | 192.0.2.1     |       2 | 2021-02-16 17:51:16 |           NULL |           NULL |
| 42559 | B_uRclOdM12xhAFOdHUaNA== | 203.0.113.7   |       6 | 2021-02-16 17:42:54 |           NULL |           NULL |
+-------+--------------------------+---------------+---------+---------------------+----------------+----------------+

The IP addresses in the table above are for a correct setup. If the addresses in the ip_address column include a port, which changes with each request, CamCOPS will consider these to be new sessions and the user will be logged out.

Fix for Azure Application Gateway: - https://docs.microsoft.com/en-us/azure/application-gateway/rewrite-http-headers

12.10.8. Tablet upload errors

12.10.8.1. MySQL server has gone away

Tablet uploads fails with error including “(2006, ‘MySQL server has gone away’)”. Apache log contains “OperationalError: (2006, ‘MySQL server has gone away’)”.

CamCOPS takes care to ping the database connection, so it’s unlikely that a connection has timed out. The probable cause is that the relevant max_allowed_packet parameter is set too small; MySQL also generates this error if the query is too big. You will need to edit the MySQL my.cnf configuration file; see Setting up MySQL under Linux. The most probable time to see this error is when uploading the BLOB table (blobs).

12.10.8.2. Read timed out

Tablet BLOB upload fails with error “Read timed out”.

Likely problem: large BLOB (big photo), slow network. For example, in one of our tests a BLOB took more than 17 s to upload, so the tablet needs to wait at least that long after starting to send it. Increase the tablet’s network timeout (e.g. try increasing from 5000 to 60000 ms) in Settings ‣ Server settings.

12.10.8.3. Row size too large (>8126)

Note

This bug relates to old versions of CamCOPS and should no longer be seen.

In full, the error was:

Uploading error: Operational error (1118). Row size too large (> 8126).
Changing some columns to TEXT or BLOB may help. In current row format, BLOB
prefix of 0 bytes is stored inline.

This is a problem with some TEXT-heavy tables, e.g. psychiatricclerking.

Best thing: change the table format.

  1. Edit the MySQL config file, e.g. /etc/my.cnf. In the [mysqld] section, add the lines:

    innodb_file_per_table
    innodb_file_format = Barracuda
    
  2. Restart MySQL.

  3. At the MySQL command line:

    ALTER TABLE psychiatricclerking
        ENGINE=InnoDB
        ROW_FORMAT=COMPRESSED
        KEY_BLOCK_SIZE=8;
    

Another method to consider for MySQL versions before 5.7.5: making the MySQL log file bigger, e.g. 512 Mb.

  1. At the MySQL console: SET GLOBAL innodb_fast_shutdown=0;

  2. Stop MySQL.

  3. Edit the MySQL config file, e.g. /etc/my.cnf, and in the [mysqld] section, add the line:

    innodb_log_file_size = 512M
    
  4. Delete the old log files, e.g. /var/lib/mysql/ib_logfile0 and /var/lib/mysql/ib_logfile1.

  5. Restart MySQL

From CamCOPS v1.32, the server autoconverts tables to Barracuda when using the make-tables command, to avoid this problem.

12.10.9. MySQL: “Too many connections”

This error occurs if programs collectively attempt to open more connections to MySQL than the configured limit 1. The easiest way to make it happen in CamCOPS is to launch a web server with a very high maximum number of threads, in excess of the MySQL limit, and then work the web server hard.

Fix the problem by limiting the maximum number of threads/processes used by CamCOPS or by increasing the MySQL connection limit.

12.10.10. MySQL: “Illegal mix of collations…”

In full: MySQL reports: “Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation ‘=’”

In summary, part of your database is out of date, and this is probably because you’ve upgraded from an old version of CamCOPS.

12.10.10.1. Background: character sets and collations

MySQL character sets and collations can be confusing.

A character set is the way in which the computer represents characters. We are a long way beyond the basic ASCII 7-bit character set (which could represent 128 characters) and we should be supporting all Unicode characters (of which there are >136,000, including accents like ä and symbols like ≈). There are several ways to represent Unicode, but the most popular is UTF-8, which uses a variable number of bytes (from 1 to 4) per character. This means that simple text using only plain 7-bit ASCII characters still takes one byte per character, and more bytes are added for non-ASCII characters. CamCOPS uses UTF-8.

A collation is a MySQL term for the way that characters are compared. For example, if you are sorting in Swedish and you don’t care about case sensitivity, you want the ordering A–Z then ÅÄÖ, and you want a to be considered equal to A, å to be considered equal to Å, and so on.

12.10.10.2. Background: MySQL’s support for character sets and collations

Now, to add to the difficulty, MySQL supports multiple different levels at which you can define the character set and collation.

Character sets are supported for these things: literal, column, table, database, server — plus client, connection (though I think that’s the same as ‘literal’), filesystem, results, system 2. (Some don’t change: system, for storing identifiers, always UTF8 3. Some we don’t need to care about: filesystem, for referring to filenames. The client one is for statements arriving from the client. The connection one is used for literals, and I have no idea why you might want this to be different from the client setting. The results one is the one that the server uses to return result sets or error messages. You can inspect some of these settings with SHOW VARIABLES LIKE 'char%', and table-specific settings using SHOW CREATE TABLE tablename.)

Collations are supported for the following (from greatest to least precedence): query, column, table, database, connection, server 4. (The connection collation is applicable for the comparison of literal strings.)

Note

You can inspect the database/connection/server collations using

SHOW VARIABLES LIKE 'collation%'

and table-specific settings using

SHOW CREATE TABLE tablename

Inspect all table collations for a given database with

SELECT table_catalog, table_schema, table_name, column_name, character_set_name, collation_name
FROM information_schema.columns
WHERE collation_name IS NOT NULL
AND table_schema = 'camcops';  -- or whatever your database is named

12.10.10.3. How CamCOPS configures MySQL character sets and collations

CamCOPS (via SQLAlchemy) creates all MySQL tables using the ‘utf8mb4’ character set. This is MySQL’s “proper” 4-byte UTF8 character set. (The MySQL ‘utf8’ character set uses up to 3 bytes per character and can’t store all characters 5.)

CamCOPS uses the ‘utf8mb4_unicode_ci’ collation, which uses the ‘utf8mb4’ character set and implements the Unicode standard for sorting 6. The ‘_ci’ suffix means “case insensitive”. CamCOPS sets the table collation when it creates tables, and then ignores collations for queries/columns.

You can see what CamCOPS is doing easily from a running CamCOPS server, using the “Inspect table definitions” view. You’ll see table definitions like:

CREATE TABLE phq9 (
    q1 INTEGER COMMENT 'Q1 (anhedonia) (0 not at all - 3 nearly every day)',
    -- lots of other columns here
    -- lots of CONSTRAINT statements next
) CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC ENGINE=InnoDB;

12.10.10.4. The problem that creates this error

The underlying reason: a string comparison is occurring between columns with different collations and MySQL cannot resolve the conflict 7.

The CamCOPS reason: if you have a very old CamCOPS database, it might not have the table collations set properly. Pick some tables and use syntax like SHOW CREATE TABLE phq9 \\G. (The \G is a special MySQL console suffix to show the results in non-tabular format.) If you don’t see a COLLATE command at the end, that’s probably the reason for the error.

12.10.10.5. How to generate the error

The error is typically triggered by viewing a clinical text view (CTV) that joins across “old” and “new” tables. A textual comparison will happen on the _era column. In the following example, cisr is a table with the collation set correctly (DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci), and patient is an old one (DEFAULT CHARSET=utf8mb4 only, with an implicit collation of utf8mb4_general_ci). You can then generate an error by hand with the following SQL:

SELECT COUNT(*)
FROM cisr c INNER JOIN patient p
    ON c.patient_id = p.id  -- integer; fine
    AND c._device_id = p._device_id  -- integer; fine
    AND c._era = p._era  -- string; collation mismatch; not fine
    -- COLLATE utf8mb4_unicode_ci  -- uncomment to fix the error for this query only
;

12.10.10.6. A quick solution

Rather than applying the collation to each table, you’d think we could change the database collation (and character set, while we’re at it) like this:

ALTER DATABASE <dbname> CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

However, that doesn’t work, because the old tables and columns both still have ‘hidden’ collation information:

SHOW TABLE STATUS;
SHOW FULL COLUMNS FROM patient;

So you have to go through all tables. To automate this 8, execute the following command to generate all the necessary SQL:

SELECT CONCAT(
        'ALTER TABLE ', table_schema, '.', table_name,
        ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;'
    ) AS ExecuteTheString
FROM information_schema.tables
WHERE table_schema = 'your_database_name'
AND table_type = 'BASE TABLE';

Then a quick bit of copying/pasting and you should be there.

12.10.10.7. A CamCOPS table-creation bug, fixed

2018-07-02: this happened again on the CPFT machine. Most tables had a collation of utf8mb4_general_ci except a new one (khandaker_1_medicalhistory). There is no reference to utf8mb4_general_ci in the code except here in the help. The DDL report looks right. Is there still a residual problem? Are upgrades messing with the collations somehow? As of 2018-07-02 (CamCOPS server version 2.2.3), all tables set back to CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; let’s see what happens.

In MySQL 5.7.22, this code:

USE testdb;
CREATE TABLE testtable (
    pk INTEGER PRIMARY KEY AUTO_INCREMENT,
    sometext VARCHAR(50)
) CHARSET=utf8mb4 COLLATE utf8mb4_unicode_ci;

successfully creates a table with the right collation. However, this does not:

USE testdb;
CREATE TABLE testtable (
    pk INTEGER PRIMARY KEY AUTO_INCREMENT,
    sometext VARCHAR(50)
) COLLATE utf8mb4_unicode_ci CHARSET=utf8mb4;

The second produces fields with the default collation (e.g. uft8mb4_generai_ci). So, the CHARSET command has to come before the COLLATE command.

Now, in Alembic, these end up being passed as mysql_charset and mysql_collate parameters to the op.create_table() call. In Alembic 0.9.9, the wrong order is generated. Ultimately, CamCOPS sets these via Base.__table_args__. SQLAlchemy finds that via ext/declarative/base.py and copies it to self.table_args. Now, that looks like it can be a dict or a tuple, so potentially ordered. However, in Alembic, they are also passed as keyword arguments to alembic.operations.ops.CreateTableOp.create_table, so go into an unordered dict. The final result is typically COLLATE utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC ENGINE=InnoDB CHARSET=utf8mb4, which doesn’t get the collation right.

So maybe one hacky option is to replace

Base.__table_args__ = {
    'mysql_engine': 'InnoDB',
    'mysql_row_format': 'DYNAMIC',
    'mysql_charset': 'utf8mb4',
    'mysql_collate': 'utf8mb4_unicode_ci'
}

with

Base.__table_args__ = {
    'mysql_engine': 'InnoDB',
    'mysql_row_format': 'DYNAMIC',
    'mysql_charset': 'utf8mb4 COLLATE utf8mb4_unicode_ci',
}

since the COLLATE part is really an argument to CHARSET (or CHARACTER SET); see https://dev.mysql.com/doc/refman/8.0/en/create-table.html.

(Of course, Python dictionaries are becoming ordered too; see https://stackoverflow.com/questions/1867861/dictionaries-how-to-keep-keys-values-in-same-order-as-declared.)

Yes, that works. Therefore: fixed as of 2018-07-08, v2.2.4.

12.10.11. Web browser reports: “DevTools failed to parse SourceMap…”

In full, the web browser reports:

DevTools failed to parse SourceMap: https://wombat/camcops/deform_static/css/bootstrap.min.css.map

This file (bootstrap.min.css.map) should be shipped with Deform, but isn’t. For now: don’t worry about it.

12.10.12. Logo PNG with transparency crashes PDF generator

Note

This bug relates to old versions of CamCOPS and should no longer be seen.

Problem

If your institutional logo PNG file contains a transparency layer, it will crash the xhtml2pdf PDF generator (as of 2015-02-05). The error looks like:

 /usr/local/lib/python2.7/dist-packages/xhtml2pdf/xhtml2pdf_reportlab.py in getRGBData...
    426                     self.mode = 'RGB'
    427                 elif mode not in ('L', 'RGB', 'CMYK'):
=>  428                     im = im.convert('RGB')
    429                     self.mode = 'RGB'
    430                 self._data = im.tostring()
im = <PIL.PngImagePlugin.PngImageFile image mode=P size=590x118>, im.convert = <bound method PngImageFile.convert of <PIL.PngImagePlugin.PngImageFile image mode=P size=590x118>>

 /usr/local/lib/python2.7/dist-packages/PIL/Image.py in convert(self=<PIL.PngImagePlugin.PngImageFile image mode=P size=590x118>, mode='RGB', matrix=None, dither=3, palette=0, colors=256)
    808         if delete_trns:
    809             #crash fail if we leave a bytes transparency in an rgb/l mode.
=>  810             del(new.info['transparency'])
    811         if trns is not None:
    812             if new_im.mode == 'P':
global new = <function new>, new.info undefined

<type 'exceptions.UnboundLocalError'>: local variable 'new' referenced before assignment
      args = ("local variable 'new' referenced before assignment",)
      message = "local variable 'new' referenced before assignment"

Solution 1

Remove the transparency layer. For example:

  • Load the file in GIMP;

  • Layer ‣ Transparency ‣ Remove alpha channel

  • Resave, e.g. with File ‣ Overwrite…

Solution 2

Use a newer version of CamCOPS; from server version 1.40, it uses wkhtmltopdf instead, which is also faster.


Footnotes

1

https://dev.mysql.com/doc/refman/5.7/en/too-many-connections.html

2

https://dev.mysql.com/doc/refman/5.7/en/charset-connection.html

3

https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html

4

https://stackoverflow.com/questions/24356090/difference-between-database-table-column-collation

5

https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

6

https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci

7

https://stackoverflow.com/questions/3029321/troubleshooting-illegal-mix-of-collations-error-in-mysql

8

https://stackoverflow.com/questions/10859966/how-to-convert-all-tables-in-database-to-one-collation; https://stackoverflow.com/questions/1294117/how-to-change-collation-of-database-table-column

9

See Linux flavours.