In text mode, COPY will be simply defeated by the presence of a backslash in the JSON. So, for example, any field that contains an embedded double quote mark, or an embedded newline, or anything else that needs escaping according to the JSON spec, will cause failure. And in text mode you have very little control over how it works - you can't, for example, specify a different ESCAPE character. So text mode simply won't work.
CSV mode is more flexible, but poses different problems. Here, instead of backslash causing a problem, QUOTE characters can cause a problem. First, JSON itself uses the default QUOTE character (double quote) to quote all string values. But if we change use an alternative like single quote, then the presence of any single quote in the JSON leads us into difficulties. Second, JSON also uses the default DELIMITER (comma) extensively. So, clearly we need to use something else for the QUOTE and DELIMITER options. (By default, in CSV mode, the ESCAPE character is the same as the QUOTE character, so we don't need to worry about it separately.)
What we in fact want is to specify QUOTE and DELIMITER characters that can't appear at all in the JSON. Then the whole line will be seen as a single unquoted datum, which is exactly what we want. There is a small set of single-byte characters that happen to be illegal in JSON, so we can be sure that choosing them for these options should do the right thing with any legal JSON. These are the control characters. So the solution we came up with looks like this:
Of course, if the JSON has embedded newlines as punctuation, this wont work. So it's important that you configure whatever is producing the JSON not to insert newlines anywhere but at the end of each JSON document.copy the_table(jsonfield) from '/path/to/jsondata' csv quote e'\x01' delimiter e'\x02';
Now this solution is a bit of a hack. I wonder if there's a case for a COPY mode that simply treats each line as a single datum. I also wonder if we need some more specialized tools for importing JSON, possibly one or more Foreign Data Wrappers. Such things could handle, say, embedded newline punctuation.
Note too that files produced by PostgreSQL's COPY ... TO command will be properly quoted and escaped and won't need to be handled like this to read them back. Of course, if you want them to be readable by other non-CSV processors, then you might need to use similar options to those above to avoid unwanted quoting and escaping.