CSV Primer & Troubleshooting Tips
I have come across several questions posted on the Community related to the CSV (comma-separated values) format and felt it might be helpful to review the specifications and some gotchas. You might ask what's so complicated about the CSV format. After all, it's simply field values separated by commas. While it's true, there are a few more details that might come handy if you ever have to troubleshoot CSV import/export issues.
DEFINITION OF THE CSV FORMAT
The CSV format is quite simple and defined in IETF RFC 4180 Section 2. While I encourage you to read the original text, here's a condensed version:
- Each line is terminated by CRLF.
- The last line doesn't need to be terminated by CRLF.
- An optional header line is allowed in the first line, useful for adding field names.
- Each line may contain one or more fields, separated by commas. All lines must contain the same number of fields.
- Each field may be enclosed in double quotes (optional, but see below).
- Fields containing line breaks (CRLF), double quotes, and commas must be enclosed in double quotes.
- If a field is enclosed in double quotes, all double quotes inside the field must be escaped by another double quote (replace all " with "").
6 and 7 are in bold, as they tend to cause issues in malformed CSV data.
GOTCHAS TO WATCH OUT FOR
Here are some comments and gotchas to watch out for, in addition to 6 and 7 from above:
- Depending on locales, commas may be used as decimal points or thousand separators for the display values of numeric/currency fields. Be sure to account for those cases. This also applies to date fields.
- If you're seeing more fields than expected, look for commas in unenclosed field values.
- If field values are split across multiple lines, look for unenclosed CRLF.
- While RFC 4180 only specifies commas, some CSV formats (including some cases in ServiceNow) allow other delimiters.
- RFC 4180 Section 2 incudes this caveat: "While there are various specifications and implementations for the CSV format, there is no formal specification in existence, which allows for a wide variety of interpretations of CSV files. This section documents the format that seems to be followed by most implementations."
escapeCSV FUNCTION
Here's a simple JavaScript function based on RFC 4180 that can be used in processors or script includes for generating CSV data:
/**
Takes raw field value and returns CSV escaped value
based on RFC 4180 Common Format and MIME Type for CSV Files October 2005 http://tools.ietf.org/html/rfc4180
*
@param {string} raw - raw field value
@return {string} escaped CSV field value, if applicable per RFC 4180
*/
function escapeCSV(raw) {
var value = (raw + '').replace(/"/g,'""'); // escape all double quotes
if (/[\r\n",]/.test(value)) value = '"' + value + '"'; // if value has special characters, enclose in double quotes
return value;
}
Please feel free to connect, follow, post feedback / questions / comments, share, like, bookmark, endorse.
https://www.servicenow.com/community/developer-blog/csv-primer-troubleshooting-tips/ba-p/2282861
