Sanitizing CSV files with regex

Often, you want to use a CSV file, but commas within fields, double and single quotation marks can work trickily with some other programs.

  1. The first regex will replace all commas in double quotation fields with unicode entity (only if such is not the first field, however)
  2. The second will then remove all the double quotation marks
  3. The third will replace the single quotation marks with their unicode entity

These are all in vim syntax.

%s/\(,"[^\"]*\),\(.*"\)/\1\\u002C\2/
%s/"//g
%s/'/\\u0027/g
unix csv vim

Edit on github
comments powered by Disqus
Click me