問題:
On Linux, I have a directory with lots of files.在 Linux 上,我有一個包含大量文件的目錄。 Some of them have non-ASCII characters, but they are all valid UTF-8 .其中一些具有非 ASCII 字符,但它們都是有效的UTF-8 。 One program has a bug that prevents it working with non-ASCII filenames, and I have to find out how many are affected.一個程序有一個錯誤,阻止它使用非 ASCII 文件名,我必須找出有多少受到影響。 I was going to do this with find
and then do a grep to print the non-ASCII characters, and then do a wc -l
to find the number.我打算用find
來做這個,然後用grep來打印非 ASCII 字符,然後用wc -l
來查找數字。 It doesn't have to be grep;它不必是 grep; I can use any standard Unix regular expression , like Perl , sed , AWK , etc.我可以使用任何標準的 Unix正則表達式,如Perl 、 sed 、 AWK等。
However, is there a regular expression for 'any character that's not an ASCII character'?但是,是否有“任何不是 ASCII 字符的字符”的正則表達式?
解決方案:
參考一: https://en.stackoom.com/question/8uYE參考二: https://stackoom.com/question/8uYE